[UA-discuss] Checking for Mixed Scripts in a domain name or email address - is this a UASG issue? - Group Discussion

Stuart Stuple stuartst at microsoft.com
Mon Aug 7 20:05:22 UTC 2017

I believe you could use Unicode definitions of scripts as, in that system, numbers are NOT Latin. (They are ANSI and ASCII but Unicode clearly separates them and punctuation.)

That said, the other concerns (about whether this is within remit and whether it is a good practice remain.

From: ua-discuss-bounces at icann.org [mailto:ua-discuss-bounces at icann.org] On Behalf Of Maxim Alzoba
Sent: Monday, August 7, 2017 12:47 PM
To: Don Hollander <don.hollander at icann.org>
Cc: ua-discuss at icann.org
Subject: Re: [UA-discuss] Checking for Mixed Scripts in a domain name or email address - is this a UASG issue? - Group Discussion

Hello Don,

usually IDN cyrillic contains digits, so one of the examples is not correct (Cyrillic+Latin Numerals is not a mixed script).


usually issues arise with the string containing chars , which do not fall into the same IDN table, for example
something from IDN mistakes -



Or, which is important, IDN string from the table not allowed for the particular TLD

for example moscow.москва  or москва.moscow (both TLDs are strictly one script, only Cyrillic Russian in the first, and only allowed  ASCII symbols in the second)

P.s: formally each TLD has IDN policy or "no-IDN policy" and it describes allowed combinations, and what is important it might change over the time
(for example some old TLDs decided to allow some of IDN tables).

Sincerely Yours,

Maxim Alzoba
Special projects manager,
International Relations Department,

m. +7 916 6761580(+whatsapp)
skype oldfrogger

Current UTC offset: +3.00 (.Moscow)

On Aug 7, 2017, at 21:52, Don Hollander <don.hollander at icann.org<mailto:don.hollander at icann.org>> wrote:

We’ve had the following suggested for the Programming Language Criteria:

    Could we include one more test case in "Programming Language Evaluation Criteria ".

   1. checking of multiple script in Email,Domain,Url.
       Ex: еріс.com[xn--e1awd7f.com]<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__xn-2D-2De1awd7f.com%26d%3DDwMGaQ%26c%3DFmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM%26r%3DYI0XKyKCabKQi3GVWLvuoyCWjH9WBgEBxLbMnmhSRwo%26m%3DEFZDl1I8ua7aO6Pqyr_eSbKfzcZZXdThlgDwma2WbOc%26s%3DZOsWhkxyr4kVruIyP0v-uZywNeMA16IoEXNg_7ZkMxs%26e%3D&data=02%7C01%7Cstuartst%40exchange.microsoft.com%7C782eb85495ba4d81404408d4ddcc9f27%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636377318366514294&sdata=EK8y%2BmOCBSzF8Z8GErJIdSI2Px8qCv7Pb%2BLb8UKYEdw%3D&reserved=0> contain two language script (Cyrillic ,Latin ).
              deepak.भारत  contain two  language script (Latin, Devanagari)

I understand the thinking behind this, but I’m not sure that a) it’s in our remit, b) it’s a good idea c) it will be rejecting perfectly valid domain names (CJK, Cyrillic+Latin Numerals, etc)

I also don’t know what standard this would reference or what policy it would reference.   Something from the Unicode group or M3WAAG?

Your thoughts, please.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170807/fe2c9082/attachment.html>

More information about the UA-discuss mailing list