[UA-discuss] Checking for Mixed Scripts in a domain name or email address - is this a UASG issue? - Group Discussion

Maxim Alzoba m.alzoba at gmail.com
Mon Aug 7 19:46:35 UTC 2017


Hello Don,

usually IDN cyrillic contains digits, so one of the examples is not correct (Cyrillic+Latin Numerals is not a mixed script).

https://www.iana.org/domains/idn-tables/tables/xn--80adxhks_ru_1.0.txt <https://www.iana.org/domains/idn-tables/tables/xn--80adxhks_ru_1.0.txt>


usually issues arise with the string containing chars , which do not fall into the same IDN table, for example 
something from IDN mistakes - 
tчху	xn--t-cubfh

from
https://www.icann.org/sites/default/files/packages/reserved-names/ReservedNames.xml <https://www.icann.org/sites/default/files/packages/reserved-names/ReservedNames.xml>

Or, which is important, IDN string from the table not allowed for the particular TLD

for example moscow.москва  or москва.moscow (both TLDs are strictly one script, only Cyrillic Russian in the first, and only allowed  ASCII symbols in the second)

P.s: formally each TLD has IDN policy or "no-IDN policy" and it describes allowed combinations, and what is important it might change over the time
(for example some old TLDs decided to allow some of IDN tables).
 
Sincerely Yours,

Maxim Alzoba
Special projects manager,
International Relations Department,
FAITID

m. +7 916 6761580(+whatsapp)
skype oldfrogger

Current UTC offset: +3.00 (.Moscow)

> On Aug 7, 2017, at 21:52, Don Hollander <don.hollander at icann.org> wrote:
> 
> We’ve had the following suggested for the Programming Language Criteria:
>  
>     Could we include one more test case in "Programming Language Evaluation Criteria ".
> 
>    1. checking of multiple script in Email,Domain,Url.
>        Ex: еріс.com[xn--e1awd7f.com] <https://urldefense.proofpoint.com/v2/url?u=http-3A__xn-2D-2De1awd7f.com&d=DwMGaQ&c=FmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM&r=YI0XKyKCabKQi3GVWLvuoyCWjH9WBgEBxLbMnmhSRwo&m=EFZDl1I8ua7aO6Pqyr_eSbKfzcZZXdThlgDwma2WbOc&s=ZOsWhkxyr4kVruIyP0v-uZywNeMA16IoEXNg_7ZkMxs&e=> contain two language script (Cyrillic ,Latin ).
>               deepak.भारत  contain two  language script (Latin, Devanagari)
>  
>  
> I understand the thinking behind this, but I’m not sure that a) it’s in our remit, b) it’s a good idea c) it will be rejecting perfectly valid domain names (CJK, Cyrillic+Latin Numerals, etc)
>  
> I also don’t know what standard this would reference or what policy it would reference.   Something from the Unicode group or M3WAAG?
>  
>  
> Your thoughts, please.
>  
> Don
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170807/0c475377/attachment.html>


More information about the UA-discuss mailing list