[UA-discuss] Checking for Mixed Scripts in a domain name oremail address - is this a UASG issue? - Group Discussion

Mark Svancarek marksv at microsoft.com
Tue Aug 8 23:54:06 UTC 2017


My gut level response is that mixing scripts within a label is much worse than mixing scripts in FQDN, though I am hard-pressed to quantify that feeling.  One could use confusable code points from Latin, Greek and/or Cyrillic, with only a single script in each label but a mixture overall within the FQDN.  Is that really less dangerous? Hmm…

From: ua-discuss-bounces at icann.org [mailto:ua-discuss-bounces at icann.org] On Behalf Of Asmus Freytag
Sent: Monday, August 7, 2017 3:36 PM
To: ua-discuss at icann.org
Subject: Re: [UA-discuss] Checking for Mixed Scripts in a domain name oremail address - is this a UASG issue? - Group Discussion

The ASCII digits have script "common", not "Latin".

The issue isn't numbers, it's mixing letters.

Also, mixing scripts within a label seems somehow different from mixing scripts within a FQDN.

A./

On 8/7/2017 12:46 PM, Maxim Alzoba wrote:
Hello Don,

usually IDN cyrillic contains digits, so one of the examples is not correct (Cyrillic+Latin Numerals is not a mixed script).

https://www.iana.org/domains/idn-tables/tables/xn--80adxhks_ru_1.0.txt<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.iana.org%2Fdomains%2Fidn-tables%2Ftables%2Fxn--80adxhks_ru_1.0.txt&data=02%7C01%7Cmarksv%40microsoft.com%7C7b44e9571246424d486f08d4dde4ac24%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636377421751192468&sdata=YkDXN3bX4ThceauSipnSvZs1pUV5DBXjMlXP1PBDsRQ%3D&reserved=0>


usually issues arise with the string containing chars , which do not fall into the same IDN table, for example
something from IDN mistakes -
tчху

xn--t-cubfh



from
https://www.icann.org/sites/default/files/packages/reserved-names/ReservedNames.xml<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.icann.org%2Fsites%2Fdefault%2Ffiles%2Fpackages%2Freserved-names%2FReservedNames.xml&data=02%7C01%7Cmarksv%40microsoft.com%7C7b44e9571246424d486f08d4dde4ac24%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636377421751192468&sdata=gT%2FLd4t3vy0SHpUjs%2Bzqa16T2SkgRRgOOCksPU6kQUg%3D&reserved=0>

Or, which is important, IDN string from the table not allowed for the particular TLD

for example moscow.москва  or москва.moscow (both TLDs are strictly one script, only Cyrillic Russian in the first, and only allowed  ASCII symbols in the second)

P.s: formally each TLD has IDN policy or "no-IDN policy" and it describes allowed combinations, and what is important it might change over the time
(for example some old TLDs decided to allow some of IDN tables).

Sincerely Yours,

Maxim Alzoba
Special projects manager,
International Relations Department,
FAITID

m. +7 916 6761580(+whatsapp)
skype oldfrogger

Current UTC offset: +3.00 (.Moscow)

On Aug 7, 2017, at 21:52, Don Hollander <don.hollander at icann.org<mailto:don.hollander at icann.org>> wrote:

We’ve had the following suggested for the Programming Language Criteria:

    Could we include one more test case in "Programming Language Evaluation Criteria ".

   1. checking of multiple script in Email,Domain,Url.
       Ex: еріс.com[xn--e1awd7f.com]<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__xn-2D-2De1awd7f.com%26d%3DDwMGaQ%26c%3DFmY1u3PJp6wrcrwll3mSVzgfkbPSS6sJms7xcl4I5cM%26r%3DYI0XKyKCabKQi3GVWLvuoyCWjH9WBgEBxLbMnmhSRwo%26m%3DEFZDl1I8ua7aO6Pqyr_eSbKfzcZZXdThlgDwma2WbOc%26s%3DZOsWhkxyr4kVruIyP0v-uZywNeMA16IoEXNg_7ZkMxs%26e%3D&data=02%7C01%7Cmarksv%40microsoft.com%7C7b44e9571246424d486f08d4dde4ac24%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636377421751192468&sdata=SRykJMr4j2kYIWQEaTV9pc8IFzRz4Na%2B596Kz6qDscs%3D&reserved=0> contain two language script (Cyrillic ,Latin ).
              deepak.भारत  contain two  language script (Latin, Devanagari)


I understand the thinking behind this, but I’m not sure that a) it’s in our remit, b) it’s a good idea c) it will be rejecting perfectly valid domain names (CJK, Cyrillic+Latin Numerals, etc)

I also don’t know what standard this would reference or what policy it would reference.   Something from the Unicode group or M3WAAG?


Your thoughts, please.

Don




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170808/a5f526a9/attachment.html>


More information about the UA-discuss mailing list