[UA-discuss] IDN Implementation Guidelines [RE: Re : And now about phishing...]
dusan at dukes.in.rs
Fri Apr 21 21:41:57 UTC 2017
As a chair of Cyrillic GP, I must assure you that I know all those things you have just explained. J
Thanks for very good explanation, and yes, we done wonderful job with cross-script homoglyphs, but it seems to me that I need to explain something else>
in my mind there is no question of Latin brand written on Arabic script. I use to work for brand Политика / Cyrillic brand, so I am thinking about Cyrillic brands on Arabic, Armenian, Greek, Latin…
All in all - I was saying that because of possible confusion between all scripts – not only related to „ASCII brands“.
And again, maybe I am wrong, just discussing J
For certificates – in other email.
From: Asmus Freytag (c) [mailto:asmusf at ix.netcom.com]
Sent: Friday, April 21, 2017 9:01 PM
To: Dusan Stojicevic <dusan at dukes.in.rs>; nalini.elkins at insidethestack.com; 'Vittorio Bertola' <vittorio.bertola at open-xchange.com>; ua-discuss at icann.org; 'Edmon Chung' <edmon at registry.asia>
Subject: Re: [UA-discuss] IDN Implementation Guidelines [RE: Re : And now about phishing...]
On 4/21/2017 10:11 AM, Dusan Stojicevic wrote:
And these are just brand names with Cyrillic... more of them can be made with other scripts (Armenian, Georgian, Greek, Arabic...).
Just hold on a minute.
We've just done a pretty thorough first pass over cross-script homoglyphs (the identical-looking code points, not the "looks the same if you squint at them at arms-length" variety).
The conclusion is that Armenian has a small number of letters (q, h, n, u, o, and possibly g) that might qualify. In some fonts, they are rendered practically identically, in others not so much:
They are also less "useful" for whole script confusables, as they lack certain high frequency letters like "e", "a", "i", and "s"
x x x x x x
x xxx xx x xx x
Now for Georgian, the same review concluded there is no high fidelity overlap (near identical pair of code points).
In Greek you have a real issue only to the extent that you show the address in uppercase. Most of the lowercase letters are pretty distinct (except for omicron, and nu (ν) looks more than a little bit like "v"). We had a strong debate on whether to take uppercase into account when deciding which code points constitute cross-script variants.
The conclusion we had was that the protocol is limited to lowercase for a reason.
If you consider uppercase, you get different pairs based on the two cases.
Capital N looks like "N", lowercase nu looks like "v". If you require variants to be transitive (very necessary for optimized evaluation), then you get "n" as a variant of "v" in Latin!
It works like this: Lowercase n is a case variant of cap N, N is a (homoglyph-)variant of Cap Nu, Cap Nu is a (case-)variant of lowercase nu, lowercase nu is a (homoglyph-)variant of v. When you traverse this chain, which is what defines transitivity, you can get from "n" to "v" inside the same script.
We figured that we had reached the limit of what you can address with variants in the registries at this point.
Finally, as for Arabic, I would like to see an example of a Latin label spoofed using only Arabic letters.
(It's possible to write "English" using Chinese characters that vaguely look like letters of the alphabet, but while you can read such texts, they look rather odd).
Also agree entirely with Vittorio, and just want to add another layer of the problem - epic.com example use https, and while GeoTrust and at least one other CA have stopped issuing automated certificated for IDNs sometime ago for other reasons, this trend will be expected for others to follow.
Displaying some details about the domain/certificate owner (see my previous message) would seem to be more useful than showing an IDN as impenetrable xn-- label. The former works for phishing attacks against any scripts, the latter is only useful for people who can be expected to work entirely without IDNs.
This email has been checked for viruses by Avast antivirus software.
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 4455 bytes
Desc: not available
More information about the UA-discuss