[UA-discuss] Fw: Re: IDN Implementation Guidelines [RE: Re : And now about phishing...]

Sun Apr 23 06:08:50 UTC 2017

On 4/22/2017 9:24 PM, ajay at data.in wrote:
> Take a look at this paragraph. Can you read what it says? All the 
> letters have been jumbled (mixed). Only the first and last letter of 
> ecah word is in the right place:
>
> I cnduo't bvleiee taht I culod aulaclty uesdtannrd waht I was rdnaieg. 
> Unisg the icndeblire pweor of the hmuan mnid, aocdcrnig to rseecrah at 
> Cmabrigde Uinervtisy, it dseno't mttaer in waht oderr the lterets in a 
> wrod are, the olny irpoamtnt tihng is taht the frsit and lsat ltteer 
> be in the rhgit pclae. The rset can be a taotl mses and you can sitll 
> raed it whoutit a pboerlm. Tihs is bucseae the huamn mnid deos not 
> raed ervey ltteer by istlef, but the wrod as a wlohe. Aaznmig, huh? 
> Yaeh and I awlyas tghhuot slelinpg was ipmorantt!
> Try out with friends. If they can that too.
>
> Some clue from above ? 

The clue from the above is that most people do not read 
"letter-by-letter" most of the time, but based on word-shape - and the 
latter is pretty resilient to alterations in sequences.

If we had limited identifier to dictionary words, 90% of non-homograph 
spoofing would disappear, because many of the spoofs that look like 
words, aren't in the dictionary.

If this weren't the case (and most of the jumbles were words 
themselves), you couldn't read the scrambled text above, because it 
would then look like a different text.

We didn't adopt this, so we have to look at other means to defend 
against attacks.

The interesting thing is that the letter shapes still matter. Note that 
the example doesn't simply keep first and last and then substitutes 
different letters.

That means that the use of diacritics, for example, remains highly 
distinctive; because the marks change the "outline" of the word. A 
likely exception to that are populations accustomed to expecting 
diacritics to be optional.

Note also, that while you can figure out the intended content of the 
above text quickly (that is you can "read" it, rather than having to 
decrypt it letter-by-letter, it still is immediately detectable as being 
"funny" (misspelled).

(Also, the test may be skewed towards English, because there are so many 
short words in English - all the one, two and three-letter words are 
retained, and the four-letter words have precisely one possible 
alternation).

Anything that you see in the example that you shared with us?

A./

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20170422/0cf492b7/attachment.html>