[Latingp] From IP: Diacritics below a security risk?

Tan Tanaka, Dennis dtantanaka at verisign.com
Fri Aug 31 20:42:55 UTC 2018


Mirjana, et al

Below is a proposed response to IP


From: Latin GP
To: IP

The Latin GP appreciates the additional input received on August 29, 2018 titled “Diacritics below a security risk?”.

In its feedback, IP makes a case for potential security risks when certain diacritics below (e.g. dot and line below) are used in a domain name. The risk is not always apparent, but it reveals itself when the diacritic below is obscured by an underline, which is the typical formatting feature of hyperlinks. The IP asserts “Of all diacritics, diacritics below can be difficult to distinguish or be prone to clipping”.

Security risks deserve a place in our analysis, so Latin GP is committed to explicitly discuss this matter, resolve whether they constitute a security risk to the Root Zone and decide a solution vis-à-vis the LGR.

On a related note, the Latin GP would like to get additional clarification regarding some of IP’s statements from the August 29 email:

“It can be argued users have no working understanding of typography and would not reliably interpret small gaps or bulges in the underline as being related to an unfamiliar code point”.

“The IP would like to encourage the Latin GP […] to explicitly examine [the diacritics below] example and other cases like it, where code points can become indistinguishable in common usage scenarios for IDNs”.

Some GP members would imply that IP is welcoming visual similarity or confusability as a criterion for variant definition (e.g. letters with diacritics acute and grave could be deemed indistinguishable, therefore variants). Other members don’t agree with that reasoning. In this context, the Latin GP wants to confirm that prior guidance from IP (below), which we find consistent with the LGR Procedure, is not at odds with the August 29 email.

                LGR Procedure:
“Finally, in investigating the possible variant relations, Generation Panels should ignore cases where the relation is based exclusively on aspects of visual similarity.”


October 18, 2017: Principles for repertoire and variants – Feedback from IP

“In the context of the Root Zone, the Procedure is quite clear in that it considers simple similarity of appearance to be outside the scope of the Root Zone LGR.”... “Having the Root Zone exhibit fundamentally different design decisions with respect to variants than those found on the second level would have to be justified by strong arguments based on factors special to the Root Zone.”


March 22, 2017: Latin GP Proposal: IP Feedback

“The kinds of variants to be defined in the Root Zone LGR are limited to homoglyphs, which are characters essentially identical appearance by design, instead of merely similar appearance.”


Sincerely,
Latin GP


From: Latingp <latingp-bounces at icann.org> on behalf of Sarmad Hussain <sarmad.hussain at icann.org>
Date: Wednesday, August 29, 2018 at 2:58 AM
To: Latin GP <latingp at icann.org>
Subject: [EXTERNAL] [Latingp] From IP: Diacritics below a security risk?

Dear Latin GP members

Kindly find below some feedback from IP for your consideration.

Regards
Sarmad



TO: LatinGP
FROM: IP

There are recent and widely published examples of phishing attacks using Latin IDNs in which the key features involved were diacritics below the letter. Here is an example:

[cid:part1.E6E9F88C.32B00687 at ix.netcom.com]

Of all diacritics, diacritics below can be difficult to distinguish or be prone to clipping -- there is less space below the baseline than between the typical lowercase glyph and the top of the line.

The example given above shows a further interaction with URL underlining - and not all display engines actually do as nice a job interrupting the underline as in the screen shot above. For example, here is how one system will render this (using a designated UI font - Segoe UI):

[cid:part2.59964F92.16B1BC0B at ix.netcom.com]

Note, this code point (U+1E33) is in the MSR as is (U+1E35 LATIN SMALL K WITH LINE BELOW).

[cid:part3.99B9649E.C3C335FC at ix.netcom.com]

The second example contains U+1E35 --  while the effect does not show equally at all type sizes, from 12pt and below the LINE BELOW is reliably hidden. Here are the two examples at 10pt

[cid:part4.B43CE3FD.8C84EACF at ix.netcom.com]

The issue is not limited to "K". We see "B", "D", "L" and "N" with both DOT and LINE BELOW and "M" and "H" with DOT BELOW, all on the same page in the MSR.

It can be argued users have no working understanding of typography and would not reliably interpret small gaps or bulges in the underline as being related to an unfamiliar code point. This appears to make all diacritics below security-sensitive, however, the initial determination belongs to the relevant GPs.

Note by the way that the Devanagari LGR treats sequences containing NUKTA (a dot below) as variants in at least some cases and recent community comments for that script are calling for more variant sequences. However, while the feature is graphically analog (dot below), each script works differently and there is no single a-priori solution.

The IP would like to encourage the LatinGP (and any other GP facing cases like this) to explicitly examine this example and other cases like it, where code points can become indistinguishable in common usage scenarios for IDNs, and formally conclude whether and how to take these into account when designing their LGR.

At this point, the IP would expect the GP to:

* explicitly discuss this and other scenarios like it

* evaluate whether they constitute a security risk to the Root Zone

* come up with a reasoned decision as to whether and how to address them in the design of the Latin GP; and finally

* document both the decision and its rationale.

In coming to a decision, the GP may resolve:

1) to make them variants

2) to list them for attention as confusable

3) to take no action, because the GP feels that they do not represent a special security risk.

As part of the review of the Latin LGR, the IP will look at the background and rationale offered by the Latin GP in coming to its conclusion; note that if the IP feels that the facts considered and rationale documented do not support the conclusion reached by the GP it may raise objections at that time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180831/5c73629f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 131067 bytes
Desc: image001.png
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180831/5c73629f/image001-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 878 bytes
Desc: image002.png
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180831/5c73629f/image002-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 939 bytes
Desc: image003.png
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180831/5c73629f/image003-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 777 bytes
Desc: image004.png
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180831/5c73629f/image004-0001.png>


More information about the Latingp mailing list