[Latingp] Variant cross-script analysis worksheets

Thu May 31 13:14:23 UTC 2018

It would be helpful if we could get the IP to issue an statement on the points you are raising below.

-Dennis

From: Bill Jouris <bill.jouris at insidethestack.com>
Reply-To: "bill.jouris at insidethestack.com" <bill.jouris at insidethestack.com>
Date: Wednesday, May 30, 2018 at 5:24 PM
To: Dennis Tan Tanaka <dtantanaka at verisign.com>, Meikal Mumin <meikal.mumin at uni-koeln.de>, Michael Bauland <Michael.Bauland at knipp.de>, Sarmad Hussain <sarmad.hussain at icann.org>
Cc: Latin GP <LatinGP at icann.org>
Subject: [EXTERNAL] Re: [Latingp] Variant cross-script analysis worksheets

And yet, in speaking with members of the IP (at San Juan), on the subject of the Least Astonishment Principle, what they said was "We are looking to the Generation Panels for guidance."

And said further (on the matter of the breve and caron) "When I am typing something that includes one of them, I have to copy and paste because I can't tell which I am lookIng at."  In short, while the two diacritics are clearly not identical, at least with sufficient magnification, as far as he was concerned there was no reason that they could not be classified by us as varients.

What constitutes "identical in appearance" depends enormously on just how much magnification is assumed.  The rationale for assuming anything larger than 12 point type is not at all obvious.

Bill

Sent from Yahoo Mail on Androi<https://go.onelink.me/107872968?pid=InProduct&c=Global_Internal_YGrowth_AndroidEmailSig__AndroidUsers&af_wl=ym&af_sub1=Internal&af_sub2=Global_YGrowth&af_sub3=EmailSignature>d

On Wed, May 30, 2018 at 1:08 PM, Tan Tanaka, Dennis
<dtantanaka at verisign.com> wrote:

From: Meikal Mumin <meikal.mumin at uni-koeln.de>
Date: Tuesday, May 29, 2018 at 10:14 AM
To: Bill Jouris <bill.jouris at insidethestack.com>, Dennis Tan Tanaka <dtantanaka at verisign.com>, Michael Bauland <Michael.Bauland at knipp.de>, Sarmad Hussain <sarmad.hussain at icann.org>
Cc: Latin GP <LatinGP at icann.org>
Subject: [EXTERNAL] Re: [Latingp] Variant cross-script analysis worksheets

My conclusion is that it is more complex than reducing things to "homoglyphs" but I do not think that (at least linguistically) we have a strong definition of homoglyphs

On homoglyphs the Latin GP has received the following guidance from IP, in writing and verbally (during the Brussels workshop)

“In the context of the Root Zone, the Procedure is quite clear in that it considers simple similarity of appearance to be outside the scope of the Root Zone LGR. In admitting exact homoglyphs, the IP has been making the argument that ‘e’ in Latin (U+0065) and ‘е’ in Cyrillic(U+0435) are not just visually indistinguishable, but that their distinct code points effectively represent a disunification by script property.” – Email from IP to Latin GP of 18 October 2017 in response to our draft Principles for Inclusion and Exclusion of Code Points in Latin Script for the Root Zone, and in particular to our Analysis of Variants in the Latin Script for the Root Zone.

“The kinds of variants to be defined in the Root Zone LGR are limited to homoglyphs, which are characters with essentially identical appearance by design, instead of merely similar appearance.” – Integration Panel feedback to Latin GP proposal of 22 March 2017.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180531/70f39379/attachment-0001.html>