[Gnso-epdp-idn-team] [Ext] Re: Call for Volunteers - String Similarity Review Small Grou

Ariel Liang ariel.liang at icann.org
Tue May 3 18:11:20 UTC 2022


Thank you for the correction and the insight, Dennis.

Indeed, the Sharp S and Dotless I are the only two code points in the Latin script that have allocatable variants. We realized that it is not accurate to say that German and Turkish broadly have allocatable variants. We mention these languages in the hope that someone in the EPDP-Team with familiarity with those languages may be able to help develop examples of confusingly similar variant labels containing those code points, in addition to other examples.

Best Regards,
Ariel

From: "Tan Tanaka, Dennis" <dtantanaka at verisign.com>
Date: Tuesday, May 3, 2022 at 12:47 PM
To: Ariel Liang <ariel.liang at icann.org>, "gnso-epdp-idn-team at icann.org" <gnso-epdp-idn-team at icann.org>, "gnso-secs at icann.org" <gnso-secs at icann.org>
Subject: [Ext] Re: [Gnso-epdp-idn-team] Call for Volunteers - String Similarity Review Small Grou

Hello Ariel, et al

Just wanted to correct the record with respect the allocatable variants in the Latin script. The two code points that have allocatable variants are Sharp S (00DF) and Dotless I (0131). Colloquially they are referred to as ‘German Sharp S’ and ‘Turkish I’. The rationale for variant relationship, however, is not derived by language but due to stability issues regarding legacy IDNA2003 behavior for Sharp S and Unicode Uppercase/Lowercase locale-specific behavior for dotless I.

The Latin script is used by many languages around the world, and each of these languages may have some language-specific conventions as far as alternate spellings, etc. In general, language-specific ‘rules’ were not sufficient to be considered for variant relationships.

Best,
Dennis

From: Gnso-epdp-idn-team <gnso-epdp-idn-team-bounces at icann.org> on behalf of Ariel Liang <ariel.liang at icann.org>
Date: Monday, May 2, 2022 at 9:48 AM
To: "gnso-epdp-idn-team at icann.org" <gnso-epdp-idn-team at icann.org>, "gnso-secs at icann.org" <gnso-secs at icann.org>
Subject: [EXTERNAL] [Gnso-epdp-idn-team] Call for Volunteers - String Similarity Review Small Grou


Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


Dear EPDP Team,


During our meeting on 28 April 2022, there was support for forming a small group to develop examples to assist the EPDP Team’s deliberation on charter questions E3, E1, and E3a. These questions are related to whether/how the string similarity review should be adjusted due to the implementation of variants.


Specifically, the small team is expected to develop concrete examples of strings and variants that are visually confusable, and demonstrate how they would be compared against each other based on the three levels of string similarity review [secure-web.cisco.com]<https://urldefense.com/v3/__https:/secure-web.cisco.com/1FlUqiSBtiaiTGD_yyOt4LP2pZvpg2t157wW7ozE0NmoGDKTzFlG_5I6G_9ibKm-8fWZmvblevzlsAiwA-SAr8AY4X8Ngv5qtCZj4YFLIvyBYkkjQaK5dAI7mgcYRy__F9HwSBK1UIkIwO1moTeG2pr4xkM7V8bdjGDBsjWwrxSJ4CSEDM7XHrH0wyDnhQkSfTg6nvE_lg1LbaitvJe2dxKi6RDyV-Pct3eJaMnt6r7ToBWtzAgUmvl1IAFUQBjZN/https*3A*2F*2Fcommunity.icann.org*2Fdownload*2Fattachments*2F192217199*2FEPDP*2520Team*2520Meeting*2520*252331*2520Slides*2520-*2520ccPDP4*2520update*252C*2520E5.pdf*3Fversion*3D1*26modificationDate*3D1651255806555*26api*3Dv2__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJSU!!PtGJab4!t41X0L-NUW61TkpJ8igKaRf9XArTJUmxG8tNGnSWGHohaW-6H2iSq3OrX_ll1OcP6xGvmIU$>. The goal of this exercise is to help transform a largely academic discussion of abstract concepts into a more comprehensible discussion with practical examples. This may help the Team better analyze the three levels’ impact on string similarity review, as well as their potential consequences.


Anyone on the EPDP Team is welcome to volunteer to be part of the small team. Participation of members with familiarity or expertise in languages / scripts that have variants would be highly appreciated.


As a refresher, the languages/scripts that have both allocatable and blocked variants according to the RZ-LGR are:
·         Arabic
·         Bengali
·         Chinese
·         Greek
·         Latin (German and Turkish have allocatable variants)
·         Myanmar
·         Tamil


The languages/scripts that only have blocked variants are:
·         Armenian
·         Cyrillic
·         Devanagari
·         Ethiopic
·         Gurmukhi
·         Hebrew
·         Japanese
·         Kannada
·         Khmer
·         Korean
·         Malayalam
·         Oriya
·         Sinhala
·         Telugu


If you are interested in being part of this small group, please email gnso-secs at icann.org<mailto:gnso-secs at icann.org> by EOB Friday, 6 May.


Thank you,

Steve, Emily, Ariel


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/gnso-epdp-idn-team/attachments/20220503/b4ef7c68/attachment-0001.html>


More information about the Gnso-epdp-idn-team mailing list