[Gnso-epdp-idn-team] Call for Volunteers - String Similarity Review Small Grou

Tan Tanaka, Dennis dtantanaka at verisign.com
Tue May 3 16:47:04 UTC 2022

Hello Ariel, et al

Just wanted to correct the record with respect the allocatable variants in the Latin script. The two code points that have allocatable variants are Sharp S (00DF) and Dotless I (0131). Colloquially they are referred to as ‘German Sharp S’ and ‘Turkish I’. The rationale for variant relationship, however, is not derived by language but due to stability issues regarding legacy IDNA2003 behavior for Sharp S and Unicode Uppercase/Lowercase locale-specific behavior for dotless I.

The Latin script is used by many languages around the world, and each of these languages may have some language-specific conventions as far as alternate spellings, etc. In general, language-specific ‘rules’ were not sufficient to be considered for variant relationships.


From: Gnso-epdp-idn-team <gnso-epdp-idn-team-bounces at icann.org> on behalf of Ariel Liang <ariel.liang at icann.org>
Date: Monday, May 2, 2022 at 9:48 AM
To: "gnso-epdp-idn-team at icann.org" <gnso-epdp-idn-team at icann.org>, "gnso-secs at icann.org" <gnso-secs at icann.org>
Subject: [EXTERNAL] [Gnso-epdp-idn-team] Call for Volunteers - String Similarity Review Small Grou

Caution: This email originated from outside the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.

Dear EPDP Team,

During our meeting on 28 April 2022, there was support for forming a small group to develop examples to assist the EPDP Team’s deliberation on charter questions E3, E1, and E3a. These questions are related to whether/how the string similarity review should be adjusted due to the implementation of variants.

Specifically, the small team is expected to develop concrete examples of strings and variants that are visually confusable, and demonstrate how they would be compared against each other based on the three levels of string similarity review<https://secure-web.cisco.com/1FlUqiSBtiaiTGD_yyOt4LP2pZvpg2t157wW7ozE0NmoGDKTzFlG_5I6G_9ibKm-8fWZmvblevzlsAiwA-SAr8AY4X8Ngv5qtCZj4YFLIvyBYkkjQaK5dAI7mgcYRy__F9HwSBK1UIkIwO1moTeG2pr4xkM7V8bdjGDBsjWwrxSJ4CSEDM7XHrH0wyDnhQkSfTg6nvE_lg1LbaitvJe2dxKi6RDyV-Pct3eJaMnt6r7ToBWtzAgUmvl1IAFUQBjZN/https%3A%2F%2Fcommunity.icann.org%2Fdownload%2Fattachments%2F192217199%2FEPDP%2520Team%2520Meeting%2520%252331%2520Slides%2520-%2520ccPDP4%2520update%252C%2520E5.pdf%3Fversion%3D1%26modificationDate%3D1651255806555%26api%3Dv2>. The goal of this exercise is to help transform a largely academic discussion of abstract concepts into a more comprehensible discussion with practical examples. This may help the Team better analyze the three levels’ impact on string similarity review, as well as their potential consequences.

Anyone on the EPDP Team is welcome to volunteer to be part of the small team. Participation of members with familiarity or expertise in languages / scripts that have variants would be highly appreciated.

As a refresher, the languages/scripts that have both allocatable and blocked variants according to the RZ-LGR are:
·         Arabic
·         Bengali
·         Chinese
·         Greek
·         Latin (German and Turkish have allocatable variants)
·         Myanmar
·         Tamil

The languages/scripts that only have blocked variants are:
·         Armenian
·         Cyrillic
·         Devanagari
·         Ethiopic
·         Gurmukhi
·         Hebrew
·         Japanese
·         Kannada
·         Khmer
·         Korean
·         Malayalam
·         Oriya
·         Sinhala
·         Telugu

If you are interested in being part of this small group, please email gnso-secs at icann.org<mailto:gnso-secs at icann.org> by EOB Friday, 6 May.

Thank you,

Steve, Emily, Ariel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/gnso-epdp-idn-team/attachments/20220503/b26ed9d3/attachment-0001.html>

More information about the Gnso-epdp-idn-team mailing list