[UA-discuss] IANA IDN Tables

Elaine Pruis elainepruis at gmail.com
Mon Feb 26 21:06:04 UTC 2018


It must be over a year since I last looked at the IANA IDN tables and they
have grown massively in that time ➜ iana.org/domains/idn-tables

Actually, all new TLD applicants were required to submit their language
tables during the application window.  Part of the pre-delegation test is
to test the registry function against the rulesets in the tables.
The Registries have a contractual obligation to ICANN to publish the
tables. That is a task managed by IANA.
IANA has finally caught up with the publication of these tables, and that
is why it appears to be a massive growth in a short period of time. As a
member of the CSC (tasked with oversight of the IANA function), we are now
monitoring the "speed" at which submitted tables get approved and then
published on this site.


On Mon, Feb 26, 2018 at 11:48 AM, Asmus Freytag <asmusf at ix.netcom.com>
wrote:

> On 2/26/2018 7:51 AM, Andrew Sullivan wrote:
>
>> On Mon, Feb 26, 2018 at 01:47:56PM +0300, Maxim Alzoba wrote:
>>
>>> As I understand all IDN tables, which passed 2012 application round
>>> are to be allowed to use despite any changes in LGR (it was the part of
>>> the
>>> discussion when LGRs were established).
>>>
>> Correct.  But some people are updating in line with LGR tools already
>> -- particularly when their communities are sensitive to the issues of
>> general-purpose domains and need conflicting uses of the same script.
>> This applies (for instance) to Han, certainly Latin and Arabic, and
>> probably Cyrillic (and maybe even Latin, Cyrillic, and Greek, though I
>> know of nobody who's been that careful yet).
>>
>> The previous "variants" approach derived from the JET work made the
>> distinction between "blocked" and "allocatable" less plain than it is
>> now, and the inter-writing-system effects of characters is also now
>> plainer and so easier to represent.  The JET approach worked quite
>> well for CJK when used in relative isolation, but has limitations when
>> applied more generally, which is why the new approach was worked out.
>>
>
> One interesting development for Chinese is a clever bit of tweaking of the
> algorithm
> that defines the set of "allocatable" variants.
>
> The original JET approach was intended to lead to at most three possible
> labels:
> one all-simplified, one all-traditional label plus one mixed label (as
> applied for).
>
> Because some code points have more than one simplified or more than one
> traditional
> variant, a simple-minded scheme would allow a combinatorial explosion of
> allocatable labels in some cases.
>
> The new algorithm is able to limit the number of allocatable labels in
> these cases
> to four; fewer in the general case.
>
> This would be a big win, as keeping the number of allocatable variants
> small
> has benefits, especially as the number of allocatable FQDN is the
> permutation
> of all allocatable labels on each level.
>
> Embedding the reduction into the algorithm has the advantage of making
> the set of allocatable labels predictable (by evaluating the label against
> the LGR).
> The LGR would fully conform to RFC 7940.
>
> The number of blocked variants is still defined by the permutation of all
> variants
> that aren't allocatable. For some labels, the numbers can be formidable,
> but
> fortunately, there is no need to enumerate them, even for collision
> testing.
>
> However, even the largest set of blocked variant pales compared to the
> immense size of the namespace (20,000 code points) to the power of
> (maximal number of code points in a U-label).
>
> I believe the Chinese Generation Panel is planning a presentation of the
> scheme
> at ICANN61.
>
> A./
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ua-discuss/attachments/20180226/d2867a13/attachment.html>


More information about the UA-discuss mailing list