[Latingp] Variant cross-script analysis worksheets

Fri May 18 14:04:03 UTC 2018

Well first of all a string does not equal a label, since there are further
restrictions on the latter - I think a minimum length of 3 characters in
the case of A-labels is one. Secondly, in the case of Arabic LGR, we
defined 16 different sequences which cannot co-occur in labels, only that
we did that in the form of WLEs rather than variants. But I seem to
remember a discussion among the GP and in between the GP and IP, where IP
explained that such confusabilities can be dealt with either in the form of
variant rules or whole label evaluation rules. Anyhow RFC 8288 gives a case
example for variants of sequences exactly parallel to some of those
cross-script variant candidates I was suggesting:

17 <https://tools.ietf.org/html/rfc8228#section-17>.  Variants for Sequences

   Variant mappings can be defined between sequences or between a code
   point and a sequence.  For example, one might define a "blocked"
   variant between the sequence "rn" and the code point "m" because they
   are practically indistinguishable in common UI fonts.

Since we are discussing cross-script variants, I don't think WLEs will be
able to control them, which would mean that we must deal with such
confusable characters or sequences of characters in the context of
variants, and therefore come up with a stronger criterion than"not a
"single" code point in the repertoire"

On 18 May 2018 at 14:22, Tan Tanaka, Dennis <dtantanaka at verisign.com> wrote:

>
>
> a single code point can be in a variant relationship with a sequence of
> code points or not? I was under the impression that the answer is yes...
>
>
>
> I would agree with you, only if such sequence is considered as a “single”
> code point in the repertoire, otherwise we are comparing a code point
> against a string (i.e. label)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180518/0cc224e3/attachment.html>