[Latingp] Digraphs

Meikal Mumin meikal.mumin at uni-koeln.de
Sat May 14 09:50:10 UTC 2016


Dear colleagues,

so that clarifies that question - thanks Abdeslam.

Coming back to your questions Chris - I believe combining marks could be
excluded, as was done in the case of Arabic LGR. Meanwhile case like ij
could be declared variants with a sequence of i + j, provided we see a need
for including the former.

If ligatures are no part of MSR-2, then I assume the problem has solved
itself.

Best,

Meikal


2016-05-11 22:27 GMT+02:00 Abdeslam Nasri <abdeslam.nasri at gmail.com>:

> Dear Chris and Colleagues,
>
>
> Digraphs or more generally sequences of code points, can be specified as
> variants of a single code point.
>
> An excerpt from the LAGER specification :
>
> " A sequence of multiple code points can be specified as a variant of a
>
>    single code point.  For example, the sequence of LATIN SMALL LETTER O
>    (U+006F) then LATIN SMALL LETTER E (U+0065) might hypothetically be
>    specified as a variant for an LATIN SMALL LETTER O WITH DIAERESIS
>    (U+00F6) as follows:
>
>        <char cp="00F6">
>            <var cp="006F 0065"/>
>        </char>
>
> "
>
> In the typical case of digraphs these are named precomposed versus
> decomposed formats of a single letter. Normalization should exist in
> Unicode in order to allow these variants, or otherwise block them.
>
>
> Kind Regards,
> Abdeslam NASRI
>
>
>
> 2016-05-09 15:43 GMT+02:00 Dillon, Chris <c.dillon at ucl.ac.uk>:
>
>> Dear Meikal,
>>
>>
>>
>> Thank you for your thoughts on digraphs.
>>
>>
>>
>> In that case, we would have blocked variants like i, dotless i  and iota,
>> where application for a label containing one, would block applications for
>> labels containing any of the others.
>>
>>
>>
>> We would also have blocked variants, digraphs like ij, which could never
>> be allocated at all. If we need to do this, it will be necessary to
>> describe variants for ligature code points we have not yet analysed in the
>> Latin ranges, as they aren’t in MSR2.
>>
>>
>>
>> (This distinction is what I was finding difficult during the face-to-face
>> meeting in Marrakech.)
>>
>>
>>
>> Incidentally, I’m fairly sure two code points could be a variant of one.
>> ( I wonder what happens with the Arabic ligature of laam and alif that
>> looks like Greek gamma; in Urdu the two do not combine so closely, if at
>> all.)
>>
>>
>>
>> Regards,
>>
>>
>>
>> Chris.
>>
>> --
>>
>> Research Associate in Linguistic Computing, Centre for Digital
>> Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int
>> 31599) www.ucl.ac.uk/dis/people/chrisdillon
>>
>>
>>
>> *From:* Meikal Mumin [mailto:meikal.mumin at uni-koeln.de]
>> *Sent:* 09 May 2016 09:38
>> *To:* Dillon, Chris <c.dillon at ucl.ac.uk>
>> *Cc:* latingp at icann.org
>> *Subject:* Re: [Latingp] Digraphs
>>
>>
>>
>> Dear Chris and colleagues,
>>
>>
>>
>> apologies for the late reply. I believe we don't need to exclude
>> digraphs. We could simply set them up as variants, e.g.  ij as equivalent of
>> i + j. It could be useful to verify with IP, if it is possible to declare a
>> sequence of two code-points as a variant of one - we had not encountered
>> such a case with Arabic script.
>>
>>
>>
>> Best wishes,
>>
>>
>>
>> Meikal
>>
>>
>>
>> 2016-03-29 9:54 GMT+02:00 Dillon, Chris <c.dillon at ucl.ac.uk>:
>>
>> Dear colleagues,
>>
>>
>>
>> Mirjana’s recent research on Montenegrin has raised some interesting
>> issues.
>>
>>
>>
>> One of them is diagraphs.
>>
>> Currently we have digraphs like æ and œ in our repertoire, but Dutch ij
>> (U+0133) as in vijf ‘five’ is white in MSR-2 (not compatible with IDNA
>> 2008). Certainly many digraphs, including ij are visually similar to their
>> component letters. We could consider adding all digraphs to the list of
>> criteria for exclusion, or adding them with exceptions (less good from a
>> usability point of view). Incidentally, ß and & are probably excluded for
>> other reasons, Longevity Principle and Punctuation, respectively.
>>
>>
>>
>> What do you think?
>>
>>
>>
>> Français: Qu’est-ce qu’on devrait faire avec les digraphs dans notre
>> répertoire – les permettre ou pas?
>>
>>
>>
>> Regards,
>>
>>
>>
>> Chris.
>>
>>>>
>> _______________________________________________
>> Latingp mailing list
>> Latingp at icann.org
>> https://mm.icann.org/mailman/listinfo/latingp
>>
>>
>
>
> --
> Cordialement,
> Abdeslam NASRI
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20160514/9f9ac9a5/attachment.html>


More information about the Latingp mailing list