[Latingp] Repertoire and Latin Extended A

Mats Dufberg mats.dufberg at iis.se
Mon Jul 23 13:52:28 UTC 2018


Mirjana,

Since I promised to review the second-level LGRs to see if there are any characters in any of those languages that we have not include I have also done that and this is my findings (Bill and Meikal have at least looked at some of them):

U+00FF is included for French. Omniglot does not list it, but Wikipedia does.

U+0157 is included for Latvian. Omniglog does not list it. Wikipedia does, but today only used in diaspora (as Meikal pointed out).

The other missing code-points from Latin Extended-A were not found.

Code points I looked at:

00FF LATIN SMALL LETTER Y WITH DIAERESIS (French)
0109 LATIN SMALL LETTER C WITH CIRCUMFLEX
0125 LATIN SMALL LETTER H WITH CIRCUMFLEX
0135 LATIN SMALL LETTER J WITH CIRCUMFLEX
014F LATIN SMALL LETTER O WITH BREVE
0157 LATIN SMALL LETTER R WITH CEDILLA (Latvian)
0163 LATIN SMALL LETTER T WITH CEDILLA
0177 LATIN SMALL LETTER Y WITH CIRCUMFLEX

The LGRs can be found at https://www.icann.org/resources/pages/second-level-lgr-2015-06-21-en

*

A comment to Bill and Meikal:

The LGR for German and Spanish lists U+014F (ŏ) LATIN SMALL LETTER O WITH BREVE as *excluded* not as included. You have to go down to “repertoire by code point” to see what is included. They also have a concept of “extended code point” which is in between included and excluded, e.g. U+00E0 for Spanish.

Ÿ (U+00FF) is neither included in English nor German.


Mats

---
Mats Dufberg
DNS Specialist, IIS
Mobile: +46 73 065 3899
https://www.iis.se/en/


From: Latingp <latingp-bounces at icann.org> on behalf of Meikal Mumin <meikal.mumin at uni-koeln.de>
Date: Monday, 23 July 2018 at 00:11
To: Bill Jouris <bill.jouris at insidethestack.com>
Cc: ICANN Latin GP <latingp at icann.org>
Subject: Re: [Latingp] Repertoire and Latin Extended A

Dear colleagues,

On 22 July 2018 at 21:49, Bill Jouris <bill.jouris at insidethestack.com<mailto:bill.jouris at insidethestack.com>> wrote:
Hi Mirjana,



I've reviewed the repertoire we have (after adding Esperanto) and compared it to the Unicode table's Basic Latin, Latin-1 Supplement, and Latin Extended-A codepoints.



The following entries from Latin-1 Supplement are included in MSR-2, but not included in our repertoire:
00FF    ÿ     Latin Small Letter Y with Diaeresis


This occurs rarely in personal names in German https://de.wikipedia.org/wiki/%C5%B8#Franz%C3%B6sisch
and in French in place names amongst others (https://fr.wikipedia.org/wiki/%C5%B8#Fran%C3%A7ais).


The following entries from Latin Extended-A are included in MSR-3 but not included in our repertoire:

014F     ŏ   Latin Small Letter O with Breve
0157     ŗ    Latin Small Letter R with Cedilla

FYI
ÿ is listed in the ICANN LGRs for German and for English (much to my amazement, as I have never encountered it previously), but does not appear in Omniglot, nor in the Wikipedia alphabet referenced in the LGR, for either language.

A quick search did not yield any evidence for English, but German - see above.

ŏ is listed in the ICANN LGRs for German and for Spanish, but does not appear in Omniglot, nor in the Wikipedia alphabet referenced in the LGR, for either language.

A quick search did not yield any supporting evidence.

ŗ is listed in the ICANN LGR for Latvian, but does not appear in Omniglot, nor in the Wikipedia alphabet referenced in the LGR.

This https://de.wikipedia.org/wiki/%C5%96 says it was used historically in Lativian. This https://en.wikipedia.org/wiki/Latvian_orthography clarifies that it is part of an older orthography still in use in diaspora communities.


LGR for language deu-Latn — German<https://www.icann.org/sites/default/files/packages/lgr/lgr-second-level-german-30aug16-en.html>


LGR for language deu-Latn — German



This is way  larger than the set of characters used in German, even taking into consideration loans and borrowings from other languages. I would be interested to know who developed this on what basis. Some sources are 404.


LGR for language eng-Latn — English
<https://www.icann.org/sites/default/files/packages/lgr/lgr-second-level-english-30aug16-en.html>



LGR for language eng-Latn — English



LGR for language spa-Latn — Spanish<https://www.icann.org/sites/default/files/packages/lgr/lgr-second-level-spanish-30aug16-en.html>


LGR for language spa-Latn — Spanish




Bill Jouris
Inside Products
bill.jouris at insidethestack.com<mailto:bill.jouris at insidethestack.com>
831-659-8360
925-855-9512 (direct)

_______________________________________________
Latingp mailing list
Latingp at icann.org<mailto:Latingp at icann.org>
https://mm.icann.org/mailman/listinfo/latingp


Best,

Meikal

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20180723/b85db15a/attachment-0001.html>


More information about the Latingp mailing list