[Latingp] character-based analysis

Mats Dufberg mats.dufberg at iis.se
Mon Jun 5 10:31:52 UTC 2017


Ahmed,

Your inclusion principles read

>>
2.1.  Letter code point which is a letter and has established contemporary use in a language
<<

This is straight forward.

>>
2.2.  Mark code point which represents a required mark, where at least one of the letters it forms has established contemporary use in a language
<<

Firstly, I do not think that we want to include mark code points without contextual limitation, i.e. the combination of letter code point and mark or marks is what we want to include. Secondly, the principle should be that the combination has an established use in a language, shouldn't it?

>>
2.3.  Code point which represents a combination of letters in a language which has established contemporary use, where at least one of the constituent letters cannot be represented by a combination of letter code points and mark code points.
<<

Can you give an example of what you mean?

>>
2.4.  Code point which represents a lexical word or phrase in a language, which has established contemporary use and cannot be decomposed into a sequence of code points representing letter code points and mark code points.
<<

Can you give an example of what you mean?


You have a third type of principles, "deferral principles". Deferred to when and what?


You refer to "Language Table submitted by ccTLD in the context of IDNA2008 in the IANA repository". My experience is that the ccTLD IDN tables are mostly country based (i.e. supporting multiple languages in a country) rather than language based.



Yours,
Mats

---
Mats Dufberg
DNS Specialist, IIS
Mobile: +46 73 065 3899
https://www.iis.se/en/


From: Ahmed Bakhat <ahmedbakhat at yahoo.com>
Date: Saturday 3 June 2017 at 22:20
To: Mirjana Tasić <Mirjana.Tasic at rnids.rs>, Mats Dufberg <mats.dufberg at iis.se>, Textual Solutions <textualsolutions at gmail.com>, Latin GP <latingp at icann.org>, Sarmad Hussain <sarmad.hussain at icann.org>
Subject: Re: [Latingp] character-based analysis

Dear Mirjana and all group members of Repertoire sub group,

I think first we have to focus on available characters under available Unicode charts for Latin Script, then we have to devise principles / rules for inclusion / exclusion / deffer, on the basis of usage in different languages. After having a table, we have to look for the usage in language.

I am attaching first draft of principles for Latin Script, available Unicode charts and MSR-2 documents, for start of the discussion of the group, thous 1st chart (0000 to 007F)  does not need any discussion as it is already in use as ASCII code.


Best Regards,

Ahmed Bakht


On Thursday, May 25, 2017, 7:49:47 PM GMT+5, Mirjana Tasić <Mirjana.Tasic at rnids.rs> wrote:



Dear Nebiye,



I am trying to understand the idea behind your proposal. What is the purpose of looking for specific characters through all languages.  Are you trying to develop the Repertoire of all characters used in languages with Latin script for future processing?



Regards Mirjana



From: <latingp-bounces at icann.org> on behalf of Mats Dufberg <mats.dufberg at iis.se>
Date: Thursday, May 25, 2017 at 12:17
To: Textual Solutions <textualsolutions at gmail.com>, Latin GP <latingp at icann.org>
Subject: Re: [Latingp] character-based analysis



1.      If not found we still do not know if it should be included or not.

2.      We have to return to all languages for characters that we have not found elsewhere.

3.      We have to investigate all characters in every language anyway to make to see if it has any combination of base character and combining mark.



For every character (or combination) that we want to include we should find evidence that it is used according to the principles. To have a firm ground we should not just register for one language, but for several, in case some language is excluded at a later stage or that evidence is found to be invalid.





Mats



---

Mats Dufberg

DNS Specialist, IIS

Mobile: +46 73 065 3899

https://www.iis.se/en/





From: <latingp-bounces at icann.org> on behalf of Textual Solutions <textualsolutions at gmail.com>
Date: Thursday 25 May 2017 at 09:21
To: Latin GP <latingp at icann.org>
Subject: [Latingp] character-based analysis



Dear All,

Each member of the Rep. group may be invited to look at one character only across the languages listed. What do you think? Pls see sample attached and comment. Thanks.

NPK
_______________________________________________
Latingp mailing list
Latingp at icann.org<mailto:Latingp at icann.org>
https://mm.icann.org/mailman/listinfo/latingp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/latingp/attachments/20170605/de491f88/attachment-0001.html>


More information about the Latingp mailing list