[ChineseGP] Updates : Inclusion & Exclusion Principles

Dillon, Chris c.dillon at ucl.ac.uk
Tue Apr 29 08:25:12 UTC 2014


Dear Qi Chao,

Thank you. Your diagram makes things clearer. As you write, “sum” is not accurate either. We need to have a longer explanation, similar to what you’ve written below, but deciding the questions such as:


·         Are we sure we can leave out the characters in CJK exts-B?

·         As you say, is anyone in the group aware of any characters outside MSR-1-HAN being required?

Regards,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon

From: 齐超 [mailto:qichao at cnnic.cn]
Sent: 29 April 2014 07:23
To: Dillon, Chris; wangwei; ChineseGP at icann.org
Subject: Re: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles

Hello, Chirs

    Thank you. Your edit make the principles clear and comprehensive.

But for 'SUM', I think CGP Script is not a summary of CJK and MSR.

There is a picture for CJK & MSR & CGP & CDNC Script(CGP Script) [Principle-1, Principle-3].

[cid:image003.jpg at 01CF638C.EBA1BB50]

The CGP script as showed in orange colour, is just a part from MSR-1-Han. CDNC script is its origin.
1. CDNC Script does not include Hanzi from CJK exts-B;
2. CDNC Script does not include some Hanzi code points from other registry scripts as JP, DotAsia[Principle-6].

And here is also some points in CDNC script that maybe conflict with other registry as Kr or JP [Principle-5].

So 'SUM' maybe confuse the relation of MSR-1 and CJK(exts-A, exts-B).

May CGP script cover points beyond MSR-1-han?

And if true, it is a hard work to handle thousands of CJK Hanzi, case by case, for CGP members.
Thanks.

________________________________
               齐超 via foxmail

发件人: Dillon, Chris<mailto:c.dillon at ucl.ac.uk>
发送时间: 2014年4月28日(星期一) 下午4:30
收件人: Wang Wei<mailto:wangwei at cnic.cn>; ChineseGP at icann.org<mailto:ChineseGP at icann.org>
主题: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles
Dear colleagues,

Please find some minor changes in the version below. These are to make the English smoother. There is also one substantial change: the word “intersection” in paragraph one, often means the small area where two circles (in this case tables) overlap. I think here, the meaning is all characters in the three tables and so I think a word like “sum” is better.


1.      Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP THE character set would be the SUM.

2.      CGP character set shall be programmed according to the requirements of RFC3743/4713 and [Representing] Label Generation Rulesets WILL BE REPRESENTED using XML.

3.      The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP.

4.      The initial set shall be checked following the [standard of] criteriA listed by The Normalized Hanzi Chart for General Use and IIcore.

5.      Some [of the] abandoned archaic characters, for instance KOREAN Idu charactersS (이두/吏读字), shall be deleted based on the consensus with CDNC.

6.      Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements:

1.    A. The Code Points of different languages shall be programmed by THEIR affiliated institutions, such as JP, KR, HK, DotAsia, etc.

2.    B. Each Code Point shall pass CHECKS conducted by both the language expert in the CGP panels and CDNC.

3.    C. All the strings in an application FOR THE CGP shall not [be] collide with existing characterS in the process of Variant evaluation.

7.      THE CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities.

[] means text I have removed.

Regards,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon

From: chinesegp-bounces at icann.org<mailto:chinesegp-bounces at icann.org> [mailto:chinesegp-bounces at icann.org] On Behalf Of Wang Wei
Sent: 25 April 2014 13:29
To: ChineseGP at icann.org<mailto:ChineseGP at icann.org>
Subject: [ChineseGP] Updates : Inclusion & Exclusion Principles

Dear CGP members

         While ICANN is reviewing the CGP proposal, some members has drafted the principles of character inclusion & exclusion as follows.


1.      Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP character set would be the intersection.

2.      CGP character set shall be programmed according to the requirements of RFC3743/4713 and Representing Label Generation Rulesets using XML

3.      The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP.

4.      The initial set shall be checked following the standard of criterion listed by The Normalized Hanzi Chart for General Use and IIcore.

5.      Some of the abandoned archaic characters, for instance Idu character(吏读字), shall be deleted based on the consensus with CDNC.

6.      Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements:

4.    A. The Code Points of different languages shall be programmed by its affiliated institutions, such as JP, KR, HK, DotAsia, ect.

5.    B. Each Code Point shall pass the interview conducted by both the language expert in the CGP panels and CDNC.

6.    C. All the strings in an application to join CGP shall not be collide with existing character in the process of Variant evaluation.

7.      CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities.

Please give your comments and advice on these principles.

Once we reach a consensus, the technical guys will make a character table and submit it to Integration Panel.


Regards
Wang Wei

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/mailman/private/chinesegp/attachments/20140429/52620111/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.jpg
Type: image/jpeg
Size: 16234 bytes
Desc: image003.jpg
URL: <https://mm.icann.org/mailman/private/chinesegp/attachments/20140429/52620111/image003-0001.jpg>


More information about the ChineseGP mailing list