[ChineseGP] Updates : Inclusion & Exclusion Principles

齐超 qichao at cnnic.cn
Tue Apr 29 10:13:43 UTC 2014


Dear, Chris

   Thank you for your reply. How about 'subset' ?  But it does not cover your advice below.





                                   齐超 via foxmail

发件人: Dillon, Chris
发送时间: 2014年4月29日(星期二) 下午4:25
收件人: 齐超; wangwei; ChineseGP at icann.org
主题: RE: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles
Dear Qi Chao,
 
Thank you. Your diagram makes things clearer. As you write, “sum” is not accurate either. We need to have a longer explanation, similar to what you’ve written below, but deciding the questions such as:
 
·         Are we sure we can leave out the characters in CJK exts-B?
·         As you say, is anyone in the group aware of any characters outside MSR-1-HAN being required?
 
Regards,
 
Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon
 
From: 齐超 [mailto:qichao at cnnic.cn] 
Sent: 29 April 2014 07:23
To: Dillon, Chris; wangwei; ChineseGP at icann.org
Subject: Re: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles
 
Hello, Chirs
 
    Thank you. Your edit make the principles clear and comprehensive. 
 
But for 'SUM', I think CGP Script is not a summary of CJK and MSR. 
 
There is a picture for CJK & MSR & CGP & CDNC Script(CGP Script) [Principle-1, Principle-3].
 

 
The CGP script as showed in orange colour, is just a part from MSR-1-Han. CDNC script is its origin.
1. CDNC Script does not include Hanzi from CJK exts-B;
2. CDNC Script does not include some Hanzi code points from other registry scripts as JP, DotAsia[Principle-6].
     
And here is also some points in CDNC script that maybe conflict with other registry as Kr or JP [Principle-5].
 
So 'SUM' maybe confuse the relation of MSR-1 and CJK(exts-A, exts-B). 
 
May CGP script cover points beyond MSR-1-han? 
 
And if true, it is a hard work to handle thousands of CJK Hanzi, case by case, for CGP members.
Thanks.
 



               齐超 via foxmail
 
发件人: Dillon, Chris
发送时间: 2014年4月28日(星期一) 下午4:30
收件人: Wang Wei; ChineseGP at icann.org
主题: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles
Dear colleagues,
 
Please find some minor changes in the version below. These are to make the English smoother. There is also one substantial change: the word “intersection” in paragraph one, often means the small area where two circles (in this case tables) overlap. I think here, the meaning is all characters in the three tables and so I think a word like “sum” is better.
 
1.      Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP THE character set would be the SUM.
2.      CGP character set shall be programmed according to the requirements of RFC3743/4713 and [Representing] Label Generation Rulesets WILL BE REPRESENTED using XML.
3.      The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP.
4.      The initial set shall be checked following the [standard of] criteriA listed by The Normalized Hanzi Chart for General Use and IIcore.
5.      Some [of the] abandoned archaic characters, for instance KOREAN Idu charactersS (이두/吏读字), shall be deleted based on the consensus with CDNC.
6.      Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements:
1.    A. The Code Points of different languages shall be programmed by THEIR affiliated institutions, such as JP, KR, HK, DotAsia, etc.
2.    B. Each Code Point shall pass CHECKS conducted by both the language expert in the CGP panels and CDNC.
3.    C. All the strings in an application FOR THE CGP shall not [be] collide with existing characterS in the process of Variant evaluation.
7.      THE CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities. 
 
[] means text I have removed.
 
Regards,
 
Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon
 
From: chinesegp-bounces at icann.org [mailto:chinesegp-bounces at icann.org] On Behalf Of Wang Wei
Sent: 25 April 2014 13:29
To: ChineseGP at icann.org
Subject: [ChineseGP] Updates : Inclusion & Exclusion Principles
 
Dear CGP members
 
         While ICANN is reviewing the CGP proposal, some members has drafted the principles of character inclusion & exclusion as follows.
 
1.      Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP character set would be the intersection.
2.      CGP character set shall be programmed according to the requirements of RFC3743/4713 and Representing Label Generation Rulesets using XML
3.      The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP.
4.      The initial set shall be checked following the standard of criterion listed by The Normalized Hanzi Chart for General Use and IIcore.
5.      Some of the abandoned archaic characters, for instance Idu character(吏读字), shall be deleted based on the consensus with CDNC.
6.      Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements:
4.    A. The Code Points of different languages shall be programmed by its affiliated institutions, such as JP, KR, HK, DotAsia, ect.
5.    B. Each Code Point shall pass the interview conducted by both the language expert in the CGP panels and CDNC.
6.    C. All the strings in an application to join CGP shall not be collide with existing character in the process of Variant evaluation.
7.      CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities. 
 
Please give your comments and advice on these principles.
 
Once we reach a consensus, the technical guys will make a character table and submit it to Integration Panel.
 
 
Regards
Wang Wei
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/mailman/private/chinesegp/attachments/20140429/0cce9905/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003(04-29-18-10-05).jpg
Type: image/jpeg
Size: 16234 bytes
Desc: not available
URL: <https://mm.icann.org/mailman/private/chinesegp/attachments/20140429/0cce9905/image00304-29-18-10-05-0001.jpg>


More information about the ChineseGP mailing list