[ChineseGP] Updates : Inclusion & Exclusion Principles

Jonathan Shea jonathan.shea at hkirc.hk
Wed Apr 30 07:56:06 UTC 2014


Dear Wang Wei, Chris,

+1

Regards,
Jonathan Shea

From: Dillon, Chris [mailto:c.dillon at ucl.ac.uk]
Sent: Wednesday, 30 April 2014 3:00 PM
To: Wang Wei; Jonathan Shea
Cc: ChineseGP at icann.org
Subject: RE: [ChineseGP] Updates : Inclusion & Exclusion Principles

Dear Wang Wei,

+1

Incidentally, it is interesting to see Yoshiro Yoneya’s post (http://forum.icann.org/lists/comments-msr-03mar14/msg00000.html ) which ends in a list of kanji not in MSR-1. Judging from the numbers, some of them are also in CJK-B.

Regards,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon

From: Wang Wei [mailto:wangwei at cnnic.cn]
Sent: 30 April 2014 04:42
To: 'Jonathan Shea'; Dillon, Chris
Cc: ChineseGP at icann.org<mailto:ChineseGP at icann.org>
Subject: 答复: [ChineseGP] Updates : Inclusion & Exclusion Principles

Thanks Jonathan & Chris


      The current CDNC table and JP table are all located in CJK & CJK exts A.
      That’s why we suggests make a intersection of MSR & CJK+CJK-A
      But I just checked HKSCS and found that are hundreds of character in CJK-B
      So I’d like to change principle 1 into “the maximum range of CGP character set would be all CJK Unified Ideographs that are included in the MSR contributed by ICANN”

      Which means, if there are some characters neither in CDNC table nor in MSR, first, we push ICANN accept them into MSR, second, we add them into CDNC table and CGP table through an appropriate evaluation process.
      Will this suggestion work for you?


Regards
Wang Wei

发件人: chinesegp-bounces at icann.org<mailto:chinesegp-bounces at icann.org> [mailto:chinesegp-bounces at icann.org] 代表 Jonathan Shea
发送时间: 2014年4月29日 17:30
收件人: Dillon, Chris
抄送: ChineseGP at icann.org<mailto:ChineseGP at icann.org>; wangwei
主题: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles

Dear Chris,

We (HKIRC, registry for the .HK ccTLD and .香港 IDN ccTLD) has submitted a comment to ICANN requesting their consideration to add 2,677 HKSCS (Hong Kong Supplemental Character Set) characters to MSR-1. HKSCS contains Chinese characters that are used in Hong Kong but may not be used in other Chinese-speaking communities.


http://forum.icann.org/lists/comments-msr-03mar14/msg00002.html.

As these HKSCS characters are not in the CDNC variant table, HKIRC is in the process of applying to CDNC to add these characters to the CDNC table batch by batch, the first batch containing 21 characters and is being processed.

Adding HKSCS characters to MSR-1 is for administrative convenience mainly – otherwise when CDNC approves the addition of some HKSCS characters to the CDNC table in the future, these characters cannot be added to the CGP table because MSR-1 does not contain them. Also, we are not sure at this stage whether the IP will produce new versions of MSR such as MSR-2.

As MSR-1 is a superset and CGP is not obliged to consider all characters in the MSR, our comment to add HKSCS characters into MSR-1 should not have any impact on the 7 proposed principles listed by Wang Wei.

Also. I have already communicated with the CDNC co-chairs, council members and secretariat before submitting the comment to ICANN.


Regards,
Jonathan Shea
HKIRC

From: chinesegp-bounces at icann.org<mailto:chinesegp-bounces at icann.org> [mailto:chinesegp-bounces at icann.org] On Behalf Of Dillon, Chris
Sent: Tuesday, 29 April 2014 4:25 PM
To: 齐超; wangwei; ChineseGP at icann.org<mailto:ChineseGP at icann.org>
Subject: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles

Dear Qi Chao,

Thank you. Your diagram makes things clearer. As you write, “sum” is not accurate either. We need to have a longer explanation, similar to what you’ve written below, but deciding the questions such as:


•         Are we sure we can leave out the characters in CJK exts-B?

•         As you say, is anyone in the group aware of any characters outside MSR-1-HAN being required?

Regards,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon

From: 齐超 [mailto:qichao at cnnic.cn]
Sent: 29 April 2014 07:23
To: Dillon, Chris; wangwei; ChineseGP at icann.org<mailto:ChineseGP at icann.org>
Subject: Re: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles

Hello, Chirs

    Thank you. Your edit make the principles clear and comprehensive.

But for 'SUM', I think CGP Script is not a summary of CJK and MSR.

There is a picture for CJK & MSR & CGP & CDNC Script(CGP Script) [Principle-1, Principle-3].

[cid:image001.jpg at 01CF648C.B157BCF0]

The CGP script as showed in orange colour, is just a part from MSR-1-Han. CDNC script is its origin.
1. CDNC Script does not include Hanzi from CJK exts-B;
2. CDNC Script does not include some Hanzi code points from other registry scripts as JP, DotAsia[Principle-6].

And here is also some points in CDNC script that maybe conflict with other registry as Kr or JP [Principle-5].

So 'SUM' maybe confuse the relation of MSR-1 and CJK(exts-A, exts-B).

May CGP script cover points beyond MSR-1-han?

And if true, it is a hard work to handle thousands of CJK Hanzi, case by case, for CGP members.
Thanks.

________________________________
               齐超 via foxmail

发件人: Dillon, Chris<mailto:c.dillon at ucl.ac.uk>
发送时间: 2014年4月28日(星期一) 下午4:30
收件人: Wang Wei<mailto:wangwei at cnic.cn>; ChineseGP at icann.org<mailto:ChineseGP at icann.org>
主题: Re: [ChineseGP] Updates : Inclusion & Exclusion Principles
Dear colleagues,

Please find some minor changes in the version below. These are to make the English smoother. There is also one substantial change: the word “intersection” in paragraph one, often means the small area where two circles (in this case tables) overlap. I think here, the meaning is all characters in the three tables and so I think a word like “sum” is better.


1.      Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP THE character set would be the SUM.

2.      CGP character set shall be programmed according to the requirements of RFC3743/4713 and [Representing] Label Generation Rulesets WILL BE REPRESENTED using XML.

3.      The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP.

4.      The initial set shall be checked following the [standard of] criteriA listed by The Normalized Hanzi Chart for General Use and IIcore.

5.      Some [of the] abandoned archaic characters, for instance KOREAN Idu charactersS (이두/吏读字), shall be deleted based on the consensus with CDNC.

6.      Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements:

1.    A. The Code Points of different languages shall be programmed by THEIR affiliated institutions, such as JP, KR, HK, DotAsia, etc.

2.    B. Each Code Point shall pass CHECKS conducted by both the language expert in the CGP panels and CDNC.

3.    C. All the strings in an application FOR THE CGP shall not [be] collide with existing characterS in the process of Variant evaluation.

7.      THE CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities.

[] means text I have removed.

Regards,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) ucl.ac.uk/dis/people/chrisdillon

From: chinesegp-bounces at icann.org<mailto:chinesegp-bounces at icann.org> [mailto:chinesegp-bounces at icann.org] On Behalf Of Wang Wei
Sent: 25 April 2014 13:29
To: ChineseGP at icann.org<mailto:ChineseGP at icann.org>
Subject: [ChineseGP] Updates : Inclusion & Exclusion Principles

Dear CGP members

         While ICANN is reviewing the CGP proposal, some members has drafted the principles of character inclusion & exclusion as follows.


1.      Based on MSR 1 character set contributed by ICANN, with CJK Unified Ideographs and Extension A as reference, the maximum range of CGP character set would be the intersection.

2.      CGP character set shall be programmed according to the requirements of RFC3743/4713 and Representing Label Generation Rulesets using XML

3.      The CDNC table widely accepted among Chinese domain name area can be employed as the initial set of CGP.

4.      The initial set shall be checked following the standard of criterion listed by The Normalized Hanzi Chart for General Use and IIcore.

5.      Some of the abandoned archaic characters, for instance Idu character(吏读字), shall be deleted based on the consensus with CDNC.

6.      Some Code Points are listed in the intersection of CJK and MSR-1, yet not included in the CGP. Such will be included in CGP only when they meet the following requirements:

4.    A. The Code Points of different languages shall be programmed by its affiliated institutions, such as JP, KR, HK, DotAsia, ect.

5.    B. Each Code Point shall pass the interview conducted by both the language expert in the CGP panels and CDNC.

6.    C. All the strings in an application to join CGP shall not be collide with existing character in the process of Variant evaluation.

7.      CGP is expected to submit a unified Chinese character set under its combination with all Chinese script communities.

Please give your comments and advice on these principles.

Once we reach a consensus, the technical guys will make a character table and submit it to Integration Panel.


Regards
Wang Wei

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/mailman/private/chinesegp/attachments/20140430/ec15239a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 16234 bytes
Desc: image001.jpg
URL: <https://mm.icann.org/mailman/private/chinesegp/attachments/20140430/ec15239a/image001-0001.jpg>


More information about the ChineseGP mailing list