[ChineseGP] Integration Panel Considerations on 20151115 Chinese LGR draft

Fri Mar 4 23:50:35 UTC 2016

Dear Wang Wei,

The Integration Panel has not been able to discuss your contribution as 
a full panel. Therefore, I will give you my personal feedback on some 
issues, with the hope that this will be useful for any deliberations you 
will have among yourselves or with the other IP members who will be at 
ICANN55.

I will send, under separate cover an HTML version of your XML file. It 
provides a different way of looking at the data than the XML file, and 
you might find it useful. It's a rather large file, so I will send a 
compressed version.

A./

Comments below:

On 2/24/2016 10:00 PM, 王伟 wrote:
>
> Dear Asmus
>
> Thanks for reviewing CGP XML document and pointing out the problems in it.
>
>        I have fixed the flaws about reference id, reflective type, 
> usage of “blocked”, and etc.
>
The XML looks much improved, but we haven't had the chance to test it 
yet. But I did run it through my tool.

Lines 70409 -70410 had some minor issues:

      <action disp="allocatable" only-variants="r-simp r-trad r-both" 
common="original label"/>
      <action disp="blocked"     any-variant="simp trad both r-simp 
r-trad r-both" "block any other mixed labels" />

common --> comment
"block... --> comment="block

I believe those where the only syntax issues I found.
>
> Before answer the questions in PDF, I’d to explain the development 
> principles of CGP repertoire.
>
> 1)The core set of CGP repertoire is the intersection of CDNC table 
> (the latest version 2015) and MSR
>
> 2)The intersection of China official Normalized Hanzi List for Common 
> Use and MSR
>
> 3)The intersection of IICore and MSR (to include the other IIcore code 
> points not covered in step 1 and 2)
>
> Considering the principles above, the answers are listed as below:
>
> Q1: The IP would like to know the rationale for adding the 22 IICORE 
> characters that are not used in
>
> existing IDN tables (see Section 4.22). We note that over half of them 
> are included in IICORE for
>
> support of Korean.
>
> A1: we added 22 IICore code points not for support of Korean, but to 
> reach a maximum support of IICore.
>

Noted.
>
> Q2: The IP would like to learn the reasoning by the CGP with regards 
> to deciding which code points
>
> from the JO set to include. We were unable to identify a pattern to 
> distinguish the 94 that were
>
> selected from the 50 that were excluded. For example, we can see that 
> a few of these 50 have
>
> glyph shapes are distinctively ‘Japanese’but on the other hand, many 
> are shared with other IRG
>
> sources.
>
> A2: If a JP code point happens to be included in the above repertoire, 
> CGP experts will analyze its variant relationship with others, whether 
> or not it is a variant.
>
> We don’t intend to analyze every code point in JGP repertoire and 
> discuss to add it into CGP repertoire.
>

To understand this then, the cutoff you used for code points that are 
not specific to Chinese is the IICORE set and any partial coverage of 
the J0 set would be due to partial coverage of that set in IICORE?
>
> Q3: The IP wonders whether it is wise not to include the 7 characters 
> that were deemed useful by
>
> DotAsia for Singapore and other Chinese constituencies (listed 
> explicitly in Section 4.1).
>
> A3: thanks for reminding us that 7 characters from DotAsia. These 7 
> code points are not covered by the above 3 principles.
>
> We will have CDNC meeting in the coming March, and to discuss if a new 
> 4^th principle should be adopted, to allow all New gTLD application 
> IDN code points.
>

OK. This means that the issue is still open at this point. We'll wait 
for a report from CGP on the resolution.
>
> Q4: The IP would like to see a more detailed analysis of the 62 
> characters IICORE encoded in the
>
> block: CJK UNIFIED IDEOGRAPHS EXTENSION B and which are not included 
> in the CLGR draft.
>
> These are all HKSCS characters and are specific to Cantonese. The main 
> reason that they are
>
> encoded in Extension B (instead of the main block or Extension A) is 
> because they were
>
> processed later by the IRG. The IP feels that any rationale for 
> excluding them (whether or not
>
> this is because of their encoding in a Supplementary Plane), would 
> need to be specifically
>
> documented by the Chinese Generation Panel.
>
> A4: we didn’t include the code points in extension B, whose coding 
> length is 5 (from 2000 to 2A6D6), they are not fully support in IT 
> systems in China.
>
> If IP thinks it is necessary and insist to include extension B, we 
> will discuss it in the coming CDNC meeting.
>

I would expect that such implementation issues are transient in nature 
and, over time, support for code points above U+FFFF will become more 
widespread and more reliable. For example, I expect Emoji are as popular 
in China as everywhere else, and IT systems will support them (not for 
IDNs, but for text and SMS of course). Most of them are supplementary 
characters in Unicode, which will put pressure on upgrading IT systems 
to handle code points beyond U+FFFF.

Because the proposed restriction appears to impact a specific geographic 
region and user community, it would seem important to get direct input 
from representatives of that user community on how critical these are 
for Cantonese and how widespread their support is in IT systems for that 
language. While I'm sure that the IP understands the need to be 
conservative, when the impact is culturally one-sided like this, I'm 
sure the IP would like to have the documentation that the restrictions 
are supported by or at least acceptable to the affected user 
communities. The procedure requires us to avoid the impression of bias.

> I hope these answers could help.
>
> Please give us further advice and correction for the inadequacy of our 
> work.
>

I'm sure I'm speaking for the IP as a whole when I say that we are 
looking forward to getting a complete draft of an LGR proposal, with all 
the principles and background for the decisions fully discussed.

Also, the <description> element in the XML file could be extended a bit 
to give a brief overview on the method used to calculated dispositions 
for variant labels. The idea is that there should be just enough 
information that a knowledgeable reader can understand the XML without 
having to read the full proposal, but not so much as to duplicate all 
the details from the latter. (The <description> would cite specific 
sections of the proposal document - as was done for LGR-1, which was 
just published).

Hope you are having a good meeting, if you are attending ICANN55

A./

> Best Regards
>
> Wang Wei
>
> *发 件人:*Asmus Freytag [mailto:asmusf at ix.netcom.com]
> *发 送时间:*2015年12月18日13:52
> *收件人:*Wang Wei <wangwei at cnic.cn <mailto:wangwei at cnic.cn>>
> *抄送:*Sarmad Hussain <sarmad.hussain at icann.org 
> <mailto:sarmad.hussain at icann.org>>
> *主题:*Integration Panel Considerations on 20151115 Chinese LGR draft
>
> Dear Wang Wei,
>
> please find attached the Integration Panel's review of the latest 
> draft that you shared with us, complete with a few requests for 
> additional information. These are about the repertoire at this point, 
> we are still reviewing the variants.
>
> Thanks,
>
> A./
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/chinesegp/attachments/20160304/9271faca/attachment-0001.html>