[ChineseGP] [Japanesegp] [Koreangp] Proposed Action items before Seoul meeting

Dillon, Chris c.dillon at ucl.ac.uk
Mon May 11 13:51:53 UTC 2015


Dear Yoneya-san,

JIS X 0208 looks rather familiar to me; when I was working in the British Library in the 1990s I once improved the character candidate listings in a Japanese Front End Processor used with a DOS bibliographic database.

I've been scrolling through J-LGR-s.xlsx and found some interesting things, mostly minor (see below and in red in the attached file). One thing isn't minor and I'll write about it in a separate email.

- Some code points are close to being punctuation, e.g. ゝ 309D and ー 30FC.
- It's difficult to imagine small hiragana in labels e.g. 3041 ぁ.
- I can just about imagine historic kana being used in cafe names e.g. 3090 ゐ.
- There are quite a few code points that look like parts of characters e.g. 4E3F 丿.

Regards,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) www.ucl.ac.uk/dis/people/chrisdillon 

-----Original Message-----
From: Yoshiro YONEYA [mailto:yoshiro.yoneya at jprs.co.jp] 
Sent: 30 April 2015 06:42
To: Dillon, Chris
Cc: KoreanGP at icann.org; ChineseGP at icann.org; JapaneseGP at icann.org
Subject: Re: [Japanesegp] [Koreangp] Proposed Action items before Seoul meeting

Dear Chris-san,

> I think the JGP's conclusions are probably correct, but I reckon the best way to be sure is to look at things from as many angles as possible and to try to find any exceptions to and problems with every statement.

As you see, Japanese LGR-1 includes rare characters.  This is because the source of Japanese LGR-1 is from Japanese Industrial Standard (JIS) repertoire.  To provide details, Kanji repertoire in Japanese LGR-1 is the same with Kanji defined in level 1 and level 2 of JIS X 0208[1].  
JIS X 0208 has more than 30 years history, and is widely used in Japan. 
JGP considered more narrow/wide range of repertoire, but tentatively concluded that the repertoire of JIS X 0208 is reasonable selection for Japanese LGR-1 repertoire.

[1] JIS X 0208
<http://en.wikipedia.org/wiki/JIS_X_0208>
<http://zh.wikipedia.org/wiki/JIS_X_0208>
<http://ko.wikipedia.org/wiki/JIS_X_0208>

Regards,

--
Yoshiro YONEYA <yoshiro.yoneya at jprs.co.jp>

On Wed, 29 Apr 2015 08:22:31 +0000 "Dillon, Chris" <c.dillon at ucl.ac.uk> wrote:

> Dear Yoneya-san,
> 
> Thank you for the spreadsheet with the characters.
> Seeing them really makes a difference. The general impression is of a 
> generous list containing especially at the end of it some characters 
> I've seen for example in novels or old literature over the last 35 
> years, and some I've never seen at all. However, obviously, as a 
> non-native speaker I have not read anything like as much as most 
> native speakers have. The situation (i.e. containing rarer characters) 
> may well be the same with the Chinese tables, but I can't comment at 
> all there, as I have read so little modern Chinese. I did study 
> Classical Chinese at university. (The Chinese students used to laugh 
> if I pronounced 孔子曰 as Koushi iwaku ... as many did not realize that 
> there is a system for reading Classical Chinese in Japanese.)
> 
> I think the JGP's conclusions are probably correct, but I reckon the best way to be sure is to look at things from as many angles as possible and to try to find any exceptions to and problems with every statement.
> 
> Thank you also for clarifying the situation as regards the options on Slide 6.
> 
> Looking forward to seeing you in Seoul,
> 
> Regards,
> 
> Chris.
> --
> Research Associate in Linguistic Computing, Centre for Digital 
> Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 
> 31599) www.ucl.ac.uk/dis/people/chrisdillon
> 
> -----Original Message-----
> From: Yoshiro YONEYA [mailto:yoshiro.yoneya at jprs.co.jp]
> Sent: 28 April 2015 09:56
> To: Dillon, Chris
> Cc: hotta at jprs.co.jp; KoreanGP at icann.org; ChineseGP at icann.org; 
> JapaneseGP at icann.org
> Subject: Re: [Japanesegp] [Koreangp] Proposed Action items before 
> Seoul meeting
> 
> Dear Chris-san,
> 
> Thank you for your comments.
> Followings are my responses for some of them.
> 
> > I believe Mr Yoneya’s algorithm will work.
> 
> Thank you, it encourages me a lot.
> 
> > It is fortunate that 機 ’machine’ / 机 ’desk’ and 発 ’send’ / 髪 ‘hair’ 
> > seem to be the only cases where (at least commonly used) different 
> > characters in Japanese are the same character in Simplified Chinese.
> > (I haven’t spent as much time with looking for characters that are 
> > separate in Chinese but brought together in Japanese. 弁 replaces at 
> > least three characters in Chinese, but I think none are common. I 
> > can imagine a . 弁当 TLD, so that may be good news for bento 
> > companies.)
> 
> JGP assessed CGP's draft LGR-1 how it affects to the usage of Japanese IDN. 
> 
> What JGP did and tentatively concluded are:
> - Comparison of CGP's draft LGR-1 and JGP's draft LGR-1 (with possible
>   variants)
>   - JGP tentatively concluded that serious influence of CGP's variant to 
>     Japanese IDN might be very limited
> - Assessment of occurence of 'may be seriously affected' variants in
>   Japanese JP domain name
>   - JGP tentatively concluded that such variants are mostly used as 
>     different character in Japanese IDN
> 
> >From this assessment, we proposed Japanese LGR-1 with no variant. 
> So, at this moment, JGP does not have much interest to search different variants.
> 
> > I note the options for the disposition of variants not defined in the LGR-1s (Slide 6), i.e.:
> > 
> > - Blocked if the variant is not in the LGR-1 / Allocatable otherwise
> > 
> > - Blocked if the variant is not in the LGR-1 / Inherit its original 
> > disposition in the LGR-1 (Allocatable/Simp/Trad/Both)
> 
> For JGP, both is OK.  For CGP, the latter seems to be more acceptable.
> I'd like to have consensus on this during the meeting.
> 
> > I note that it is difficult to understand Japanese LGR-1, as the characters are not visible.
> 
> Attached Japanese LGR-1 repertoire list with visible characters. 
> I hope this is helpful.
> 
> Regards,
> 
> --
> Yoshiro YONEYA <yoshiro.yoneya at jprs.co.jp>
> 
> On Mon, 27 Apr 2015 12:42:30 +0000 "Dillon, Chris" <c.dillon at ucl.ac.uk> wrote:
> 
> > Dear colleagues,
> > 
> > Here are some comments, as requested by Hiro.
> > 
> > I reckon I have now caught up after missing the Dallas meeting.
> > 
> > I believe Mr Yoneya’s algorithm will work.
> > 
> > I have spent some amount of time looking for exceptions to various 
> > statements in it e.g. Slide 5 “there exists at least one identical 
> > ideograph”. (No exception found.)
> > 
> > It is fortunate that 機 ’machine’ / 机 ’desk’ and 発 ’send’ / 髪 ‘hair’ 
> > seem to be the only cases where (at least commonly used) different 
> > characters in Japanese are the same character in Simplified Chinese.
> > (I haven’t spent as much time with looking for characters that are 
> > separate in Chinese but brought together in Japanese. 弁 replaces at 
> > least three characters in Chinese, but I think none are common. I 
> > can imagine a . 弁当 TLD, so that may be good news for bento 
> > companies.)
> > 
> > I note the options for the disposition of variants not defined in the LGR-1s (Slide 6), i.e.:
> > 
> > - Blocked if the variant is not in the LGR-1 / Allocatable otherwise
> > 
> > - Blocked if the variant is not in the LGR-1 / Inherit its original 
> > disposition in the LGR-1 (Allocatable/Simp/Trad/Both)
> > 
> > Both case studies are most interesting. I note that there are some labels, e.g. 予园  (with the first character, I think used only in Japan and the second only in Simplified Chinese) that perhaps we would prefer not to see allocatable in the ideal world, but suspect that blocking them would involve adding horrendous complexity.
> > 
> > I note that it is difficult to understand Japanese LGR-1, as the characters are not visible.
> > 
> > I have also been looking for differences between Traditional Chinese characters and Korean hanja. So far I have found one: characters with the progression radical tend to start with two dots in hanja: 逃 and only one in Traditional Chinese: 逃.
> > 
> > Looking forward to Seoul,
> > 
> > Regards,
> > 
> > Chris.



More information about the ChineseGP mailing list