[Japanesegp] terminology: traditional Chinese chars?

Kenny Huang, Ph.D. huangksh at gmail.com
Wed Sep 30 10:08:06 UTC 2015


Dear Prof. Kim,

ISO10646 doesn't have classifier to distinguish Traditional
 Chinese (TC) or Simplified Chinese (SC). The principle is
 that same glyphs will be the same code point, therefore it is
 possible that one code point could represent both TC and SC.

 The basic classification technique we used is checking data
 source from http://www.unicode.org/charts/unihan.html. As long
 as there is a TSource in a glyphs, we consider it is Traditional Chinese.
 Such as 曾
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=66FE&useutf8=true
 which embeded TSource, therefore it is considered as TC initially. As I
mention
 early, it is possible that both TC/SC existed in the same code point.

The method is used for preliminary stage of developing TC/SC table.
There are many language experts working on the table from one glyphs
to another via human validation. I can only say there is no simple algorithm
or method to produce the result.

FYI

Kenny Huang


On 30 September 2015 at 14:40, KIM Kyongsok <gimgs0 at gmail.com> wrote:

> Dear Messrs Wang and Huang:
>
> How are you doing?  I have a question about the term
> " traditional chinese chars".
>
> 1) In ISO/IEC 10646, there are about 56000 Chinese chars.
>
> 2) There are 2235 simplified Chinese chars
>   and 2261 chars corresponding to these 2235 simplified Chinese chars.
>
> 3) My questions are:
>
>   3-1) By the term "traditional Chinese chars", do you refer to
>    just 2261 chars? Or 56000 - 2235 = 53000+ (roughly) chars?
>
>   3-2) I wonder what terms Chinese experts use to distinguish
>    between  a) 2261 chars? and b) 56000 - 2235 = 53000+ chars?
>
> Thanks in advance.
>
> KIM, K.
>
>
>
> --
> 김 경석      KIM, Kyongsok
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/japanesegp/attachments/20150930/d3b8e82f/attachment.html>


More information about the japanesegp mailing list