[ietf-charsets] Update the information for GB18030 in IANA?
Martin J. Dürst
duerst at it.aoyama.ac.jp
Wed Nov 16 08:29:05 UTC 2022
Dear Eiso, others,
Many thanks for your request.
I'm the expert reviewer for character set registrations (Ned Freed was
the primary reviewer, but unfortunately no longer is with us).
I'm sorry I missed your mail for over a month.
On 2022-10-14 12:02, 陈永聪 wrote:
> Dear Anthony,
> (cc Mr. Chen Zhuang and Ken)
>
>
> I find your email in the GB18030 page in IANA (https://www.iana.org/assignments/charset-reg/GB18030), which the page is compiled by you.
>
>
> The newest version of GB 18030 has been published as GB 18030-2022, please see https://std.samr.gov.cn/gb/search/gbDetailed?id=E4A2A4C875726A5DE05397BE0A0A61E8 and https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=A1931A578FE14957104988029B0833D3 . However the IANA information has still kept for the original 2000 version, which is not better for the users.
Many thanks for this information. What we need to decide is whether to
update the registration for charset "GB18030" to this new standard or
whether we should define a new charset label (e.g. "GB18030-2022") to
distinguish this from the old (2000 version) of the standard. This
depends on how much changes there are in the new version, and how
various implementers are expected to deal with the new version.
Given that as far as I understand, GB 18030 is an encoding of
Unicode/ISO 10646, and we do not distinguish versions in charset labels
for Unicode/ISO 10646, there is a good argument for not introducing a
new label for this new version.
On the other hand, if there are new structural features in the 2022
version that are not present in the 2000 version, that might indicate
the need for a new charset label.
So any information on structural changes (or the absence thereof) as
well as expectations towards industry and plans and needs from industry
are greatly appreciated.
From reading https://en.wikipedia.org/wiki/GB_18030, my understanding
is that there are no structural changes, and that the main change is
that mappings to the PUA have been completely eliminated. That means
that there are some mapping differences between the 2000/2005 version
and the 2022 version, but we might characterize them as minor (80 or so
codepoints) and decide to keep the same label.
> CESI also kindly released the mapping table at http://www.nits.org.cn/getIndex.req?action=findAllNews&req=modulenvpromote&type=0&moduleId=455&sid=4 . If it is not convenient to download the file from the website, you can find the attachment.
Ideally, I'd prefer if you didn't send such large files to the mailing
list. But I have been unable go get any response from the above URI in a
browser. What's interesting is that a ping to www.nits.org.ch works
without problems, with a bit over 100 ms round trip time.
If such data is available somewhere, it might be better to get a diff
between the old and the new mapping tables. That should be much shorter.
> I think it is better to update the information in corresponding IANA page. So, do you know how shall we need to do?
Once we know exactly what/how we want to update the information, I'll
ask IANA to do so.
My understanding is that this is not immediately urgent, because
https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=A1931A578FE14957104988029B0833D3
says that the (google translate) "implementation date" (实施日期) is August
1st, 2023.
Looking forward to getting additional information from anybody who has some.
Regards, Martin.
>
>
> Eiso
More information about the ietf-charsets
mailing list