[ietf-charsets] Fwd: [IANA #1297322] IANA characters-sets US-ASCII entry incorrect

Wed Dec 20 07:00:19 UTC 2023

To everybody interested in the recent discussion on the character set 
registry, in particular the (absence of the) entry "ASCII".

Many thanks to IANA, and in particular Sabrina Tanamal, for digging up 
the relevant correspondence from 10 and 20 years ago.

Please accept my apology for not remembering this correspondence and 
therewith seriously confusing the discussion.

My summary based on this new information is as follows:

- Ned Freed sent a request to IANA in February 2003 concerning the
   entry for "US-ASCII" and related aliases in the Character Set
   registry, requesting (among else) the removal of the alias "ASCII".
   The request for this removal was based on the fact that RFC 2046 says
   'The character set name "ASCII" is reserved and must not be used for
   any purpose.'.

- When IANA was moving their registries from .txt to .xml, this request
   was rediscovered and acted upon. Both Ned and me agreed with the
   removal of "ASCII". We decided that there was no need to inform
   the ietf-charset mailing list, which in hindsight was probably a
   mistake (not the least because it would have had the potential to
   shorten the current discussion by quite a bit).

Given the fact that RFC 2046 clearly says that 'The character set name 
"ASCII" is reserved and must not be used for any purpose.', I think that 
the only choice is to leave the registry as it is.

<charset reviewer hat on>
I'm of course ready to reevaluate this and adding this label back in if 
anybody is able to come up with really strong and convincing arguments 
to do so.
<charset reviewer hat off>

For data labeled with charset=ASCII, the correct interpretation is to 
ignore the charset parameter because of an undefined parameter value. 
The implementation would then fall back to the default, which in case of 
email and text/plain is "US-ASCII". The overall result is the same as an 
"ASCII" alias in the registry.

Regards,   Martin.

-------- Forwarded Message --------
Subject: [IANA #1297322] IANA characters-sets US-ASCII entry incorrect
Date: Tue, 19 Dec 2023 01:30:10 +0000
From: Sabrina Tanamal via RT <iana-issues-comment at iana.org>
Reply-To: iana-issues-comment at iana.org
CC: duerst at it.aoyama.ac.jp

Hi Martin (trimming the list),
It looks like this change was completed in 2013 (reported by Ned and 
approved by you). Please see the thread below.
Let me know if you need us to forward anything to the list or if any 
changes are required.
Thanks,
Sabrina

=====

Fri Jan 04 08:10:25 2013 Martin Duerst <duerst at it.aoyama.ac.jp> - 
Correspondence added CC:	ned.freed at mrochek.com
Subject:	Re: [IANA #111894] Possible update to character-sets (#2)
Date:	Fri, 04 Jan 2013 16:48:30 +0900
To:	iana-matrix at iana.org
From:	"Martin J. Dürst" <duerst at it.aoyama.ac.jp>
  Hello Amanda,

On 2013/01/04 10:37, Amanda Baber via RT wrote:
Hide quoted text
> Hi Martin,
>
> Are you OK with going ahead with this?

Yes, please go ahead with this. Ned is correct on each and every point.
Sorry for the delay in answering this.

Regards, Martin.

> thanks,
> Amanda
>
> On Thu Dec 20 23:11:34 2012, ned.freed at mrochek.com wrote:
>>> Hi,
>>
>>> I haven't seen anything about this proposal on the charset list:
>>
>> I don't see any need to post this, but since I'm the source of the
>> change
>> it should be Martin's call. I think he's OK with going ahead, but he
>> should
>> probably confirm.
>>
>>
>> Ned
>>
>>>>>> I therefore suggest that this entry [ANSI_X3.4-1968] be changed
>> to read:
>>>>>>
>>>>>> Name: US-ASCII (preferred MIME name) [RFC2046]
>>>>>> MIBenum: 3
>>>>>> Source: ANSI X3.4-1986
>>>>>> Alias: iso-ir-6
>>>>>> Alias: ANSI_X3.4-1968
>>>>>> Alias: ANSI_X3.4-1986
>>>>>> Alias: ISO_646.irv:1991
>>>>>> Alias: ISO646-US
>>>>>> Alias: us
>>>>>> Alias: IBM367
>>>>>> Alias: cp367
>>>>>> Alias: csASCII
>>
>>> Since you both approved it, as Martin noted, should we go ahead and
>> make this change? Or would one of you prefer to post it to the list
>> first? We'd prefer to leave it up to you.
>>
>>> thanks,
>>> Amanda
>>
>>> On Fri Sep 28 06:42:32 2012, duerst at it.aoyama.ac.jp wrote:
>>>> Hello Amanda,
>>>>
>>>> I have looked at the proposed change, and I think it makes *a lot*
>> of
>>>> sense. I think we have two choices:
>>>>
>>>> a) Just go ahead and fix it. We have both reviewers agreeing with
>> it.
>>>> b) Just to be sure, send a mail to the charset mailing list saying
>> we
>>>> plan to do this, so that anybody who may have a complaint (I don't
>>>> expect anybody, but anyway). Then after a few weeks, go ahead and
>> do it.
>>>>
>>>> I'm okay with either.
>>>>
>>>> Regards, Martin.
>>>>
On 2012/09/28 7:23, Amanda Baber via RT wrote:
> Martin and Ned,
>
> We have this one last character-sets email from several years ago that
> needs to be addressed. Ned, I know you submitted this in the first
> place, but can you verify that this still needs to be done/won't break
> anything?
>
> all apologies, and thanks,
> Amanda
>
>> -------- Original Message --------
>> Subject: Fix for one particularly serious charset registry problem
>> Date: Sun, 09 Feb 2003 15:51:23 -0800 (PST)
>> From: ned.freed at mrochek.com
>> To: iana at iana.org
>> CC: ned.freed at mrochek.com, paf at cisco.com, harald at alvestrand.no
>>
>>
>> The first entry in the charset registry reads as follows:
>>
>> Name: ANSI_X3.4-1968 [RFC1345,KXS2]
>> MIBenum: 3
>> Source: ECMA registry
>> Alias: iso-ir-6
>> Alias: ANSI_X3.4-1986
>> Alias: ISO_646.irv:1991
>> Alias: ASCII
>> Alias: ISO646-US
>> Alias: US-ASCII (preferred MIME name)
>> Alias: us
>> Alias: IBM367
>> Alias: cp367
>> Alias: csASCII
>>
>> There are, unfortunately, many problems with this entry:
>>
>> (1) The primary name of the charset is US-ASCII, not ANSI_X3.4-1968.
> It has
>> always been this way as far back as RFC 1341. And this actually
> matters,
>> since the primary charset name is the one that's supposed to be used
>> in encoded words, and the restrictions on encoded words don't
> allow use
>> of ANSI_X3.4-1968!
>>
>> (2) The source of the registration should be the ANSI standards document,
>> not some ECMA registry.
>>
>> (3) The alias "ASCII" is specifically prohibited by RFC 2046 section
> 4.1.2.
>>
>> (4) The defining document for this charset is RFC 2046, not RFC 1345.
>>
>> I therefore suggest that this entry be changed to read:
>>
>> Name: US-ASCII (preferred MIME name) [RFC2046]
>> MIBenum: 3
>> Source: ANSI X3.4-1986
>> Alias: iso-ir-6
>> Alias: ANSI_X3.4-1968
>> Alias: ANSI_X3.4-1986
>> Alias: ISO_646.irv:1991
>> Alias: ISO646-US
>> Alias: us
>> Alias: IBM367
>> Alias: cp367
>> Alias: csASCII
>>
>> There is no formal procedure for fixing errors in the charset registry.
>> However, I believe that identification of an actual problem along with
>> signoff of the charset reviewer should be sufficient to make the change.
>>
>> Ned

On Mon Dec 18 21:59:24 2023, steffen at sdaoden.eu wrote:
> Martin J. Dürst wrote in
>  <6fc74600-cff6-4751-9efa-1200a9ec5aa5 at it.aoyama.ac.jp>:
>  |Hello Stephen,
> 
> Actually Miss Jaeger named me Steve in English, because another
> one got Stephen earlier.
> 
>  |On 2023-12-16 04:06, Steffen Nurpmeso wrote:
>  |> To add that for backward compatibility the plain ASCII alias
>  |> cannot go away,
>  |
>  |I seem to remember too that ASCII was listed as an alias, and have 
>  |confirmed this with
>  |https://web.archive.org/web/20051229042158/http://www.iana.org/assignmen\
>  |ts/character-sets
> 
> My local copy is from 2011.
> 
>  |Stephen, maybe you can do a bisection to find out where this alias 
>  |disappeared.
> 
> I have only a local copy from 2011.  I have no access to an IANA
> revision control system, shall such a thing exist.  Sorry.
> 
>  |<charset reviewer hat on>
>  |I don't remember ever having dealt with a request to remove this ALIAS, 
>  |and I strongly doubt that Ned ever did that.
>  |</charset reviewer hat on>
> 
> Well i could imagine it was broken when taking over the beautiful
> text-only version to that really, really machine readable XML
> variant that is now used as the base.
> I hope the conversion was mechanical and the bugs are little
> jokes, otherwise the entire IANA data collection possibly needs
> an audit :-)
> 
>  |> i do have emails with charset=ascii in my archive,
>  |
>  |It would be interesting to know how many (i.e. what percentage of 
>  |overall mails, and what percentage in comparison to those labeled 
>  |US-ASCII), and how old they are.
> 
> The latter is very large, the former is very low.  Two, to be
> exact.  But they do exist (and i only have a very sparse private
> email archive).
> 
> --steffen
> |
> |Der Kragenbaer,                The moon bear,
> |der holt sich munter           he cheerfully and one by one
> |einen nach dem anderen runter  wa.ks himself off
> |(By Robert Gernhardt)
> |
> | Only in December: lightful Dubai COP28 Narendra Modi quote:
> |  A small part of humanity has ruthlessly exploited nature.
> |  But the entire humanity is bearing the cost of it,
> |  especially the inhabitants of the Global South.
> |  The selfishness of a few will lead the world into darkness,
> |  not just for themselves but for the entire world.
> |  [Christians might think of Revelation 11:18
> |    The nations were angry, and your wrath has come[.]
> |    [.]for destroying those who destroy the earth.
> |   But i find the above more kind, and much friendlier]
>