zic.8 in html

Paul Eggert eggert at twinsun.com
Wed Oct 15 07:23:24 UTC 2003


At Mon, 13 Oct 2003 23:47:00 +0200, Oscar van Vlijmen <ovv at hetnet.nl> writes:

> 1. There are in the POSIX 1 region of characters below code position
> 128 no significant differences between the encodings us-ascii,
> iso-8859-1 and utf-8.

Yes, if we stick to the ASCII subset (and use only TAB and LF among
the control characters) we should be OK.  Pretty much everybody can
read ASCII.  ISO-8859-1 is incompatible with UTF-8, EUC, shift-JIS,
etc. and so it's a bit more likely to be mishandled.  (ISO-8859-1 used
to be the default character set for HTML, but that was a while ago
now.)


> 2. If an html page will be viewed off-line, it would be useful to put a
> <meta> tag in the <head> section describing a character set.
> iso-8859-1 would be the most compatible,

For some time the web pages have had <meta> tags that specify
US-ASCII, which is a bit more conservative than ISO-8859-1.  They also
have their US-ASCII encoding specified in their XML declaration.  Hmm,
I just noticed that the HTTP header said "Content-Type: text/html;
charset=iso-8859-1"; I just fixed this to say "us-ascii" so that it's
all consistent.



More information about the tz mailing list