zic.8 in html
Paul Eggert
eggert at twinsun.com
Wed Oct 15 07:23:24 UTC 2003
At Mon, 13 Oct 2003 23:47:00 +0200, Oscar van Vlijmen <ovv at hetnet.nl> writes:
> 1. There are in the POSIX 1 region of characters below code position
> 128 no significant differences between the encodings us-ascii,
> iso-8859-1 and utf-8.
Yes, if we stick to the ASCII subset (and use only TAB and LF among
the control characters) we should be OK. Pretty much everybody can
read ASCII. ISO-8859-1 is incompatible with UTF-8, EUC, shift-JIS,
etc. and so it's a bit more likely to be mishandled. (ISO-8859-1 used
to be the default character set for HTML, but that was a while ago
now.)
> 2. If an html page will be viewed off-line, it would be useful to put a
> <meta> tag in the <head> section describing a character set.
> iso-8859-1 would be the most compatible,
For some time the web pages have had <meta> tags that specify
US-ASCII, which is a bit more conservative than ISO-8859-1. They also
have their US-ASCII encoding specified in their XML declaration. Hmm,
I just noticed that the HTTP header said "Content-Type: text/html;
charset=iso-8859-1"; I just fixed this to say "us-ascii" so that it's
all consistent.
More information about the tz
mailing list