[tz] TZ file comments UTF-8? Bastardized HTML? (was Re: Busingen revisited)

Paul Eggert eggert at cs.ucla.edu
Thu Jan 10 17:58:15 UTC 2013

On 01/10/13 05:04, Ian Abbott wrote:

> I'd prefer the comments to be in UTF-8 without the HTML entities and
> HTML tags, but the non-comment parts of the files to be restricted
> to plain-old ASCII.  The current HTML mark-up tags seem to have been
> added around December 1997 or earlier, although there have been URLs
> in the files since 1996 or earlier.  The TZ files pre-date HTML by
> several years and pre-date UTF-8 by several more years.

The HTML markup has bothered me, too; I have found it more
distracting than useful.  URLs themselves should be fine,
but the <a href='...'> business gets in the way.

> I'm not sure how widespread the adoption of UTF-8 text files is in
> the big, wide world, but I don't suppose we should care as long as
> the zic compilers don't break and the systems that zic is run on
> support 8-bit text files.

There still is the problem that people who are editing
the files with their own text editors may be hampered.
In my normal way of editing text across the network (ssh and
LC_ALL=C and emacs -nw), non-ASCII characters are rendered as ugly
hexadecimalish strings that are hard to read.  I can work
around the problem but it is an annoyance.

