[tz] TZ file comments UTF-8? Bastardized HTML? (was Re: Busingen revisited)
Ian Abbott
abbotti at mev.co.uk
Fri Jan 11 10:05:36 UTC 2013
On 2013/01/10 05:58 PM, Paul Eggert wrote:
> On 01/10/13 05:04, Ian Abbott wrote:
>> I'd prefer the comments to be in UTF-8 without the HTML entities and
>> HTML tags, but the non-comment parts of the files to be restricted
>> to plain-old ASCII. The current HTML mark-up tags seem to have been
>> added around December 1997 or earlier, although there have been URLs
>> in the files since 1996 or earlier. The TZ files pre-date HTML by
>> several years and pre-date UTF-8 by several more years.
>
> The HTML markup has bothered me, too; I have found it more
> distracting than useful. URLs themselves should be fine,
> but the <a href='...'> business gets in the way.
The HTML entities are also rather unreadable. If the HTML markup is
removed, and the HTML entities can't be replaced with UTF-8 sequences,
perhaps the non-ASCII characters could be replaced with TeX markup
sequences e.g. 'B\"usingen' rather than 'Büingen'.
>> I'm not sure how widespread the adoption of UTF-8 text files is in
>> the big, wide world, but I don't suppose we should care as long as
>> the zic compilers don't break and the systems that zic is run on
>> support 8-bit text files.
>
> There still is the problem that people who are editing
> the files with their own text editors may be hampered.
> In my normal way of editing text across the network (ssh and
> LC_ALL=C and emacs -nw), non-ASCII characters are rendered as ugly
> hexadecimalish strings that are hard to read. I can work
> around the problem but it is an annoyance.
Even worse for non 8-bit-clean editors such as the original 'vi', I
suppose. Thanks, I hadn't considered that!
--
-=( Ian Abbott @ MEV Ltd. E-mail: <abbotti at mev.co.uk> )=-
-=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-
More information about the tz
mailing list