[tz] TZ file comments UTF-8? Bastardized HTML? (was Re: Busingen revisited)

Ian Abbott abbotti at mev.co.uk
Sat Jan 12 08:59:39 UTC 2013

On 11/01/13 15:46, Paul Eggert wrote:
> On 01/11/2013 06:51 AM, Deborah Goldsmith wrote:
>> Can anyone name a system in use today that is not capable of dealing with it?
> It depends on what one means by "capable".  When I type
> the command "emacs southamerica" here's some text
> that I see on my remote-shell terminal window:
> # A partir de entonces, San Luis establecer\u00E1 el huso horario propio de
> That is, Emacs correctly infers that the file uses Latin-1,
> but because I prefer the LC_ALL='C' locale it displays all
> non-ASCII characters by using hexadecimal escape sequences.
> It would do the same if the file used UTF-8.
> I could work around the problem by using, say, the
> the LC_ALL='en_US.utf8' locale, but that has some undesirable
> side effects (it mishandles character ranges, and it's
> noticeably slower for some other things I do), and I'd rather not.
> The main systems I use these days are Ubuntu 12.10, Fedora 17, and
> RHEL 6.3; these are all the latest stable versions, and they
> all work this way.  I wouldn't be surprised if the latest OS X
> release worked this way too.

At least for Ubuntu 12.10 you could try LC_ALL=C.utf8 for the others 

> For this particular case, the fix is simple: translate the text
> into English (it's an English-language database, after all).
> Names can be a bit trickier, but again, things are simpler
> (at least for this maintainer) if the commentary is in ASCII.

Fair enough, but transliterating personal names into plain ASCII is 

-=( Ian Abbott @ MEV Ltd.    E-mail: <abbotti at mev.co.uk>        )=-
-=( Tel: +44 (0)161 477 1898   FAX: +44 (0)161 718 3587         )=-

More information about the tz mailing list