factory zone abbreviation

Zefram zefram at fysh.org
Thu Aug 26 21:59:56 UTC 2010


Marc Lehmann wrote:
>Maybe I am dense, but where does POSIX actually oblige implementations
>to only accept the POSIX forms for TZ?

In one sense it doesn't, because it has an explicit escape hatch, that
any TZ value beginning with a colon has implementation-defined meaning.

In another sense, accepting the colon form as a "POSIX form", the
standard that I pointed at contains an explicit obligation.  It says "The
contents of the environment variable named TZ shall be used ... by various
utilities, to override the default timezone.  The value of TZ has one of
the two forms ...".  So it obliges certain functions and utilities to use
TZ, and says what values in TZ (when they don't start with a colon) mean.

And in another sense no again, because the description of the two forms
of TZ can be read as an obligation on the *setter* of TZ.  Setting TZ
to a non-conforming value would be a violation of that obligation,
presumably invoking undefined behaviour from the various utilities that
are obliged to pay attention to TZ.

But this is slightly beside the point.  We're not actually talking
about a value of a TZ variable.  We're talking about a field in
tzfiles, which is a format not defined by POSIX but by tzfile(5).
That says "After the second header and data comes a newline-enclosed,
POSIX-TZ-environment-variable-style string ...".  It explicitly
incorporates by reference the POSIX definition of the meaning of TZ
values.  I based my implementation on that.

Actually I've now modified my implementation of the tzfile format to
grandfather that specific space-containing value for this final field.
If the last local time type in the file is UT with the funky abbreviation,
and the `POSIX-TZ' field has the funky non-POSIX value, then it uses the
last local time type, and refrains from passing the non-POSIX `POSIX-TZ'
value on to the POSIX-TZ parsing code.  This hack is in the same vein
as one I already had, where a local time type of UT with abbreviation
"zzz" is treated as not defining local time (which for this code *is*
distinct from being defined as UT with abbreviation "zzz").  I've made
both of these deviations from tzfile(5) in order to better handle the
tzfiles that will be de facto found in the wild.

So, erm, attempting to get back to some kind of point, I see two main
reasons to pay attention to POSIX's rules here.  Firstly, because
tzfile(5) says that that field is for a POSIX TZ value, so it's a good
idea to make sure that what's put in there really is a POSIX TZ value,
so that you can hand it off to any code (not necessarily your own)
that parses POSIX TZ values.  You could of course change tzfile(5)
so that it no longer claims that non-empty values in that field are
always POSIX TZ values, but this would dilute the value of the field.
It's *useful* to conform to a standard protocol in this field.

Secondly, a direct implication of the POSIX rules on TZ is that timezone
abbreviations are expected to be composed from a very limited set of
characters, in particular not including space.  This too is a protocol,
and a useful one, as I alluded to in my previous message.  Even if you're
not handling TZ values, you may be handling local time abbreviations,
and knowing what they'll look like is useful.  So even if tzfile(5)
didn't invoke the POSIX TZ rules at all, it would still be a good idea
for all the abbreviations in a tzfile to satisfy POSIX's rules for local
time abbreviations.

-zefram



More information about the tz mailing list