Time zone: the next generation
Ken Pizzini
tz. at explicate.org
Sun Mar 6 03:16:49 UTC 2005
On Thu, Mar 03, 2005 at 02:49:27PM -0500, Olson, Arthur David (NIH/NCI) wrote:
> But the zone information compiler (zic) still produces binary files with
> 32-bit transition time values. Something's gotta give.
>
> As long as we're making changes, it's best to do as much as possible
> (to avoid the need for further change down the road).
>
> I've listed problems, approaches, and questions below.
> Much of this material is related to general matters of time rather than
> specific matters of time zones; my apologies.
My opinion is that opening up the scope of the TZ database to
include all historic calendars used before the introduction
of standardized time zones in the 19th century is too ambitious,
as is including non- earth-referenced clocks (e.g., Martian time).
While I agree that some flexibility about what calendar is used
is in order (so that, for example, we might handle the new
Israeli time zones by reference to the Hebrew calendar), I think
the primary problem that this code needs to address is the
simpler, but already complex, one of:
Given:
* a location on earth [probably specified broadly, as with the
current tz zone-name based approach, but perhaps we can include
polygon information allowing a latitude+longitude based
selection?],
* a "Modified Julian Date" (Gregorian 1970-01-01 CE == MJD 40587)
[specified relative to a convenient-to-the-program location
on earth, not necessarily the location mentioned above]
* a reference clock (TAI/GPS, UTC/UTS/UT1, other),
* and a count-of-seconds-since-epoch on that reference clock
[note that TAI and GPS based counters would count SI seconds,
but UTC and UTS based ones would only count "non-leap" seconds,
and I guess a UT1 clock would probably have "seconds" of
irregular lengths?]
return:
* the local MJD
* the local (zone-adjusted) time
Beyond that, we will of course want to comply with the C and POSIX
time APIs, and so will need to at least translate between MJD values
and proleptic Gregorian calendar dates. Slightly more ambitious,
but still within the bounds of reason, we may wish to add support for
the Hebrew calendar (which we would need to do internally anyway,
if we wish to support the new time zones in Israel without having
to resort to the current solution of special-casing every year).
We may also wish to add ephemeris calculations so that we can correctly
adjust for local-sun-time, whether for Saudi Arabia in the late 1980s,
or for pre-timezone locales (or not).
We should certainly leave the derived MJD exposed for other APIs to
translate into dates on other calendars, but once again I claim that,
just as we are less complete about time zone definitions prior to 1970,
this code base need not directly cater to other calendar expressions.
As to "zic file" format: I agree that we should revisit the choice
of compiling the zone files to a binary format. Absent a good
and still valid reason for it, I'd much prefer that we go to a
"termcap-like" model. I'd say that the primary human-maintained
tables would include many (all?) of the features that we currently
have (such as local-time based transition references, aliases/links,
and shared zone-transition rules), but the installed run-time version,
while still text that can be edited (so that local installations
can make tweaks before official updates are available), should
be pre-processed as much as practical (e.g., eliminating external
cross-references and converting all times to UTC).
To that end, I'll toss out a "pre-alpha" idea about what an entry
might hold (omitting a fair amount of complexity that will eventually
be required):
tzversion:tzcode-zic/2005f
name:Asia/Jerusalem
clock:UTC
valid_start:53795
valid_end:open
standard_abbr:IST
standard_offset:+7200
daylight_abbr:IDT
daylight_offset:+10800
daylight_start_day:dow=5 & ((mon=4g & mday=1g) | (mon=3g & 25g<mday))
daylight_start_time:0
daylight_end_day:dow=5 & mon=1H & 1H < mday & mday<9H
daylight_end_time:82800
Note that while in Israel the daylight_end_day would be on the
*Saturday* (dow=6) preceding 10 Tishri (2H<mday & mday<10H),
this entry has been preprocessed to reference the date/time
in UTC. Also, the use of "g" and "H" suffixes to reference the
Gregorian and Hebrew calendars is almost certainly a bad notation,
but I needed something to use for my example...
(NB: MJD 53795 = 2005-03-01, when the law passed in the Knesset;
a different start date might be more appropriate, I'm not sure.)
Thinking about the broader problem a little more, perhaps it would
make sense to use XML for the run-time format? One of the first
problems I see with the notation I used above is that it gets
quite verbose and redundant for zones which have had many changes
only in the start-day/end-day rules. Two good things about XML
are that there are some good and fast parsers out there, and it
is a well-known standard, allowing other applications to easily
leverage our data. The bad thing is that it would either add an
external dependency to the code, or require that we bundle a parser.
Anyway, there's my first 0.02 euros on the subject.
--Ken Pizzini
More information about the tz
mailing list