Time zone: the next generation

Olson, Arthur David (NIH/NCI) olsona at dc37a.nci.nih.gov
Thu Mar 3 19:49:27 UTC 2005


Time zone package: the next generation

Over the last year we've done as much as possible,
within the existing time zone package framework,
to cope with systems with 64-bit time_t values.
But the zone information compiler (zic) still produces binary files with
32-bit transition time values. Something's gotta give.

As long as we're making changes, it's best to do as much as possible
(to avoid the need for further change down the road).

I've listed problems, approaches, and questions below.
Much of this material is related to general matters of time rather than
specific matters of time zones; my apologies.

PROBLEMS

*	Future transition times/past transition times
	The binary files produced by zic record transition times as 32-bit
	values; times after the 2038 (or before 1901) cannot be represented.
	(The future limit can be extended to 2106 by treating the values as
	unsigned, but if that's done times before 1970 cannot be
represented.)
*	Transitions in Israel
	Israel now goes back to standard time in the fall on the Saturday
	before Yom Kippur; there's no convenient way to represent this in
	the input to zic.
*	Julian-Gregorian transition
	Signed 32-bit time_t values can only represent years going back to 
	1901; this means that for most areas of the world the Gregorian
	calendar is in effect for all times representable by such a time_t.
	Signed 64-bit time_t values have a far greater range; the range
always
	includes all instants when areas switched from Julian to Gregorian.
	There's no provision for handling the switch in the time zone
package.
	(The transition happens at different times in different places;
	in addition to handling the jump over a number of days, there's
	also the matter of figuring out whether a year ending in 00 is
	a leap year in a particular place.)
*	Year zero
	Some year numbering schemes skip over the year zero; others do not.
	There's no provision for specifying whether or not to skip in the
	time zone package.
*	Early years of Julian calendar
	Leap years were inserted every three (rather than four) years
	early in the life of the Julian calendar; some leap years were
skipped
	later to make up for this. Month lengths (and names) were also in
flux
	for a while. Documentation of the glitches is shaky; there's no way
to
	reflect what documentation we do have.
*	Pre-Julian calendars
	The time zone package cannot handle information about the Roman
	Republic calendar or any of its predecessors.
*	Non-Julian-Gregorian calendars
	The time zone package cannot handle information about
	non-Julian-Gregorian time schemes (Mayan, Martian, and so on).
*	Big Bang/Big Crunch
	Signed 64-bit time_t values have enough range to go back to
theorized
	time of the Big Bang origin of the universe and thus back to the
start
	of time itself. Some folks might want to fold all instants "before"
the
	Big Bang in to that instant. At the other end, advocates of the Big
	Crunch theory might want to treat time_t values greater than the
	predicted Crunch instant as if they were the Crunch instant. There's
	no way to do such pegging in the time zone package.
*	Creation/Apocalypse
	Some folks might want to peg past times at a predicted time of
	Creation and peg future times at a predicted Apocalypse.

APPROACHES

*	Do nothing
	Since things don't get sticky until at least 2037,
	it's possible to wait (at least for a while) before taking action.
*	Tweak the binary file format
	At the least this would involve widening stored transition times
beyond
	32 bits. It might also be necessary to widen offsets as a way of
coping
	with Julian/Gregorian shifts and year zero skips.
*	Abandon binary files
	We're now operating on the "terminfo" model in which
	human-readable descriptions are converted to binary form (with some
	precomputation done) for use by programs. We could shift to the
earlier
	"termcap" model, simply copying files such as "asia" and
"northamerica"
	to a public directory and interpreting them at run time. This
	eliminates the need to change binary file formats (since such files
	disappear); there might still be a need to change the source file
	format if we wanted to do things such as handle Julian/Gregorian
	transitions. Responsibly taking this approach might involve learning
	why the termcap-to-terminfo transition occurred, and whether the
	reasons are still applicable in today's computing environment.
*	Preprocess
	We could change "zic" so that for each zone it outputs a file with
	only those Rule and Zone lines required for the zone; there could be
	some simplification of the output (such as expressing all times in
	UTC) to ease interpretation. Again there would be run-time
	interpretation, but the job would be scaled down (by
pre-identification
	of relevant data) and simplified.
*	Change zic's output to another format
	We could take the vzic route of reading the existing source files
and
	producing VZONEINFO format output. We might want to extend the
	VZONEINFO format (for example, to handle leap seconds). We might
want
	to produce output in some other existing format, or in a newly
designed
	format.
	
QUESTIONS

*	Do we handle Julian/Gregorian transitions?
*	Do we allow control of skipping the year zero?
*	Do we handle early-Julian leap year variations?
*	Do we handle pegging of far-past and far-future times?
*	For each new of the above:
	*	What default assumption should be used in zic
		and in run-time software?
	*	Can the assumption be overridden in a time zone source file?
		If so, how?
	*	Can the assumption be overridden using an environment
variable?
		If so, how?
	*	Can the assumption be overridden with function calls?
		If so, how?
*	Do we simplify handling of events tied to
	non-Gregorian-calendar-related events (such as Yom Kippur)?
*	Do we handle pre-Julian or non-Julian-Gregorian time schemes?
*	When Sherman types...
		%horton TZ=Europe/Rome wayback March 15 -44
	...will he and Mr. Peabody witness an assassination?

So...before discussing any of the above in detail...are the problems,
approaches, and questions above correct and complete? If not, what should
be changed?

				--ado



More information about the tz mailing list