Time zone: the next generation

Robert Elz kre at munnari.OZ.AU
Sun Mar 6 15:25:56 UTC 2005


    Date:        Thu, 3 Mar 2005 14:49:27 -0500 
    From:        "Olson, Arthur David (NIH/NCI)" <olsona at dc37a.nci.nih.gov>
    Message-ID:  <75DDD376F2B6B546B722398AC161106C74038F at nihexchange2.nih.gov>

  | But the zone information compiler (zic) still produces binary files with
  | 32-bit transition time values. Something's gotta give.

Yes, a revision makes sense.

But

  | As long as we're making changes, it's best to do as much as possible
  | (to avoid the need for further change down the road).

No, please try and avoid typical 2nd system effects, with the grand
temptation to add everything that anything can imagine might possibly
be of some interest to someone, somewhere, sometime.   Change only
what absolutely needs changing because of demonstrated need now.  What
exists is currently pretty good, the 64 bit issue is certainly going to
bite sometime, so that one ought be fixed, there's nothing else (or
very little) so seriously wrong with that exists now that makes it
important to change, I suspect.

If sometime in the future someone has a problem that is reasonably
solved in this set of code, then we (or someone) can solve their
concrete problem at the time it is presented.   Until there's a real
problem to solve, any solutions adopted are more likely to be problems
than answers.

  | *	Future transition times/past transition times
  | 	The binary files produced by zic record transition times as 32-bit
  | 	values; times after the 2038 (or before 1901) cannot be represented.
  | 	(The future limit can be extended to 2106 by treating the values as
  | 	unsigned, but if that's done times before 1970 cannot be
  | represented.)

Yes, a wider range makes sense.

  | *	Transitions in Israel
  | 	Israel now goes back to standard time in the fall on the Saturday
  | 	before Yom Kippur; there's no convenient way to represent this in
  | 	the input to zic.

In the input language, perhaps - but all that's needed there is a
slightly more flexible version of the script processing that's needed
now for US presidential election years (or once was), and for some of
the wacky rules that have been used in parts of Australia.   The actual
(binary) zone file format for this is just fine.

  | *	Julian-Gregorian transition

Forget it.   Those interested in archeology/anthropology/astronomy can use
their own calendar methods.   All that needs to be dealt with here is the
current time, and reasonable timestamps for events that have occurred in the
computer age, and are reasonably likely to still exist.  Going back
to 1970 is plenty early enough (I'm not sure we need anything more than
that).

Don't attempt to solve everyone's problems.   Pick ours, and solve that
one, and leave all the rest alone.   (Note, this isn't meant to demean
the importance of everyone else's issues, just to limit our effort to
what we know how to handle properly - guessing what someone else might
find useful is just plain dumb.)


  | APPROACHES
  | 
  | *	Do nothing
  | 	Since things don't get sticky until at least 2037,
  | 	it's possible to wait (at least for a while) before taking action.

No, we need enough future time for planning purposes, so something will
have to be done before (at about the latest) 2025.   Then we need depployment
time before that (time for everything to get upgraded, and all the old
stuff that doesn't get upgraded to die away).  That means the new stuff
needs to be ready to be shipped bt 2015 or so I think.   When in the next
10 years the decisions get made and the code written probably doesn't
matter all that much.

  | *	Tweak the binary file format

Yes.

  | *	Abandon binary files

No.   The text files require too much knowledge and processing.   As long
as the binary file model exists, it doesn't matter how long that processing
takes (it is only really necessary to ever process ascii->binary once a year
or so), or what resources are needed to perform the conversion.  So, it
is entirely reasonable to run a program (for every year data is being 
generated) to calculate Israeli -> Gregorian conversions.   It isn't
reasonable to do that in every program that wants a time_t -> struct tm
conversion.

  | *	Change zic's output to another format

The current format seems to work pretty well,  the only thing (field
widths excepted, that's a trivial change) beyond that that you might
want to do, is profile some programs that do a lot of date conversions
(ls -l on a huge directory or something) and see if the file format is
adding unnecessary overhead, and if it is (and only if it is) consider
whether some optimisation might be made that could improve access in
the common cases (perhaps "this year" for the year the ascii->binary
conversion is done could be made really fast to access, on the
assumption that the current year is accessed more often than any other,
but only (*only*) if profiling suggests a detectable win will be
possible by adopting this approach).

  | *	Do we handle Julian/Gregorian transitions?
No.

  | *	Do we allow control of skipping the year zero?
No.

  | *	Do we handle early-Julian leap year variations?
No.

  | *	Do we handle pegging of far-past and far-future times?
If you're going to have a very wide allowable range, putting some
limits on what gets converted makes sense.  Pretending we know how
times and calendars will be done for thousands of years into the
future is absurd, just look at what has changed in the past few
hundred years - and decades for DST.   Handle times forward based
upon current assumptions for a couple of hundred years at most, and
treat all the rest as someone else's problem.   Backwards, 1970 is
far enough to be accurate.

  | *	For each new of the above:
  | 	*	What default assumption should be used in zic
  | 		and in run-time software?
Back to 1970, forward to today + 100 years (maybe 200, no more).
 
  | 	*	Can the assumption be overridden in a time zone source file?
Not worth the effort.  Aside from anything else, it means that applications
behave differently in different environments, and that's to be avoided if
at all possible.

  | 	*	Can the assumption be overridden using an environment
  | variable?
Definitely not.

  | 	*	Can the assumption be overridden with function calls?
No.   Any overriding means the data has to exist.   The data doesn't
exist (for neither the distant future, nor the distant past).   We're
still guessing just what rules some parts of the world used for daylight
saving in the past few years...

  | *	Do we simplify handling of events tied to
  | 	non-Gregorian-calendar-related events (such as Yom Kippur)?
By allowing an external script (and hence program) to supply rule
info when zic runs.   That's all that's needed.

  | *	Do we handle pre-Julian or non-Julian-Gregorian time schemes?
No.

kre




More information about the tz mailing list