new time conversion interface

Nathan Myers ncm at best.com
Tue Nov 3 21:51:21 UTC 1998


Thank you, Dan, Paul, and Ian, for your comments.
 
DJB wrote (in his acerbic way):
> Nathan Myers writes:
> > I believe that conversions should yield a struct containing
> > a reference timestamp, followed by up to four offsets tagged with
> > official wall-clock offset, interpretation, and confidence values.  
> 
> How exactly do you expect programs to use this information?
> 
> I can imagine a warning such as ``The time is jumping from 2:00 CDT back
> to 1:00 CST; I assume you meant 1:30 CDT; if you actually wanted 1:30
> CST, type 1:30 CST.'' But this doesn't use your vague ``confidence
> values''; it uses hard data about the time zone.

Current programs handle time and time zones badly in large part
because programmers understand the problems poorly or not at all.
_That won't change._  Exposing programmers to more complexity than 
they understand, such as handing them a time zone transition database 
to explore (to see if their apparently successful conversion was in
fact dubious) will not help matters appreciably.  At best, it will
slow down conversions in correct programs by orders of magnitude.

The goal is to increase the number of programs that do something
meaningful, achieved by encapsulating complexity.  Certainly any 
program that needs to delve into a transition database should do 
so, but most should not.  Users presented with such a level of 
detail are ill-equipped to evaluate it anyway.  We improve matters 
by capturing our understanding in the code, and presenting a summary 
of meaningful results.

Yes, the annotations I wrote were vague -- as noted in the posting,
and as is appropriate in the sketch for a design.  In a firm design 
things become a lot more precise.  (Where I was too precise, e.g. 
"+/i 43200 seconds, it created a distraction.)

Different programs will use the information differently; that's
the *point*.  A program that just needs a low-precision timestamp 
can use t0 and ignore the quibbles.  An interactive program can 
present a confirmation query to the user.  Non-interactive programs 
might log parts of the conversion report along with the timestamp,
or use the report as a clue that they need to spend the time 
digging into the transition database -- otherwise a waste of time, 
in the common case.

Some programs "know" that the time being entered wasn't just 
read off a wall clock, and can ignore the likelihood of the clock 
not having been reset.  The conversion function doesn't know that, 
but the program can use the fact in its interpretation of the 
conversion result.

Paul Hill wrote: 
> Nathan has provided an interesting list of ambiguities, but I see
> a few problems with the list.  Nathan Myers wrote:
> 
> > 0.  Unambiguous time, e.g. 04:30 morning of a time change
> >
> >   t0=123456789
> >   offset 0: 0 sec; wall-clock offset: -3600 sec;
> >      interpretation: unambiguous; confidence: certain
> >   offset 0: -3600 sec; wall-clock offset: -3600 sec.
> >      interpretation: suggested substitute; confidence: doubtful
> >
> > 1. Spring ambiguity, enter 02:30 when it doesn't exist because
> >    civil time proceeded 01:59:59 -> 03:00:00.  (Or 02:00:00 to
> >    03:00:01?  I don't know.)
> >
> >   t0=123456789
> >   offset 0: 0 sec; wall-clock offset: -3600 sec.
> >      interpretation: suggested substitute; confidence: doubtful
> >
> > 2. Spring ambiguity, enter 02:30 when it doesn't exist because
> >    civil time proceeded 01:59:59 -> 03:00:00 (or whatever).
> >
> >   t0=123456789
> >   offset 0: 0 sec; wall-clock offset: 0 sec.
> >      interpretation: official choice; confidence: Nominally unambiguous
> >   offset 1: -3600 sec; wall-clock offset: 0 sec.
> >      interpretation: suggested substitute; confidence: doubtful
> >
> > 3. Autumn ambiguity, enter 01:30 on morning when civil time proceeds
> >    from 01:59:59 to 01:00; is it the first or second 01:30 event?
> >
> >   t0=123456789
> >   offset 0: 0 sec; wall-clock offset: 3600 sec.
> >      interpretation: ambiguous choice; confidence: equal alternative
> >   offset 1: 3600 sec; wall-clock offset: 0 sec.
> >      interpretation: ambiguous choice; confidence: equal alternative
> >
> > 4. Autumn, enter 02:30, same morning as above; did they mean the
> >    official 02:30, or did they mean the second 01:30 because they
> >    failed to reset their clock?
> >
> >   t0=123456789
> >   offset 0: 0 sec; wall-clock offset: 0 sec.
> >      interpretation: official choice; confidence: Nominally unambiguous
> >   offset 1: 3600 sec; wall-clock offset: 0 sec.
> >      interpretation: unofficial choice; confidence: Possible alternative
> 
> "Because they failed to reset their clock"!  That possibility could
> apply to all times both DLS and non-DLS during all days of the year
> (or at least during some fuzzy set period in and around each time
> change), so the additional entry in #4 (caused by reading from an
> unofficial clock that wasn't changed) should be the same as the
> additional entry in #0 (caused by reading from an unofficial clock
> that wasn't changed). 

We're dealing with _humans_, here.  It's a reasonable goal to try to 
increase the reliability of time entries by noting likely errors.  
The reality is that it is extremely common for clocks to go unchanged 
for a few hours (or even a day or two) after the "official" time change.  
(I have shown up for work an hour early myself, as a result; I never 
look at a clock most Sundays.)

A note that the time changed recently is *extremely* helpful when 
getting confirmation of a time entry, but most programmers are not
equipped or inclined to root about in a transition database, 
especially when the conversion function has just done so and is 
far better-equipped to report what it found.

> In the #0 your second possibility is "doubtful
> ...  substitution", in #4 you have "possible ... unofficial".  I
> don't see them as different.  What are you trying to suggest by
> having so many categories? How can you really differentiate between
> them?

I agree that the two examples are too similar.  Originally I had
#0 as the canonical "certain" time, and then realized that (as
you noted) that condition is rare.  At some point the likelihood
of an error due to a transition becomes smaller than that of a 
simple typo, which can only be determined empirically. 

> At a minimum this appears to suggest that there is a redundancy in
> your proposed catagories, but maybe I don't completely understand
> the use of the two offsets, which also differ between the two
> possibilities, but I can't see why they would.

Consider #0, then, to be the case the day after the time change.
Two days after, we can say that a simple data entry "typo" is 
equally likely and drop the alternative.
 
> Speaking of ambiguity, could you explain what your are really trying
> to capture in #1 and #2, because your descriptions are of the same
> circumstance.  Maybe you meant #1 is one half hour after the time
> change ("2:30" is a Standard Time) and #2 is one half-hour before
> the time change ("2:30" is a DLS Time). If so, these are also the
> result of reading a clock that wasn't reset correctly and putting
> that value in the tm struct, so why are your return results different
> by more than just a sign in the offset.

Er, that's a typo.  (Note the hour of the posting. Ironic, isn't
it? :-)  My apologies for the confusion.  In the original posting 
I had two lists and it was correct in the first list, but transcribed 
wrong.

The intended entry for #2 was 03:30.  This is a valid "official"
time, unlike 02:30.  However, it is very likely to be wrong.
 
> This seems to leave us with only one other possibility, #3, the
> one hour of time that really does correctly exist twice on a wall
> clock running in both Standard and DLS time correctly.
 
Yes, this is a case that unambiguously needs attention, and is
frequently assumed to be the only one such.  As Dan mentioned, a 
program that needs precision must sometimes root around in the 
transition database.  A conversion function can offer a starting 
point, and can tell us whether we might need to look.  Anything
it can do to make looking unnecessary is good if it doesn't 
add too much complexity.

Cases that are identical as far as the program is concerned
(time is an hour off because of a clock not reset) are quite
different in how the user experiences them.  That distinction
is worth preserving when presenting alternatives to the user
who entered a datum.

Nathan Myers
ncm at cantrip.org




More information about the tz mailing list