Question on abbreviations

Mark Davis mark.davis at icu-project.org
Wed Sep 27 22:26:59 UTC 2006


Thanks for the explanations, that helps. Could the explanation of
transitions and the default for LETTER/S before the first year be added to
zic.8.txt?

As to UNTIL, it isn't just a confusion. The documentation says:

     Input lines are made up of fields.  Fields are separated
     from one another by any number of white space characters.
     Leading and trailing white space on input lines is ignored.

Thus saying that UNTIL is a single field with interior spaces contradicts
this (and is also clumsier to parse and harder to explain).

Mark

On 9/27/06, Ken Pizzini <tz. at explicate.org> wrote:
>
> On Wed, Sep 27, 2006 at 02:37:58PM -0700, Mark Davis wrote:
> > I share your confusion. If Paul (Eggert's) description is right, then I
> have
> > to ignore the TO field in some circumstances which are entirely unclear
> to
> > me. I would much rather see the TO field corrected. That is, if TO=1942
> is
> > ignored, and 1945 is the real date, then the line should be corrected to
> > TO=1945.
>
> The key to understanding is that the rules describe a list of
> *transitions*.
>
> After a transition, the described effect on zone offset and abbreviation
> *remain* in effect until the next transition.  The "TO" part of a rule is
> used to enable a shorthand for a _recurring_ transition, such as "first
> Tuesday of February", for all years within the range.  If "to" is
> "only", then the *transition* being documented is a singleton, but
> the transitioned-into offset/abbreviation remains in effect until the
> _next_ transition, no matter how far in the future.
>
>
> > There are other failures in the parsing. My error messages are:
> ...
> > I looked into why this is happening, and found:
> >
> > Zone Europe/Amsterdam    0:19:32 -    LMT    1835
> >            0:19:32    Neth    %s    1937 Jul  1
>
> > But the first LETTER/S defined by Neth is in 1916, so during the range
> from
> > 1835 to 1916 this is undefined. If the LETTER/S are magically also
> defined
> > *before* the first FROM, that should be described in the specification.
>
> Yes, this is a failure of the documentation.  If a Zone refers to a time
> within a Rule that is before the first transition mentioned for that rule,
> then the _oldest_standard_time_ "Letter/s" is used.  In this case, AMT.
>
>
>
> > BTW, the documentation was a first a bit confusing to me, since it says
> that
> > fields are delimited by spaces, and lists a single Zone UNTIL field.
> > However, if you look carefully at the documentation, there are really 4
> > fields:
> >
> > UNTIL_YEAR UNTIL_IN UNTIL_ON UNTIL_AT
> >
> > which are optional [but only in "truncation" from the end: that is, it
> > corresponds to the (Perl) regex (UNTIL_YEAR (UNTIL_IN (UNTIL_ON
> > (UNTIL_AT)?)?)?)?].
> >
> > I'm not the only one to have initially made this mistake: the proposed
> XML
> > format for the TZ database makes the same mistake.
>
> Confusing: granted.  Whether "Until" is one or multiple fields is a
> matter of interpretation.  The _traditional_ understanding is that it
> is a *single* "timestamp field" which may happen to have spaces within
> it.  BTW the subfields aren't "YEAR IN ON AT", but "YEAR MONTH DAY TIME".
>
> In this regard, a recent addition to the tzcode tarball is zoneinfo2tdf.pl
> ,
> which translates the more free-with-spaces zone tzdata into a form which
> strictly uses a single tab between fields.  This may make life easier
> for some by simplifying their parser's requirements.  (Or not.)
>
>                 --Ken Pizzini
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mm.icann.org/pipermail/tz/attachments/20060927/e7c068bb/attachment-0001.html 


More information about the tz mailing list