[tz] 2021e Asia/Dushanbe

Nick Deguillaume nickledeg at gmail.com
Fri May 13 20:51:06 UTC 2022


Thanks for the message.

Yes, you were correct in thinking that
*If the FORMAT field contains either a "%s" or a '/' then the RULES field
must contain a named rule.*
 was a trial rule of my own invention for my own parser implementation. I
saw a pattern in the historic data and was curious as to whether it applied
everywhere. I have now modified my checks so that all files pass.

To be clear, I am not suggesting that the zic compiler mishandles any old
data files. Neither am I suggesting that there are any errors in the zic
documentation.
When I was referring to data being at slight variance to the documentation,
the documentation I was referring to was:
https://data.iana.org/time-zones/tz-link.html and
https://data.iana.org/time-zones/tz-how-to.html
I now recognise that I would have been better off using the zic
documentation as my primary source.

Nonetheless, here are a few things I have found:

1. tz_link.html  states that:
*Sources for the tz database are UTF-8 text files... *
Some of the comments in some of the old files contain non UTF-8 single byte
representations of accented letters. Since such occurrences are in the
comments this will not affect anything.

2. The  tz_how-to.html states that:
*Prior to the 2020b release, it was called the TYPE field, though it was
never used in the main data ...*
However, some of the old data in https://data.iana.org/time-zones/releases/
contains "even" and "odd" to account for the Adeleide festival. (I got
round this by excluding the versions of the Australia/Adeleide exhibiting
"even" and "odd".)

3. The  tz_how-to.html states that:

*The FORMAT column specifies the usual abbreviation of the time zone name.
It can have one of three forms:a string of three or more characters that
are either ASCII alphanumerics, “+”, or “-”, in which case that’s the
abbreviation ...*
I had to allow an underscore and space to allow all the files to pass. In
the case of St. Helena I also had  to allow a '?' as the first character.
Further, I had to allow an abbreviation in a '/' separated format to be
only two characters.(I recognise that this is not technically in violation
of the statement above.)

4. I can see that some of the older files use a '?' where the more modern
files use '%s'. This is not mentioned in the tz_how-to.html documentation,
I recognise that putting such obscurities in the document may not be a good
idea.

As you can see these are all very minor things. I appreciate your quick
responses.

Regards

Nick

On Fri, 13 May 2022 at 20:20, Paul Eggert <eggert at cs.ucla.edu> wrote:

> On 5/13/22 09:35, Tim Parenti via tz wrote:
>
> > I'm not sure where your "must contain a named rule" quote is coming from
>
> I imagine this was a style rule of Nick's invention. Violating the rule
> might issue a warning but it shouldn't be a fatal error, as the 'asia'
> file was correct as-is.
>
> > in this case the "rule" is not named, *per se*, but is rather a constant
> > 1:00.  The relevant thing is that the RULES field is not "-".
>
> Or more precisely it's that the RULES column is neither "-", nor a
> suffixless zero offset, nor an offset with an "s" suffix. We don't use
> any of these more-obscure features in TZDB data but they're in the .zi
> format.
>
> Since TZDB consistently avoids '/' in the many other places where this
> situation arises, it should avoid '/' here for stylistic consistency. So
> I installed the attached proposed patches. The 1st patch omits the '/'
> in question; the 2nd documents that STDOFF columns don't have suffixes
> (this wasn't clear in the man page, and I discovered this while looking
> into the 3rd patch), and the 3rd adds a style check for this.
>
> None of these patches affect the TZif output files.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/tz/attachments/20220513/827a3357/attachment.html>


More information about the tz mailing list