[tz] Proposal: validation text file with releases

Howard Hinnant howard.hinnant at gmail.com
Sat Jul 18 22:37:01 UTC 2015


On Jul 18, 2015, at 6:16 PM, Jon Skeet <skeet at pobox.com> wrote:
> 
> On 18 July 2015 at 23:01, Howard Hinnant <howard.hinnant at gmail.com> wrote:
>> On Jul 18, 2015, at 3:40 PM, Jon Skeet <skeet at pobox.com> wrote:
>> >
>> > Next update: I've improved the zdump-based generation of the data, and put the data in the current format for all the tz data releases I can find (from 1996 onwards) at http://nodatime.org/tzvalidate/
>> 
>> I’ve generated a version of tzdata2015e-tzvalidate.txt.zip from my code here:
>> 
>> http://howardhinnant.github.io/tzdata2015e-tzvalidate.txt.zip
>> 
> I saw your earlier message and hoped you were reading this thread too. Supporting code such as yours is precisely the motivation for this endeavour.

<nod> The exercise found some bugs in my code already. :-)

https://github.com/HowardHinnant/date/commit/a431164fcd79a43ba1f70c9ba70659a662e29875

>  
>> There are appear to be two kinds of differences:
>> 
>> 1.  I appear to start earlier than you, for example I have:
>> 
>> Africa/Algiers
>> 1891-03-14T23:48:48Z +00:09:21 standard PMT
>> 
>> and you do not.
>> 
> That much is simple to explain - the format I'm currently generating explicitly starts in 1905 and ends in 2035. The 1905 part was due to an earlier version of zdump I was using was limited to 1900.
> As per Paul's messages earlier in the thread, eventually we'll want to expose more data - although it's not clear how late it's worth going. (I doubt that it's worth extending beyond 2100 for example.)

Fwiw, I started at 1834-01-01T00:00:00 UTC.  I doubt going beyond 2038 would be worthwhile.

>  
>> 2.  This one has me more concerned:  When a zone specifies a rule/date combination and the date falls of the beginning of the rule table, I assume a “” variable part, where you appear to assume a “S” variable part.  For example, I have:
>> 
>> America/Barbados
>> 1924-01-01T03:58:29Z -03:58:29 standard BMT
>> 1932-01-01T03:58:29Z -04:00:00 standard AT
>> 1977-06-12T06:00:00Z -03:00:00 daylight ADT
>> 
>> And you have:
>> 
>> America/Barbados
>> 1924-01-01T03:58:29Z -03:58:29 standard BMT
>> 1932-01-01T03:58:29Z -04:00:00 standard AST
>> 1977-06-12T06:00:00Z -03:00:00 daylight ADT
>> 
> Just to be clear, this isn't "me" so much as "zic and then zdump". It happens that Noda Time (which is more "my" code) does the same thing though :)

Ok.

>  
>> The America/Barbados Zone switches to the Barb Rule on 1932-01-01T03:58:29Z, using the format A%sT.  But the first Barb Rule is 1977-06-12 2:00.  I looked for documentation for what is supposed to happen in a situation like this, but didn’t find anything.
>> 
> I think AST makes sense here (as it's standard time) but I agree that it's not clearly documented.
> 
> In Noda Time, if I don't find a rule leading "into" the  transition period, I take the name of the first rule with no daylight savings.
> See https://github.com/nodatime/nodatime/blob/20d57967e04f1b57a10c00910f337a1c3caf7522/src/NodaTime.TzdbCompiler/Tzdb/DateTimeZoneBuilder.cs#L127 for the code involved.
> 
> zic appears to implement equivalent behaviour, although I wouldn't like to pin down where.
> 
> I'd be interested in seeing whether your understanding of the data in natural language ties in with the comments expressed in DateTimeZoneBuilder at the link above, by the way.

I didn’t really have an understanding, and so guessed at {00:00, “”}.  I’m happy to implement whatever rule should be here (including the one you’ve described), and only ask that it be documented somewhere (besides the zic source code).  I checked the zic man page, but didn’t see it (I might have missed it).

Fwiw, here is the program I used to generate my version of this validation file:

http://codepad.org/JethTWsl

Howard



More information about the tz mailing list