[tz] Extra transition for Europe/London with 2023d

Brooks Harris brooks at edlmax.com
Tue Jan 2 21:53:32 UTC 2024


On 1/2/2024 2:29 PM, Derick Rethans via tz wrote:
> On Tue, 2 Jan 2024, brian.inglis--- via tz wrote:
>
>> On 2024-01-02 04:29, Derick Rethans via tz wrote:
>>> Hi,
>>> I have just updated the tzdb for PHP, and one of our tests started
>>> failing, and it turned out due to an unexpected data change:
>>> Previously, the following transitions existed:
>>>>>> 1994-03-27 01:00:00 UT (           764730000) =   1 [  3600 1   4 'BST' (0,0)]
>>> 1994-10-23 01:00:00 UT (           782874000) =   2 [     0 0   8 'GMT' (0,0)]
>>> 1995-03-26 01:00:00 UT (           796179600) =   1 [  3600 1   4 'BST' (0,0)]
>>> 1995-10-22 01:00:00 UT (           814323600) =   2 [     0 0   8 'GMT' (0,0)]
>>> 1996-03-31 01:00:00 UT (           828234000) =   1 [  3600 1   4 'BST' (0,0)]
>>>
>>>                                              POSIX string: > > GMT0BST,M3.5.0/1,M10.5.0
>>>                                              std:   2 [     0 0   8 'GMT' (0,0)]
>>>                                              dst:   1 [  3600 1   4 'BST' (0,0)]
>>>
>>> But now, they include an extra one for Jan 1st, 1996, with the March 31st
>>> one now not being the last one:
>>>>>> 1994-03-27 01:00:00 UT (           764730000) =   1 [  3600 1   4 'BST' (0,0)]
>>> 1994-10-23 01:00:00 UT (           782874000) =   2 [     0 0   8 'GMT' (0,0)]
>>> 1995-03-26 01:00:00 UT (           796179600) =   1 [  3600 1   4 'BST' (0,0)]
>>> 1995-10-22 01:00:00 UT (           814323600) =   2 [     0 0   8 'GMT' (0,0)]
>>> 1996-01-01 00:00:00 UT (           820454400) =   2 [     0 0   8 'GMT' (0,0)]
>>>
>>>                                              POSIX string: GMT0BST,M3.5.0/1,M10.5.0
>>>                                              std:   2 [     0 0   8 'GMT' (0,0)]
>>>                                              dst:   1 [  3600 1   4 'BST' (0,0)]
>>> I couldn't find anywhere in tzfile.5 or theory.html whether the last
>>> generated transition must match a transition as specified with the POSIX
>>> string (as it did with 2023c and earlier), but I vaguely remember having
>>> read such a thing when I implemented the POSIX string parsing logic.
>>> As far as I know so-far, the only effect it has on PHP users is
>>> that they will now see an extra transition when they enumerate them (the
>>> 1996-01-01 is inserted).
>>> I think I am mostly flagging this up because this was an unexpected change.
>> Check your installed data or paths and conversion code!
> I am not using any installed data, and both of these were created by zic, which
> is what I would consider the reference implementation.
>
>> There was a leap second at that time, and regularly during the 1990s, so you
>> seem to be using right/Europe/London:
> No, I am not.
>
> The new rule for 1996-01-01 says:
>
>>> 1996-01-01 00:00:00 UT (           820454400) =   2 [     0 0   8 'GMT' (0,0)]
> The first "2" is the "tzh_typecount" value. It is 2, just like in the previous
> entry for 1995-10-22 01:00:00 UT:
>
>>> 1995-10-22 01:00:00 UT (           814323600) =   2 [     0 0   8 'GMT' (0,0)]
> You can also see that the offset stays "0" for both, and both have the
> abbreviation "GMT".
>
> In the 2023c data file, that entry correctly has typecount 1 (for 1996-03-31):
>
>>> 1996-03-31 01:00:00 UT (           828234000) =   1 [  3600 1   4 'BST' (0,0)]
> It is distinctly a change in data as output by zic, as my diff of the created
> binary also show:
>
> The change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00)
> 820454400 is exactly as what my tool
> (https://github.com/derickr/timelib/blob/master/docs/show-tzinfo.c) shows, the
> change from 1996-03-31 to 1996-01-01:
>
> -0x00, 0x00, 0x00, 0x31, 0x5D, 0xD9, 0x10, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02,
> +0x00, 0x00, 0x00, 0x30, 0xE7, 0x24, 0x00, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02,
>
>>
> And the other change changes the associated typecount from 0x01 to 0x02:
>
> -0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
> +0x02, 0x01, 0x02, 0x01, 0x02, 0x02, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
>
>
> No where comes the 'right''s leap second into play. If I turn these on by
> setting -L leapseconds when calling `zic`, then the output changes to the
> following, as expected (well, except for 1996-01-01 vs 1996-03-31):
>
> 1994-10-23 01:00:19 UT (           782874019) =   2 [     0 0   8 'GMT' (0,0)]
> 1995-03-26 01:00:19 UT (           796179619) =   1 [  3600 1   4 'BST' (0,0)]
> 1995-10-22 01:00:19 UT (           814323619) =   2 [     0 0   8 'GMT' (0,0)]
> 1996-01-01 00:00:20 UT (           820454420) =   2 [     0 0   8 'GMT' (0,0)]
>> 1994-07-01 00:00:18 UT (           773020818) = 19
> 1996-01-01 00:00:19 UT (           820454419) = 20
> 1997-07-01 00:00:20 UT (           867715220) = 21
>>
> <snip>
>
>> but there is no change visible with zdump on default or POSIX 2023d:
>>
>> $ zdump -Vc1994,1998 Europe/London
>> zdump -Vc1997,2025 -Vc1994,1998 Europe/London
> zdump apparently doesn't show this behaviour.
>
> I'm fairly certain that the output of zic itself changed. If I replace the
> "zic.c" from 2023d with 2023c (and associated source files), the data that my
> tool shows indeed reverts back to the expected:
>
>> 1995-03-26 01:00:00 UT (           796179600) =   1 [  3600 1   4 'BST' (0,0)]
> 1995-10-22 01:00:00 UT (           814323600) =   2 [     0 0   8 'GMT' (0,0)]
> 1996-03-31 01:00:00 UT (           828234000) =   1 [  3600 1   4 'BST' (0,0)]
>
>
> cheers,
> Derick
>
In my (as yet unpublished) work I've discovered that there are some 
missing transitions produced by zic.c.
This example at Europe/London 1996 is but one example.

# Zone    NAME        STDOFF    RULES FORMAT    [UNTIL]
Zone    Europe/London    -0:01:15 -    LMT    1847 Dec  1  0:00s
              0:00    GB-Eire    %s    1968 Oct 27
              1:00    -    BST    1971 Oct 31  2:00u
              0:00    GB-Eire    %s    1996
              0:00    EU    GMT/BST

There is certainly a transition at 1996-01-01 00:00:00, from RULES 
GB-Eire to EU and FORMAT from %s to GMT/BST:
             0:00    GB-Eire    %s    1996
             0:00    EU    GMT/BST

zic.c misses this transition, and it's important to the work I'm doing 
because its necessary to lookup the *previous* transition in some 
circumstances and if this transition is missing the returned previous 
transition is incorrect.

I've found it necessary to refine zic.c, in particular portions of 
outzone() and writezone(), including the code block commented as "** 
Optimize", such that this transition (and others like it) are included 
in the resulting TzIf files.

With this my adapted version of zdump then produces:
[157]
   814323599 1995-10-22 01:59:59 isdst 1 gmtoff   3600 stdoff 0 BST
   814323600 1995-10-22 01:00:00 isdst 0 gmtoff      0 stdoff 0 GMT
[158]
   820454399 1995-12-31 23:59:59 isdst 0 gmtoff      0 stdoff 0 GMT
   820454400 1996-01-01 00:00:00 isdst 0 gmtoff      0 stdoff 0 GMT
[159]
   828233999 1996-03-31 00:59:59 isdst 0 gmtoff      0 stdoff 0 GMT
   828234000 1996-03-31 02:00:00 isdst 1 gmtoff   3600 stdoff 0 BST

Note this 1996 transition does not produce a discontinuity in the YMDhms 
sequence, which rolls over normally: 1995-12-31 23:59:59 to 1996-01-01 
00:00:00. But its "at" time (820454400) and metadata are important.

There are many examples of this throughout the TzDb source file, 
wherever a time zone "era" [UNTIL] is designated with only the year, 
like this London example, 0:00    GB-Eire    %s    1996.

Now in this special case of London in winter time that time-point falls 
immediately after a leap-second.  So the YMDhms sequence must go:
1995-12-31 23:59:59
1995-12-31 23:59:60  << leap-second
1996-01-01 00:00:00
In this case London is at the same STDOFF as UTC (0:00) so this sequence 
is true for both methods of introducing leap-seconds; a) local 
simultaneous with UTC (like tzdb "right") and b) "rolling leap-seconds", 
both of which my work supports.

I suggest TzDb may want to have a look at this topic.  I think If these 
improvements were made it would not alter the typical current behavior 
of localtime(); the YMDhms representations and sequences would remain 
the same. But the addition of these transitions are more complete and 
honest to the underlying TzDb source data and this is important for some 
types of extended functionality I'm pursuing.

Thanks,
-Brooks

"Prediction is difficult, especially about the future."




More information about the tz mailing list