[tz] Extra transition for Europe/London with 2023d
Brooks Harris
brooks at edlmax.com
Tue Jan 2 21:53:32 UTC 2024
On 1/2/2024 2:29 PM, Derick Rethans via tz wrote:
> On Tue, 2 Jan 2024, brian.inglis--- via tz wrote:
>
>> On 2024-01-02 04:29, Derick Rethans via tz wrote:
>>> Hi,
>>> I have just updated the tzdb for PHP, and one of our tests started
>>> failing, and it turned out due to an unexpected data change:
>>> Previously, the following transitions existed:
>>> …
>>> 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)]
>>> 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)]
>>> 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)]
>>> 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)]
>>> 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
>>>
>>> POSIX string: > > GMT0BST,M3.5.0/1,M10.5.0
>>> std: 2 [ 0 0 8 'GMT' (0,0)]
>>> dst: 1 [ 3600 1 4 'BST' (0,0)]
>>>
>>> But now, they include an extra one for Jan 1st, 1996, with the March 31st
>>> one now not being the last one:
>>> …
>>> 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)]
>>> 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)]
>>> 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)]
>>> 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)]
>>> 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
>>>
>>> POSIX string: GMT0BST,M3.5.0/1,M10.5.0
>>> std: 2 [ 0 0 8 'GMT' (0,0)]
>>> dst: 1 [ 3600 1 4 'BST' (0,0)]
>>> I couldn't find anywhere in tzfile.5 or theory.html whether the last
>>> generated transition must match a transition as specified with the POSIX
>>> string (as it did with 2023c and earlier), but I vaguely remember having
>>> read such a thing when I implemented the POSIX string parsing logic.
>>> As far as I know so-far, the only effect it has on PHP users is
>>> that they will now see an extra transition when they enumerate them (the
>>> 1996-01-01 is inserted).
>>> I think I am mostly flagging this up because this was an unexpected change.
>> Check your installed data or paths and conversion code!
> I am not using any installed data, and both of these were created by zic, which
> is what I would consider the reference implementation.
>
>> There was a leap second at that time, and regularly during the 1990s, so you
>> seem to be using right/Europe/London:
> No, I am not.
>
> The new rule for 1996-01-01 says:
>
>>> 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
> The first "2" is the "tzh_typecount" value. It is 2, just like in the previous
> entry for 1995-10-22 01:00:00 UT:
>
>>> 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)]
> You can also see that the offset stays "0" for both, and both have the
> abbreviation "GMT".
>
> In the 2023c data file, that entry correctly has typecount 1 (for 1996-03-31):
>
>>> 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
> It is distinctly a change in data as output by zic, as my diff of the created
> binary also show:
>
> The change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00)
> 820454400 is exactly as what my tool
> (https://github.com/derickr/timelib/blob/master/docs/show-tzinfo.c) shows, the
> change from 1996-03-31 to 1996-01-01:
>
> -0x00, 0x00, 0x00, 0x31, 0x5D, 0xD9, 0x10, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02,
> +0x00, 0x00, 0x00, 0x30, 0xE7, 0x24, 0x00, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02,
>
> …
>
> And the other change changes the associated typecount from 0x01 to 0x02:
>
> -0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
> +0x02, 0x01, 0x02, 0x01, 0x02, 0x02, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
>
>
> No where comes the 'right''s leap second into play. If I turn these on by
> setting -L leapseconds when calling `zic`, then the output changes to the
> following, as expected (well, except for 1996-01-01 vs 1996-03-31):
>
> 1994-10-23 01:00:19 UT ( 782874019) = 2 [ 0 0 8 'GMT' (0,0)]
> 1995-03-26 01:00:19 UT ( 796179619) = 1 [ 3600 1 4 'BST' (0,0)]
> 1995-10-22 01:00:19 UT ( 814323619) = 2 [ 0 0 8 'GMT' (0,0)]
> 1996-01-01 00:00:20 UT ( 820454420) = 2 [ 0 0 8 'GMT' (0,0)]
> …
> 1994-07-01 00:00:18 UT ( 773020818) = 19
> 1996-01-01 00:00:19 UT ( 820454419) = 20
> 1997-07-01 00:00:20 UT ( 867715220) = 21
> …
>
> <snip>
>
>> but there is no change visible with zdump on default or POSIX 2023d:
>>
>> $ zdump -Vc1994,1998 Europe/London
>> zdump -Vc1997,2025 -Vc1994,1998 Europe/London
> zdump apparently doesn't show this behaviour.
>
> I'm fairly certain that the output of zic itself changed. If I replace the
> "zic.c" from 2023d with 2023c (and associated source files), the data that my
> tool shows indeed reverts back to the expected:
>
> …
> 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)]
> 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)]
> 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
>
>
> cheers,
> Derick
>
In my (as yet unpublished) work I've discovered that there are some
missing transitions produced by zic.c.
This example at Europe/London 1996 is but one example.
# Zone NAME STDOFF RULES FORMAT [UNTIL]
Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00s
0:00 GB-Eire %s 1968 Oct 27
1:00 - BST 1971 Oct 31 2:00u
0:00 GB-Eire %s 1996
0:00 EU GMT/BST
There is certainly a transition at 1996-01-01 00:00:00, from RULES
GB-Eire to EU and FORMAT from %s to GMT/BST:
0:00 GB-Eire %s 1996
0:00 EU GMT/BST
zic.c misses this transition, and it's important to the work I'm doing
because its necessary to lookup the *previous* transition in some
circumstances and if this transition is missing the returned previous
transition is incorrect.
I've found it necessary to refine zic.c, in particular portions of
outzone() and writezone(), including the code block commented as "**
Optimize", such that this transition (and others like it) are included
in the resulting TzIf files.
With this my adapted version of zdump then produces:
[157]
814323599 1995-10-22 01:59:59 isdst 1 gmtoff 3600 stdoff 0 BST
814323600 1995-10-22 01:00:00 isdst 0 gmtoff 0 stdoff 0 GMT
[158]
820454399 1995-12-31 23:59:59 isdst 0 gmtoff 0 stdoff 0 GMT
820454400 1996-01-01 00:00:00 isdst 0 gmtoff 0 stdoff 0 GMT
[159]
828233999 1996-03-31 00:59:59 isdst 0 gmtoff 0 stdoff 0 GMT
828234000 1996-03-31 02:00:00 isdst 1 gmtoff 3600 stdoff 0 BST
Note this 1996 transition does not produce a discontinuity in the YMDhms
sequence, which rolls over normally: 1995-12-31 23:59:59 to 1996-01-01
00:00:00. But its "at" time (820454400) and metadata are important.
There are many examples of this throughout the TzDb source file,
wherever a time zone "era" [UNTIL] is designated with only the year,
like this London example, 0:00 GB-Eire %s 1996.
Now in this special case of London in winter time that time-point falls
immediately after a leap-second. So the YMDhms sequence must go:
1995-12-31 23:59:59
1995-12-31 23:59:60 << leap-second
1996-01-01 00:00:00
In this case London is at the same STDOFF as UTC (0:00) so this sequence
is true for both methods of introducing leap-seconds; a) local
simultaneous with UTC (like tzdb "right") and b) "rolling leap-seconds",
both of which my work supports.
I suggest TzDb may want to have a look at this topic. I think If these
improvements were made it would not alter the typical current behavior
of localtime(); the YMDhms representations and sequences would remain
the same. But the addition of these transitions are more complete and
honest to the underlying TzDb source data and this is important for some
types of extended functionality I'm pursuing.
Thanks,
-Brooks
"Prediction is difficult, especially about the future."
More information about the tz
mailing list