[tz] Extra transition for Europe/London with 2023d

Derick Rethans derick at derickrethans.nl
Thu Jan 4 10:59:48 UTC 2024


On Wed, 3 Jan 2024, brian.inglis--- via tz wrote:

> On 2024-01-02 12:29, Derick Rethans wrote:
> > On Tue, 2 Jan 2024, brian.inglis--- via tz wrote:
> > 
> > > On 2024-01-02 04:29, Derick Rethans via tz wrote:
> > > > Hi,
> > > > I have just updated the tzdb for PHP, and one of our tests started
> > > > failing, and it turned out due to an unexpected data change:
> > > > Previously, the following transitions existed:

<snip>

> > > > I couldn't find anywhere in tzfile.5 or theory.html whether the last
> > > > generated transition must match a transition as specified with the
> > > > POSIX string (as it did with 2023c and earlier), but I vaguely
> > > > remember having read such a thing when I implemented the POSIX
> > > > string parsing logic. As far as I know so-far, the only effect it
> > > > has on PHP users is that they will now see an extra transition when
> > > > they enumerate them (the 1996-01-01 is inserted).
> > > >
> > > > I think I am mostly flagging this up because this was an unexpected
> > > > change.
> > >
> > > Check your installed data or paths and conversion code!
> > 
> > I am not using any installed data, and both of these were created by
> > zic, which is what I would consider the reference implementation.
> > 
> > > There was a leap second at that time, and regularly during the 1990s,
> > > so you seem to be using right/Europe/London:
> 
> > No, I am not.
> 
> Generated data rather than installed, and what selections, options, and
> parameters are you using to generate that data, including those for make
> and zic?
> 
> I am using tzcode 2023d zdump and zic, and tzdata 2023d make and zic
> parameters:
> 
> 	make DATAFORM=rearguard PACKRATDATA=backzone PACKRATLIST=zone.tab \
> 		VERSION_DEPS= tzdata.zi
>         mkdir -p zoneinfo/ zoneinfo/posix/ zoneinfo/right/
>         zic -b fat -d zoneinfo       -L /dev/null   tzdata.zi
>         zic -b fat -d zoneinfo/posix -L /dev/null   tzdata.zi
>         zic -b fat -d zoneinfo/right -L leapseconds tzdata.zi
> 
> Which data format version(s) are you reading and listing?

Just "-b slim", and none of the other environment vars (repro script below).
 
> > It is distinctly a change in data as output by zic, as my diff of the
> > created binary also show:
> > 
> > The change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24,
> > 0x00) 820454400 is exactly as what my tool
> > (https://github.com/derickr/timelib/blob/master/docs/show-tzinfo.c)
> > shows, the change from 1996-03-31 to 1996-01-01:
> >
> > -0x00, 0x00, 0x00, 0x31, 0x5D, 0xD9, 0x10, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02,
> > +0x00, 0x00, 0x00, 0x30, 0xE7, 0x24, 0x00, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02,
> > …
> > And the other change changes the associated typecount from 0x01 to 0x02:
> >
> > -0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
> > +0x02, 0x01, 0x02, 0x01, 0x02, 0x02, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
>
> > > but there is no change visible with zdump on default or POSIX 2023d:
> > > $ zdump -Vc1994,1998 Europe/London
> > > zdump -Vc1997,2025 -Vc1994,1998 Europe/London
> 
> > zdump apparently doesn't show this behaviour.
> >
> > I'm fairly certain that the output of zic itself changed. If I replace
> > the "zic.c" from 2023d with 2023c (and associated source files), the
> > data that my tool shows indeed reverts back to the expected:
> > …
> > 1995-03-26 01:00:00 UT (           796179600) =   1 [  3600 1   4 'BST' (0,0)]
> > 1995-10-22 01:00:00 UT (           814323600) =   2 [     0 0   8 'GMT' (0,0)]
> > 1996-03-31 01:00:00 UT (           828234000) =   1 [  3600 1   4 'BST' (0,0)]
> 
> It is possible with your make and zic selections, options, and parameters,
> that a bug generates unnecessary extra transition(s).
> 
> What generates your 2023.4 data file?

This is the repro set-up, no non-default arguments, or any of the
environment variables. I use the 2023.4 data file, and the 2023c/2023d code
releases.

---- >8 ---------- tzdata-repro.sh ------------------------------------------

mkdir /tmp/tzdata-repro
cd /tmp/tzdata-repro

# Download and extract code (2023c and 2023d):
wget https://data.iana.org/time-zones/releases/tzcode2023c.tar.gz
mkdir code-2023c && cd code-2023c && tar xvzf ../tzcode2023c.tar.gz && cd ..

wget https://data.iana.org/time-zones/releases/tzcode2023d.tar.gz
mkdir code-2023d && cd code-2023d && tar xvzf ../tzcode2023d.tar.gz && cd ..

# Download and extract data (2023d):
wget https://data.iana.org/time-zones/releases/tzdata2023d.tar.gz
cd code-2023c && tar xvzf ../tzdata2023d.tar.gz && cd ..
cd code-2023d && tar xvzf ../tzdata2023d.tar.gz && cd ..

# Build code
cd code-2023c && make zic && cd ..
cd code-2023d && make zic && cd ..

# Create data files for Europe
mkdir -p data-files/2023c data-files/2023d
./code-2023c/zic code-2023c/europe -d data-files/2023c -b slim
./code-2023d/zic code-2023d/europe -d data-files/2023d -b slim

# Show difference
diff <(xxd -g 1 data-files/2023c/Europe/London) <(xxd -g 1 data-files/2023d/Europe/London)

# Result:
# 86c86
# < 00000550: 00 00 00 31 5d d9 10 02 01 02 01 02 01 02 01 02  ...1]...........
# ---
# > 00000550: 00 00 00 30 e7 24 00 02 01 02 01 02 01 02 01 02  ...0.$..........
# 96c96
# < 000005f0: 02 01 02 01 02 01 ff ff ff b5 00 00 00 00 0e 10  ................
# ---
# > 000005f0: 02 01 02 01 02 02 ff ff ff b5 00 00 00 00 0e 10  ................

# And if you have timelib's show-tzinfo:
diff -u <(~/dev/derickr-timelib/docs/show-tzinfo Europe/London `pwd`/data-files/2023c) <(~/dev/derickr-timelib/docs/show-tzinfo Europe/London `pwd`/data-files/2023d)

# Result:
# --- /dev/fd/63	2024-01-04 10:47:50.296038022 +0000
# +++ /dev/fd/62	2024-01-04 10:47:50.296038022 +0000
# @@ -171,7 +171,7 @@
#  1994-10-23 01:00:00 UT (           782874000) =   2 [     0 0   8 'GMT' (0,0)]
#  1995-03-26 01:00:00 UT (           796179600) =   1 [  3600 1   4 'BST' (0,0)]
#  1995-10-22 01:00:00 UT (           814323600) =   2 [     0 0   8 'GMT' (0,0)]
# -1996-03-31 01:00:00 UT (           828234000) =   1 [  3600 1   4 'BST' (0,0)]
# +1996-01-01 00:00:00 UT (           820454400) =   2 [     0 0   8 'GMT' (0,0)]
# 
#                                             POSIX string: GMT0BST,M3.5.0/1,M10.5.0
#                                             std:   2 [     0 0   8 'GMT' (0,0)]

---- >8 ---------------------------------------------------------------------

Which shows the difference in output between zic-2023c and zic-2023d, both using
data-2023d. This difference is exactly as I described earlier, the transition
time change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00)
820454400 abd the type change for the last entry from 01 to 02.

> It might be useful for zdump to support a rawer -d debug/dump format showing
> rawer data in useful radixes to diagnose these cases.

Yes, that is why I had written my show-tzinfo when I wrote the tzdata reader for
PHP first in 2005:
https://github.com/php/php-src/commit/4fb4cac65c735a9253d7b77f17468a5768a7de13#diff-acfda8fcf0da62c66aa9d348e344be9d41cfd5e75f5ead54bafc62069c282cedR143

cheers,
Derick
-- 
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support
Host of PHP Internals News: https://phpinternals.news

mastodon: @derickr at phpc.social @xdebug at phpc.social
twitter: @derickr and @xdebug


More information about the tz mailing list