[tz] Issues with pre-1970 information in TZDB
Brooks Harris
brooks at edlmax.com
Thu Sep 23 18:28:01 UTC 2021
On 2021-09-23 12:47 PM, Paul Eggert wrote:
> On 9/22/21 11:31 AM, Brooks Harris via tz wrote:
>> Indeed the changes would have significant consequences to my current
>> (in development) tzdb parser which reads the source files directly
>> with no modifications to accumulate all time zones that have existed.
>
> If the goal is to accumulate all timezones that have ever existed in
> tzdb, the parser should read 'backzone', as 'backzone' has for some
> time been the repository for entries that were formerly Zones but are
> now Links in the default database.
>
> If the parser reads 'backzone', you shouldn't notice effects due to
> the recent alike-since-1970 changes. Otherwise the parser should be
> changed to read 'backzone', regardless of whether the recent
> alike-since-1970 changes are present.
>
> Even 'backzone' won't suffice to achieve the goal I mentioned, as I
> vaguely recall some deleted Zones never made it to backzone way back
> when. I don't recall the details, unfortunately. You can get most of
> the details by looking at the Git history, I expect, but it'd be a bit
> of a job. If you find out anything from that search, please let us
> know, as I expect these old deleted Zones should be put into
> 'backzone' though this is low priority.
>
>
Thanks Paul.
Yes, I'm reading backzone.
As mentioned my approach sought to first accumulate ALL the information
and to leave it to higher layers of the client to make choices for its
specific purpose. A particular example I'm concerned with is filtering
the tzdb data through CLDR windowsZones.xml. Other clients may make
other choices for other target OSs or applications.
I have a question. What is the criteria of what would qualify a zone or
rule set as "pre-1970"?
It seems the first rule set of many time zones relevant to a 1970 start
precedes 1970. For example:
# Zone NAME STDOFF RULES FORMAT [UNTIL]
Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58
-5:00 US E%sT 1920
-5:00 NYC E%sT 1942
-5:00 US E%sT 1946
-5:00 NYC E%sT 1967
-5:00 US E%sT
The last zone era begins in 1967. Other time zones may have much earlier
'starting points'. For instance
# Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
Rule Japan 1948 only - May Sat>=1 24:00 1:00 D
Rule Japan 1948 1951 - Sep Sat>=8 25:00 0 S
Rule Japan 1949 only - Apr Sat>=1 24:00 1:00 D
Rule Japan 1950 1951 - May Sat>=1 24:00 1:00 D
# Zone NAME STDOFF RULES FORMAT [UNTIL]
Zone Asia/Tokyo 9:18:59 - LMT 1887 Dec 31 15:00u
9:00 Japan J%sT
That has single zone era starting in 1887, and the latest DST rule in 1951.
It seems these time zones could not be changed or merged to backzone. So
what constitutes "pre-1970" and justifies a move to backzone? I gather
this 'merge' procedure began in 2015? Is there documentation or
discussion of that decision?
-Brooks
More information about the tz
mailing list