[tz] Extra transition for Europe/London with 2023d

Guy Harris gharris at sonic.net
Fri Jan 5 23:31:17 UTC 2024


On Jan 5, 2024, at 2:16 PM, Stephen Colebourne via tz <tz at iana.org> wrote:

> I suspect that most people write TZDB source parsers because they want
> access to more data than the binary format provides. The source files
> are a wealth of information, which cannot be obtained in any other
> way.

Which information is that (other than transition dates/times and rules, which are mentioned later in your message)?

> For example, modern Java uses a list of historic transitions and
> encoded rules for future transitions. But some others prefer a list of
> transitions into the future (to some future year). I suspect the new
> format would supply both the rules and resolved transitions for future
> dates.

So what is the definition of a "transition" here?

The binary file obviously allows code that reads it to get information of the form "at date/time DT, one or more of {the offset from UTC, whether tm_isdst should be zero or non-zero, the time zone abbreviation} changes".  The tzcode doesn't happen to have APIs to *provide* that information, but that's a different matter.

Is there software that needs to know about transitions that change none of those?

> See https://github.com/jodastephen/tzdiff/blob/master/data/Europe-London.txt
> for the kind of data Java needs (transitions and rules).

So are there Java classes read those files and use them?

Or are they files produced by Java code that *uses* the data?

That file appears to list:

	the first entry in Zone Europe/London ("LMT: -00:01:15");

	a bunch of transitions, the first one being at 1847-12-01T00:00-00:01:15 and the last one being at 1997-10-26T02:00+01:00, with the date and time shown in ISO 8601 format (with 1997-10-26T02:00+01:00 meaning year 1997, month October, day 26, at 2:00 local time), "Gap" presumably meaning "the clock is turned forward" and "Overlap" presumably meaning "the clock is turned back", and with "to XXX" meaning "the offset from (proleptic?) UTC switches to XXX, with "Z" meaning "the offset from UTC is zero");

	two rules that are, I guess, presumed to cover all times after 1997-10-26T02:00+01:00.

Most of those transitions are generated by rules for GB-Eire, but those rules are *not* in that file, even though they *are* in the europe source file.  What is the criterion for when it switches from showing transitions to showing rules?

> Issues such as the negative daylight savings flag go away. The
> alternate format would simply supply both flags. eg. for Europe/Dublin
> winter would have something like "dstLegal=true" and
> "dstSummer=false".

So what do the (presumed) Booleans "dstLegal" and "dstSummer" mean here?

> Note that it would basically need to expose all data in the source
> files (otherwise people will keep on parsing the source files).

"Data" presumably meaning "not comments".

> Ideally, the final TZif binary format would be derived from the new
> alternate format, thus the flow would be TZ source files (intended for
> internal TZDB use only) -> TZ JSON -> TZif binary

"Internal TZDB use" presumably meaning "kept in the TZDB repository and shipped as part of a TZDB 'source code release', but not installed on systems using the tzdb?  They also serve as human-readable text; JSON is "human-readable" in that it's a textual format, but I'm not sure I'd call it "human-readable" in the sense of "every bit as easy for a human to read as zic source is".

I'd be more inclined to treat the JSON format as an alternative compiled format.


More information about the tz mailing list