[tz] Use cases for source data

Gilmore Davidson gilmoreorless at gmail.com
Fri Oct 8 01:33:06 UTC 2021


Stephen Colebourne's recent attempts to define requirements and use cases
for tzdb got me thinking about the source data files.

There have been several discussions over the years regarding the intended
stability of the source format, separate from TZif binary files. These
generally arise when the answer to a data query is "use this special
compile-time flag". The most recent example is from Stephen at
http://mm.icann.org/pipermail/tz/2021-September/030561.html

The subtlety is in how the data set is consumed. While many downstream
> projects use the makefile, not all do. A significant portion of
> downstream users make use of the source files directly, with their own
> parsers. ie. there is no ability to use a compile-time option. Those
> parsers are not setup to use backzone even if it were a valid option
>

I think some of these discussions come about because there's disagreement
about the intended use of the source files. The impression I've got from
this list over the years is that the source files weren't intended to be a
stable API; only the compiled TZif binary files were intended to be
consumed downstream (and in later years, the .zi text representations). But
many projects do use the source files as their primary interface, leading
to tension when the format changes.

So this is my attempt to list the use cases that people have for the source
data files. This could prompt a wider discussion about what counts as a
defined, stable interface for tzdb. Not every consumer of tzdb data is
relying solely on the pre-compiled binary files.

NOTE: This is a separate issue from the recent pre-/post-1970 data
discussions. I'm trying to stay away from wading into that topic.

-----------------------

Consumers of the source data text files:

1) The zic compiler for producing TZif binary files and .zi text files in
an official tzdb release.

2) People reading the file comments to understand the history of certain
zones.

3) Libraries/projects that need to parse/compile the data into a different
representation than that provided by zic. This is often, but not
exclusively, to preserve the distinction between Link and Zone entries,
either for API or data compression reasons.

4) Data analysis or visualisations that care about knowing the higher-level
Rule definitions ("transition on the last Sunday of October every year")
rather than the exact transition timestamps for each year.

5) IDE or text editor plugins that provide syntax highlighting for the
source files.

All of these are based just on my own experiences and from lurking on this
list. Any corrections/additions are welcome.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/tz/attachments/20211008/bab11911/attachment.html>


More information about the tz mailing list