[tz] input needed on creation of a new sub-package for raw zone data

Paul Eggert eggert at cs.ucla.edu
Tue May 23 08:29:02 UTC 2017


Patsy Franklin wrote:

> We are planning to ship a new subpackage for users who want to have access
> to the raw zone data files e.g. leapseconds

This is a good idea overall; thanks. Here are some comments and suggestions for 
improvement.

First, as a terminology issue, we need a better name than "raw zone data". The 
files we're talking about are ordinary text files, and "raw" has the wrong 
connotation for text. Also, the package name "tzdata-zonedata" is repetitive and 
somewhat-confusing. Instead, how about a package name like "tzdata-info" or 
"tzdata-src" or something like that?

> Just as an example we would ship the following files:
> LICENSE

The LICENSE file conveys misleading information for the files in question, as 
they are all public domain, so let's not install it. Of course if you want to 
install all the source files as a package, then LICENSE should be included along 
with all the other files in the tzdb tarball; but as I understand it, the goal 
here is to install only the data source.

> africa
> antarctica
> asia
> australasia
> europe
> northamerica
> southamerica
> pacificnew
> etcetera
> backward
> systemv
> factory
> backzone

The installed source data should match the installed binary data, so the above 
list of files needs to be adjusted to match what's installed as binary data. For 
example, by default 'backzone' should be omitted since its data items are 
normally not installed.

Also, that's a long list of file names. I would rather not propagate 
implementation details like this list into the installation directory. Although 
the intent may be that "the raw zone data format may change", in practice what 
happens is that people depend on the format. So we might as well use a simple 
format rather than a complicated one; see below for a specific proposal.

 > iso3166.tab
 > zone1970.tab
 > zone.tab.

These files are already installed, and installing copies of them in a different 
directory would lead to operational problems. How about if we just leave them 
where they already are?

 > leapseconds
 > leap-seconds.list

We need not and probably should not ship two text files that contain the same 
leap-second info in different representations. As we're considering removing 
leap-seconds.list anyway, let's just install 'leapseconds' and skip 
leap-seconds.list.

 > version

I would rather that we didn't recommend installing this file in the tzdb source, 
as that would be a maintenance hassle and anyway the file is not needed to 
generate the binary data. Similarly, I don't think the installation directory's 
name should contain the tzdb version number, as others have proposed. Versioning 
should be an independent aspect of operations, and it should not be our job.


With the above in mind, here's a simpler proposal: We optionally install two 
text files: 'leapseconds' and a new file 'tzdata.zi' containing the parts of 
asia, australasia, etc. that are actually used to create the binary data.

The idea is that 'zic tzdata.zi' exactly re-creates the installed binary data 
files, and that 'zic -l leapseconds tzdata.zi' does the same for data with leap 
seconds. Programs that want text rather than binary data can read tzdata.zi (and 
optionally, 'leapseconds'). Because tzdata.zi uses the documented zic format, 
third-party tools can parse it. (".zi" stands for "zoneinfo": ".zi" is to zic as 
:.c: is to cc.)

We can install these two text files by default into the same directory as the 
already-installed text files iso3166.tab, zone1970.tab, and zone.tab.


More information about the tz mailing list