[tz] draft of change summary for next tz release

Stephen Colebourne scolebourne at joda.org
Wed Sep 18 12:22:11 UTC 2013

On 18 September 2013 10:59, Meno Hochschild <mhochschild at gmx.de> wrote:
> So I support the removal, since I think the discussed data obviously appear
> to be of very questionable nature. We cannot even consider the discussed
> data as "our best estimate".  Of course, there is no 100% guarantee - no
> black and white. If someone can know it better then it is easy to add the
> lost data again. But we should really not let the users in the state of
> wrong assumptions. Stability of data should not be the primary concern,
> rather correctness. And most users (near 100%) don't care about old that is
> to say archeological timezone data.

Pre-1970 data matters to some more than others. I can see a range of positions:
a) delete all pre-1970 data
b) only have Zones for areas distinct after 1970, other IDs are Links,
full data where available for each Zone
c) only have IDs for areas distinct after 1970, full data where
available for each ID
d) create new IDs where data only differs before 1970
I'm arguing for (c), which I previously believed was the tzdb's goal.
The data deletion is based on (b).

The quality of data deleted is also of different value to different
people. I'll try to explain it in a different way...

We know that the quality of the historic data for the Carribean is
dubious. Lets give it an accuracy rating of 20%. One argument is that
removing data with 20% accuracy is a good thing, and that is an
understandable position. However, its important to look at the
consequences of the deletion. Previously, location A (eg. Guadeloupe)
had a 20% accuracy rating for its historic data. After the change it
still has a 20% accuracy rating just with different data. But, that
20% accuracy rating refers to location B (eg. Port of Spain). From the
perspective of location A, the accuracy is now lower, say 5%, because
20% accurate data for location B clearly is even less accurate for
location A.

I understand that the distinction here is fine. But its rather like
saying "we published a guess 10 years ago for when the first factory
opened in Brussels, but it now OK to replace that data with the guess
we made for when the first factory opened in London" (assuming we
recorded factory opening dates). I value each guess being distinct for
each location in the absence of better information.


More information about the tz mailing list