[tz] Dealing with Pre-1970 Data

Stephen Colebourne scolebourne at joda.org
Sat Aug 31 23:39:27 UTC 2013

On 31 August 2013 22:06, Paul Eggert <eggert at cs.ucla.edu> wrote:
> Lester Caine wrote:
>> having a single repository for ALL available data should be a goal?
> Sure, it's a worthy goal, even if pre-1970 timestamps are
> currently out of scope for the current database.  We could
> collect all the data that we can for an extended database
> that contains new zones that differ from existing ones only
> for pre-1970 timestamps.  We could then derive the current
> database by applying a filter to the extended database,
> along the lines that Zefram suggested.  This filtering could
> be done automatically and at the source level, so existing
> tz source file readers would not need to be changed, and we
> wouldn't have to maintain two copies of the database.

I can see no reason why an optional additional file "extended" could
not exist for new zone IDs that only exist to record history before
1970. Such additional zones would not appear in zone.tab. Most people
would just ignore the "extended" file.

However, I would argue that any zone ID that already exists (or is
newly created) should have its full pre-1970 history retained and
enhanced within the main tzdb files, so all current consumers simply
pickup the enhancements.

> As I understand it, Stephen wouldn't oppose the existence of
> a filter per se, but is uneasy about having the default
> filter being set to 1970.  But I'm afraid the filtering
> approach won't work unless we continue to filter at 1970
> as we have regularly done in the past.  Too much
> existing practice is based in the 1970 cutoff, and (as now
> explained in the Theory file) the 1970 cutoff is not really
> that arbitrary -- rather, it corresponds roughly with the
> advent of computerized timekeeping and of a greater need for
> standardized civil time.

Filtering data applies to database consumers via zic, not the database
itself. The database itself should not be limited to storing data
after 1970. If someone makes a contribution to an existing ID before
1970, that data should be included in the main files. Whether that
contribution causes a Link to become a full Zone should never be a
relevant factor.

It might well be that this is slightly more broad than previously, but
it is not overly onerous. There are relatively few Links in the tzdb
that exist to share data (as opposed to simple renames). Only if
someone researches the history of one of those locations is there a
need to convert the Link to a Zone.

ie. the notion that "its before 1970 so we don't care" needs to be
toned down a little. Still a useful guide, but if real data is
available pre-1970, it is only responsible to keep and retain it.

Final note. LMT can now be safely ignored in this discussion.


More information about the tz mailing list