Yap decision

Paul Eggert eggert at CS.UCLA.EDU
Wed Feb 22 17:58:40 UTC 2006


Jesper Norgaard Welen <jnorgard at prodigy.net.mx> writes:

> From tz database version 2005l to current version (2006a) the timezone
> Indian/Yap was removed.

I assume you mean "Pacific/Yap"?  I don't see any record of any "Indian/Yap".

In 2005l Pacific/Yap was moved to the 'backward' file.  This was
suggested by Mark Davis, who noted that it was a duplicate entry with
Pacific/Truk (as it had the same time zone history since 1970).

This was noted in my 2005-08-17 proposed tz changes.  I also CC'ed
more-detailed email about this to <tz at lecserver.nci.nih.gov> but
perhaps it got lost due to the fact that I should have used
<tz at elsie.nci.nih.gov>.  I enclose another copy of the earlier message
below.  Note that some of the discussion is obsolete new (e.g., the
stuff about EST5EDT was fixed in a different way, in December).


From: Paul Eggert <eggert at cs.ucla.edu>
Date: Tue, 16 Aug 2005 11:12:45 -0700
To: "Mark Davis" <mark.davis at icu-project.org>
Cc: <tz at lecserver.nci.nih.gov>
Subject: Re: Request from CLDR committee

> From: "Mark Davis" <mark.davis at icu-project.org>
> Sent: Tuesday, August 09, 2005 07:28
>
> 1. Missing Country Codes.
> ...
> Suggested changes:
> To antarctica, add
> Zone    Atlantic/Bouvet 0:00     -           GMT
> Zone    Pacific/Heard    5:00     -           GMT

The problem with this approach, as I see it, is that the data are
purely invented.  Heard Island is not 5 hours ahead of GMT, since
nobody lives there.  In a practical sense, local time is undefined for
uninhabited locations, and it would be misleading for the tz database
to imply otherwise.

I suppose we could work around this by using a "zzz" entry, e.g.:

Zone Indian/Heard 0:00 - zzz

but I wonder whether this might cause more troubles than it'd cure.
"zzz" currently is used to denote local time for locations that are
sometimes inhabited and sometimes uninhabited.  However, this is a
hack: what is really wanted is a relation where the entries for local
time are simply absent while the location is uninhabited.

As far as I can tell, the country codes HM and BV are ISO 3166
curiosities.  For example, www.bv is now owned by Black & Veatch, a
privately-held firm that two of my cousins work for -- it has nothing
to do with Bouvet Island.  I wouldn't worry too much about these
glitches in the ISO database.

> 2. Enabling Canonical IDs
>
> One dependency we have is that the last field in the canonical ID be unique.
> That is, we can't have both a Europe/London and an America/London.

I wouldn't rely on this.  It's too constraining on the database.
Part of the point of having a tree-structured name space is the
desire to avoid constraints like these.

> This appears to be the practice in
> the TZ database, as evidenced by the following in southamerica:
>
> # Bahia (BA)
> # There are too many Salvadors elsewhere, so use America/Bahia instead
> # of America/Salvador.

That comment refers to the other Salvadors in America.  If there were
a Salvador in Africa (or even in America/Argentina) it wouldn't be a
problem.

> We also depend on the feature that every equivalence class (except Etc/...)
> has exactly one member in zone.tab.

That sounds reasonable.  How about if we add something to Theory
saying that zone.tab "is intended to be an exhaustive list of
canonical names for geographic regions."

> List A. TZIDs that are not linked, but are the same
>
>             001      Etc/GMT
>             001      Etc/UTC
>             001      Etc/UCT -- not linked, identical

They are not identical, since they use different abbreviations.  Hence
they cannot be linked.  For example, on my Debian GNU/Linux 3.1 box:

  $ TZ=Etc/GMT date
  Tue Aug 16 17:31:23 GMT 2005
  $ TZ=Etc/UTC date
  Tue Aug 16 17:31:27 UTC 2005

The difference in output is intentional.

How about if you simplify your life by simply ignoring the Etc/* names?

> List B. TZIDs that are not linked, are different locations, but are the same
> since 1970
>
>             AQ      Antarctica/Mawson
>             AQ      Antarctica/Vostok -- same since 57

As far as I know it's merely a coincidence that the two bases use the
same time zone.  Most crucially the bases belong to different
countries (so to some extent the "different country" rule applies,
though admittedly Antarctica is special) and use different supply
lines.  I'd rather leave them alone for now.

>             FM      Pacific/Truk
>             FM      Pacific/Yap -- same since 70
>             GB       Europe/Belfast
>             GB       Europe/London -- same since 68
>             ML      Africa/Bamako
>             ML      Africa/Timbuktu -- same since 60

OK, let's merge those.  It is more consistent to do so.  We'll keep
backward-compatibility entries, of course.  Sigh.  I sort of liked
having Timbuktu in the database.

> List C. TZIDs that are linked, but refer to different locations.

As a rule, these locations have identical time zone histories except
before the advent of standard time; and LMT (by definition) is
approximate, so I don't see the harm in keeping them linked.
If they are trouble for your system, perhaps you can keep a
list of exceptions and filter them out.

That being said, perhaps some of the "Link" lines should be moved from
northamerica (etc.)  to backward.  For example, the line

Link America/New_York EST5EDT

is in the "northamerica" file, but it's really there only for backward
compatibility with System V, so perhaps we should move it to
"backward".  Arthur, would you be open to that sort of thing?



More information about the tz mailing list