FW: Request from CLDR committee

Olson, Arthur David (NIH/NCI) olsona at dc37a.nci.nih.gov
Tue Aug 9 18:48:48 UTC 2005


Mark Davis is not on the time zone mailing list (at least not at the address
below); direct replies appropriately.

				--ado

----- Original Message -----
From: "Mark Davis" <mark.davis at icu-project.org>
To: "Tz (tz at elsie.nci.nih.gov)" <tz at lecserver.nci.nih.gov>
Cc: "Olson, Arthur David (NIH/NCI)" <olsona at dc37a.nci.nih.gov>
Sent: Tuesday, August 09, 2005 07:28
Subject: Request from CLDR committee


The CLDR technical committee has been looking at issues that have come up in
connection with timezones, and have the following requests of the TZ group.

Background.

As discussed previously on this list, the CLDR project supplies
localizations for timezone identifiers, based on the TZ IDs from the TZ
database. The localizations can either be explicit strings, or if
unavailable, use the translated country name (where a country has only a
single timezone) or fall back to the last field of the TZID and the
translated country name. Thus in German:

America/Havana => "Kuba"
Europe/Moscow => "Moskau (Russische Föderation)"
America/Los_Angeles => "Los Angeles (Vereinigte Staaten)"

The process is somewhat more complicated than this description, but this
provides the gist. For other examples, see
http://unicode.org/cldr/data/common/test/ with different locales.

(BTW, a corrigendum was issued for the GMT problem raised earlier; see
http://unicode.org/cldr/corrigenda.html).


Requests.

1. Missing Country Codes.

There are two missing ISO country codes. While these are uninhabited rocks,
they should be added according to Theory, which says:

"       Include at least one location per time zone rule set per country.
               One such location is enough.  Use ISO 3166 (see the file
               iso3166.tab) to help decide whether something is a country.
"

There are also good, practical reasons to do this; an implementation that
maps from country codes to sets of zones needs to have some value for all
ISO country codes.

The two ISO codes are:

HM  53 06 S, 72 31 E          Heard Island and McDonald Islands
BV  54 26 S, 3 24 E             Bouvet Island

(locations are from
http://www.cia.gov/cia/publications/factbook/geos/bv.html
http://www.cia.gov/cia/publications/factbook/geos/hm.html )

Suggested changes:

To zone.tab, add
HM      -5306+7231    Pacific/Heard
BV       -5426+0324    Atlantic/Bouvet

To antarctica, add
Zone    Atlantic/Bouvet 0:00     -           GMT
Zone    Pacific/Heard    5:00     -           GMT


2. Enabling Canonical IDs

The Link commands in the database establish equivalence classes between
TZIDs (aka "location names"). For an implementation like CLDR, it is
important that there be a completely stable canonical TZID that represents
any of those equivalents. Based on feedback from this list, we chose it to
be:
a) the TZID in zone.tab as of 2004a, or
b) any new TZID in a later version of zone.tab that is not equivalent to a
TZID introduced in an earlier version.

That is, we use America/Buenos_Aires since it was in 2004a, and
America/Argentina/Tucuman since it was introduced later.

One dependency we have is that the last field in the canonical ID be unique.
That is, we can't have both a Europe/London and an America/London. Now, if
the TZ database ever added a TZID that was not unique in this sense, we
could add our own canonical ID outside of the TZ database, but that is
clearly not our preference, not at all. This appears to be the practice in
the TZ database, as evidenced by the following in southamerica:

# Bahia (BA)
# There are too many Salvadors elsewhere, so use America/Bahia instead
# of America/Salvador.

We also depend on the feature that every equivalence class (except Etc/...)
has exactly one member in zone.tab.

To avoid having to hack around problems in the future, we would like this to
be captured in Theory as requirements for the construction of future IDs.
This is in no way a functional restriction, just on the choice of names.
Thus, we propose the addition of something like the following to the "rules
used for choosing location names" in Theory.

        All locations (the final field in a location name) must be unique.
Thus one cannot have Europe/London and either America/London or
America/Canada/London.

        Two location names that appear in zone.tab cannot be Linked
together, either directly or through a chain of Links. Conversely, every
location (except for those starting with "Etc") must be Linked to a location
name in zone.tab.


3. Definitional Links.

For the purpose of something like CLDR, it is important to separate out the
*definitional* equivalents from the *incidental* equivalents (equivalencies
that happen to be true for now, but could change in the future). You don't
want to include two TZIDs in the same definitional equivalence class if they
are ever different, or could be in the future, because then comparisons
between TZIDs (as equivalent) could be true now, but fail in the future.
After looking at the equivalence classes established by Link, it turns out
that there are a few anomalies.

Now, Theory says:

"       If all the clocks in a country's region have agreed since 1970,
               don't bother to include more than one location
               even if subregions' clocks disagreed before 1970.
               Otherwise these tables would become annoyingly large.
"

This makes a great deal of sense. After all, if we go back to when daylight
savings started, then every location on earth (that didn't share a longitude
with another location) would be a separate TZID.

However, there are a small number of anomalous cases. List A below contains
items that should be Linked, since they are always and will always be
equivalent (as far as TZ calculations go). List B contains cases that have
been the same since 1970, but are not Linked. So they appear to violate the
condition above in Theory. Conversely, List C contains cases that clearly
reference different locations, and thus before timezones were added, they
had different offsets (sun time). So if the same criteria are applied as in
List B, they would be unlinked.

So we request that the items in List A be linked, and each of the pairs in
List B and C be treated consistently:

Option 1. Leave (or make) the pair Linked, and pick one item in each pair,
and document that it is obsolete, and will never be unlinked from the other.

Option 2. Leave (or make) the pair Unlinked. If it was previously Linked,
then thus according to #2 above, one of the pair would be added to zone.tab.

And add to Theory, under "rules used for choosing location names".

    As of version X, whenever two location names have been linked in the
past, for stability they will remain linked forever.

=============

List A. TZIDs that are not linked, but are the same

            001      Etc/GMT
            001      Etc/UTC
            001      Etc/UCT -- not linked, identical

List B. TZIDs that are not linked, are different locations, but are the same
since 1970

            AQ      Antarctica/Mawson
            AQ      Antarctica/Vostok -- same since 57

            FM      Pacific/Truk
            FM      Pacific/Yap -- same since 70

            GB       Europe/Belfast
            GB       Europe/London -- same since 68

            ML      Africa/Bamako
            ML      Africa/Timbuktu -- same since 60

List C. TZIDs that are linked, but refer to different locations.
(This was derived by inspection; if there are other cases of IDs that refer
to different locations, please let us know.)

            AQ      Antarctica/McMurdo
            AQ      Antarctica/South_Pole  -- linked, different places

            SJ        Arctic/Longyearbyen
            SJ        Atlantic/Jan_Mayen -- linked, different places

            US       America/Denver
            US       America/Shiprock -- linked, different places

            AR       America/Argentina/Cordoba
            AR       America/Rosario -- linked, different places

            AU       Australia/Sydney
            AU       Australia/Canberra -- linked, different places

            BR       America/Rio_Branco
            BR       America/Porto_Acre -- linked, different places
            BR       Brazil/Acre ?

            IL         Asia/Jerusalem
            IL         Asia/Tel_Aviv -- linked, different places

            MD      Europe/Chisinau
            MD      Europe/Tiraspol -- linked, different places

            MX      America/Tijuana
            MX      America/Ensenada -- linked, different places

            US       America/Indianapolis
            US       America/Fort_Wayne -- linked, different places











More information about the tz mailing list