FW: Request from CLDR committee
Olson, Arthur David (NIH/NCI)
olsona at dc37a.nci.nih.gov
Tue Aug 9 18:48:48 UTC 2005
Mark Davis is not on the time zone mailing list (at least not at the address
below); direct replies appropriately.
----- Original Message -----
From: "Mark Davis" <mark.davis at icu-project.org>
To: "Tz (tz at elsie.nci.nih.gov)" <tz at lecserver.nci.nih.gov>
Cc: "Olson, Arthur David (NIH/NCI)" <olsona at dc37a.nci.nih.gov>
Sent: Tuesday, August 09, 2005 07:28
Subject: Request from CLDR committee
The CLDR technical committee has been looking at issues that have come up in
connection with timezones, and have the following requests of the TZ group.
As discussed previously on this list, the CLDR project supplies
localizations for timezone identifiers, based on the TZ IDs from the TZ
database. The localizations can either be explicit strings, or if
unavailable, use the translated country name (where a country has only a
single timezone) or fall back to the last field of the TZID and the
translated country name. Thus in German:
America/Havana => "Kuba"
Europe/Moscow => "Moskau (Russische Föderation)"
America/Los_Angeles => "Los Angeles (Vereinigte Staaten)"
The process is somewhat more complicated than this description, but this
provides the gist. For other examples, see
http://unicode.org/cldr/data/common/test/ with different locales.
(BTW, a corrigendum was issued for the GMT problem raised earlier; see
1. Missing Country Codes.
There are two missing ISO country codes. While these are uninhabited rocks,
they should be added according to Theory, which says:
" Include at least one location per time zone rule set per country.
One such location is enough. Use ISO 3166 (see the file
iso3166.tab) to help decide whether something is a country.
There are also good, practical reasons to do this; an implementation that
maps from country codes to sets of zones needs to have some value for all
ISO country codes.
The two ISO codes are:
HM 53 06 S, 72 31 E Heard Island and McDonald Islands
BV 54 26 S, 3 24 E Bouvet Island
(locations are from
To zone.tab, add
HM -5306+7231 Pacific/Heard
BV -5426+0324 Atlantic/Bouvet
To antarctica, add
Zone Atlantic/Bouvet 0:00 - GMT
Zone Pacific/Heard 5:00 - GMT
2. Enabling Canonical IDs
The Link commands in the database establish equivalence classes between
TZIDs (aka "location names"). For an implementation like CLDR, it is
important that there be a completely stable canonical TZID that represents
any of those equivalents. Based on feedback from this list, we chose it to
a) the TZID in zone.tab as of 2004a, or
b) any new TZID in a later version of zone.tab that is not equivalent to a
TZID introduced in an earlier version.
That is, we use America/Buenos_Aires since it was in 2004a, and
America/Argentina/Tucuman since it was introduced later.
One dependency we have is that the last field in the canonical ID be unique.
That is, we can't have both a Europe/London and an America/London. Now, if
the TZ database ever added a TZID that was not unique in this sense, we
could add our own canonical ID outside of the TZ database, but that is
clearly not our preference, not at all. This appears to be the practice in
the TZ database, as evidenced by the following in southamerica:
# Bahia (BA)
# There are too many Salvadors elsewhere, so use America/Bahia instead
# of America/Salvador.
We also depend on the feature that every equivalence class (except Etc/...)
has exactly one member in zone.tab.
To avoid having to hack around problems in the future, we would like this to
be captured in Theory as requirements for the construction of future IDs.
This is in no way a functional restriction, just on the choice of names.
Thus, we propose the addition of something like the following to the "rules
used for choosing location names" in Theory.
All locations (the final field in a location name) must be unique.
Thus one cannot have Europe/London and either America/London or
Two location names that appear in zone.tab cannot be Linked
together, either directly or through a chain of Links. Conversely, every
location (except for those starting with "Etc") must be Linked to a location
name in zone.tab.
3. Definitional Links.
For the purpose of something like CLDR, it is important to separate out the
*definitional* equivalents from the *incidental* equivalents (equivalencies
that happen to be true for now, but could change in the future). You don't
want to include two TZIDs in the same definitional equivalence class if they
are ever different, or could be in the future, because then comparisons
between TZIDs (as equivalent) could be true now, but fail in the future.
After looking at the equivalence classes established by Link, it turns out
that there are a few anomalies.
Now, Theory says:
" If all the clocks in a country's region have agreed since 1970,
don't bother to include more than one location
even if subregions' clocks disagreed before 1970.
Otherwise these tables would become annoyingly large.
This makes a great deal of sense. After all, if we go back to when daylight
savings started, then every location on earth (that didn't share a longitude
with another location) would be a separate TZID.
However, there are a small number of anomalous cases. List A below contains
items that should be Linked, since they are always and will always be
equivalent (as far as TZ calculations go). List B contains cases that have
been the same since 1970, but are not Linked. So they appear to violate the
condition above in Theory. Conversely, List C contains cases that clearly
reference different locations, and thus before timezones were added, they
had different offsets (sun time). So if the same criteria are applied as in
List B, they would be unlinked.
So we request that the items in List A be linked, and each of the pairs in
List B and C be treated consistently:
Option 1. Leave (or make) the pair Linked, and pick one item in each pair,
and document that it is obsolete, and will never be unlinked from the other.
Option 2. Leave (or make) the pair Unlinked. If it was previously Linked,
then thus according to #2 above, one of the pair would be added to zone.tab.
And add to Theory, under "rules used for choosing location names".
As of version X, whenever two location names have been linked in the
past, for stability they will remain linked forever.
List A. TZIDs that are not linked, but are the same
001 Etc/UCT -- not linked, identical
List B. TZIDs that are not linked, are different locations, but are the same
AQ Antarctica/Vostok -- same since 57
FM Pacific/Yap -- same since 70
GB Europe/London -- same since 68
ML Africa/Timbuktu -- same since 60
List C. TZIDs that are linked, but refer to different locations.
(This was derived by inspection; if there are other cases of IDs that refer
to different locations, please let us know.)
AQ Antarctica/South_Pole -- linked, different places
SJ Atlantic/Jan_Mayen -- linked, different places
US America/Shiprock -- linked, different places
AR America/Rosario -- linked, different places
AU Australia/Canberra -- linked, different places
BR America/Porto_Acre -- linked, different places
BR Brazil/Acre ?
IL Asia/Tel_Aviv -- linked, different places
MD Europe/Tiraspol -- linked, different places
MX America/Ensenada -- linked, different places
US America/Fort_Wayne -- linked, different places
More information about the tz