Time Zone Localizations

Mark Davis mark.davis at jtcsv.com
Fri Jun 11 18:49:03 UTC 2004


comments interleaved below.

Mark
__________________________________
http://www.macchiato.com
► शिष्यादिच्छेत्पराजयम् ◄

----- Original Message ----- 
From: "Chuck Soper" <chucks at lmi.net>
To: <tz at lecserver.nci.nih.gov>
Sent: Fri, 2004 Jun 11 00:21
Subject: Re: Time Zone Localizations


> I think that the tables listed at this link could
> be revised to improve clarity:
>
http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/formatting/zone_log.html
>
> Are these tables informational or vital to the
> process? For example, will the Aliases table be
> used to "Canonicalize the Olson ID" (step 1 of
> the Fallback procedure)? If so, then the table
> should be based on the current time zone data
> files instead of "current Java data". This could
> turn into a serious maintenance issue. Ideally, I
> think that a script could be run on the time zone
> data files to generate a new aliases table each
> time the time zone data files are updated
> (currently tzdata2004a).

The tables are informational, just to provide a different view of the data and
provide background for the issues involved.

The end goal is to work off of the current time zone data, as provided on
ftp://elsie.nci.nih.gov/pub/. It may be necessary to supplement that data, e.g.
to have a list of 'outmoded codes' like WET, CET, MET, EET, Asia/Riyadh87,
Asia/Riyadh88, Asia/Riyadh89; or add IDs for missing country codes, but we would
really far rather that all the data could come from
ftp://elsie.nci.nih.gov/pub/.

>
> ### Aliases
> The Aliases table appears to contain time zone
> names from several different tzdata2004a files:
> backward, etcetera, and systemv. If you use
> several different tables such as backward
> aliases, etcetera aliases, and systemv aliases
> then I expect that they would be easier to
> maintain as the time zone database is updated.

Good. Part of our supplementary data could be which whole tables to exclude.

>
> I suggest avoiding terms like 'Bogus' and 'Real'
> because they're not very descriptive. Here are
> some possible new terms:
>    'Bogus' to 'Obsolete Olson ID from tzdata2004a/backward'
>    'Real' to 'Valid Olson ID as of tzdata2004a'
>
> The 'Bogus' column in the Aliases table contains
> both obsolete and valid time zone names.
> Antarctica/South_Pole and America/Shiprock are
> valid; they are both listed in the
> tzdata2004a/zone.tab file. These files correspond
> to (are equal to) Antarctica/McMurdo and
> America/Denver respectively but that doesn't
> necessarily mean that they are invalid or bogus.

You're right: Bogus is an overly familiar term.

However, to reduce the translation requirements and make the data more
manageable, we do want to set up some uniqueness criteria. If two IDs have
exactly the same behavior since the time when time zones were adopted, and have
always been in the same country over that period, we only want one of them to be
in the main list. The other can be an alternate -- and still work-- but we would
recommend an extremely low priority on translation.

>
> Where did AET (line 2, Aliases table) for
> Australia/Sydney come from? I can't find any
> reference to it in tzdata2004a.

There are some aliases that come from Java. That is noted in the document, but
probably not clearly enough.

>
> ### Country to Zones
> The 'Country to Zones' looks good, yet it
> contains at least one obsolete country code, YU
> for Yugoslavia. I realize that this is a known
> issue. You can use the tzdata2004a/zone.tab file
> to generate an updated 'Country to Zones' table.
>
> For obsolete ISO 3166 country codes such as YU I
> think that ISO 3166-3 could be referenced. ISO
> 3166-3 represents codes for formerly used names
> of countries:
> http://www.niso.org/standards/resources/3166.html
>
http://www.iso.org/iso/en/prods-services/iso3166ma/04background-on-iso-3166/iso3166-3.html
> Each ISO 3166-3 entry has several fields
> including a four letter code where the first two
> letters are the formerly used code and the last
> two letters are the code that replaced it. The
> four letter code for Yugoslavia is 'YUCS'.
>
> The ISO web site says that ISO 3166-3 was first
> published in 1998 (or maybe 1999), but I cannot
> find the original document. Yes, it would be
> ironic if the standard for formerly used names is
> formerly used.

There is separate discussion on the email list on the instability of ISO codes.
Unfortunately, the ISO committee makes no stability guarantees about 3 letter
codes either.

>
> Should CLDR (or the time zone database) maintain
> a list of formerly used country codes? This would
> be similar to the backward file to maintain
> obsolete time zone names.

We would probably use the same mechanism as the RFC 3066bis; that once a country
code is introduced by ISO, we never retract it. If they introduce a different
meaning for that code, we don't follow them -- and instead use the UN code.

>
> ### Countries that are missing Zones
> Could this table be renamed to 'Country Codes that do not have Zones'?
> Of course, Yugoslavia shouldn't be in the table.

Yes, and you are correct on Yugoslavia. (Apparently the Java implementation
filters that out).

>
> ### Zones that are missing Countries
> Could this table be renamed to "Zones that do not map to specific Countries"?

Yes

>
> Europe/Belgrade is in this list, yet it's listed
> in the tzdata2004a/zone.tab file with a country
> code of CS. I suppose this is a know issue
> related to the YU country code for Yugoslavia.

Yes

>
> I don't think that Asia/Riyadh87, Asia/Riyadh88,
> and Asia/Riyadh89 belong in this list. They
> correspond to the tzdata2004a files: solar87,
> solar88, and solar89. Should they go in the alias
> table(s)?

They really sound like items we should just ignore, for the purposes of these
document, since they are not really useful.

>
> ### Windows IDs
> This table seems to imply that there is a
> one-to-one relationship between Olson IDs and
> Windows IDs. Since you only have 75 Windows IDs
> listed, I suspect that one Windows ID may map to
> one or more Olson IDs.

There should be more explanation. These are, to the best of our knowledge, the
appropriate mappings to use for Windows IDs, but its presence here is only
informational. Windows doesn't try to do historic time zones, nor do they cover
all of the modern timezones completely.

>
> ### Equivalent Modern Zones
> My first reaction whenever I see a significant
> effort to simplify something for the user is to
> think that it could lead to problems. How did you
> determine that "A lot of people just don't care
> about historic differences"? Do most people use
> time zones merely to keep track of the current
> time or current differences between another time
> zone? How else do users use time zones?

Many (I would dare say the vast majority) of end users just don't care now that
there was once a difference between Dawson, Whitehorse and Los Angeles. When
they pick a timezone in some preferences dialog (on their machine, in a website
preferences page, etc) they just want to see one choice for that zone, not three
different ones that they have to think about. The UI might have an advanced
button (as the text discusses) for someone who does really care, but that will
be a very small proportion of users.

>
> That's all for now. I hope that my comments are useful.
> Chuck

Absolutely!

Actually, another question. We have traditionally referred to the timezone IDs
in ftp://elsie.nci.nih.gov/pub/ as "Olson IDs". What is the best way to refer to
them?

>
>
> At 6:40 PM -0700 6/10/04, Mark Davis wrote:
> >Thanks for your feedback.
> >
> >>  Bouvet Island - an uninhabited volcanic island, almost entirely
> >...
> >>  Etc/GMT{[+-]N} are just for fixed GMT offsets; they don't correspond to
> >  > countries.
> >
> >Yes, we realize that Bouvet Island and Heard Island and McDonald Islands are
> >completely obscure places; it is more for a
> >matter of API/testing completeness.
> >Understood that Etc/GMT... don't correspond to
> >countries. But in an API and for
> >translation, it is useful to have everything attached to a country, even if
it
> >is a pseudo-country. That's why the suggestion
> >in the document is to use ZZ for
> >them, which is a private-use ISO country code, which can be translated as "no
> >country".
> >
> >As to Yugoslavia, that is a real mess, because the ISO committee just doesn't
> >care about stability of identifiers. You can have a database set up with
> >someone's country of birth stored as CS. All of a sudden by some whim of ISO,
> >that data is invalidated. More on that at
> >http://www.unicode.org/consortium/utc-positions.html#2stability.
> >
> >>  Asia/Riyadh{87,88,89}: Saudi Arabia, SA - those are historical, from
> >>  an era when Saudi Arabia used solar time, and apply only to Riyadh
> >>  (and, if you're really fussy, to a particular location in Riyadh, I
> >>  guess), so they're not appropriate for Saudi Arabia as a whole.  I
> >>  don't know what names you'd give them.
> >>  ...
> >>  WET, CET, MET, and EET "are for backward compatibility with older
> >>  versions"; various Europe/XXX rules should presumably be used instead -
> >>  I guess you could pick cities for each of them.
> >
> >For these, I guess my recommendation would be to
> >not bother translating them at
> >all -- they are all compatibility orphans, one wouldn't encourage their use.
> >
> >Mark
> >__________________________________
> >http://www.macchiato.com
> >? '¤÷ËÕZY¤½Ë¼·¬Ë»¦Z?ÕÃË ?
> >
> >----- Original Message -----
> >From: "Guy Harris" <guy at alum.mit.edu>
> >To: "Mark Davis" <mark.davis at jtcsv.com>
> >Cc: <tz at lecserver.nci.nih.gov>
> >Sent: Thu, 2004 Jun 10 18:11
> >Subject: Re: Time Zone Localizations
> >
> >
> >>
> >>  On Jun 10, 2004, at 12:04 PM, Mark Davis wrote:
> >>
> >>  > I'd very much appreciate any feedback on the proposal.
> >>
> >>  Some of the countries listed as missing zones are:
> >>
> >>  Bouvet Island - an uninhabited volcanic island, almost entirely
> >>  covered by glaciers, controlled by Norway, and designated as a nature
> >>  reserve, according to
> >>
> >>  http://www.cia.gov/cia/publications/factbook/geos/bv.html
> >>
> >>  I don't know if the automated meteorological station on the island
> >>  cares about time zones or not.
> >>
> >>  Heard Island and McDonald Islands - uninhabited, barren, sub-Antarctic
> >>  islands now controlled by Australia, designated as a nature preserve,
> >>  according to
> >>
> >>  http://www.cia.gov/cia/publications/factbook/geos/hm.html
> >>
> >>  They don't even mention any automated meteorological stations, just
> >>  seals and birds.
> >>
> >>  Yugoslavia - it's now Serbia and Montenegro.  Europe/Belgrade is the
> >>  correct zone for it.
> >>
> >>  Some of the time zones listed as missing countries are:
> >>
> >>  Europe/Belgrade: Serbia and Montenegro, which has the ISO 3166-1
> >>  Alpha-2 code CS, according to
> >>
> >>  http://www.iso.org/iso/en/prods-services/iso3166ma/01whats-new/2003
> >>  -07-23_statement_cs.html
> >>
> >>  Asia/Riyadh{87,88,89}: Saudi Arabia, SA - those are historical, from
> >>  an era when Saudi Arabia used solar time, and apply only to Riyadh
> >>  (and, if you're really fussy, to a particular location in Riyadh, I
> >>  guess), so they're not appropriate for Saudi Arabia as a whole.  I
> >>  don't know what names you'd give them.
> >>
> >>  Etc/GMT{[+-]N} are just for fixed GMT offsets; they don't correspond to
> >>  countries.
> >>
> >>  WET, CET, MET, and EET "are for backward compatibility with older
> >>  versions"; various Europe/XXX rules should presumably be used instead -
> >>  I guess you could pick cities for each of them.
> >>
> >>
>
>
>




More information about the tz mailing list