[tz] Classifying IDs

Paul Eggert eggert at cs.ucla.edu
Fri Oct 8 07:33:34 UTC 2021

On 10/6/21 10:08 AM, Stephen Colebourne via tz wrote:

As far as the categorization of IDs goes, I think I'd categorize them 
somewhat differently:

* The names defined by the PRIMARY_YDATA files.

* The names defined by 'backward'. (Though this category now tends to 
blur into the previous one, as the use of 'backward' is more popular 

* The names defined by 'etcetera'.

* The names defined by 'backzone' but not by the other files.

This categorization uses maintainer lingo. But it determines what names 
end users see so it's a valid categorization for end users too.

There's another way to categorize names, which might be better in the 
long run than what we have now. We could categorize them as follows:

* The names currently defined by 'etcetera'.

* Names for each set of clocks that are planned to agree in the future. 
This is useful for applications like planning calendars, setting 
thermostats, etc.

* Names for each set of clocks that have agreed since 1970 and are 
planned to agree in the future. (This category includes the previous 
category.) Current tzdb Zones approximate this set (though we still have 
20-odd Zones too many).

* Backward-compatibility aliases for the above.

* Other names (outside the scope of tzdb, so 'backzone' stuff).

OK, getting back to your classification:

> Examples: Portugal, NZ-CHAT, Navajo, Libya.
> Proposal: Provide 3-6 months notice, then move obsolete IDs to a new
> file "obsolete" which downstream projects are strongly encouraged not
> to include. (I would argue that the time has come to properly remove
> these IDs, which are very inconsistent in terms of which are provided
> and which not, eg Portugal, but not Spain)

Inconsistent they definitely are. And it might make sense to remove old 
names that are rarely used, and that are so inconsistent that they cause 
more problems (via confusion) than they cure (by supporting old TZ 
settings). However, I would think we'd need more than a few month's notice.

> Deprecated, same location
> ------------------------------------
> IDs that have been deprecated with a single clear alternative ID being
> provided. Both IDs represent the same physical location/city.
> Spelling changes: Asia/Katmandu (replaced by Asia/Kathmandu),
> Asia/Rangoon (replaced by Asia/Yangon)
> ID structure changes: America/Louisville (replaced by
> America/Kentucky/Louisville)
> Proposal: Ensure all of these are in `backward`
> Consider: Is there any way to move these IDs to the obsolete file?
> Maybe after 5 years? Or do we just accept backwards compatibility
> restrictions on these?

I'd say that there's less of an argument for removing these names, as 
the confusion is surely less.

> Legally described mega-zones
> -----------------------------------------
> IDs for locations where a federal or supra-national body defines
> rules, eg the EU or US DOT.
> Examples: US/Mountain, CET, WET
> Consider: Can we write down a rule to identify when something like
> this should be included? Then move the matching IDs to the main files
> (eg. are the EU and US DOT the only two examples here?)

Although there are some other examples like this in the world, I think 
we're better off not pursuing this as it would duplicate existing 
functionality (thus causing confusion), it'd be a pain to nail down 
exactly what a "mega-zone" would be (or not be), and it'd be more 
opportunity for real-world politics to strike.

> Examples: Europe/Berlin, America/New_York, Africa/Abidjan
> Proposal: Ensure all of these are in the main files.
> Consider: Should there be new IDs for each of these abstract regions
> to indicate they are a separate and distinct concept? eg.
> "Region/Berlin".

I'd rather avoid having yet more names for the same thing. We already 
have so many aliases that there's some confusion.

> IDs for locations that are not region IDs. Each ID will have the same
> wall clock since 1970 as one of the region IDs.
> Examples: Europe/Oslo, Europe/Amsterdam, Atlantic/Reykjavik
> Consider: Can we write down a rule that covers which IDs are included
> here?

Yes: a backward-compatibility rule. If we had the name in previous 
releases, we should keep the name. This rule is simple and clear, and 
helps avoid name proliferation.

> these non-region location IDs are actively
> used in downstream business applications

That's fine, as these business applications should continue to work 
because the old IDs will continue to be maintained. If a new ISO country 
code is established but no ID is created (because there's no timekeeping 
need for one), the applications can continue to use the same IDs they 
were using before.

More information about the tz mailing list