[tz] Converting cities to tz identifiers (tangent)

Paul Ganssle pganssle at gmail.com
Tue Feb 20 22:07:13 UTC 2018

If what you want is that if what you *want* is "the time zone in Boston", you need to store something that means "the time zone in Boston" rather than eagerly doing the lookup. You'll have to look up the mapping between Boston <-> America/* every time because the tzdb zones do *not* represent rules for a given geographical region, they represent historically coherent rulesets.

The main problem is that doing this is *way* more complicated than picking a single time zone from the enumerable list of tzdb zones and storing *that*. If you happen to be in one of the cities the zone is named after, you'll get the right thing, but otherwise all bets are off. It's further complicated by the fact that even the geographical location <-> time zone ruleset is upset by the existence of things like Asia/Urumqi, where whether you use the ruleset is a cultural rather than geographical question.

In any case, I sympathize with the question of "US/Eastern" vs. "America/New_York". It does seem like there are times where the time zone you want is "the US Eastern time zone", regardless of whether that represents the time in New York or not, in which case you *can* eagerly map to "US/Eastern", whereas if you want "the current time in <some city>", for the most part you *cannot* eagerly map to a given time zone. This leads to the perverse result that there are times when it makes sense to store "US/Eastern", but "America/New_York" is usually at best an approximation of what you actually want (leaving aside the "coincidence" case of time in New York).


On 02/20/2018 04:36 PM, John Hawkinson wrote:
> Paul.Koning at dell.com <Paul.Koning at dell.com> wrote on Tue, 20 Feb 2018
> at 20:10:33 +0000 in <4FEED0AD-A1BA-4D23-B461-CB07F0F85045 at dell.csom>:
>> I would not recommend that.  "backward" exists, as the name suggests, only for backward compatibility with old naming conventions.
> I'm not sure we've made such a statement of deprecation. I think a lot of people use the US/* style, even though it's not what the tz db prefers. I think a lot of people would be up-in-arms if those identifiers were deprecated anytime in foreseeable life of the project (measured in decades). But perhaps I am wrong (or too conservative).
>> It may seem harder than necessary, but that may be because you haven't fully realized the depths of confusion that politicians will go to.
> Nah, I'm well aware of it. But the fact of the matter is, in the United States, for non-historical timestamps, very few of the gotchas pop up. For sure, there are states that have multiple zones in them, but that's not (at least in my view) a "gotcha."
>> Poking around Indiana will give you a good idea of why this stuff is
>> harder than you might expect.  There you will find some per-county
>> rules, and at some point in time county A might match county B while
>> at another point in history it uses different rules.
> See above. If we're dealing with present day (as I am), this is not a real concern.
>> So yes, the answer really is as hard as it seems to be.
> I think you mistake me.
>> Well, US/x is just a link to America/y, for suitable y.  So there isn't any actual difference.
> That's objectively false. The difference is what happens in the future when tz database breaks the link. 
>> The real problem is this: If some of the places in an existing zone
>> change their rules to differ from those of the rest of the zone, the
>> new definition will have two zones where there used to be just one.
> Correct. And that's why it matters what I choose. And that's also why
> my inclination is to choose the US/* form.
>> Consider a slightly stranger example [...]
> It's because that's such an unlikely case that I'm not particularly worried about it. In my situation, I'm making a reasonable judgment between two choices that are unlikely to matter, but if they do, I'd like to choose the most maintainable and understandable choice that reduces future maintenance. Strange examples are good to be aware of, but aren't directly helpful.
>> So to answer the question of the most stable identifier to use --
>> there isn't one.
> Oh, that's not true. There are lots of reasons to make choices about which identifier to use, and lots of ways to define stability. I offered some of my reasoning, and I'm curious what other people think (especially the argument for avoiding the "backward" zones, which I've always thought to be relatively weak arguments in the United States, but a lot stronger elsewhere).
>> But that doesn't help predict what might happen next year, because
>> you'd just be guessing what politicians might do and in general that
>> isn't feasible.
> False. Just because we cannot accurately predict the future does not mean we may not attempt it, and any choice in this space (America/New_York or US/Eastern) is such a prediction. We can't do it with perfect confidence, but certainly can do it, and arguably we must make such a choice. (Of course, we can make the choice by Rule instead of by evaluation of stability, and our Rule might be "Don't use the backward id's because *secret magic reason*." But it's still a choice.)
> --jhawk at mit.edu
>   John Hawkinson

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://mm.icann.org/pipermail/tz/attachments/20180220/4e167795/signature.asc>

More information about the tz mailing list