[tz] Recommendations for stable list of timezone ids
thobes at gmail.com
Mon Feb 12 13:51:55 UTC 2018
Condensing my reply to two people into one email, in order to not generate too much traffic on the list.
> On 9 Feb 2018, at 14:54 , Philip Paeps <philip at trouble.is> wrote:
> On 2018-02-09 13:44:03 (+0100), Tobias Lindaaker wrote:
>> I am trying to compile a flat list of all tzids in the Time Zone Database in a way such that the position of a tzid remains the same in the list even in the face of future changes to the Time Zone Database.
>> I would like to know if anyone could provide me with some insight into how (a prefix of) such a list could be generated from any version of the Time Zone Database. Or would this be something I would have to maintain myself from release to release of the Time Zone Database?
> I think this is something you'll have to maintain yourself. As you discovered, timezones sometimes get deleted so you'll need to be aware of the data files and the backzone file.
And in the case of tzids such as Asia/Alma-Ata, Asia/Ishigaki, and Canada/East-Saskatchewan I need to just manually keep track of them because they were completely removed from the tz database.
>> Finally some background on why (I think) I need to generate this type of list of tzids: I am improving the support for storing timestamp values in a database management system (Neo4j), and would like to avoid having to store the timezone component of zoned timestamps as a (variable length) text component, but rather store it as a fixed size integer offset into a table of tzids. Thus I need such a stable list of tzids.
> I wonder why you're storing timezone information in the database? Is there any reason you can't simply store UTC timestamps and show them in the correct timezone in the presentation layer?
I do store UTC timestamps, but I also store the origin time zone in order to be able to show the timestamp correctly in the presentation layer.
I don't need to store any of the definition of the time zones, only the tzid itself. I completely rely on the tzdb for being able to convert the UTC timestamp to the correct local time for the zone that the timestamp originated from.
> On 9 Feb 2018, at 19:26 , Paul.Koning at dell.com wrote:
>> On Feb 9, 2018, at 1:23 PM, Paul Koning <paul.koning at dell.com> wrote:
>>> On Feb 9, 2018, at 7:44 AM, Tobias Lindaaker <thobes at gmail.com> wrote:
>>> I am trying to compile a flat list of all tzids in the Time Zone Database in a way such that the position of a tzid remains the same in the list even in the face of future changes to the Time Zone Database.
>> Isn't this already in place? Timezones are identified by <continent>/<city>. While the primary name for a zone may change, for example if the "best known" spelling of the city changes, the previous name in such cases is preserved as a link.
>> So you can read the current tzdata to find all the names. Those that aren't links are today's TZ identifiers. In the future, if a new name shows up, it's either a renamed zone (if a previously known zone name is now a link to that new name) or a newly defined zone (if no such link exists).
>> If that isn't sufficient, what's missing?
> Oh, I missed something. You want them sorted and for the positions to remain fixed. That's clearly not possible. And why do you need that? The strings are stable identifiers as I described. If you don't like keying lots of records by strings, which is reasonable, you can create an auxiliary table that maps the strings into small integers which are then used in the other tables. You can do all this today.
Yes. What I am asking for is if somebody already had such an auxiliary table defined somewhere, or if anyone had hints on the best strategy for creating such an auxiliary table.
My original plan was to dynamically generate something similar to a system table for mapping between the string identifiers and small integers. For various technical reasons that I don't think I need to bore this list with a number of my colleagues were against this idea and proposed instead generating a static mapping table and ship it as part our code. After some investigation into how to best generate such a mapping table in a stable way, which ended with me bumping into the removal of Canada/East-Saskatchewan (which meant it would be impossible to generate a stable list based on only a single release of tzdata), I decided to ask this list for input.
I would still very much like it if I could maintain a dynamic mapping based on the tzids in actual use rather than a static pre-generated table, but in order for the feature to make the next release window I need to at least investigate both options in parallel. To be honest, the static-table-in-code approach is not looking as bad as I first thought, even if I do have to continuously track new tzdata releases, in particular because I would need to do some of that anyhow in order to handle things like Canada/East-Saskatchewan.
More information about the tz