[tz] Preparing to fork tzdb

Steffen Nurpmeso steffen at sdaoden.eu
Thu Sep 23 22:26:46 UTC 2021


Guy Harris wrote in
 <B97CABBF-BFEA-4CE5-9D05-9663170ABBE0 at sonic.net>:
 |On Sep 23, 2021, at 5:41 AM, Steffen Nurpmeso <steffen at sdaoden.eu> wrote:
 |> Guy Harris wrote in
 |> <8C6BA6DA-AC7D-4FCE-97F0-1B277A981B55 at sonic.net>:
 |> ...
 |>|So one difference here appears to be that the pre-1970 data for Europe/O\
 |>|slo may be accurate while the pre-1970 data for America/Montreal may \
 |>|be inaccurate.

And the README even says

  This database of historical local time information has several goals:

   * Provide a compendium of data about the history of civil time that
     is useful even if not 100% accurate.

 |> But isn't the drive to backzone going on for longer, yes hasn't it
 |> been introduced for exactly the purpose in 2014?
 |
 |I.e., one to two years longer, given that America/Montreal was introduced \
 |in 2015 (as I believe I noted when I first brought up America/Montreal \
 |in this context).
 |
 |> Ever since i am on this ML it was discussed often to possibly
 |> replace the identifiers with anonymized strings, i wonder how
 |> anyone could create an interface where one name maps to another
 |> name.
 |
 |If I open up System Preferences on my machine, select the "Date and \
 |Time" entry, select the "Time Zone" tab, enable editing via a couple \
 |of steps, and type "Oxnard" into the "Closest City" box, under the \
 |hood it maps "Oxnard, CA" to "America/Los_Angeles", setting the current \
 |system tzdb region to the latter.  (And then I go back to "figure it \
 |out automatically based on where I'm currently located", but I digress....)
 |
 |Is that what you mean by "one name maps to another name"?  There's \
 |no "America/Oxnard", so it needs to figure out that Oxnard is in the \
 |tzdb region that's named "America/Los_Angeles".

This is malicous hairsplitting for nothing.
If a programming interface gives you one TZ name for another TZ
name where both effectively indistinguishable select the same
data.  The first name had to be selected first, why would you want
it to be mapped?  backward is not timing out, i think.

 |> And what for?  This is not just "super correct" and "i give
 |> my users the full knowledge of the real thing, of anything", that
 |> is just broken.  It is anyway a programmer-only thing, maybe also
 |> a UNIX command line power user thing -- but to get it _really_
 |> right for _users_, you had to use ICU data and use their approach
 |> of time zones, that includes translation then also, does it?
 |
 |macOS *does* include ICU, but the string "Oxnard" appears nowhere in \
 |CLDR as of cldr-28, so I suspect it uses more than just ICU data.
 |
 |> You know, as a human german speaking being i would _never_ assume
 |> Europe/Vienna and do "$ TZ=Europe/Vienna date" really, that is
 |> a nerd assumption.  Maybe "$ date --country=AUstrIA" or
 |> "$ date --city Würzburg" or "$ date --state Baden-Württemberg",
 |> and i only write this for case-messing out-of-ASCII.
 |
 |Sounds good.  That would require data of the sort that macOS uses; \
 |could OpenStreetMaps data be used?

Well _i_ do not think that sounds good, i think reality is you go
to a mega companies app store, download one, and then input or
even speak a name.

 |Anyway:

Yes, please!

 |> Sorry for getting bold, but for one i really did not make that
 |> error (!), and then i also find it absurd to claim to offer
 |> a timezone interface that covers all the date and time data, and
 |> then to exclude backzone.
 |
 |I think we need, as some have suggested, to figure out what to do about \
 |pre-1970 data.  I think that includes somehow making it clear that \
 |people who expect complete, accurate, and unchanging pre-1970 data \
 |are expecting something that
 |
 | 1) is currently self-contradictory (Do you want "accurate" or do you \
 | want "unchanging"?  Some of the data we have is probably inaccurate, \
 | so that making it accurate will involve change!)
 |
 | 2) would involve a lot of effort (sometimes it's even hard to get \
 | information about upcoming time changes from governments; finding \
 | historical information might be a lot harder, especially to people \
 | who aren't familiar with the language and culture of the location \
 | for which you're trying to get historical time information).
 |
 |It also includes deciding whether to offer all the data by default \
 |(as in "not in backzone"), to provide a way to offer both sets of data \
 |separately, or whatever.

My personal opinion is that data should not be splitted at all.
Instead tooling should be used to cut where the line, if any, is
drawn, the make file is capable of doing this.  Without being able
to quote now, IANA TZ is supposed to offer post-1970 data by
default.

 |> On the other hand the announced release strategy felt offending to
 |> me.
 |
 |I think that given the current level of controversy, the next release \
 |should be "2021a+Samoa and any other updates" rather than "current \
 |tip of the main branch", so that we get something out that includes \
 |the Samoa updates but doesn't include the controversial changes.

Or the build should be changed to either cut the data by default,
or to include backzone by default.  In both cases all but a surely
unused broken interface of two downstream libraries should not
recognize a difference, .. unless they want to use date pre 1970.
It is just sad that for black and other coloured noone spoke up
against splitting to backzone, but suddenly it is shaking.
Given the length of the thread, maybe your way is the best.
But if the year long and now standardized way is continued,
including the release, then downstream consumers should be made
aware of changes, like letting an unchanged build recipe fail.

 --End of <B97CABBF-BFEA-4CE5-9D05-9663170ABBE0 at sonic.net>

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


More information about the tz mailing list