[tz] An alternate framing of timezone maintenance
mikeadouglass at gmail.com
Wed Sep 22 15:57:55 UTC 2021
This is pretty much what I've been suggesting (though not in as much
I would also suggest we (re)consider distribution of the resulting data.
Asking people to discover and run code to produce the format they desire
seems unnecessary. Providing the data in a number of (popular)
downloadable formats would seem reasonable.
Timezone distribution service?
On 9/22/21 06:52, Russ Allbery via tz wrote:
> Over the past few days, I've felt like the framing of the discussion
> hasn't taken into account Paul's clearly expressed desire for the part of
> maintenance he wants to focus on, and has not attempted to incorporate
> that into a design that would preserve other properties that other mailing
> list participants are interested in. I've also wondered if all parties
> are making unnecessarily strong assumptions about the nature of tz
> maintenance that exclude potentially useful designs.
> In the hope of applying the maxim that all problems in computer science
> can be solved by adding a level of indirection, here's a wild proposal
> that, even if not workable as-is, might help in looking at the discussion
> from a different angle.
> One can think of the tz database as two layers. The first is a collection
> of rulesets that represent rules for clock changes in particular regions.
> Call that the timekeeping data set. The second is a many-to-one
> assignment of names to those rulesets. Call that the naming layer.
> The scheme used for the naming layer attempted to avoid politicization of
> that layer by using the continent and largest city approach. This was
> largely successful, particularly by the standards of attempts of this
> sort, but not entirely so.
> For years now, the tz project has in essence asked people to treat the
> zone names as opaque identifiers and not imbue them with political
> meaning. Unfortunately, because those identifiers embed real-world names
> with other meanings in other contexts, I believe this effort is doomed to
> never fully succeed. The names and spellings of cities are political.
> The choice of continent to which to assign a city can be political.
> Population counts are political. Readers of the mailing list can fill in
> more examples.
> However, the timekeeping data set, divorced from the naming layer, is as
> close to apolitical as anything involving laws and human practice could
> be. Putting aside timezone abbreviations, nearly all of the political
> conflict is over the naming layer, not the timekeeping data set.
> I believe Paul has clearly indicated that the part of the work that he
> wants to focus on is maintenance of the timekeeping data set. I would
> characterize his recent proposed changes as attempts to make the naming
> layer less political to reduce political arguments and thus allow more
> time and attention to be spent on the timekeeping data set, which is where
> the primary value of the project lies. The stability concerns that have
> prompted most of the recent discussion are almost entirely about the
> naming layer.
> Suppose we resurrect the idea of opaque timezone identifiers.
> Specifically, suppose that we *add* a new, random identifier, something
> like TZ0045 with random digits, to all existing rulesets in either the
> main database or backzone. These identifiers would be unique identifiers
> for the dataset itself, independent of any other names. These identifiers
> would immediately have some useful properties:
> 1. Historic times for a given identifier would change only if we
> discovered that the previous times were clearly erroneous. Apart from
> fixing discovered errors, historic times would be stable for any given
> 2. Looking forward, new identifiers may be added if portions of an
> existing region diverge in their timekeeping practices or if someone
> gathers new historical information that would prompt the creation of a
> new backzone ruleset, but that's the only possible change. Identifiers
> will never change or be retired.
> 3. These identifiers carry absolutely no additional political content on
> top of the rules themselves. In other words, they add no new political
> problems not inherent and unavoidable in the data itself.
> Adding these identifiers would nearly double the number of names in the
> current tz database, which is unfortunate, but certainly far less
> disruptive than the sorts of changes that have recently been considered.
> Once these identifiers exist, the combination of those identifiers and the
> timekeeping data set form a nearly apolitical collection of data to which
> a naming layer can be cleanly applied. One can, for example, define a
> naming layer that exactly corresponds to the naming in use in the previous
> release of the tz database. With the exception of the implementation
> detail that the previous names become links to a new canonical identifier,
> the combination of that naming layer and that conception of the
> timekeeping data set is functionally identical to the previous tz release
> (except for the normal sorts of modifications for on-the-ground
> timekeeping changes).
> This may sound like a lot of work just to get back to where we already
> are, but with a pile of new, ugly names. But the point of such a change
> is that it now permits a separation of concerns and even potentially a
> separation of maintenance.
> The timekeeping data set is now a separate artifact that those whose
> primary interest is in timekeeping data can focus on without having to get
> involved in political naming discussions. It achieves the goal that Paul
> has been working towards (but which is impossible to fully achieve with
> the current naming) of separating the data from political and historical
> decisions about who got a timezone name and who didn't. And (very slowly,
> of course) there is now the possibility for consumers of the tz database
> to opt out of the naming conventions. One could, for instance, choose a
> timezone based on selection from a map and have that correspond to the
> unique, permanent timezone identifier.
> Meanwhile, clearly there is a strong interest in the naming layer and a
> strong desire to continue to maintain it along lines that Paul is not
> entirely comfortable with. Recently, that discussion has focused on
> naming stability, but other parties have expressed other interests in the
> past (adding new spellings of cities, ensuring a name exists for every
> ISO-recognized country, ensuring a name exists for regional capitals that
> are commonly referenced locally as the name for a timezone, etc.).
> Nothing is going to make those discussions go away, as the past many years
> of discussions here have shown, but now they are separable from the
> timekeeping data set and participants can decide which part of the
> maintenance they're interested in.
> If Paul (or any other contributor) wished, he could choose to focus on the
> part of the project that he finds the most interesting and leave
> maintenance of the naming layer largely to other parties. Given recent
> mailing list traffic, there is obviously substantial interest in that
> naming layer and thus I'm sure there will be no shortage of volunteers to
> help maintain it. And those who make decisions about the naming layer can
> then also absorb the consequences of those decisions, such as handling
> arguments over the spelling of cities. It would even be possible
> (although not necessary) to move discussion of the naming layer to a
> separate mailing list to more clearly separate political discussion from
> ruleset maintenance and technical work on the associated code libraries.
> The naming layer, which is now nearly devoid of technical decisions, could
> even be delegated to a more political body that deals with these sorts of
> conflicts constantly and is thus better equipped to handle them than the
> tz mailing list. Numerous options like that become possible.
> Even if a maintenance split doesn't happen, I think everyone may benefit
> from cleanly separating the spectacularly high-quality resource of
> rulesets and their accompanying exhaustive references, discussion, and
> human-readable descriptions of applicable regions from the politically
> fraught but technically quite small and simple naming layer.
> This idea may not be workable for reasons that aren't obvious to me at
> nearly 4:00am, but hopefully it will at least provide a different angle
> from which to look at the current arguments and possibly achieve some
> clarity about which portions of the overall tz project people are
> interested in working on and where the exact controversy lies.
More information about the tz