[tz] Pre-1970 data
brian at xparks.net
Fri Nov 5 15:26:38 UTC 2021
On Thu, Nov 4, 2021 at 10:11 PM Philip Paeps <philip at trouble.is> wrote:
> On 2021-11-05 12:17:34 (+0800), Brian Park via tz wrote:
> > I get the impression that this debate is caused by the existence of 2
> > different schools of thought: [...]
> > I want to suggest that it may be possible for these 2 views to
> > coexist.
> They de facto coexist right now. The overwhelming majority of the data
> are descriptive. Only recent efforts have made some of the post-1970
> data appear more prescriptive.
They coexist in an ad hoc manner right now, and that seems to be one of the
causes for the contention. I am suggesting that we formalize the
separation, so that both groups are happier.
> > could create a new file, e.g. call it 'countryzone', which contains a
> > set
> > of Links organized in a hierarchical tree by country, pointing to the
> > Core
> > zones.
> I strongly believe we should continue to carefully avoid attempting to
> group data by country. [I would even avoid using the word "country"
> wherever possible.]
Can you explain why? Because it will cause arguments about disputed places?
I think only a small minority of places around the world are disputed. By
separating these ISO-country timezones into a 'countryzone' file, perhaps
we can confine the debate into a smaller section of the TZDB. We could
create duplicate entries (i.e. Country1/City, Country2/City), or create a
pseudo-country called "Disputed" (i.e. Disputed/City). The point is, we can
create policies that govern these disputed regions.
Could we move 'countryzone' into a separate project? Probably, but some
amount of initial coordination and refactoring would be required to resolve
conflicting zone identifiers.
Overall, I feel like the TZDB data should lean a bit more towards matching
how end-users think about timezones in the real world (Prescriptive), and
lean slightly less on treating timezones as a clustering problem
(Descriptive). But I can see pros and cons of both approaches. Which is why
I am suggesting ways to make the 2 approaches interoperate better.
> For the pre-1970 data, it is my understanding that the 'backzone' file
> > contains Zone records which should replace ONLY the LinkMerged records
> > found in the other files. I propose that all LinkMerged records be
> > extracted into a separate file (let's call it 'mergedzone') so that
> > there
> > is a clear symmetry between 'backzone' and 'mergedzone', which allows
> > them
> > to be substituted for each other. The dependency diagram looks
> > something
> > like this:
> As I've suggested before in another thread, I think we should consider
> undoing the split into backzone. I really liked Stephen's phrasing
> earlier in this thread: acceptably accurate, not outrageously wrong. We
> started moving data to backzone to limit the scope of 'active'
> maintenance to post-1970 data. That artificial split led us towards a
> more prescriptive worldview. It seems clear that prescriptive simply
> does not work for a real world with people on it.
I think Paul Eggert has made it clear that he does not want to maintain
this data. My proposed refactoring of this info into the 'backzone' /
'mergedzone' pair makes it easy for downstream libraries to add back the
'backzone' data if they want. The 'make PACKRATDATA=backzone' hack does not
help downstream libraries which do not use TZif or the Makefile.
> > If there is any chance that this will result in being able to type
> > "Canada/Toronto" instead of "America/Toronto", that would resolve an
> > annoyance that has lasted some 30-35 years.
In this context, America refers to the landmass, not to the political
> entity occupying a large chunk of it. [Canada/Eastern etc moved to
> backward around 1993, as far as I can tell.]
Virtual no one in the world thinks of "America" as referring to all of
"North America" and "South America".
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the tz