[tz] Pre-1970 data

dpatte dpatte at relativedata.com
Fri Nov 5 22:35:40 UTC 2021


I was be contributing a longer message on this whole subject within a few days, but I want to state upfront that country/city is insufficient in many countries for a fully expandable location database.Off the top of my head, I can think of three different Richmonds in Canada. One in Quebec, one in Ontario, and one in BC. I know there are many Springfields in many USA states, and let's not even talk about the potential USA/Columbus.The key of a database should be unique and recognizable. For many countries, multiple levels of government are the only way to properly differentiate cities.Canada/British_Columbia/Richmond would be more appropriate.This allows the pre-1970 db to be both infinitely expandable, as well as user friendly to the point of being fully obvious to all users.Sent from my Galaxy
-------- Original message --------From: Brian Park via tz <tz at iana.org> Date: 2021-11-05  11:27  (GMT-05:00) To: Philip Paeps <philip at trouble.is> Cc: Stephen Colebourne <scolebourne at joda.org>, Time Zone Mailing List <tz at iana.org> Subject: Re: [tz] Pre-1970 data On Thu, Nov 4, 2021 at 10:11 PM Philip Paeps <philip at trouble.is> wrote:On 2021-11-05 12:17:34 (+0800), Brian Park via tz wrote:
> I get the impression that this debate is caused by the existence of 2
> different schools of thought: [...]
>
> I want to suggest that it may be possible for these 2 views to 
> coexist.

They de facto coexist right now.  The overwhelming majority of the data 
are descriptive.  Only recent efforts have made some of the post-1970 
data appear more prescriptive.They coexist in an ad hoc manner right now, and that seems to be one of the causes for the contention. I am suggesting that we formalize the separation, so that both groups are happier.
> We
> could create a new file, e.g. call it 'countryzone', which contains a 
> set
> of Links organized in a hierarchical tree by country, pointing to the 
> Core
> zones.

I strongly believe we should continue to carefully avoid attempting to 
group data by country.  [I would even avoid using the word "country" 
wherever possible.]Can you explain why? Because it will cause arguments about disputed places? I think only a small minority of places around the world are disputed. By separating these ISO-country timezones into a 'countryzone' file, perhaps we can confine the debate into a smaller section of the TZDB. We could
create duplicate entries (i.e. Country1/City, Country2/City), or create a 
pseudo-country called "Disputed" (i.e. Disputed/City). The point is, we can create policies that govern these disputed regions.Could we move 'countryzone' into a separate project? Probably, but some amount of initial coordination and refactoring  would be required to resolve conflicting zone identifiers.Overall, I feel like the TZDB data should  lean a bit more towards matching how end-users think about timezones in the real world (Prescriptive), and lean slightly less  on treating timezones as a clustering problem (Descriptive). But I can see pros and cons of both approaches. Which is why I am suggesting ways to make the 2 approaches interoperate better.
> For the pre-1970 data, it is my understanding that the 'backzone' file
> contains Zone records which should replace ONLY the LinkMerged records
> found in the other files. I propose that all LinkMerged records be
> extracted into a separate file (let's call it 'mergedzone') so that 
> there
> is a clear symmetry between 'backzone' and 'mergedzone', which allows 
> them
> to be substituted for each other. The dependency diagram looks 
> something
> like this:

As I've suggested before in another thread, I think we should consider 
undoing the split into backzone.  I really liked Stephen's phrasing 
earlier in this thread: acceptably accurate, not outrageously wrong.  We 
started moving data to backzone to limit the scope of 'active' 
maintenance to post-1970 data.  That artificial split led us towards a 
more prescriptive worldview.  It seems clear that prescriptive simply 
does not work for a real world with people on it.I think Paul Eggert has made it  clear that he does not want to  maintain this data. My proposed refactoring of this info into the 'backzone' /  'mergedzone'  pair makes it easy for downstream libraries to add back the 'backzone' data if they want. The 'make PACKRATDATA=backzone' hack does not help downstream libraries which do not use TZif or the Makefile.
> If there is any chance that this will result in being able to type
> "Canada/Toronto" instead of "America/Toronto", that would resolve an
> annoyance that has lasted some 30-35 years.
In this context, America refers to the landmass, not to the political 
entity occupying a large chunk of it.  [Canada/Eastern etc moved to 
backward around 1993, as far as I can tell.]Virtual no one in the world thinks of "America" as referring to all of "North America" and "South America".Brian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/tz/attachments/20211105/f58af254/attachment-0001.html>


More information about the tz mailing list