[tz] Pre-1970 data

Philip Paeps philip at trouble.is
Sat Nov 6 06:55:02 UTC 2021


On 2021-11-06 05:53:23 (+0800), Brian Park via tz wrote:
> On Fri, Nov 5, 2021 at 12:01 PM Brian Park <brian at xparks.net> wrote:
>> I agree that it is conceptually cleaner if the Core TZDB identifiers 
>> were
>> internal only. But I understand that some people would consider 
>> ISO-country
>> identifiers to be out of scope of this project, although there are 
>> many ad
>> hoc ones currently in the database. I think a file like 'countryzone'
>> should be added only if there are people willing to maintain such a 
>> list.
>> It may need to be a separate project, to avoid forcing the TZ 
>> Coordinator
>> to pick up the slack if those maintainers drop off.
>
> Following up my own post, I took an initial stab at what this 
> 'countryzone'
> file would look like, and immediately ran into problems that convinces 
> me
> that this does *not* belong in the TZDB project. The scope seems too 
> large,
> so it seems better as a separate project.

I'm glad we can agree on this. :)

> 2) If we shorten some countries, like "Bosnia and Herzegovina" to just
> "Bosnia" for convenience, are we going to offend people? I don't know
> anyone from Bosnia and Herzegovina, so I have no idea. Each country 
> that we
> shorten needs to be researched carefully.

That's just a subset of the problems you'll encounter.  Referring to 
certain regions assigned ISO two-letter codes as countries will cause 
considerable awkwardness too.  Even if you wisely avoid using the word 
country, your suggestion to abbreviate (or not) will be deeply 
controversial.  You'll face an uphill struggle defending each decision.

You might reasonably (from a technical perspective) suggest mechanically 
abbreviating (truncating) to the first whitespace or punctuation mark.  
That'll give you seven regions named "Saint" and that will probably the 
least of your problems.

> 3) At least 5 countries have non-ASCII characters in their ISO names: 
> "Côte
> d'Ivoire ", "Curaçao", "Åland Islands", "Saint Barthélemy", 
> "Réunion".
> Personally, I would like to use only ASCII characters because they are 
> the
> lowest common denominator that is guaranteed to work, outside of 
> mainframes
> using EBCDIC. If we remove these non-ASCII characters, are we going to
> offend the people of those countries, even though these are supposed 
> to be
> English versions of their country names?

I don't believe the spelling will be nearly as controversial as 
referring to most of the regions in that list as countries.  There is 
prior art in the tzdb for ASCII-fying accented letters.

> 4) So maybe the solution is to use 2-letter or 3-letter ISO codes, 
> instead
> of the shortened, quasi-English versions of the country names. So we 
> get
> things like "CA/Eastern" or "CAN/Eastern", instead of 
> "Canada/Eastern". Not
> very satisfying for Canadians or many other countries (except for 
> Americans
> whose ISO codes "US" and "USA" match their colloquial usage 
> perfectly).

That would be somewhat less controversial.  Though note Clive's 
observation about GB/UK.

Philip

-- 
Philip Paeps
Senior Reality Engineer
Alternative Enterprises



More information about the tz mailing list