[tz] CLDR UN/LOCODE usage
mail.2012 at tobiasconradi.com
Sun Apr 21 11:24:34 UTC 2013
Was: Re: [tz] Proposal to change Macquarie Island to be Australian territory
A reply to :http://mm.icann.org/pipermail/tz/2013-April/019135.html
First LOCODE message in the thread was at
On Sat, Apr 20, 2013 at 3:53 PM, Mark Davis ☕ <mark at macchiato.com> wrote:
> On Fri, Apr 19, 2013 at 12:35 PM, Tobias Conradi
> <mail.2012 at tobiasconradi.com> wrote:
>> On Fri, Apr 19, 2013 at 11:17 AM, Mark Davis ☕ <mark at macchiato.com> wrote:
>> >> Variable length and inconsistent country code usage:
>> > This is a misunderstanding.
>> Of the UN LOCODE system, on the side of CLDR designers.
>> > CLDR is specified to use 5 letter UN LOCODEs where they exist. Where
>> > they do
>> > not exist, it is specified to use a non-5-letter code, precisely so that
>> > they do not overlap with future UN LOCODEs.
>> That could have been easily achieved with 5-letter codes, as has been
>> shown at:
>> http://mm.icann.org/pipermail/tz/2012-May/017974.html and
>> If overlap prevention the goal, then there was misunderstanding of the
>> UN LOCODE system on the side of CLDR.
> The only way that there would be a problem for CLDR is if the UN LOCODEs
> could be other than 5 alphanumerics.
It may be true that there is no /problem for CLDR/ - but the claim was
that CLDR has "Variable length and inconsistent country code usage"
> According to
> http://www.unece.org/fileadmin/DAM/cefact/locode/unlocode_manual.pdf, the UN
> LOCODE would be 5 letters and possibly numbers (in positions 3-5). As long
> as any additional CLDR identifiers did not match that, there wouldn't be an
> So I don't understand your contention. Are you saying that UN LOCODEs, can
> in fact consist of other than 5 alphanumerics?
I don't say so, and AFAICS never said so.
>> > When the codes are not 5
>> > letters, the first two letters have no meaning.
>> Except where they have. And for some from the US, even the first 4
>> have a meaning:
>> debsngn: de = Germany
>> gldkshvn: gl = Greenland
>> mxstis: mx = Mexico
>> rukhndg: ru = Russia
>> ruunera: ru = Russia
>> usinvev: us = US (usin = Indiana)
>> usnavajo: us = US
>> usndcnt: us = US (usnd = North Dakota)
>> usndnsl: us = US (usnd = North Dakota)
> I should have said: When the codes are not 5 alphanumerics, the first two
> letters do not *necessarily* denote a country code.
That was my starting claim: "inconsistent country code usage"
>> > The codes are stablized, meaning that they will not change no matter
>> > what
>> > changes happen in the base codes. So if Hawaii leaves the US and joins
>> > Canada as a new province, "ushnl" would not change in CLDR even if the
>> > UN
>> > LOCODE changes to "cahnl" or something else.
>> And combining this with "When the codes are not 5 letters, the first
>> two letters have no meaning.", assuming that, if they have 5 letters
>> they /may/ have a meaning, leads to the fact that the meaning in CLDR
>> would be different from the meaning in the UN LOCODE system.
> I don't understand what you are saying: that would only be an issue if there
> were overlaps.
What don't you understand in : "the meaning in CLDR would be different
from the meaning in the UN LOCODE system"?
The meaning of the first two letters in the UN/LOCODE system, except
for private range codes, is to show the current valid ISO 3166-1
alpha-2 country code for the country the location identified by the
UN/LOCODE is located in.
>> A statement on why a country relation was not seen suitable for time
>> zone identifiers can be found at:
>> "Be robust in the presence of political changes."
> That's fine for TZ codes. As far as CLDR is concerned, however, the main
> goal is to have an unambiguous, stable ID that is from 3-8 ASCII
> alphanumerics long and maps to TZ codes. The 3-8 ASCII limitation is imposed
> by BCP47, so we could not use the TZ codes unmodified. We could have chosen
> to simply number them, but decided that the UN LOCODEs would be a bit
> simpler to deal with.
A "3-8 ASCII limitation [...] imposed by BCP47" would have allowed for
fixed length 5-digit codes.
As emailed here:
"It is common for identifiers created by ISO to have fixed length, e.g.
ISWC, ISSN, ISMN, ISRC, ISBN, ISNI, ISAN, ISO country codes, ISO
But CLDR for an unstated reason did choose variable length.
If one lists them in a table, the column containing them is
unnecessarily wider. If one puts them as labels on a map, they are
harder to fit.
In both cases, the non-5-character codes draw extra attention by the
reader to them.
Maybe all new assignments can be restricted to 5-character codes?
Rheinsberger Str. 18
More information about the tz