[tz] [PATCH] Change abbreviation of Sri Lanka standard time to SLST

Wed Oct 19 09:40:17 UTC 2016

> On Oct 19, 2016, at 1:39 AM, Pavel V. Rochnyack <rpv at nikolas.ru> wrote:
> 
> 19.10.2016 14:41, Guy Harris пишет:
>> On Oct 18, 2016, at 10:41 PM, Pavel V. Rochnyack <rpv at nikolas.ru>
>>  wrote:
>> 
>> 
>>> 19.10.2016 12:20, Paul Eggert пишет:
>>> 
>>>> Sadika Sumanapala wrote:
>>>> 
>>>>> Sri Lanka standard time is SLST.
>>>>> 
>>>> We can switch to "SLST" later if it catches on in the broader English-language community.
>>>> 
>>>  ... and if your country is not English-speaking you have to go away with your wishes (like Russia and rest of exUSSR).
>>> 
>> If your country is not English-speaking, your software vendor should be providing their own translations of time zone abbreviations
>> 
>> 
> 
> Sadika Sumanapala, did you get the answer?  If you want abbreviation to be changed, then your software vendor should provide their own time zone database.
> Did you create support ticket already?
> 
>> If your country is not English-speaking, your software vendor should be providing their own translations of time zone abbreviations
> 
> Whom do you mean by "your software vendor"? If I'm using Debian, then you propose Debian community to build localized tzdb versions for different countries?

No, I propose that they (and every other UN*X on the planet) have the code that provides time zone abbreviations use the user's locale to look up the abbreviation in the CLDR, and only use the tzdb's abbreviation if they can't find one in the CLDR.

> Later, if some country will change its timezone offset then you also will answer "your software vendor should be providing their own ..." ?

No, I wouldn't.  Time zone offsets and DST rules don't require localization; time zone abbreviations do.

>>  or using the Unicode CLDR for translated abbreviations.  
>> 
>> It's not the job of the tzdb maintainers to provide non-English-language abbreviations; 
> 
> Then tzdb should not provide abbreviations at all.

Ideally, no, it shouldn't; as far as I'm concerned, the only reason for providing them in tzdb is that there are UN*X APIs that supply them, so the tzcode sample implementation of the UN*X time zone APIs needs some way of providing them.

Actual UN*Xes that provide those APIs should use the CLDR (or, for people who like reinventing the wheel, use their own tables) to get the time zone abbreviation, and only use the tzdb abbreviations if looking up the locale's abbreviation in the CLDR fails.

This does, however, raise the question of whether anything other than characters in the "portable character set" are allowed to be provided by UNIX(R) systems (non-UNIX(R) systems, such as the *BSDs and Linux, aren't required to impose such a limitation, but macOS/Solaris/AIX/HP-UX are).  The Single UNIX Specification page for tzname:

	http://pubs.opengroup.org/onlinepubs/9699919799/functions/tzname.html

says

	The tzset() function shall use the value of the environment variable TZ to set time conversion information used by ctime, localtime, mktime, and strftime. If TZ is absent from the environment, implementation-defined default timezone information shall be used.

	The tzset() function shall set the external variable tzname as follows:

	tzname[0] = "std";
	tzname[1] = "dst";

	where std and dst are as described in XBD Environment Variables.

and XBD Environment Variables:

	http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08

says

	TZ
	     This variable shall represent timezone information. The contents of the environment variable named TZ shall be used by the ctime(), ctime_r(), localtime(), localtime_r()strftime(), mktime(), functions, and by various utilities, to override the default timezone. The value of TZ has one of the two forms (spaces inserted for clarity):

	     :characters

	     or:

	     std offset dst offset, rule

	     If TZ is of the first format (that is, if the first character is a <colon>), the characters following the <colon> are handled in an implementation-defined manner.

	     The expanded format (for all TZs whose value does not have a <colon> as the first character) is as follows:

	     stdoffset[dst[offset][,start[/time],end[/time]]]

	     Where:

	     std and dst

	          Indicate no less than three, nor more than {TZNAME_MAX}, bytes that are the designation for the standard (std) or the alternative (dst -such as Daylight Savings Time) timezone. Only std is required; if dst is missing, then the alternative time does not apply in this locale.

Each of these fields may occur in either of two formats quoted or unquoted:

	               * In the quoted form, the first character shall be the <less-than-sign> ( '<' ) character and the last character shall be the <greater-than-sign> ( '>' ) character. All characters between these quoting characters shall be alphanumeric characters from the portable character set in the current locale, the <plus-sign> ( '+' ) character, or the <hyphen-minus> ( '-' ) character. The std and dst fields in this case shall not include the quoting characters.

	               * In the unquoted form, all characters in these fields shall be alphabetic characters from the portable character set in the current locale.

	          The interpretation of these fields is unspecified if either field is less than three bytes (except for the case when dst is missing), more than {TZNAME_MAX} bytes, or if they contain characters other than those specified.

and the Portable Character Set:

	http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html

is a subset of ASCII, so no accented Roman-alphabet letters, no non-Roman-alphabet letters, no CJK characters, etc..

However, the "std" and "dst" there refer to the contents of the TZ environment variable when it's a standard POSIX setting.  For a POSIX setting, the tzdb abbreviations aren't used, the values from the environment variable are used.

So it's not clear what constraints, if any, are placed on the members of tzname[] if TZ is set to :{tzid} or just to {tzid}, e.g. :Europe/Berlin or Europe/Berlin.  If the "portable character set" limitation applies only to POSIX settings of TZ, tzname could be set to a string with arbitrary characters, such as "SEČ" (which comes from the CLDR file for Slovak; it's the abbreviation for Central European Standard time, although the Slovak for "standard" isn't part of the abbreviation).