[tz] OpenJDK/CLDR/ICU/Joda issues with Ireland change
scolebourne at joda.org
Thu Jan 25 11:25:43 UTC 2018
On 24 January 2018 at 18:44, Guy Harris <guy at alum.mit.edu> wrote:
> On Jan 23, 2018, at 11:55 AM, Yoshito Umaoka <yoshito_umaoka at us.ibm.com> wrote:
>> CLDR does not determine offsets.
> Stephen Colebourne claimed that CLDR determines whether to use the standard or daylight time strings by comparing the "raw offset" (presumably meaning "the offset during standard time") with the "actual offset" (presumably meaning "the offset during daylight savings time").
raw-offset = the base/standard offset from TZDB (GMTOFF)
actual-offset = the actual offset a person sees at a given instant
(GMTOFF + SAVE)
Saying (raw-offset == actual-offset) is the same as saying (SAVE == 0).
> Therefore, it *must* know those offsets, otherwise it cannot compare them.
> So let me rephrase the question:
> How does CLDR obtain those offsets?
Mostly answered by Yoshito/Mark.
- TZDB provides data on how offsets change over time, indicating a
base/raw/standard offset and an adjustment/SAVE when DST applies.
- CLDR provides data on zone names, keyed by "generic", "standard", "daylight".
- For Ireland, CLDR states that "standard" = winter = "Greenwich Mean Time"
- For Ireland, CLDR states that "daylight" = summer = "Irish Standard Time"
- Some piece of code has to decide whether to pick the "standard" or
"daylight" CLDR key based on the TZDB data.
- ICU & OpenJDK both parse the source TZDB files (as data is lost in
the conversion to binary)
- ICU & OpenJDK use the mechanism to pick the key, using (raw-offset =
actual-offset) to indicate "standard".
- For Ireland, TZDB currently indicates (raw-offset = actual-offset) in winter
- For Ireland, TZDB is proposing to indicate (raw-offset =
actual-offset) in summer
To adjust to the Ireland proposal, ICU & OpenJDK code (and all similar
code) would need to handle negative SAVE values. The evidence so far
is that task is not complex, and already works in many cases.
To adjust to the Ireland proposal, CLDR would have to change the text
associated with the keys "standard" and "daylight" to the opposite of
what they are today.
Therefore, there are 8 possible combinations to consider:
- new code, new TZDB, new CLDR - works fine
- new code, new TZDB, old CLDR - wrong names
- new code, old TZDB, new CLDR - wrong names
- new code, old TZDB, old CLDR - works fine
- old code, new TZDB, new CLDR - code may fail
- old code, new TZDB, old CLDR - code may fail & wrong names
- old code, old TZDB, new CLDR - wrong names
- old code, old TZDB, old CLDR - works fine
All of these combinations are possible to create in the wild. It is
not possible to ensure that only a working combination exists
(especially considering the old code cases). Of the four cases where
TZDB changes, 3 result in failure.
And note that this only discusses one piece of new code. In reality,
ICU, OpenJDK, Joda-Time, ThreeTen-Backport, Android and other
libraries all exist. Each of these can be in new-code vs old-code
form, so instead of this being 8 combinations, it could easily be 16,
32, 64 or more.
Perhaps now, readers can see why I say this is not just a code bug
that can be fixed. It is the interplay between old and new versions of
code and data that makes the change impossible. (It simply isn't
possible to update everything in lock-step).
Finally, the Ireland situation has been known about in TZDB since 2005:
Common sense prevailed back then, with the SAVE value remaining
positive. (the zic binary output doesn't care whether SAVE is positive
or negative other than the tm_isdst flag which everyone here seems to
think is an anachronism in zic).
More information about the tz