[tz] [PATCH 2/3] Replace some zones with links when that doesn't lose non-LMT info.

Stephen Colebourne scolebourne at joda.org
Wed Sep 4 16:24:54 UTC 2013


On 4 September 2013 15:32, Paul Eggert <eggert at cs.ucla.edu> wrote:
>> ubiquity is a key value of the data. The same data is used
>> everywhere from Unix to Java to mobile phones.
>
> No, it's pretty routinely filtered before it hits many
> platforms.  One example: QNX has unsigned time_t, which by
> design filters out all data before 1970.
>
> Furthermore, there is an inevitable delay in propagating
> changes to the field.  Even if we're talking a single host
> with 64-bit signed time_t (so that it matches Java's
> 'long'), I've seen situations where Java's copy of the data
> disagree with the POSIX copy.  And certainly a distributed
> application cannot assume ubiquity, as the client and server
> may be updated at different times.  So, for various reasons
> unrelated to the proposed changes, it's already the case
> that applications cannot assume that the data are ubiquitous
> and that the same data are used everywhere.

Not the ubiquity I meant. I meant that if there exists a fork of tzdb,
then it will over time diverge. If Scotland has its own time different
from England, then one tzdb might name it Edinburgh and the forked
tzdb uses Glasgow. That divergence, or non-ubiquity, would be very
unhelpful to everyone that needs time-zone data.


>> I'm not speaking on behalf of myself, but on behalf of Java
>> development generally.
> These comments would have more weight if they pointed to
> user problems that occurred when we made similar changes in
> the past.  Based on my experience I'm skeptical that there
> were significant user problems.  I've asked the list for
> reports of problems but nobody else has reported problems
> either.  This suggests that the concerns are misplaced.

The data is used far more widely that just zic. Some of those uses
work directly from the source tzdb data. Some of those uses expose
source tzdb data that is not exposed via the zic binary. Thus, it
certainly is the case that there are people affected by each change.
Chances are they won't know it until a month or two down the line.

For example, the removal of "Castries Mean Time" and "Kingstown Mean
Time" will be visible in Joda-Time, and the change to the end of LMT
will be visible in Joda-Time and JSR-310.

Most people will adapt, not complain. But it is clearly a fact to say
that data has been deleted and that deletion is observable to
consumers of the data. The only argument is whether that data was
sufficently in error to warrant deletion.


> On this list I have also noted that the changes promise to
> make life easier for users in some cases, by omitting
> irrelevant choices.  This is a real advantage that should
> trump stability concerns.

There are other ways (winnowing) to reduce the selection problem,
because that problem is based off zic. For those of us parsing the
source tzdb files directly, any data loss is data loss.


>> zone ID merging that loses the start date of offsets or abbreviations,
>> even if those are guesswork/invented (because the replacement is not
>> an enhancement, its worse).
>
> I've had quite a bit of experience in dealing with the
> Shanks data.  From my experience the proposed change is a
> fairer representation of what we know than the previous
> version was.  You're right that we don't know that the new
> version is correct and the old is wrong (both are guesses),
> but it's not right to say that the new version is worse.

> Changes like this are a longstanding part of maintenance,
> and I'm becoming inclined to think that we shouldn't
> discontinue this practice purely from a desire to not
> change things.

Again, I don't want to stop enhancements, that would be
counter-productive to all.

However, by making the change you are asserting that it is more true
that all 10 locations in the Carribean have the same local time since
the year dot, than that they had different time (that we just don't
know anything about). ie. the fact that you have made a change at all
means that the data should now be more reliable. Yet you accept that
both the before and after are guesses.

So, if you can reply to this email and say "with additional research I
can say with reasonable certainty that all 10 locations have always
had the same local time" then the change is entirely justified. If you
can't, then the change should be reverted as not based on enough
evidence. (To put it another way, the barrier required for changing
the ID like this is higher than the barrier was when the ID was
created in the first place.

For example, America/Curacao and America/Aruba have exactly the same
time since the year dot apart from the LMT value (they do have the
same LMT end date). As such a Link is appropriate.

Whereas, America/Tortola and America/Port_of_Spain have different LMT
end dates. This means that Joda-time users will see a change in local
time when querying between 1911-07-01 and 1912-03-02.

More broadly, I'd suggest it would have been wiser to only suggest
this patch once the current mess has settled down.

Stephen


More information about the tz mailing list