[tz] Proposed reversions, for moving forward

Tim Parenti tim at timtimeonline.com
Fri Aug 8 22:25:11 UTC 2014


I'm going to attempt to synthesize a lot of the recent discussion with
respect to where I stand, as well as one way we could proceed...

On 8 August 2014 02:25, Alan Barrett <apb at cequrux.com> wrote:

> Yes, valuing correctness over stability is good, even when the new data is
> not 100% correct, provided it is more correct than the old data.
>
> The stability-related complaints have been about cases where the "more
> correct than the old data" condition was not perceived to be satisfied.
>
> I am gradually coming round to the opinion that the new data is probably
> more correct than the old data, but that is not clear to all observers.
>

I, too, am gradually coming around to this stance, and for the same
reasons.  Among the reasons I'm not yet fully on board:

On 5 August 2014 12:07, Paul Eggert <eggert at cs.ucla.edu> wrote:

> Marc Lehmann wrote:
>
>> I haven't seen anybody argue the new data is better.
>>
>
> It appears you overlooked some arguments in that direction; see <
> http://mm.icann.org/pipermail/tz/2014-August/021283.html> for example.


This post only addresses the changes to a few of the zones.  If you assert
that the rest of the changes are also better for similar reasons, that's
one thing, but to date, I don't think this has been done.  Depending on the
nature of the assertion(s), they may or may not require fuller
documentation to become convincing.

On 6 August 2014 14:32, Paul Eggert <eggert at cs.ucla.edu> wrote:

> Lester Caine wrote:
>
>> If it is proven wrong because there is a
>> proven correct version then OK, but switching one unproven fact with
>> another ...
>>
>
> Those changes mostly remove dubious data, rather than replacing one
> dubious datum with another.


I believe the relevant point made by objectors here is that, by doing
anything other than deleting the identifiers altogether, there is no such
thing as "removing" data from an end user's perspective, again, because the
format minimally requires that something be there.  Date and time tools
using tz will still output a wall clock time for a given tz identifier and
historic UNIX timestamp, and the new assertions in this space are the ones
which (in many cases) have no more proven merit than the old versions.  I
believe this distinction of the actual Zone line data (and zic'd binaries)
we input from the end user "data" (and wall clock times) that tools output
is an important one to make here, as it lies at the heart of much objection
to these changes:

On 6 August 2014 14:58, Lester Caine <lester at lsces.co.uk> wrote:

> If the removing of dubious data results in the answers generated
> changing then that is the stability that is objected to. These changes
> resulting new output that is only changing because of two lots of
> dubious states is the problem
>

As for the scope of the disruption caused by these changes, in general, I
find it difficult to buy either argument that the supposed disruption is
(a) so large as to prohibit the change, or (b) so small as to override all
other concerns.  Due to the age of the timestamps affected, I'm more
inclined to lean toward the "small" side of this debate, but due to my
cautious nature I'm also inclined to overestimate the impacts by an
order-of-magnitude or two.  Further, just because no one has complained to
us does not mean issues don't exist, or have even been discovered by
users.  In the end, I think reality is somewhere in the middle, and that
these are weak arguments on both sides.

* * *

On 6 August 2014 14:32, Paul Eggert <eggert at cs.ucla.edu> wrote:

> In the long run it'd be better to remove dubious data, or at least move it
> to a "dubious" area optionally available to users who prefer it; but one
> step at a time.


I think this is the direction we should move toward at this juncture.
Perhaps frustratingly, the first task would be to restore the zone data
(and associated commentary) removed in 2013e and 2014f to this new area.
(I would have suggested "attic" for this file, but "dubious" is more
straightforward and also avoids another file starting with "a").

>From a build procedure standpoint, and to avoid disrupting the main
database, I think the simplest approach would be to add a Makefile target
which compiles the standard files as usual, then compiles the dubious data
with a separate call to zic.  If I am not mistaken, this would simply
overwrite the binaries created from links in the standard files with
binaries representing the more suspect data.  It's a bit of unnecessary
work for the compiler, yes, but this would simply factor into an
administrator's decision whether to include this data.

On 5 August 2014 11:40, Stephen Colebourne <scolebourne at joda.org> wrote:

> Simply ploughing on with the changes, just in smaller batches, does
> not actually make the objectors happy
>

In terms of maintenance procedure, I think we need to be just as cautious
to observe due diligence as when we add new data.  This takes on a
different tone when relegating dubious data, but should still be of
importance to the project.

On 8 August 2014 05:07, John Hawkinson <jhawk at mit.edu> wrote:

> Perhaps this work should continue in bite-size portions on a branch
> until it is finally done, and only then should that branch be merged
> to the trunk and released.
>

And I think that, once the new file and build procedure are in place, this
is one reasonable way (among many) to move forward with this plan.

--
Tim Parenti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/tz/attachments/20140808/dd5faac7/attachment-0001.html>


More information about the tz mailing list