[tz] Fractional seconds in zic input
howard.hinnant at gmail.com
Mon Feb 5 17:27:35 UTC 2018
On Feb 5, 2018, at 12:02 PM, Paul Eggert <eggert at cs.ucla.edu> wrote:
> On 02/05/2018 04:55 AM, Howard Hinnant wrote:
>> 2. Doing this without specifying a maximum precision will mean the substantial breakage I speak of in 1) will happen every time the precision is increased.
> What sort of breakage do you see? Is the problem that different downstream users will compare calculations and disagree about the exact results because they use differing precisions? But we already have that problem, as at least one downstream user already discards sub-minute information, namely Kerry Shetline's recently-discussed tzdata compressor.
A tzdb compiler will either exactly represent all of the data contained in the tzdb or it won’t. Let’s assume for the moment that the tzdb compiler desires to exactly represent all of the data contained in the tzdb. To do that, it will have to exactly represent the UTC offsets to whatever precision is in the database. To do so in a practical way, the compiler is likely to choose a precision that is at least as fine as the finest precision UTC offset (today that is seconds precision). The compiler _could_ choose to represent precisions finer than the current finest precision offset, but such a choice is not free: it costs range. So there is pressure to design the compiler to not represent uselessly fine precisions.
Given that there is upwards pressure on the finest precision that the compiler can handle, one must assume that at least some compilers, if not all of them, will design themselves to whatever is the current finest precision in the database (today seconds). To modify a compiler to handle a precision finer than it is currently designed is a moderate-sized rewrite, likely to break API and ABI in its interface if said compiler is in library form (as is mine and others).
So you’re going to break me (and most others) in the move from seconds to centiseconds. If a year from now you again move from centiseconds to milliseconds, you’re going to break me just as badly as the seconds to centiseconds move. If you keep breaking me, I’m eventually going to give up on you being a reliable source of data because I won’t be able to afford the maintenance. It won’t be that I won’t be able to keep up with the work, it will be that my customers won’t put up with my passing along your breakage to them in the form of API/ABI changes.
Seconds to centiseconds (or whatever) is going to be a huge amount of breakage for a very limited amount of benefit. It would be a mistake to do it once. It would be a colossal mistake to _plan_ on doing it multiple times.
But lets take the second choice now: The tzdb compiler may or may not exactly represent all of the data contained in the tzdb:
Now if two computers are given the same UTC time point, say to microseconds precision, and both computers map that time point to the same local time using the same time zone specification from the tzdb, they are no longer guaranteed to have equal local times when they communicate with each other (comparing their computed local times) over http. This is a broken invariant that will inevitably lead to run time errors.
So either tzdb compilers must universally exactly represent all data in the tzdb, or tzdb compilers must universally agree on the subset of data to extract from the tzdb so that they all have the same mapping (identical mapping was also essentially the motivation for the relatively recently introduced machine-readable versioning). In the latter case, the portion of the data in the tzdb that is universally ignored by all tzdb compilers has 0 benefit, and a non-zero cost because of the programming effort to ignore it, and the risk of accidentally not ignoring it. 0/non-zero is a horrible benefit/cost ratio.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: Message signed with OpenPGP
More information about the tz