[tz] [PROPOSED] Improve leap second table truncation doc

Paul Eggert eggert at cs.ucla.edu
Thu Sep 16 07:10:39 UTC 2021

* NEWS: Mention possible successor to RFC 8536, and relaxation on
TZif reader’s restriction on gaps between leaps.
* tzfile.5: Improve wording on leap second expiration.  Specify
correction before truncated leap second table more accurately,
given how localtime.c behaves; for example, the first leap second
is considered to be positive if and only if its correction is
positive.  Say what to do after leap second table expires.  Add
TZ="XXX3EDT4,0/0,J365/23" example.  Say that positive leap seconds
not at end of localtime minute have not been a practical problem yet.
* tzfile.5, zic.8: Use \- (current font minus) instead of \(mi (math
font minus).
 NEWS     |  6 ++++-
 tzfile.5 | 77 ++++++++++++++++++++++++++++++++++++++------------------
 zic.8    |  2 +-
 3 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/NEWS b/NEWS
index cc69995..00dea3c 100644
--- a/NEWS
+++ b/NEWS
@@ -86,7 +86,8 @@ Unreleased, experimental changes
     clients (including tzdb 2017c through 2021a) reject it, so
     "Expires" directives are currently disabled by default.  To enable
     them, set the EXPIRES_LINE Makefile variable.  If a TZif file uses
-    this new feature it is marked with a new TZif version number 4.
+    this new feature it is marked with a new TZif version number 4,
+    a format intended to be documented in a successor to RFC 8536.
     zic -L LEAPFILE -r @LO no longer generates an invalid TZif file
     that omits leap second information for the range LO..B when LO
@@ -98,6 +99,9 @@ Unreleased, experimental changes
     correction other than -1 or +1, and to contain adjacent
     transitions with equal corrections.  This supports TZif version 4.
+    The TZif reader now lets leap seconds occur less than 28 days
+    apart.  This supports possible future TZif extensions.
     Fix bug that caused 'localtime' etc. to crash when TZ was
     set to a all-year DST string like "EST5EDT4,0/0,J365/25" that does
     not conform to POSIX but does conform to Internet RFC 8536.
diff --git a/tzfile.5 b/tzfile.5
index bab6390..b1932bd 100644
--- a/tzfile.5
+++ b/tzfile.5
@@ -142,16 +142,26 @@ pairs of four-byte values, written in network byte order;
 the first value of each pair gives the nonnegative time
 (as returned by
 .BR time (2))
-at which a leap second occurs;
+at which a leap second occurs or at which the leap second table expires;
 the second is a signed integer specifying the correction, which is the
 .I total
 number of leap seconds to be applied during the time period
 starting at the given time.
-The pairs of values are sorted in ascending order by time.
-Each transition is for one leap second, either positive or negative;
-transitions always separated by at least 28 days minus 1 second.
-The first entry's correction is +1 (or \(mi1, for a hypothetical leap
-second table where the first leap second was negative).
+The pairs of values are sorted in strictly ascending order by time.
+Each pair denotes one leap second, either positive or negative,
+except that if the last pair has the same correction as the previous one,
+the last pair denotes the leap second table's expiration time.
+Each leap second is at the end of a UTC calendar month.
+The first leap second is positive if and only if its correction is positive,
+and the correction for each leap second after the first differs
+from the previous leap second by either 1 for a positive leap second,
+or \-1 for a negative leap second.
+If the leap second table is empty, the leap-second correction is zero
+for all timestamps;
+otherwise, for timestamps before the first correction time,
+the leap-second correction is zero if the first pair's correction is 1 or \-1,
+and is unspecified otherwise (which can happen only in files
+truncated at the start).
 .IP *
 .B tzh_ttisstdcnt
 standard/wall indicators, each stored as a one-byte boolean;
@@ -201,7 +211,7 @@ POSIX-TZ-environment-variable-style string for use in handling instants
 after the last transition time stored in the file
 or for all instants if the file has no transitions.
 The POSIX-style TZ string is empty (i.e., nothing between the newlines)
-if there is no POSIX representation for such instants.
+if there is no POSIX-style representation for such instants.
 If nonempty, the POSIX-style TZ string must agree with the local time
 type after the last transition time if present in the eight-byte data;
 for example, given the string
@@ -225,8 +235,8 @@ January 1 at 00:00 and ends December 31 at 24:00 plus the difference
 between daylight saving and standard time.
 .SS Version 4 format
 For version-4-format TZif files,
-the first leap second transition can have a correction that is neither
-+1 nor \(mi1, to support TZif files with reduced timestamp range.
+the first leap second record can have a correction that is neither
++1 nor \-1, to represent truncation of the TZif file at the start.
 Also, if two or more leap second transitions are present and the last
 entry's correction equals the previous one, the last entry
 denotes the expiration of the leap second table instead of a leap second;
@@ -237,32 +247,37 @@ the added leap seconds will change how post-expiration timestamps are treated.
 Future changes to the format may append more data.
 Version 1 files are considered a legacy format and
-should be avoided, as they do not support transition
+should not be generated, as they do not support transition
 times after the year 2038.
-Readers that only understand Version 1 must ignore
+Readers that understand only Version 1 must ignore
 any data that extends beyond the calculated end of the version
 1 data block.
 Other than version 1, writers should generate
 the lowest version number needed by a file's data.
 For example, a writer should generate a version 3 file
-if the file does not contain a truncated leap second table
+if its leap second table neither expires nor is truncated at the start
 and so does not use version 4 features, but
 TZ string extensions are necessary to accurately
 model transition times so the file does need version 3 features.
 The sequence of time changes defined by the version 1
-header and data block should be a contiguous subsequence
+header and data block should be a contiguous sub-sequence
 of the time changes defined by the version 2+ header and data
 block, and by the footer.
 This guideline helps obsolescent version 1 readers
 agree with current readers about timestamps within the
-contiguous subsequence.  It also lets writers not
+contiguous sub-sequence.  It also lets writers not
 supporting obsolescent readers use a
 .B tzh_timecnt
 of zero
 in the version 1 data block to save space.
+When a TZif file contains a leap second table expiration
+time, TZif readers should either refuse to process
+post-expiration timestamps, or process them as if the expiration
+time did not exist (possibly with an error indication).
 Time zone designations should consist of at least three (3)
 and no more than six (6) ASCII characters from the set of
@@ -272,7 +287,7 @@ and
 This is for compatibility with POSIX requirements for
 time zone abbreviations.
-When reading a version 2+ file, readers
+When reading a version 2 or higher file, readers
 should ignore the version 1 header and data block except for
 the purpose of skipping over them.
@@ -306,7 +321,7 @@ design goal has been that a reader can successfully use a TZif
 file even if the file is of a later TZif version than what the
 reader was designed for.
 When complete compatibility was not achieved, an attempt was
-made to limit glitches to rarely-used timestamps, and to allow
+made to limit glitches to rarely used timestamps and allow
 simple partial workarounds in writers designed to generate
 new-version data useful even for older-version readers.
 This section attempts to document these compatibility issues and
@@ -323,24 +338,33 @@ version 2+ data even if the reader's native timestamps have only
 32 bits.
 .IP *
 Some readers designed for version 2 might mishandle
-timestamps after a version 3 file's last transition, because
+timestamps after a version 3 or higher file's last transition, because
 they cannot parse extensions to POSIX in the TZ-like string.
 As a partial workaround, a writer can output more transitions
 than necessary, so that only far-future timestamps are
 mishandled by version 2 readers.
 .IP *
 Some readers designed for version 2 do not support
-permanent daylight saving time, e.g., a TZ string
+permanent daylight saving time with transitions after 24:00
+\(en e.g., a TZ string
 .q "EST5EDT,0/0,J365/25"
-denoting permanent Eastern Daylight Time (\-04).
-As a partial workaround, a writer can substitute standard time
-for the next time zone east, e.g.,
+denoting permanent Eastern Daylight Time
+As a workaround, a writer can substitute standard time
+for two time zones east, e.g.,
+.q "XXX3EDT4,0/0,J365/23"
+for a time zone with a never-used standard time (XXX, \-03)
+and negative daylight saving time (EDT, \-04) all year.
+as a partial workaround a writer can substitute standard time
+for the next time zone east \(en e.g.,
 .q "AST4"
-for permanent Atlantic Standard Time (\-04).
+for permanent
+Atlantic Standard Time (\-04).
 .IP *
-Some readers designed for earlier versions reject version 4 files,
-because they require complete leap second tables that do not record
-expiration dates.
+Some readers designed for version 2 or 3, and that require strict
+conformance to RFC 8536, reject version 4 files whose leap second
+tables are truncated at the start or that end in expiration times.
 .IP *
 Some readers ignore the footer, and instead predict future
 timestamps from the time type of the last transition.
@@ -413,6 +437,9 @@ a positive leap second 78796801 (1972-06-30 23:59:60 UTC), they
 map both 78796800 and 78796801 to 01:23:45 local time the next day
 instead of mapping the latter to 01:23:46, and they map 78796815 to
 01:23:59 instead of to 01:23:60.
+This has not yet been a practical problem, since no civil authority
+has observed such UTC offsets since leap seconds were
+introduced in 1972.
 Some interoperability problems are reader bugs that
 are listed here mostly as warnings to developers of readers.
diff --git a/zic.8 b/zic.8
index 3149440..ea7bab9 100644
--- a/zic.8
+++ b/zic.8
@@ -550,7 +550,7 @@ using the shortest form that does not lose information, where
 .IR mm ,
 .I ss
-are the hours, minutes, and seconds east (+) or west (\(mi) of UT.
+are the hours, minutes, and seconds east (+) or west (\-) of UT.
 a slash (/)
 separates standard and daylight abbreviations.

More information about the tz mailing list