[tz] data not represented in tzfiles

Paul Eggert eggert at cs.ucla.edu
Sun Sep 8 18:25:21 UTC 2013


Zefram wrote:

> What you've actually
> written there, because of the way the transition time-of-day gets
> interpreted, has an hour each year of standard time.  If you fix that
> (glossing over the question of whether it can be fixed), you're calling
> for two transitions to occur simultaneously, the behaviour of which is
> not well defined.

Good catch.  Since POSIX doesn't say what to do when the
end-DST and start-DST transitions are simultaneous, this
idea relies on an extension to POSIX.  Since we're already
relying on other extensions to POSIX as part of the recent changes,
it should be OK to rely on this one as well, so long as we
document what we're doing.  Here's a further patch to do that.
I plan to look into the other points your email
mentions after addressing Arthur David Olson's points.

>From 88e130ed2f5a9e13310f38d08428f1230067d568 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert at cs.ucla.edu>
Date: Sun, 8 Sep 2013 07:49:22 -0700
Subject: [PATCH] Improve the support for perpetual DST.

Problem reported by Zefram in
<http://mm.icann.org/pipermail/tz/2013-September/020059.html>.
* localtime.c (tzparse): Elide simultaneous entries out of and
into DST.  Since this optimization can elide all entries, avoid
looping forever looking for entries that will never arrive.  While
we're at it, fix another portability bug where the code assumed
wraparound on signed integer overflow.  If the DST stop and start
times are simultaneous, assume perpetual DST; the old version
of this code did this for San Luis but I suspect it might not have
done so for hypothetical examples.
* newtzset.3, tzfile.5: Mention that as an extension to POSIX,
if DST stops and starts at the same instant, it's assumed to be
in effect all year.  Give an example.  Also, mention the old
posix limit of 23 hours rather than 24.
* zic.c (stringrule): Omit the "J" in January and February,
as this can save a byte or two in the output.
(rule_cmp): New function.
(stringzone): Do a better job of constructing the standard-time
abbreviation when there is perpetual DST.  Defer to the new
stringrule to construct the times for perpetual DST.
Fix bug noted by Zefram, which caused a stray hour of standard
time to be inserted in an otherwise perpetual DST.
Previously, this code generated "WARST4WARST,J1/0,J365/24"
for the San Luis example; now it generates "WART4WARST,0/1,0".
Not only does this fix the bug, it is a bit shorter and more likely to
work better with non-tzcode implementations that mistakenly treat this
as specifying standard time all year.
---
 localtime.c | 47 ++++++++++++++++++++++++---------------------
 newtzset.3  | 16 ++++++++++++++--
 tzfile.5    |  9 ++++++---
 zic.c       | 63 +++++++++++++++++++++++++++++++++++++++++++++----------------
 4 files changed, 93 insertions(+), 42 deletions(-)

diff --git a/localtime.c b/localtime.c
index 91a3171..eb9a1a6 100644
--- a/localtime.c
+++ b/localtime.c
@@ -1008,6 +1008,7 @@ tzparse(const char *name, register struct state *const sp,
 			struct rule	start;
 			struct rule	end;
 			register int	year;
+			register int	yearlim;
 			register time_t	janfirst;
 			time_t		starttime;
 			time_t		endtime;
@@ -1035,35 +1036,39 @@ tzparse(const char *name, register struct state *const sp,
 			atp = sp->ats;
 			typep = sp->types;
 			janfirst = 0;
-			sp->timecnt = 0;
-			for (year = EPOCH_YEAR;
-			    sp->timecnt + 2 <= TZ_MAX_TIMES;
-			    ++year) {
-			    	time_t	newfirst;
+			yearlim = EPOCH_YEAR + YEARSPERREPEAT;
+			for (year = EPOCH_YEAR; year < yearlim; year++) {
+				int_fast32_t yearsecs;
 
 				starttime = transtime(janfirst, year, &start,
 					stdoffset);
 				endtime = transtime(janfirst, year, &end,
 					dstoffset);
-				if (starttime > endtime) {
-					*atp++ = endtime;
-					*typep++ = 1;	/* DST ends */
-					*atp++ = starttime;
-					*typep++ = 0;	/* DST begins */
-				} else {
-					*atp++ = starttime;
-					*typep++ = 0;	/* DST begins */
-					*atp++ = endtime;
-					*typep++ = 1;	/* DST ends */
+				if (starttime != endtime) {
+					if (&sp->ats[TZ_MAX_TIMES - 2] < atp)
+						break;
+					yearlim = year + YEARSPERREPEAT + 1;
+					if (starttime > endtime) {
+						*atp++ = endtime;
+						*typep++ = 1;	/* DST ends */
+						*atp++ = starttime;
+						*typep++ = 0;	/* DST begins */
+					} else {
+						*atp++ = starttime;
+						*typep++ = 0;	/* DST begins */
+						*atp++ = endtime;
+						*typep++ = 1;	/* DST ends */
+					}
 				}
-				sp->timecnt += 2;
-				newfirst = janfirst;
-				newfirst += year_lengths[isleap(year)] *
-					SECSPERDAY;
-				if (newfirst <= janfirst)
+				yearsecs = (year_lengths[isleap(year)]
+					    * SECSPERDAY);
+				if (time_t_max - janfirst < yearsecs)
 					break;
-				janfirst = newfirst;
+				janfirst += yearsecs;
 			}
+			sp->timecnt = atp - sp->ats;
+			if (!sp->timecnt)
+				sp->typecnt = 1;	/* Perpetual DST.  */
 		} else {
 			register int_fast32_t	theirstdoffset;
 			register int_fast32_t	theirdstoffset;
diff --git a/newtzset.3 b/newtzset.3
index bb40c01..b05a6a3 100644
--- a/newtzset.3
+++ b/newtzset.3
@@ -108,7 +108,8 @@ follows
 summer time is assumed to be one hour ahead of standard time.  One or
 more digits may be used; the value is always interpreted as a decimal
 number.  The hour must be between zero and 24, and the minutes (and
-seconds) \(em if present \(em between zero and 59.  If preceded by a
+seconds) \(em if present \(em between zero and 59.  (Older versions
+of POSIX do not allow the hour to be 24.)  If preceded by a
 .RB `` \(mi '',
 the time zone shall be east of the Prime Meridian; otherwise it shall be
 west (which may be indicated by an optional preceding
@@ -132,6 +133,9 @@ describes when the change back happens.  Each
 .I time
 field describes when, in current local time, the change to the other
 time is made.
+As an extension to POSIX, if daylight saving time stops and
+starts at the same instant of time, daylight saving time is
+assumed to be in effect all year.
 .IP
 The format of
 .I date
@@ -183,7 +187,7 @@ or
 .RB `` \(pl '').
 As an extension to POSIX, the hours part of
 .I time
-can range from \(mi167 to 167; this allows for unusual rules such
+can range from \(mi167 through 167; this allows for unusual rules such
 as "the Saturday before the first Sunday of March".  The default, if
 .I time
 is not given, is
@@ -212,6 +216,14 @@ stands for Israel standard time (IST) and Israel daylight time (IDT),
 fourth Thursday in March (i.e., 02:00 on the first Friday on or after
 March 23), and fall back at 02:00 on the last Sunday in October.
 .TP
+.B WART4WARST,0/1,0
+stands for Western Argentina Summer Time (WARST), 3 hours behind UTC.
+There is a dummy transition to standard time on January 1 at 02:00
+daylight saving time, and a simultaneous transition back to DST at
+01:00 standard time, so DST is in effect all year and the initial
+.B WART
+is a placeholder.
+.TP
 .B WGT3WGST,M3.5.0/\(mi2,M10.5.0/\(mi1
 stands for Western Greenland Time (WGT) and Western Greenland Summer
 Time (WGST), 3 hours behind UTC, where clocks follow the EU rules of
diff --git a/tzfile.5 b/tzfile.5
index c7bd40e..e69cf3e 100644
--- a/tzfile.5
+++ b/tzfile.5
@@ -145,10 +145,13 @@ POSIX-TZ-environment-variable-style string for use in handling instants
 after the last transition time stored in the file
 (with nothing between the newlines if there is no POSIX representation for
 such instants).
-This string may use a minor extension to the POSIX TZ format: the
-hours part of its transition times may be signed and range from
+As described in
+.IR newtzset (3),
+this string may use two minor extensions to the POSIX TZ format.
+First, the hours part of its transition times may be signed and range from
 \(mi167 through 167 instead of the POSIX-required unsigned values
-from 0 through 24.
+from 0 through 24 (formerly 23).  Second, if DST stops and starts
+at the same time, it is assumed to be in effect all year.
 .SH SEE ALSO
 newctime(3), newtzset(3)
 .\" This file is in the public domain, so clarified as of
diff --git a/zic.c b/zic.c
index 502d81e..cd787b0 100644
--- a/zic.c
+++ b/zic.c
@@ -1804,7 +1804,11 @@ stringrule(char *result, const struct rule *const rp, const zic_t dstoff,
 		total = 0;
 		for (month = 0; month < rp->r_month; ++month)
 			total += len_months[0][month];
-		(void) sprintf(result, "J%d", total + rp->r_dayofmonth);
+		/* Omit the "J" in Jan and Feb, as that's shorter.  */
+		if (rp->r_month <= 1)
+		  (void) sprintf(result, "%d", total + rp->r_dayofmonth - 1);
+		else
+		  (void) sprintf(result, "J%d", total + rp->r_dayofmonth);
 	} else {
 		register int	week;
 		register int	wday = rp->r_wday;
@@ -1842,6 +1846,20 @@ stringrule(char *result, const struct rule *const rp, const zic_t dstoff,
 	return 0;
 }
 
+static int
+rule_cmp(struct rule const *a, struct rule const *b)
+{
+	if (!a)
+		return -!!b;
+	if (!b)
+		return 1;
+	if (a->r_hiyear != b->r_hiyear)
+		return a->r_hiyear < b->r_hiyear ? -1 : 1;
+	if (a->r_month - b->r_month != 0)
+		return a->r_month - b->r_month;
+	return a->r_dayofmonth - b->r_dayofmonth;
+}
+
 static void
 stringzone(char *result, const struct zone *const zpfirst, const int zonecount)
 {
@@ -1851,6 +1869,7 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount)
 	register struct rule *		dstrp;
 	register int			i;
 	register const char *		abbrvar;
+	struct rule			stdr, dstr;
 
 	result[0] = '\0';
 	zp = zpfirst + zonecount - 1;
@@ -1874,19 +1893,17 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount)
 	if (stdrp == NULL && dstrp == NULL) {
 		/*
 		** There are no rules running through "max".
-		** Let's find the latest rule.
+		** Find the latest std rule in stdabbrrp
+		** and latest rule of any type in stdrp.
 		*/
+		register struct rule *stdabbrrp = NULL;
 		for (i = 0; i < zp->z_nrules; ++i) {
 			rp = &zp->z_rules[i];
-			if (stdrp == NULL || rp->r_hiyear > stdrp->r_hiyear ||
-				(rp->r_hiyear == stdrp->r_hiyear &&
-				(rp->r_month > stdrp->r_month ||
-				(rp->r_month == stdrp->r_month &&
-				rp->r_dayofmonth > stdrp->r_dayofmonth))))
-					stdrp = rp;
+			if (rp->r_stdoff == 0 && rule_cmp(stdabbrrp, rp) < 0)
+				stdabbrrp = rp;
+			if (rule_cmp(stdrp, rp) < 0)
+				stdrp = rp;
 		}
-		if (stdrp != NULL && stdrp->r_stdoff != 0)
-			dstrp = stdrp; /* We end up in DST.  */
 		/*
 		** Horrid special case: if year is 2037,
 		** presume this is a zone handled on a year-by-year basis;
@@ -1894,6 +1911,24 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount)
 		*/
 		if (stdrp != NULL && stdrp->r_hiyear == 2037)
 			return;
+
+		if (stdrp != NULL && stdrp->r_stdoff != 0) {
+			/* Perpetual DST.  */
+			stdr.r_month = dstr.r_month = TM_JANUARY;
+			stdr.r_dycode = dstr.r_dycode = DC_DOM;
+			stdr.r_dayofmonth = dstr.r_dayofmonth = 1;
+			stdr.r_tod = 2 * SECSPERHOUR;
+			dstr.r_tod = stdr.r_tod - stdrp->r_stdoff;
+			stdr.r_todisstd = dstr.r_todisstd = FALSE;
+			stdr.r_todisgmt = dstr.r_todisgmt = FALSE;
+			stdr.r_stdoff = 0;
+			dstr.r_stdoff = stdrp->r_stdoff;
+			stdr.r_abbrvar
+			  = (stdabbrrp ? stdabbrrp->r_abbrvar : "");
+			dstr.r_abbrvar = stdrp->r_abbrvar;
+			stdrp = &stdr;
+			dstrp = &dstr;
+		}
 	}
 	if (stdrp == NULL && (zp->z_nrules != 0 || zp->z_stdoff != 0))
 		return;
@@ -1913,16 +1948,12 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount)
 				return;
 		}
 	(void) strcat(result, ",");
-	if (dstrp == stdrp)
-		(void) strcat(result, "J1/0");
-	else if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) {
+	if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) {
 		result[0] = '\0';
 		return;
 	}
 	(void) strcat(result, ",");
-	if (dstrp == stdrp)
-		(void) strcat(result, "J365/24");
-	else if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) {
+	if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) {
 		result[0] = '\0';
 		return;
 	}
-- 
1.8.1.2




More information about the tz mailing list