Addition to Arthur Olsen/4.3bsd table-driven ctime

Steve Summit seismo!rutgers!lll-lcc!cae780!tektronix.TEK.COM!copper.TEK.COM!stevesu
Mon Mar 30 20:07:02 UTC 1987


(This is essentially the article I just posted to
comp.unix.wizards and comp.bugs.4bsd).

First, let me thank you guys for the work that has gone into this
new ctime.  A table-driven approach is clearly the only way to
go, and this one works admirably well.  (I can't believe how fast
it is, and how small both ctime.o and the zoneinfo files are.)
I foresee one problem with its use, however, which I'll describe,
along with a proposed solution.

Suppose I'm a software vendor (which I am, or at least I work for
one), and I ship binary executables to my customers (which I do),
executables which reference ctime and friends (which they do).
What, if anything, should I do to ctime on my development system?
I cannot be sure that my customers will install the necessary 
timezone support files.  Therefore, any copy of ctime linked into
my programs must work "correctly" when transported to a machine
without /etc/zoneinfo in place.

The version posted to net.sources defaulted to GMT in the absence
of /etc/zoneinfo.  The "Official 4.3BSD" version posted last week
arranges that local timezone correction be applied in this
situation, based on the kernel's notion of the timezone.  I
insist that both timezone _a_n_d DST corrections be applied, even in
the absence of /etc/zoneinfo, which means that ctime must carry
some DST information along in its data segment, so it can perform
at least as well as it used to.

The next question is, should any DST information hard-compiled
into ctime be fixed to reflect recent changes?  The surprising
answer is "no."  For sites which have installed /etc/zoneinfo,
the up-to-date tables found there will take precedence.  At sites
without /etc/zoneinfo, the safest assumption is that they have
not really dealt with the ctime problem at all, but are tweaking
the system clock (in the case of the upcoming change in the US,
setting the clock ahead one hour on April 5, and setting it back
an hour on April 26, when the old ctime thinks DST kicks in).
Such a strategy is reasonable, and works fine as long as the DST
correction applied by date(1) when the internal GMT is set
matches the correction applied by each and every program when
GMT is converted back to local time.  An executable with a
"fixed" version of ctime would in fact behave incorrectly at such
a site.

Therefore, the DST information hard-coded into any improved
version of ctime must be exactly as broken as the old, "standard"
one.  In that way, it will work equally well on systems that have
done no more to deal with the DST change than changing their
clocks for three weeks, and on systems that have adopted the
proposed fix.

(The only flaw in this theory is the possibility that people on
some systems have prolonged their agony by going to the trouble
of relinking everything, but with a "fixed" ctime that still
relies only on compiled-in tables.  This approach is in fact the
one suggested by Keith Bostic's "ARTICLE 13."  If anybody is
considering this, take my advice and don't.  If you're in a
position to relink at all, it's really no more trouble to use the
table-driven ctime, and that way you won't have to worry the next
time the DST rules change.)

Herewith are context diffs, against the 4.3 version posted by
Keith Bostic last week, of a version of the Arthur Olsen ctime
that works in a backwards-compatible way in the absence of
/etc/zoneinfo.  (I am also posting the complete, modified version
to net.sources.)  The code actually uses the same DST tables as
the old 4.1/4.2 ctime, to guarantee compatibility.  The old-style
tables are automatically converted into the internal state lists
needed by the new ctime's algorithm.

(These diffs still reflect some debugging code I added, to verify
that my new code was building correct internal data structures.
I did regression tests against real state lists created by zic
from a kludged-up description file reflecting the old transition
dates.)

I should point out that I have not tested the new code in the
southern hemisphere case, and I suspect that it will get things
wrong there for the first four months of 1970.  (This code only
attempts to perform DST correction after 1970, anyway.  In that
respect, it may be incompatible with the old ctime, which applied
DST in years prior to 1970 if you passed it a negative time.
The type of the internal variables has been bouncing back and
forth between `long' and `unsigned long' as people try to decide
how it should work.)

While working over ctime, I came up with a couple of questions:

	In asctime, shouldn't the year really be printed with %4d
	(or maybe %-4d) so that the returned string is guaranteed
	to have its advertised 26-character length?  (I realize
	that the code is lifted directly from the X3J11 draft
	standard, and that %d will only get it wrong for dates in
	the middle ages that a 32-bit time_t can't begin to reach.
	On the other hand, asctime gets handed a broken-down tm
	struct, so early years are quite possible.)

	Shouldn't the offtime() routine be declared static?
	It's not a publicized interface.

                                           Steve Summit
                                           stevesu at copper.tek.com

*** ctime.orig.c	Sun Mar 29 16:11:46 1987
--- ctime.c	Mon Mar 30 11:27:22 1987
***************
*** 215,220
  #endif /* USG_COMPAT */ 
  		}
  	}
  	return 0;
  }
  

--- 215,223 -----
  #endif /* USG_COMPAT */ 
  		}
  	}
+ #ifdef DEBUG
+ 	printstate();
+ #endif
  	return 0;
  }
  
***************
*** 218,223
  	return 0;
  }
  
  static
  tzsetkernel()
  {

--- 221,244 -----
  	return 0;
  }
  
+ #ifdef DEBUG
+ 
+ printstate()
+ {
+ 	int i;
+ 
+ 	printf("TS/DST info state:\n");
+ 	for(i = 0; i < s.timecnt; i++)
+ 		printf("time %d: %ld %d\n", i, s.ats[i], s.types[i]);
+ 	for(i = 0; i < s.typecnt; i++)
+ 		printf("type %d: %ld %d %d (%s)\n", i,
+ 			s.ttis[i].tt_gmtoff, s.ttis[i].tt_isdst,
+ 				s.ttis[i].tt_abbrind,
+ 					s.chars + s.ttis[i].tt_abbrind);
+ }
+ 
+ #endif
+ 
  static
  tzsetkernel()
  {
***************
*** 224,229
  	struct timeval	tv;
  	struct timezone	tz;
  	char	*tztab();
  
  	if (gettimeofday(&tv, &tz))
  		return -1;

--- 245,251 -----
  	struct timeval	tv;
  	struct timezone	tz;
  	char	*tztab();
+ 	static dstsetkernel();
  
  	if (gettimeofday(&tv, &tz))
  		return -1;
***************
*** 227,233
  
  	if (gettimeofday(&tv, &tz))
  		return -1;
! 	s.timecnt = 0;		/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
  	s.ttis[0].tt_abbrind = 0;
  	(void)strcpy(s.chars, tztab(tz.tz_minuteswest, 0));

--- 249,257 -----
  
  	if (gettimeofday(&tv, &tz))
  		return -1;
! 	s.timecnt = 0;
! 	s.typecnt = 1;
! 			/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
  	s.ttis[0].tt_isdst = 0;
  	s.ttis[0].tt_abbrind = 0;
***************
*** 229,234
  		return -1;
  	s.timecnt = 0;		/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
  	s.ttis[0].tt_abbrind = 0;
  	(void)strcpy(s.chars, tztab(tz.tz_minuteswest, 0));
  	tzname[0] = tzname[1] = s.chars;

--- 253,259 -----
  	s.typecnt = 1;
  			/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
+ 	s.ttis[0].tt_isdst = 0;
  	s.ttis[0].tt_abbrind = 0;
  	(void)strcpy(s.chars, tztab(tz.tz_minuteswest, 0));
  	tzname[0] = tzname[1] = s.chars;
***************
*** 236,241
  	timezone = tz.tz_minuteswest * 60;
  	daylight = tz.tz_dsttime;
  #endif /* USG_COMPAT */
  	return 0;
  }
  

--- 261,274 -----
  	timezone = tz.tz_minuteswest * 60;
  	daylight = tz.tz_dsttime;
  #endif /* USG_COMPAT */
+ 
+ 	if(tz.tz_dsttime)
+ 		dstsetkernel(&tz);
+ 
+ #ifdef DEBUG
+ 	printstate();
+ #endif
+ 
  	return 0;
  }
  
***************
*** 386,389
  	tmp->tm_zone = "";
  	tmp->tm_gmtoff = offset;
  	return tmp;
  }

--- 419,615 -----
  	tmp->tm_zone = "";
  	tmp->tm_gmtoff = offset;
  	return tmp;
+ }
+ 
+ /*
+  *  Backwards-compatible DST information tables.
+  *
+  *  The tables give the day number of the first day after the
+  *  Sunday of the change.
+  *
+  *  DO NOT FIX THESE TABLES.
+  *  Yes, they're wrong in several ways, including 1987 and beyond
+  *  in the United States, but they happen to match the old ctime.c
+  *  that is compiled into virtually all programs under 4.2bsd-
+  *  derived systems.  This is important if programs compiled with
+  *  this version of ctime are to work correctly when shipped (in
+  *  binary form) to systems which have not upgraded to the
+  *  /etc/zoneinfo scheme.
+  *
+  *  These hardwired tables are only used when /etc/zoneinfo cannot
+  *  be accessed.  A system which does not have /etc/zoneinfo is
+  *  probably handling DST fluctuations by changing the system clock.
+  *  Therefore, all programs on such a system (whether linked with
+  *  old or new versions of ctime) must use the same DST rules.
+  *  On a system with old-fashioned versions of ctime, handling
+  *  DST fluctuations by changing the system clock, a "correct"
+  *  version of ctime would in fact display incorrect results.
+  */
+ 
+ struct dstab {
+ 	int	dayyr;
+ 	int	daylb;
+ 	int	dayle;
+ };
+ 
+ static struct dstab usdaytab[] = {
+ 	1974,	5,	333,	/* 1974: Jan 6 - last Sun. in Nov */
+ 	1975,	58,	303,	/* 1975: Last Sun. in Feb - last Sun in Oct */
+ 	0,	119,	303,	/* all other years: end Apr - end Oct */
+ };
+ 
+ static struct dstab ausdaytab[] = {
+ 	1970,	400,	0,	/* 1970: no daylight saving at all */
+ 	1971,	303,	0,	/* 1971: daylight saving from Oct 31 */
+ 	1972,	303,	58,	/* 1972: Jan 1 -> Feb 27 & Oct 31 -> dec 31 */
+ 	0,	303,	65,	/* others: -> Mar 7, Oct 31 -> */
+ };
+ 
+ /*
+  * The European tables ... based on hearsay
+  * Believed correct for:
+  *	WE:	Great Britain, Ireland, Portugal
+  *	ME:	Belgium, Luxembourg, Netherlands, Denmark, Norway,
+  *		Austria, Poland, Czechoslovakia, Sweden, Switzerland,
+  *		DDR, DBR, France, Spain, Hungary, Italy, Jugoslavia
+  * Eastern European dst is unknown, we'll make it ME until someone speaks up.
+  *	EE:	Bulgaria, Finland, Greece, Rumania, Turkey, Western Russia
+  */
+ 
+ static struct dstab wedaytab[] = {
+ 	1983,	86,	303,	/* 1983: end March - end Oct */
+ 	1984,	86,	303,	/* 1984: end March - end Oct */
+ 	1985,	86,	303,	/* 1985: end March - end Oct */
+ 	0,	400,	0,	/* others: no daylight saving at all ??? */
+ };
+ 
+ static struct dstab medaytab[] = {
+ 	1983,	86,	272,	/* 1983: end March - end Sep */
+ 	1984,	86,	272,	/* 1984: end March - end Sep */
+ 	1985,	86,	272,	/* 1985: end March - end Sep */
+ 	0,	400,	0,	/* others: no daylight saving at all ??? */
+ };
+ 
+ static struct dayrules {
+ 	int		dst_type;	/* number obtained from system */
+ 	int		dst_hrs;	/* hours to add when dst on */
+ 	struct	dstab *	dst_rules;	/* one of the above */
+ 	enum {STH,NTH}	dst_hemi;	/* southern, northern hemisphere */
+ } dayrules [] = {
+ 	DST_USA,	1,	usdaytab,	NTH,
+ 	DST_AUST,	1,	ausdaytab,	STH,
+ 	DST_WET,	1,	wedaytab,	NTH,
+ 	DST_MET,	1,	medaytab,	NTH,
+ 	DST_EET,	1,	medaytab,	NTH,	/* XXX */
+ 	-1,
+ };
+ 
+ static
+ dstsetkernel(tzp)
+ struct timezone *tzp;
+ {
+ 	struct dayrules *drp;
+ 	int tabsize;
+ 	int timei;
+ 	int y;
+ 	int yleap;
+ 	int i;
+ 	int d, di;
+ 	time_t t;
+ 	int daylb, dayle;
+ 	char *p;
+ 
+ 	for(drp = dayrules; drp->dst_type >= 0; drp++)
+ 		if(drp->dst_type == tzp->tz_dsttime)
+ 			break;
+ 
+ 	if(drp->dst_type < 0)
+ 		return;
+ 
+ 	/* this ends up computing tabsize - 1, but that's what we want */
+ 
+ 	for(tabsize = 0; drp->dst_rules[tabsize].dayyr > 0; tabsize++)
+ 		;
+ 
+ 	/* 2038 is the year that signed 32 bit time_t's give out */
+ 
+ 	for(y = 1970, d = 0, t = 0, timei = 0; y < 2038; y++) {
+ 		daylb = drp->dst_rules[tabsize].daylb;
+ 		dayle = drp->dst_rules[tabsize].dayle;
+ 
+ 		for(i = 0; i < tabsize; i++)
+ 			if(y == drp->dst_rules[i].dayyr) {
+ 				daylb = drp->dst_rules[i].daylb;
+ 				dayle = drp->dst_rules[i].dayle;
+ 				break;
+ 			}
+ 
+ 		yleap = isleap(y);
+ 
+ 		if(yleap) {
+ 			if(daylb >= 58)
+ 				daylb++;
+ 
+ 			if(dayle >= 58)
+ 				dayle++;
+ 		}
+ 
+ 		/*
+ 		 *  January 1, 1970 was a Wednesday.
+ 		 *  d is the difference between January 1 of the loop
+ 		 *  year (y) and January 1, 1970, in days.
+ 		 *  daylb and dayle are (0-origin) day offsets with
+ 		 *  respect to January 1.
+ 		 *  So (d + dayl[be] - 3) % 7 is the day (0 == Sunday)
+ 		 *  of daylb or dayle.
+ 		 *  That's also the number to subtract from daylb or
+ 		 *  dayle to get the day number (since January 1 of
+ 		 *  the loop year) of the preceding Sunday.
+ 		 */
+ 
+ 		daylb -= (d + daylb - 3) % 7;
+ 		dayle -= (d + dayle - 3) % 7;
+ 
+ 		s.ats[timei] = t + SECS_PER_DAY * daylb
+ 				+ tzp->tz_minuteswest * SECS_PER_MIN
+ 							+ 2 * SECS_PER_HOUR;
+ 
+ 		s.ats[timei + 1] = t + SECS_PER_DAY * dayle
+ 				+ tzp->tz_minuteswest * SECS_PER_MIN
+ 					+ (drp->dst_hemi == NTH ? 1 : 2)
+ 							* SECS_PER_HOUR;
+ 
+ 		if(drp->dst_hemi == NTH) {
+ 			s.types[timei] = tzp->tz_dsttime;
+ 			s.types[timei + 1] = 0;
+ 		} else {
+ 			s.types[timei] = 0;
+ 			s.types[timei + 1] = tzp->tz_dsttime;
+ 		}
+ 
+ 		timei += 2;
+ 
+ 		di = year_lengths[yleap];
+ 
+ 		d += di;
+ 		t += di * SECS_PER_DAY;
+ 	}
+ 
+ 	s.timecnt = timei;
+ 
+ 	s.ttis[1].tt_gmtoff = tzp->tz_minuteswest * -SECS_PER_MIN
+ 						+ drp->dst_hrs * SECS_PER_HOUR;
+ 
+ 	s.ttis[1].tt_isdst = tzp->tz_dsttime;
+ 
+ 	for(p = s.chars; *p != '\0'; p++)
+ 		;
+ 
+ 	(void)strcpy(++p, tztab(tzp->tz_minuteswest, tzp->tz_dsttime));
+ 
+ 	s.ttis[1].tt_abbrind = p - s.chars;
+ 
+ 	tzname[1] = p;
+ 
+ 	s.typecnt = 2;
  }



More information about the tz mailing list