comments on draft ISO C9x changes to <time.h>
Paul Eggert
eggert at twinsun.com
Fri Jun 12 22:38:32 UTC 1998
The ISO committee in charge of the C language has issued a draft for
C9x, the next major revision to C. A copy of this (large) document is
available in:
http://osiris.dkuug.dk/JTC1/SC22/open/2620/n2620/
Section 7.16 of this draft C standard proposes a major overhaul of the
functions and datatypes defined in <time.h>. It adds a new data type
`struct tmx' that is struct tm extended with the following members:
int tm_version; // version number
int tm_zone; // time zone offset in minutes from UTC [-1439,+1439]
int tm_leapsecs;// number of leap seconds applied
void *tm_ext; // extension block
size_t tm_extlen; // size of extension block
Also, a struct tmx's tm_isdst is the positive number of minutes of
offset if DST is in effect. New functions mkxtime, strfxtime use
struct tmx instead of struct tm; a new function
struct tmx *zonetime (const time_t *timer, int zone);
is the rough analog of localtime and gmtime for struct tm.
I've submitted the following comments to the ISO committee for their
review. A copy of these comments (along with all other US public
comments on Committee Draft 1) can be found in:
http://osiris.dkuug.dk/JTC1/SC22/WG14/www/docs/n834.htm
Category: Feature that should be removed
Committee Draft subsection: 7.16
Title: changes to <time.h> need a lot of work and should be withdrawn for now
Detailed description:
Background and comments
Draft C9X introduced a new time struct tmx, new macros
_NO_LEAP_SECONDS and _LOCALTIME, and new functions mkxtime,
zonetime, and strfxtime.
These new functions seem to be an invention of the committee;
they are not based on existing practice, and in some cases
even ignore longstanding existing practice. The new functions
do not address many of the common problems observed with the
C89 primitives, notably with mktime. Nor do they add much
functionality.
For example, a common extension to C, now required by POSIX.1, are
reentrant versions of localtime, gmtime, etc. This fills a
genuine need, but it's not addressed by draft C9X.
There are also other genuine needs that are not addressed; just
look at, say, the harsh words about mktime expressed by the author
of the tide-calculation program XTide in its source code
<http://www.universe.digex.net/~dave/files/xtide-1.6.2.tar.gz>.
Draft C9X addresses few of the needs expressed by this author.
Here are some more detailed comments on technical shortcomings
in this area.
Section 7.16.1 paragraph 3.
The tm_zone member is an integer number of minutes. However,
common practice (e.g. SunOS 4.x, BSD/OS, Linux) is to have a
member named tm_gmtoff that is a long number of seconds. This
is required for proper support of POSIX.1, which lets the user
specify UTC offset to the second; it is also required for
proper support of historical applications. For example, the
UTC offset of Liberia was 44 minutes and 30 seconds until May
1972, and any program running on, say, Linux with the TZ
environment variable set to "Africa/Monrovia" cannot operate
correctly with if the UTC offset is required to be a multiple
of 60 seconds.
The tm_ext and tm_extlen members are an unprecedented kludge
in the standard library spec. This is not C++! If the
specification for struct tmx is incomplete, this suggests that
the editorial work is not done and this type should be
withdrawn from the standard.
Section 7.16.2.3 paragraph 4.
Here, draft C9X added the following new specification for mktime:
If the call is successful, a second call to the mktime
function with the resulting struct tm value shall always
leave it unchanged and return the same value as the first
call. (*)
This specification is reasonable for mkxtime, but for mktime
it requires changes to existing practice in a way that breaks
existing software. Existing software often assumes that
tm_isdst is either negative, 0, or 1; C89 does not guarantee
this, but it is common existing practice, so software that
makes this assumption is portable in practice.
Unfortunately, specification (*) cannot be satisfied without
either adding hidden members to struct tm (which breaks binary
compatibility) or by stuffing more information into tm_isdst
(which breaks the programs described above).
Granted, programs shouldn't assume that a positive tm_isdst
is 1, but it's very common in POSIX.1 programs to see
expressions like `tzname[tm->tm_isdst]', and these expressions
won't work if tm_isdst contains large values.
Section 7.16.2.4 paragraph 3.
If tm_zone was _LOCALTIME, and if tm_isdst is preposterous
(e.g. negative, or INT_MAX), this specification is unclear
about what to do. The comments in 7.16.2.6 don't help much.
Section 7.16.2.6 paragraph 1.
The specification for tm_isdst does not allow for negative
daylight-saving time. I don't know of any historical practice
for this, but POSIX.1 allows it, and implementations that
support POSIX.1 have to allow for it.
Section 7.16.2.6 paragraph 2.
The limits on ranges for struct tmx members are unreasonable.
Common existing practice, for example, is to invoke mktime
with a large value for tm_sec to compute a time stamp at some
distance from the POSIX.1 epoch. If int and long are the same
size, this runs afoul of the new restriction in this section,
which limits tm_sec to one-eighth of the potential range.
With this limitation I cannot even use mktime to compute
today's date on my Unix host from today's time_t value!
The other limits are also unnecessary. A well-written mktime
should work in the presence of arbitrary values in struct
tm members; similarly for mkxtime.
Section 7.16.2.6 paragraph 3.
There are so many errors in this section that it is hard to
determine what is intended. But from what I can tell, the
intent is wrong. For example, it seems to be saying that if
the implementation supports leap seconds, and if local time is
UTC, and if I have a struct tmx that corresponds to 1997-06-30
00:00:00, and then add 1 to tm_mday and invoke mkxtime, I
should get 1997-06-30 23:59:60 due to the intervening leap
second. This is not what I, the programmer, want or expect!
The first sentence in this paragraph reads ``Values S and D
shall be determined as follows''. But the rules that follow
do not _determine_ S and D; they merely place _constraints_
on S and D. This is because the implementation has some leeway
in choosing X1 and X2.
It's not clear in this paragraph whether we're looking at C
code or mathematics. Are we supposed to be using all the C
rules for promotion, conversion, and overflow, or are the
calculations to be done using mathematical integer arithmetic?
The last sentence in the comment about X1 and X2 is
incoherent; I really can't make out what it means.
For the implementation to determine X1 and X2, it needs to
know what D and S are. But D and S are computed from X1 and
X2! More explanation is needed before I can really figure out
what's intended here.
The definition of D is completely unmotivated, and does not
obey the rules of the Gregorian calendar. Among other things,
it uses / and % in places where it should use QUOT and REM.
(And it can't possibly be right without a `100' in it
somewhere. :-) The definition should be rewritten to be
something like the following. (Sorry, I haven't tested this,
as it's less than 30 minutes before the deadline for
submitting comments in the US as this sentence is being
written.)
D = // day offset since 0000-03-01
// contribution from year
Z*365 // number of non-leap days since 0000-03-01
+ QUOT(Z, 4) // Every 4 years ends in a leap year.
- QUOT(Z, 100) // Every 100 years ends in a nonleap year.
+ QUOT(Z, 400) // Every 400 years ends in a leap year.
// contribution from month; note we start from 03-01
+ ((int []){ ...yday offsets, starting in March ...})
[REM(M - 2, 12)]
// contribution from day of month
+ tm_mday - 1
// contribution from time of day
+ QUOT(SS, 86400)
except of course that the expression QUOT(SS, 86400) mishandles
leap seconds as described above.
Section 7.16.3.5
This new function zonetime is if only marginal use; it seems to
be present mostly as a way of defining how mkxtime works.
The definition of leap seconds is incorrect. Leap seconds are
not a UTC-UT1 offset. The absolute value of the difference
between UTC and UT1 is at most 0.9 seconds, by definition.
The changes to 7.16 seem to be hastily edited: there are a number
of what seem to be typographical errors. The changed text is not
explained, and the typos make it hard to understand what was
intended. Here are some of the typos that I spotted despite these
problems:
Section 7.16.1 paragraph 2. _LOCALTIME ``must be outside the
range [-14400, +14400].'' Presumably this should be [-1440,
+1440], i.e. one day's worth not ten.
Section 7.16.2.6 paragraph 3.
The definition for QUOT yields numerically incorrect results
if (b)-(a) or (b)-(a)-1 overflows. I suggest replacing it
with the following definition, which is clearer and free of
problems with overflow. This definition relies on C9X's new
guarantees about integer division.
#define QUOT(a,b) ((a)/(b) - ((a)%(b) < 0))
Similarly, REM can overflow if (b)*QUOT(a,b) overflows. Here
is a better version.
#define REM(a,b) ((a)%(b) + (b) * ((a)%(b) < 0))
The definition of Z can be written more compactly as:
Z = Y - (M < 2);
Section 7.16.3.6 paragraph 5.
``If this value is outside the normal range, the characters stored
are unspecified.'' What is the ``normal range''? The range as
output by localtime, the range of the Gregorian calendar, or
the limits as specified in 7.16.2.6?
Suggestion
Drop all changes to the <time.h> section for this revision of
the C Standard.
Bring in experts in this area for the next revision of the
C Standard. I suggest working together with the members of the
Time Zone Mailing list <tz at elsie.nci.nih.gov>.
Build on existing practice rather than relying on committee
inventions, which have been error-prone in this area.
If these suggestions is not followed, a lot of changes are
needed to this section, as suggested by the above discussion;
please contact me if you need more details.
More information about the tz
mailing list