Summary of problems with draft C9x <time.h>, and a proposed fix
Paul Eggert
eggert at twinsun.com
Tue Sep 15 00:12:31 UTC 1998
Rex Jaeschke, NCITS/J11 chair, recently sent me a copy of document
J11/98-048, which contains the disposition by J11 of my public review
comments regarding the C9x standard Committee Draft (CD) 1 (CD 9899,
Ballot Document N2620). One of my comments was not satisfactorily
resolved, and I'd like to follow up in the hope of improving the
eventual standard.
I'm referring to Comment 14 in US0011 (1998-03-04), my comment about
problems in the struct-tmx-related changes by CD 1 to <time.h>.
Unfortunately, many of these problems remain in CD 2, and I've since
learned of other problems. I summarize these remaining problems in
Appendix 1 below.
Also, Clive Feather (who I understand is responsible for most of the
<time.h> changes in CDs 1 and 2) has proposed that a new <time.h>
section be written to address these problems. I welcome this
proposal, and would like to contribute. However, I believe that it's
too late in the standardization process to introduce major
improvements to <time.h>, as there will be insufficient time to gain
implementation experience with these changes, experience that is
needed for proper review.
Instead, I propose that <time.h>'s problems be fixed by removing the
struct-tmx-related changes to <time.h>, reverting to the the current
ISO C standard (C89); we can then come up with a better <time.h> for
the next standard (C0x). In other words, I propose the following:
* Change <time.h> to define only the types and functions that
were defined in C89's <time.h>, and to remove a new requirement
on mktime. Appendix 2 gives the details.
* Work with Clive Feather and other interested parties to
write and test a revised <time.h> suitable for inclusion in C0x.
Please let me know of any way that I can further help implement this
proposal.
------------------------------------------------------------
Appendix 1. Problems in the struct-tmx-related part of <time.h>
Here is a summary of technical problems in the struct-tmx-related part
of CD 2 (1998-08-03), section 7.23. The problems fall into two basic areas:
* struct tmx is not headed in the right direction.
The struct-tmx-related changes do not address several well-known
problems with C89 <time.h>, and do not form a good basis for
addressing these problems. These problems include the following.
- Lack of precision. The standard does not require precise
timekeeping; typically, time_t has only 1-second precision.
- Inability to determine properties of time_t. There's no
portable way to determine the precision or range of time_t.
- Poor arithmetic support for the time_t type. difftime is not
enough for many practical applications.
- The new interface is not reentrant. A common extension to C89
is the support of reentrant versions of functions like
localtime. This extension is part of POSIX.1. There's no good
reason (other than historical practice) for time-related
functions to rely on global state; any new extensions should be
reentrant.
- No control over time zones. There's no portable way for an application
to inquire about the time in New York, for example, even if the
implementation supports this.
- Missing conversions. There's no way to convert between UTC and TAI,
or between times in different time zones, or to determine which time
zone is in use.
- No reliable interval time scale. If the clock is adjusted to keep
in sync with UTC, there's no reliable way for a program to ignore
this change.
- One cannot apply strftime to the output of gmtime,
as the %Z and %z formats may be misinterpreted.
(Credit: I've borrowed many of the above points from discussions by
Clive Feather and Markus Kuhn.)
* struct tmx has several technical problems of its own.
Even on its own terms, struct tmx has several technical problems
that would need to be fixed before being made part of a standard.
These problems include the following.
- In 7.23.1 paragraph 5, struct tmx's tm_zone member counts
minutes. This disagrees with common practice, which is to
extend struct tm by adding a new member tm_gmtoff that is UTC
offset in seconds. The extra precision is needed to support
old time stamps -- UTC offsets that were not a multiple of
one minute used to be quite common, and in at least one locale
this practice did not die out until 1972.
- The tm_leapsecs member defined by 7.23.1 paragraph 5 is an integer,
but it is supposed to represent TAI - UTC, and this value is not
normally an integer for time stamps before 1972. Also, it's not
clear what this value should be for time stamps before the introduction
of TAI in the 1950s.
- The tm_ext and tm_extlen members defined by 7.23.1 paragraph 5
use a new method to allow for future extensions. This method
has never before been tried in the C Standard, and is likely to
lead to problems in practice.
For example, the draft makes no requirement on the storage
lifetime of storage addressed by tm_ext. This means that an
application cannot reliably dereference the pointer returned by
zonetime, because it has no way of knowing when the tm_ext
member points to freed storage.
- 7.23.2.3 paragraph 4 adds the following requirement for mktime
not present in C89:
If the call is successful, a second call to the mktime
function with the resulting struct tm value shall always leave
it unchanged and return the same value as the first call.
This requirement was inspired by the struct-tmx-related changes
to <time.h>, but it requires changes to existing practice, and
it cannot be implemented without hurting performance or breaking
binary compatibility.
For example, suppose I am in Sri Lanka, and invoke mktime on the
equivalent of 1996-10-26 00:15:00 with tm_isdst==0. There are
two distinct valid time_t values for this input, since Sri Lanka
moved the clock back from 00:30 to 00:00 that day, permanently.
There is no way to select the time_t by inspecting tm_isdst,
since both times are standard time.
On examples like these, C89 allows mktime to return different
time_t values for the same input at different times during the
execution of the program. This is common existing practice,
but it is prohibited by this new requirement.
It's possible to satisfy this new requirement by adding a new
struct tm member, which specifies the UTC offset. However, this
would break binary compatibility. It's also possible to satisfy
this new requirement by always returning the earlier time_t
value in ambiguous cases. However, this can greatly hurt
performance, as it's not easy for some implementations to
determine that the input is ambiguous; it would require scouting
around each candidate returned value to see whether the value
might be ambiguous, and this step would be expensive.
- The limits on ranges for struct tmx members in 7.23.2.6
paragraph 2 are unreasonably tight. For example, they disallow
the following program on a POSIX.1 host with a 32-bit `long',
since `time (0)' currently returns values above 900000000 on
POSIX.1 hosts, which is well above the limit LONG_MAX/8 ==
268435455 imposed by 7.23.2.6.
#include <time.h>
struct tmx tm;
int main()
{
char buf[1000];
time_t t = 0;
/* Add current time to POSIX.1 epoch, using mkxtime. */
tm.tm_version = 1;
tm.tm_year = 1970 - 1900;
tm.tm_mday = 1;
tm.tm_sec = time (0);
if (mkxtime (&tm) == (time_t) -1)
return 1;
strfxtime (buf, sizeof buf, "%Y-%m-%d %H:%M:%S", &tm);
puts (buf);
return 0;
}
The limits in 7.23.2.6 are not needed. A mktime implementation
need not check for overflow on every internal arithmetic
operation; instead, it can cheaply check for overflow by doing a
relatively simple test at the end of its calculation.
- 7.23.2.6 paragraph 3 contains several technical problems:
. In some cases, it requires mkxtime to behave as if each day
contains 86400 seconds, even if the implementation supports
leap seconds. For example, if the host supports leap seconds
and uses Japan time, then using mkxtime to add 1 day to
1999-01-01 00:00:00 must yield 1999-01-01 23:59:59, because
there's a leap second at 08:59:60 that day in Japan. This
is not what most programmers will want or expect.
. The explanation starts off with ``Values S and D shall be
determined as follows'', but the code that follows does not
_determine_ S and D; it consults an oracle to find X1 and
X2, which means that the code merely places _constraints_ on
S and D. A non-oracular implementation cannot in general
determine X1 and X2 until it knows S and D, so the code,
if interpreted as a definition, is a circular one.
. The code suffers from arithmetic overflow problems. For
example, suppose tm_hour == INT_MAX && INT_MAX == 32767.
Then tm_hour*3600 overflows, even though tm_hour satisfies
the limits of paragraph 2.
. The code does not declare the types of SS, M, Y, Z, D, or S,
thus leading to confusion. Clearly these values cannot be
of type `int', due to potential overflow problems like the
one discussed above. It's not clear what type would suffice.
. The definition for QUOT yields numerically incorrect results
if either (b)-(a) or (b)-(a)-1 overflows. Similarly, REM
yields incorrect results if (b)*QUOT(a,b) overflows.
. The expression Y*365 + (Z/400)*97 + (Z%400)/4 doesn't match
the Gregorian calendar, which has special rules for years
that are multiples of 100.
. The code is uncommented, so it's hard to understand and evaluate.
For example, the epoch (D=0, S=0) is not described; it
appears to be (-0001)-12-31 Gregorian, but this should be
cleared up.
- 7.23.3.7 says that the number of leap seconds is the ``UTC-UT1 offset''.
It should say ``UTC - TAI''.
------------------------------------------------------------
Appendix 2. Details of proposed change to <time.h>
Here are the details about my proposed change to <time.h>. This
change reverts the <time.h> part of the standard to define only the
types, functions, and macros that were defined in C89's <time.h>.
It also removes the hard-to-implement requirement in 7.23.2.3 paragraph 4.
* 7.23.1 paragraph 2. Remove the macros _NO_LEAP_SECONDS and _LOCALTIME.
* 7.23.1 paragraph 3. Remove the type `struct tmx'.
* 7.23.1 paragraph 5 (struct tmx). Remove this paragraph.
* 7.23.2.3 paragraph 3 (mktime normalization). Remove this paragraph.
* 7.23.2.3 paragraph 4. Remove the phrase ``and return the same value''.
It's not feasible to return the same value in some cases;
see the discussion of 7.23.2.3 paragraph 4 above.
* 7.23.2.4 (mkxtime). Remove this section.
* 7.23.2.6 (normalization of broken-down times). Remove this section;
this means footnote 252 will be removed.
* 7.23.3 paragraph 1. Remove the reference to strfxtime.
* 7.23.3.6 (strfxtime). Remove this section.
* 7.23.3.7 (zonetime). Remove this section.
More information about the tz
mailing list