Summary of problems with draft C9x <time.h>, and a proposed fix

Paul Eggert eggert at twinsun.com
Tue Sep 15 00:12:31 UTC 1998


Rex Jaeschke, NCITS/J11 chair, recently sent me a copy of document
J11/98-048, which contains the disposition by J11 of my public review
comments regarding the C9x standard Committee Draft (CD) 1 (CD 9899,
Ballot Document N2620).  One of my comments was not satisfactorily
resolved, and I'd like to follow up in the hope of improving the
eventual standard.

I'm referring to Comment 14 in US0011 (1998-03-04), my comment about
problems in the struct-tmx-related changes by CD 1 to <time.h>.
Unfortunately, many of these problems remain in CD 2, and I've since
learned of other problems.  I summarize these remaining problems in
Appendix 1 below.

Also, Clive Feather (who I understand is responsible for most of the
<time.h> changes in CDs 1 and 2) has proposed that a new <time.h>
section be written to address these problems.  I welcome this
proposal, and would like to contribute.  However, I believe that it's
too late in the standardization process to introduce major
improvements to <time.h>, as there will be insufficient time to gain
implementation experience with these changes, experience that is
needed for proper review.

Instead, I propose that <time.h>'s problems be fixed by removing the
struct-tmx-related changes to <time.h>, reverting to the the current
ISO C standard (C89); we can then come up with a better <time.h> for
the next standard (C0x).  In other words, I propose the following:

 * Change <time.h> to define only the types and functions that
   were defined in C89's <time.h>, and to remove a new requirement
   on mktime.  Appendix 2 gives the details.

 * Work with Clive Feather and other interested parties to
   write and test a revised <time.h> suitable for inclusion in C0x.

Please let me know of any way that I can further help implement this
proposal.


------------------------------------------------------------
Appendix 1.  Problems in the struct-tmx-related part of <time.h>

Here is a summary of technical problems in the struct-tmx-related part
of CD 2 (1998-08-03), section 7.23.  The problems fall into two basic areas:

 * struct tmx is not headed in the right direction.

   The struct-tmx-related changes do not address several well-known
   problems with C89 <time.h>, and do not form a good basis for
   addressing these problems.  These problems include the following.

    - Lack of precision.  The standard does not require precise
      timekeeping; typically, time_t has only 1-second precision.

    - Inability to determine properties of time_t.  There's no
      portable way to determine the precision or range of time_t.
    
    - Poor arithmetic support for the time_t type.  difftime is not
      enough for many practical applications.

    - The new interface is not reentrant.  A common extension to C89
      is the support of reentrant versions of functions like
      localtime.  This extension is part of POSIX.1.  There's no good
      reason (other than historical practice) for time-related
      functions to rely on global state; any new extensions should be
      reentrant.
    
    - No control over time zones.  There's no portable way for an application
      to inquire about the time in New York, for example, even if the
      implementation supports this.

    - Missing conversions.  There's no way to convert between UTC and TAI,
      or between times in different time zones, or to determine which time
      zone is in use.

    - No reliable interval time scale.  If the clock is adjusted to keep
      in sync with UTC, there's no reliable way for a program to ignore
      this change.

    - One cannot apply strftime to the output of gmtime,
      as the %Z and %z formats may be misinterpreted.

   (Credit: I've borrowed many of the above points from discussions by
   Clive Feather and Markus Kuhn.)

 * struct tmx has several technical problems of its own.

   Even on its own terms, struct tmx has several technical problems
   that would need to be fixed before being made part of a standard.
   These problems include the following.

    - In 7.23.1 paragraph 5, struct tmx's tm_zone member counts
      minutes.  This disagrees with common practice, which is to
      extend struct tm by adding a new member tm_gmtoff that is UTC
      offset in seconds.  The extra precision is needed to support
      old time stamps -- UTC offsets that were not a multiple of
      one minute used to be quite common, and in at least one locale
      this practice did not die out until 1972.

    - The tm_leapsecs member defined by 7.23.1 paragraph 5 is an integer,
      but it is supposed to represent TAI - UTC, and this value is not
      normally an integer for time stamps before 1972.  Also, it's not
      clear what this value should be for time stamps before the introduction
      of TAI in the 1950s.

    - The tm_ext and tm_extlen members defined by 7.23.1 paragraph 5
      use a new method to allow for future extensions.  This method
      has never before been tried in the C Standard, and is likely to
      lead to problems in practice.

      For example, the draft makes no requirement on the storage
      lifetime of storage addressed by tm_ext.  This means that an
      application cannot reliably dereference the pointer returned by
      zonetime, because it has no way of knowing when the tm_ext
      member points to freed storage.

    - 7.23.2.3 paragraph 4 adds the following requirement for mktime
      not present in C89:

	If the call is successful, a second call to the mktime
        function with the resulting struct tm value shall always leave
        it unchanged and return the same value as the first call.

      This requirement was inspired by the struct-tmx-related changes
      to <time.h>, but it requires changes to existing practice, and
      it cannot be implemented without hurting performance or breaking
      binary compatibility.

      For example, suppose I am in Sri Lanka, and invoke mktime on the
      equivalent of 1996-10-26 00:15:00 with tm_isdst==0.  There are
      two distinct valid time_t values for this input, since Sri Lanka
      moved the clock back from 00:30 to 00:00 that day, permanently.
      There is no way to select the time_t by inspecting tm_isdst,
      since both times are standard time.

      On examples like these, C89 allows mktime to return different
      time_t values for the same input at different times during the
      execution of the program.  This is common existing practice,
      but it is prohibited by this new requirement.

      It's possible to satisfy this new requirement by adding a new
      struct tm member, which specifies the UTC offset.  However, this
      would break binary compatibility.  It's also possible to satisfy
      this new requirement by always returning the earlier time_t
      value in ambiguous cases.  However, this can greatly hurt
      performance, as it's not easy for some implementations to
      determine that the input is ambiguous; it would require scouting
      around each candidate returned value to see whether the value
      might be ambiguous, and this step would be expensive.

    - The limits on ranges for struct tmx members in 7.23.2.6
      paragraph 2 are unreasonably tight.  For example, they disallow
      the following program on a POSIX.1 host with a 32-bit `long',
      since `time (0)' currently returns values above 900000000 on
      POSIX.1 hosts, which is well above the limit LONG_MAX/8 ==
      268435455 imposed by 7.23.2.6.

	#include <time.h>

	struct tmx tm;

	int main()
	{
	  char buf[1000];
	  time_t t = 0;

	  /* Add current time to POSIX.1 epoch, using mkxtime.  */
      	  tm.tm_version = 1;
	  tm.tm_year = 1970 - 1900;
	  tm.tm_mday = 1;
	  tm.tm_sec = time (0);
	  if (mkxtime (&tm) == (time_t) -1)
	    return 1;

	  strfxtime (buf, sizeof buf, "%Y-%m-%d %H:%M:%S", &tm);
	  puts (buf);
	  return 0;
	}

      The limits in 7.23.2.6 are not needed.  A mktime implementation
      need not check for overflow on every internal arithmetic
      operation; instead, it can cheaply check for overflow by doing a
      relatively simple test at the end of its calculation.

    - 7.23.2.6 paragraph 3 contains several technical problems:

       . In some cases, it requires mkxtime to behave as if each day
	 contains 86400 seconds, even if the implementation supports
	 leap seconds.  For example, if the host supports leap seconds
	 and uses Japan time, then using mkxtime to add 1 day to
	 1999-01-01 00:00:00 must yield 1999-01-01 23:59:59, because
	 there's a leap second at 08:59:60 that day in Japan.  This
	 is not what most programmers will want or expect.

       . The explanation starts off with ``Values S and D shall be
	 determined as follows'', but the code that follows does not
	 _determine_ S and D; it consults an oracle to find X1 and
	 X2, which means that the code merely places _constraints_ on
	 S and D.  A non-oracular implementation cannot in general
	 determine X1 and X2 until it knows S and D, so the code,
	 if interpreted as a definition, is a circular one.

       . The code suffers from arithmetic overflow problems.  For
	 example, suppose tm_hour == INT_MAX && INT_MAX == 32767.
	 Then tm_hour*3600 overflows, even though tm_hour satisfies
	 the limits of paragraph 2.

       . The code does not declare the types of SS, M, Y, Z, D, or S,
	 thus leading to confusion.  Clearly these values cannot be
	 of type `int', due to potential overflow problems like the
	 one discussed above.  It's not clear what type would suffice.

       . The definition for QUOT yields numerically incorrect results
	 if either (b)-(a) or (b)-(a)-1 overflows.  Similarly, REM
	 yields incorrect results if (b)*QUOT(a,b) overflows.

       . The expression Y*365 + (Z/400)*97 + (Z%400)/4 doesn't match
	 the Gregorian calendar, which has special rules for years
	 that are multiples of 100.

       . The code is uncommented, so it's hard to understand and evaluate.
         For example, the epoch (D=0, S=0) is not described; it
	 appears to be (-0001)-12-31 Gregorian, but this should be
	 cleared up.
      
    - 7.23.3.7 says that the number of leap seconds is the ``UTC-UT1 offset''.
      It should say ``UTC - TAI''.


------------------------------------------------------------
Appendix 2.  Details of proposed change to <time.h>

Here are the details about my proposed change to <time.h>.  This
change reverts the <time.h> part of the standard to define only the
types, functions, and macros that were defined in C89's <time.h>.
It also removes the hard-to-implement requirement in 7.23.2.3 paragraph 4.

  * 7.23.1 paragraph 2.  Remove the macros _NO_LEAP_SECONDS and _LOCALTIME.
  * 7.23.1 paragraph 3.  Remove the type `struct tmx'.
  * 7.23.1 paragraph 5 (struct tmx).  Remove this paragraph.
  * 7.23.2.3 paragraph 3 (mktime normalization).  Remove this paragraph.
  * 7.23.2.3 paragraph 4.  Remove the phrase ``and return the same value''.
	It's not feasible to return the same value in some cases;
	see the discussion of 7.23.2.3 paragraph 4 above.
  * 7.23.2.4 (mkxtime).  Remove this section.
  * 7.23.2.6 (normalization of broken-down times).  Remove this section;
	this means footnote 252 will be removed.
  * 7.23.3 paragraph 1.  Remove the reference to strfxtime.
  * 7.23.3.6 (strfxtime).  Remove this section.
  * 7.23.3.7 (zonetime).  Remove this section.



More information about the tz mailing list