ISO C strftime() extention for ISO 8601

Markus G. Kuhn kuhn at cs.purdue.edu
Mon Jun 30 20:09:55 UTC 1997


"Clive D.W. Feather" wrote on 1997-06-30 13:14 UTC:
> You may be interested to know that WG14 adopted last week working paper
> N733, which adds the following items to the strftime() function:
> 
>    %f   is replaced by the weekday as a decimal number (1-7), where
>         Monday is 1 (the ISO 8601 weekday number).
>    %F   is equivalent to "%Y-%m-%d" (the ISO 8601 date format).
>    %T   is equivalent to "%H:%M:%S" (the ISO 8601 time format).
>    %V   is replaced by the ISO 8601 week number of the year (weeks
>         begin on a Monday, and week 1 is the week that includes both
>         January 4th and the first Thursday of the year) as a decimal
>         number (00-53).

Thanks a lot for forwarding this interesting working paper.  Please forward
the following comments to the authors of N733 and whoever else might be
interested.  If possible, I would like to review the final text of the
new strftime() definition (will the current draft be on the Web
somewhere?).

Comments on WG14 proposal N733
------------------------------

Markus Kuhn <kuhn at cs.purdue.edu> -- 1997-06-30 (= 1997-W27-1)

I appreciate this proposal and I have the following related comments.

  a) The valid range for %V numbers is 01-53. There is no week 00 in the
     ISO 8601 week numbering scheme.  The week before a week 01 is either
     week 52 or week 53 of the previous year.

  b) %V alone is not sufficient to be able to use the ISO week numbering
     system. Each week is associated with a year, the year in which the
     majority of the days of this week fall, and this is not necessarily
     the year in which all days of the weeks fall.  For instance: The
     week 1999-W52 goes from 1999-12-27 to 2000-01-02, in other
     words, the day 2000-01-02 has a week notation of the
     form 1999-W52-7. There should be another format descriptor for
     the year to which the current ISO week belongs, preferably both
     in 4-digit (%G) and 2-digit (%g) form.
     If you implement an algorithm for %V, you'll get the value of %G
     anyway as a by-product very easily, and therefore it should be
     made available to the strftime() user.

  c) Existing practice: The Olson tzcode package <ftp://elsie.nci.nih.gov/pub/>
     contains a widely used strftime() implementation that supports already:

       %u  ISO 8601 week day number (1 = Monday, 7=Sunday)
       %V  ISO 8601 week number (01-53)
       %G  ISO 8601 year of current week, 4-digits
       %g  ISO 8601 year of current week, 2-digits

     Unless there is a good rationale for the characters suggested
     by N733, I would suggest to stick with %u instead of %f for the
     weekday number, and I hope that you will add %G and %g as used
     in the Olson package and Arnold Robbins' strftime version 3.0.

     Therefore, if I evaluate on 1977-01-02 the string %G-W%V-%u, I should
     get 1976-W53-7, and on 1975-12-29 I should get 1976-W01-1.

Not directly related to N733, but affecting the same part of the standard,
I have a number of other suggestions:

  d) The range for %S and tm_sec is currently defined to be 00-61 to
     provide for "as many as two leap seconds".  This was based on a
     serious missunderstanding and there can never be two leap seconds
     per day as it becomes very obvious by reading ITU-R Recommendation
     TF.460-4 (I can send you a copy if you are interested).  Since
     this 00-61 range is being widely quoted in other standards, this
     error should be fixed just to stop spreading this serious
     missconception of how leap seconds work.  The correct range is 00-60.
     This is not an interoperability problem, but fixing this would make
     WG14 look like they know what they are doing, and it is therefore
     a good idea.

  e) I wonder whether %W is anywhere used and whether this field could
     be dropped to simplify implementation and memory cost.  Countries
     that start the week with Monday normally use ISO 8601 week numbers
     (%V) and not the scheme defined by %W.  I suspect %W was defined based
     on a missconception of how week numbers work in Europe.  Unless
     anyone can come up with an example where %W is used or needed, I
     suggest to drop it as it looks completely useless to me (and please
     don't quote standards that just copied the %W from ISO C).

  f) In the definition of %y and %Y, the first two digits of a four digit
     year are refered to as the "century", which is problematic, since
     the years 1999 and 2000 belong to the 20th century, but 2001 belongs
     to the 21st century. Suggested better wording: "%y is replaced by
     the last two digits of the year as a decimal number (00-99)".
     Again, not a serious interoperability problem, but it makes WG14
     look like they know what they are doing.

  g) mktime() is the inverse function of localtime(), but there exists
     no portable inverse function for gmtime() that converts a struct tm
     given in UTC into time_t.  This is a serious problem, and the addition
     of a new function (e.g., mkgmtime() might be a possible name) should
     be considered seriously.  It is not possible to invert gmtime()
     in a 100% portable way in an application program, and in practice, I
     have encountered awful hacks like binary searches over the time_t
     range to invert gmtime() in an as portable as possible way.

See <http://www.ft.uni-erlangen.de/~mskuhn/iso-time.html> for further info.

Markus

-- 
Markus G. Kuhn, Computer Science grad student, Purdue
University, Indiana, USA -- email: kuhn at cs.purdue.edu





More information about the tz mailing list