ISO C strftime() extention for ISO 8601
Markus G. Kuhn
kuhn at cs.purdue.edu
Mon Jun 30 20:09:55 UTC 1997
"Clive D.W. Feather" wrote on 1997-06-30 13:14 UTC:
> You may be interested to know that WG14 adopted last week working paper
> N733, which adds the following items to the strftime() function:
> %f is replaced by the weekday as a decimal number (1-7), where
> Monday is 1 (the ISO 8601 weekday number).
> %F is equivalent to "%Y-%m-%d" (the ISO 8601 date format).
> %T is equivalent to "%H:%M:%S" (the ISO 8601 time format).
> %V is replaced by the ISO 8601 week number of the year (weeks
> begin on a Monday, and week 1 is the week that includes both
> January 4th and the first Thursday of the year) as a decimal
> number (00-53).
Thanks a lot for forwarding this interesting working paper. Please forward
the following comments to the authors of N733 and whoever else might be
interested. If possible, I would like to review the final text of the
new strftime() definition (will the current draft be on the Web
Comments on WG14 proposal N733
Markus Kuhn <kuhn at cs.purdue.edu> -- 1997-06-30 (= 1997-W27-1)
I appreciate this proposal and I have the following related comments.
a) The valid range for %V numbers is 01-53. There is no week 00 in the
ISO 8601 week numbering scheme. The week before a week 01 is either
week 52 or week 53 of the previous year.
b) %V alone is not sufficient to be able to use the ISO week numbering
system. Each week is associated with a year, the year in which the
majority of the days of this week fall, and this is not necessarily
the year in which all days of the weeks fall. For instance: The
week 1999-W52 goes from 1999-12-27 to 2000-01-02, in other
words, the day 2000-01-02 has a week notation of the
form 1999-W52-7. There should be another format descriptor for
the year to which the current ISO week belongs, preferably both
in 4-digit (%G) and 2-digit (%g) form.
If you implement an algorithm for %V, you'll get the value of %G
anyway as a by-product very easily, and therefore it should be
made available to the strftime() user.
c) Existing practice: The Olson tzcode package <ftp://elsie.nci.nih.gov/pub/>
contains a widely used strftime() implementation that supports already:
%u ISO 8601 week day number (1 = Monday, 7=Sunday)
%V ISO 8601 week number (01-53)
%G ISO 8601 year of current week, 4-digits
%g ISO 8601 year of current week, 2-digits
Unless there is a good rationale for the characters suggested
by N733, I would suggest to stick with %u instead of %f for the
weekday number, and I hope that you will add %G and %g as used
in the Olson package and Arnold Robbins' strftime version 3.0.
Therefore, if I evaluate on 1977-01-02 the string %G-W%V-%u, I should
get 1976-W53-7, and on 1975-12-29 I should get 1976-W01-1.
Not directly related to N733, but affecting the same part of the standard,
I have a number of other suggestions:
d) The range for %S and tm_sec is currently defined to be 00-61 to
provide for "as many as two leap seconds". This was based on a
serious missunderstanding and there can never be two leap seconds
per day as it becomes very obvious by reading ITU-R Recommendation
TF.460-4 (I can send you a copy if you are interested). Since
this 00-61 range is being widely quoted in other standards, this
error should be fixed just to stop spreading this serious
missconception of how leap seconds work. The correct range is 00-60.
This is not an interoperability problem, but fixing this would make
WG14 look like they know what they are doing, and it is therefore
a good idea.
e) I wonder whether %W is anywhere used and whether this field could
be dropped to simplify implementation and memory cost. Countries
that start the week with Monday normally use ISO 8601 week numbers
(%V) and not the scheme defined by %W. I suspect %W was defined based
on a missconception of how week numbers work in Europe. Unless
anyone can come up with an example where %W is used or needed, I
suggest to drop it as it looks completely useless to me (and please
don't quote standards that just copied the %W from ISO C).
f) In the definition of %y and %Y, the first two digits of a four digit
year are refered to as the "century", which is problematic, since
the years 1999 and 2000 belong to the 20th century, but 2001 belongs
to the 21st century. Suggested better wording: "%y is replaced by
the last two digits of the year as a decimal number (00-99)".
Again, not a serious interoperability problem, but it makes WG14
look like they know what they are doing.
g) mktime() is the inverse function of localtime(), but there exists
no portable inverse function for gmtime() that converts a struct tm
given in UTC into time_t. This is a serious problem, and the addition
of a new function (e.g., mkgmtime() might be a possible name) should
be considered seriously. It is not possible to invert gmtime()
in a 100% portable way in an application program, and in practice, I
have encountered awful hacks like binary searches over the time_t
range to invert gmtime() in an as portable as possible way.
See <http://www.ft.uni-erlangen.de/~mskuhn/iso-time.html> for further info.
Markus G. Kuhn, Computer Science grad student, Purdue
University, Indiana, USA -- email: kuhn at cs.purdue.edu
More information about the tz