strftime %y and negative years

Thu Aug 19 22:57:13 UTC 2004

"Olson, Arthur David (NIH/NCI)" <olsona at dc37a.nci.nih.gov> writes:

> ...the suggested change to strftime is trying to get %y to
> produce the "year within the century" as output.

Yes, that's correct: it's trying to produce the year modulo 100.

> But the latest IEEE Std 1003.1 calls for...
>
> 	%y	Replaced by the last two digits of the year as a decimal
>		number [00,99].

Read literally, this would have undefined behavior for the years -9
through 9 (since they don't have two digits), and would generate "10"
for the year -10, and so forth, which obviously disagrees with the
proposed "year modulo 100" semantics.

However, I just checked 3 implementations and found that nobody obeys
the literal behavior for years before -9, that there's disagreement
about negative single-digit years, and that at least one traditional
Unix implementation mishandles years before 1900 (not too surprising,
since the Unix Version 7 ctime allowed only years in the range
1900..2099).

Here's what I found.

year (i.e.,    glibc 2.2.5   OpenBSD 3.4   Solaris 9
tm_year-1900)  (Debian                     patch 112874-29
               2.2.5-11.5)                 (64-bit sparc)
-101              99            -1           0/
-100              00            00           00
 -99              01           -99           '' (i.e., two apostrophes)
 ...
  -2              98            -2           0.
  -1              99            -1           0/
   0              00            00           00
   1              01            01           ''
 ...
   9              09            09           '/
  10              10            10           '0
  11              11            11           ('
 ...
  99              99            99           0/
 100              00            00           00
 101              01            01           ''
 ...
1899              99            99           0/
1900              00            00           00
 ...
The implementations agreed for years after 1899, until they
got to the year 2**31:
2**31 - 1         47            47           47
2**31             48           -48           48
2**31 + 1         49           -47           49
2**31 + 2         50           -46           50
....
2**31 + 1897      45           -51           45
2**31 + 1898      46           -50           46
2**31 + 1899      47           -49           47

So it appears that, in practice, the behavior of strftime %y is
undefined when tm_year is negative, or when tm_year+1900 exceeds
INT_MAX.

I don't know whether the standards committee would consider all this to be
a bug in the standard or in the implementations, but here's what I think.
The Solaris behavior is clearly buggy for years before 1900.
The OpenBSD behavior is clearly buggy for years after 2**31 - 1.
For years before 0, I suppose that it's debatable between OpenBSD 3.4
and glibc 2.2.5.  However, I'd say that the glibc 2.2.5 behavior is
cleaner, since it's more regular, it always outputs two bytes, and
it doesn't output "-".

PS.  Disclaimer: I wrote that part of glibc 2.2.5 so my opinion is
hardly impartial.

PPS.  Ironic, isn't it?  The consensus is that new code shouldn't use
ctime, as it's obsolete and is undefined for years out of the
traditional range, and that new code should use strftime.  But in
practice, strftime has problems too.

PPPS.  I'll CC: this to the tz mailing list to see if anybody else has
experiences with other implementations in this area, or strong
opinions on the subject.