[tz] [PROPOSED 2/2] strftime: conform better to POSIX+

Wed Aug 12 13:25:03 UTC 2020

    Date:        Tue, 11 Aug 2020 19:05:39 -0700
    From:        Paul Eggert <eggert at cs.ucla.edu>
    Message-ID:  <85aad8e3-99dd-d428-947b-3f53b30be64f at cs.ucla.edu>

  | It's too late for that. Many APIs have already been designed.

APIs that are ever used in a way that the ambiguous case can arise??

I think it might be just %Z (rarely) and %p (sometimes) which ever
produce an empty string.   How many of those other APIs would ever
be given either "%Z" or "%p" as their arg?  If that doesn't happen
they don't have a problem.

  | The POSIX draft is already doing that to strftime, by requiring EOVERFLOW
  | where many implementations don't do EOVERFLOW,

As I'm sure you have seen on the Austin list, I agree with your point
there, that should not be required.   But to add it as a "may fail"
all that is needed is any one implementation that does that.

  | and similarly for EINVAL for the odd 
  | implementations that don't support negative time_t.

Same there.

  | As long as it's specifying 
  | errno de novo anyway, it might as well fix this longtime botch.

The problem with this one is that all implementations need to support it,
or it is worthless.   If code has to deal with the possibility that the
implementation doesn't return ERANGE (and of course, most don't currently)
then that it might happen to be running on one that does is largely
immaterial.   The burden to add this kind of change is considerably greater.

  | I have a similar beef with mktime, by the way: there's not a
  | well-designed way to distinguish success from failure.

It isn't trivial, but if -1 is returned, it isn't all that difficult to
check whether there's any possibility the tm might be set to 23:59:59 Dec 31 
1969 (UTC).  In most cases it is safe to simply assume that won't be the
case, and simply treat -1 as the error indicator only, and not a possibly
valid time_t but when more is needed, the check is not all that difficult
(after a successful return, which is what it would be if the -1 indicates
that 1969 date, the tm has been normalised, Dec 31, 1969 was a Wednesday,
so set tm_wday to 0, and then check if on return it has become 3 or 4 (4,
in case of a timezone east of UTC).  If not, then we know the -1 means error.
After that there are just two cases to check for, if tm_wday is 3, then
tm_mon would be 11, tm_year 69, tm_mday 31, and tm_sec 59 (leave hour and
min unchecked initially to avoid needing to worry about the zone offset).
If wday is 4, then it should be tm_mon 0, tm_mday 1, tm_year 70, tm_sec 59.
If those checks fail (any) then the -1 was an error return.   After that if
you really wanted you could look at the zone offset, and check tm_hour and
tm_min as well, but I think it would be reasonable in that case, to
assume that the -1 really did mean 31 Dec 1969 23:59:59.

Another way, if -1 is returned, is to take the original struct tm
do tm_sec++ and then mktime() again.  If you now get 0 as the result
than the original -1 was the 31 Dec 1969 version, otherwise it was
an error (whether the 2nd mktime() returns -1 or not - anything but 0).

kre