[tz] strftime %s

Tue Jan 16 00:09:55 UTC 2024

On 2024-01-15 01:38, Robert Elz wrote:

>    | Although it's overkill for the POSIX strftime spec to require
>    | struct tm components to be set when they don't need to be set,
> 
> You're missing the point of it all.  The components that it lists
> are what some correct implementations need to be set to function.

I'm not missing the point; I'm trying to clarify and/or fix it.

POSIX has never required implementations to look at every listed struct 
tm member to function correctly in every use of the corresponding 
conversion spec. Nor has it required that the result of the conversion 
spec be completely determined by the contents of the struct tm members. 
The relation between the struct tm member values and the conversion 
outputs is more subtle than that.

>    | it's not that big a deal. In practice nearly
>    | every app calls strftime on the result of localtime etc.
> 
> Is there some evidence to support that?

Sure, look at tzcode. Or at coreutils. Or at Emacs. Or at 'tar'. Or at 
pretty much every app that uses strftime. Although it's theoretically 
possible that there are exceptions for the edge cases we're talking 
about, so far in this thread we've seen zero real-world examples.

> why would you penalise those other apps which don't?

I'm not trying to *penalize* any unusual code that trips over these edge 
cases. I'm trying to *help* it. If code relies on bugs or incorrect 
interpretations of odd corners of the POSIX spec, it'll get wrong 
answers. Standards should be worded clearly to help prevent this sort of 
confusion.

> But yes, once the next POSIX is published, then the tm_gmtoff field
> will be available to %z and tm_zone to %Z, and simply using those
> will be easy to do.   Of course, if you do it that way, you're
> breaking any existing applications which were written to either
> conform to the C standards (any of them)

No, it doesn't break conforming C programs that use these oddball edge 
cases. The C standard doesn't specify how the implementation determines 
the timezone. It can be TZ or it can be something else. So even if the 
behavior changes, it's not a violation of the C standard, as a 
conforming C program will still work as the C standard requires.

>    | All that's needed is for strftime
>    | to compute seconds since the Epoch in the usual way (i.e., using the
>    | Gregorian calendar and ignoring leap seconds),
> 
> But aside from correcting for out of range values, which strftime is
> not required to do, that's eactly what mktime() is specified to do.

That's a pretty big aside....

The intent of this part of the strftime spec, as I see it, is to say 
that strftime should use the standard POSIX way of breaking down time 
(Gregorian, no leap seconds) - not all the other mktime machinery.

> That UTC offset *must* come from the TZ value, such that if TZ is
> altered to refer to some other offset, then the result from mktime()
> (and hence from strftime(%s)) must change.

This is incorrect for two reasons. First, changing TZ need not alter 
mktime's result. Second and more important, even if you *don't* change 
TZ, two calls to mktime can yield different answers for the same 
in-range inputs, so the POSIX 202x/D4 spec does not completely specify 
strftime's output.

This second property is inherent to the inadequacy of mktime's API. And 
it undercuts any argument that strftime %s and mktime must always 
produce exactly the same output.

> But by all means, submit a defect report, and see how far that gets
> you.

OK, I've done that here:

https://www.austingroupbugs.net/view.php?id=1797

It uses an example that is a bit sharper that what we've discussed so 
far, in that the example exploits abovementioned inadequacy of mktime's API.