[tz] strftime %s

Robert Elz kre at munnari.OZ.AU
Mon Jan 15 08:06:03 UTC 2024


    Date:        Sun, 14 Jan 2024 20:33:05 -0500
    From:        Steve Summit via tz <tz at iana.org>
    Message-ID:  <2024Jan14.2033.scs.0003 at tanqueray.home>

  | But what I was convincing myself of was precisely, as you put
  | it, that the number generated by strftime %s:
  |
  | > ...is *not* the time_t value that produced this
  | > struct tm - it cannot be, as no such thing need exist.

Note you need to keep the correct standards in your head, and
know what each requires, and what each specifies.

What I wrote there is just to make it clear that one may do:

    const char *
    func(void)
    {
	static char res[80];

	struct tm T = {
		.tm_year = 2024 - 1900,
		.tm_mon = 1 - 1,
		.tm_mday = 15,
		.tm_hour = 12,
		.tm_min = 55,
		.tm_sec = 26,
		.tm_isdst = -1	/* or 0, or 1, your choice */
	};

	strftime(buf, sizeof buf, "%s", &T);
	return buf;
    }

then in my timezone, in a POSIX environment, func()
is required to return a pointer to the string "1705298126".

The value you'll get will be different, as it depends upon
your local time zone, but for any constant local timezone
that is, if you do not alter TZ), the result is the same
string, it does not depend upon the current date or time
in any way at all.

If you do alter TZ you can discover the time_t value for
that particular instant of local time in various different
time zones, as many as you desire.

[Aside: I did not compile test that, apologies for any random
syntax errors, etc, I introduced .. I did however validate the
struct tm data to time_t conversion for my timezone.]

The point here is that the %s conversion isn't giving the
time_t value that was used to generate T as no time_t value
was used for that, just the C code written above.  And
since the answer depends upon what timezone you execute it
in, there is no one right answer, expecting one is a mistake.

Anyone who believes that %s (or mktime()) is required (or
even just should, and the standard should be changed to
allow it) return a particular value for a given struct tm
needs to explain how that is to work and keep code like the
above functioning correctly.   And note, code just like that
has always worked for mktime() (since the dark ages when mktime()
was first invented, before tzcode or tm_gmtoff existed) and
thus by extension for strftime("%s") which traditionally just
did mktime() on a copy of the struct tm handed to strftime()
and converted the result to a string (using snprintf probably).

Note that T is a local variable in func()'s stack space,
the fields of the struct tm that are not explicitly set
will contain whatever stack garbage was there before the
call to func() and as we can sprinkle calls to func()
throughout our code, after any other random function has
just put who knows what on the stack, that stack garbage
can vary from call to call.   Hence, if the implementation
were to (say) use the value of tm_gmtoff (which is not
initialised above) in any way at all to compute the value
returned by the %s conversion, then the result would not always
be the same.  But it must be, there is only one time_t value which
you can pass to localtime which will generate 2024-01-15 12:55:26
in your local timezone -- except if that local time happens to
be in the overlap period when summer time has just ended and
local times run twice - though for that to happen would be
unusual indeed, that weird hour (or however long it happens
to be) typically happens in the middle of the night, and not
in the middle of the day on a Monday.

  | But before I convinced myself of this, whenever I used to see
  | that 'date +%s' was for printing what is, for all intents and
  | purposes, a raw time_t value, I imagined that's what it did:

It is what it does.   But remember that "date" is a POSIX command,
and obeys the POSIX standards, the C standard does not specify any
commands at all, just the language often used to write them.
So date(1) knows it is in a POSIX environment (anywhere else, and
what it does with a '+xxxx' operate, and how that would even be
specified to it if a date command even exists, is all someone
else's problem - some other standard, or some vendor's proprietary
S specification, or whatever) and so time_t is an integer specifying
seconds since the epoch, and so that is what date +%s is
guaranteed to print (and given no other args to vary it, seconds
sine the epoch for "now" when the command is issued).

You can rely upon that.

  | print the raw time_t value.  So it follows that if someone is
  | trying to implement all of date(1)'s '+' options using strftime,
  | with the implication that strftime has to be able to do %s, it
  | further follows that strftime has to be able to -- somehow --
  | access that raw time_t value.

Of course it can.   A time_t is a numeric value (even in C) it
isn't a struct or union, or something like that, so for a particular
environment there is some printf format conversion that we can
hand to sprintf() to convert the value to a string.   It might
be needed to cast it to a long double or something first, but it
can always be done.

  | But, hang on, don't jump down my throat and correct me again,
  | because, I know: that's wrong.

Not really.

  | is to remind myself that strftime's computation of %s is *not*
  | a simple operation:

Not completely trivial no.   But not complex like attempting to
measure time at the quantum level, or anything like that either.

  | it's a complex transformation, more or less
  | exactly equivalent to mktime.

Yes, that is exactly what it is, which is why they are both
specified to generate the same results.

  | It's potentially lossy,

I'm not sure what that means - mktime() is (unfortunately, I have
been trying hard to get this changed to something rational, but the
POSIX people simply refuse to understand the issues) always defined
to return a value, except when the year (or in very unlikely cases
year and other fields combined) has an absolute value so large that
a time_t doesn't have enough bits to represent it ... which is
impossible with a 64 bit POSIX time_t in the common case where "int"
(which is what tm_year is) is just 32 bits.

Hence strftime(%s) always is as well.

  | it does what you expect (if your expectation is even correct) only
  | if you use it very carefully, paying attention to subtle facets
  | of the documentation which are easy to overlook or misinterpret.
  | One which has been mentioned is that TZ has not changed.

NO!

You can change TZ however you like.  If you do the result will
differ - it is intended to, that's why if you run the above code
fragment in your timezone you'll get a different value than I
get (presumably, unless you're in UTC+0700).

That's intended, and the way things are intended to work.   I
still think you're hung up on the notion that struct tm must always
come from a call to one of the *time() functions which return
such a struct (or a pointer to one) given a time_t input, and that
the value obtained should be that particular time_t value.

Stop believing that, that's not how it has ever worked, or is
intended to work.

  | Another is that tzset either has or has not been called.

How does that affect anything?

  | Yet another (which I don't think has been mentioned yet) is
  | that tm_isdst is set correctly.

Not that either - tm_isdst should be just a hint to mktime()
(and consequently to strftime(%s)) for the ambiguous cases.
Unfortunately, the bizarre desire to use localtime() and mktime()
to allow arithmetic operations on C time_t's has the POSIX
people demanding that tm_isdst be an instruction, rather than
the presumption that the standards have always previously said
it was (a presumption which can be rebutted if it turns out to
be incorrect - but to be useful for arithmetic, it needs to be
mandatory, and override local conventions).

Exactly how that is supposed to work still baffles me, as how
can someone possibly know what offset would be applied were
summer time in effect in some date right in the middle of winter
in some jurisdiction which has never had any summer time at
all (like where I am, yet if I set tm_isdst to 1 in the above
fragment, they require mktime to apply a dst correction which
is an unknown magnitude and unknown sign).

  | But, yes, if you're careful of all those things, %s will work
  | correctly.  But will it do what you want?

That depends upon what you want.  Obviously.   If I want it to
magically inflate my bank account, then I will probably be
disappointed.   If you don't happen to want what it is specified
to do, you might be as well.   But if you simply want it to
behave as it is specified to do, then it should always work.

Wanting things to do other than what they are defined to do
(like wanting your car to operate as a submarine when submerged
in water) is a nice fantasy, but one that only ever seems to
work in movies.

  | there might be bugs in your understanding of what %s does,

Of course, and you fix that by learning.   Everyone starts out
knowing almost knowing about almost everything, and learns over
time.   However, your objective should be to really learn, not
guess, experiment a little and "confirm" the guess, and then
proclaim your guess to be the rule.   Unfortunately, that's
what far too many people do, with the "experiment a little"
often being "I tried it once and it worked".   And this doesn't
just apply here, it applies to everything we believe we know
to be true.   Make sure, don't just believe because it's
easier, and you're less likely to be eventually proven wrong.

  | > | > (Which brings me back to my conclusion that %s
  | > | > shouldn't exist, because it's impossible to implement correctly.
  | >
  | > Nonsense.   It is trivial to implement correctly.
  |
  | A laughable conclusion, given the complexity of this thread!

Not at all.   I have done it.   It isn't hard at all.   What is
hard is convincing people that they're long held belief of
just what must be correct (because they never happen to have
observed anything different) is in fact wrong.   That is hard.

  | But I think you mean, the long and the short of a proper %s
  | implementation is to call mktime on the struct tm handed to
  | strftime, and interpolate the result.

It would have to be on a copy of the struct tm, not the actual
one, as mktime() might modify it, and we don't want that for
strftime - there might be more other conversions still coming
in the format string, and we need to use the original values
for those, not ones altered by mktime().   But you only need to
worry about that if you're implementing strftime().

That's certainly an easy way - but as mktime() first goes about
validating the ranges of all the (relevant) struct tm fields,
and adjusting them (and then others to compensate) and also
setting up the other fields in the struct to agree (tm_wday,
tm_yday etc) that strftime() doesn't need to do - it can assume
that all the fields are already within range, as its result is
unspecified if the user doesn't guarantee that.   So it can
simply do the latter half of what mktime() does, perhaps using
some private internal function which both mktime() and strftime()
use, or perhaps just duplicating the code, or using a different
algorithm which produces the same result.   That's the implementor's
choice, and users should not worry about it.

  | (But it's like that old joke about the lecturer who,
  | after being questioned about whether a certain result is truly
  | "obvious", spends half an hour alternately deep in thought or
  | scribbling abstrusely on the chalkboard, before triumphantly
  | concluding, "Yes, I was right, it is obvious.")

Yes, heard that before - and that people believe it is humorous
are not understanding what it means to be "obvious" - which just
means that the result is guaranteed from known facts, and cannot
be different, not that the process of determining that is quick.

This is another of the things where common use of a word has
lost its true meaning - another is "theory" where people will
say "my theory is that ..." where they mean (at best) "my hypothesis..."
and far more often "my unsupported random guess..."

But theory is from the same root as theorem, and means proven.
Not a guess.   Of course, someone might, one day, find a flaw
in the proof, but until that happens, a theory should be regarded
as a fact.   But in the common mindset, it tends to suggest it
is just a guess, as that's how people misuse the word all the time.

  | An implementation that perfectly implements a
  | useless specification isn't useful.

True.   But it must have been useful to someone, sometime,
for them to have implemented and specified it that way.
That it doesn't meet your particular need doesn't mean
it isn't useful to anyone, just not useful to you.  You
might need something different - just don't break what
other people need because it isn't what you need.

  | No, I know, but if the implementor of date(1) has a specification
  | of the format specifiers accepted by '+', it might be prudent to
  | vet that list against the specification of the strftime call
  | that's about to be used.

Why?   That's strftime()'s job, it is what really knows, and more
specifications can be added over time, without needing to go
fiddle with the internals of some command (like date) which just
happens to use it.   strftime() will return "" if there is a
problem in the format string - that's why we leave the '+' in
the format that date passes to strftime - that way if date gets
a "" result, it knows there was an error, and can print a
diagnostic.  If the result starts with '+', which it must if
no error occurred, as that will (for date(1)) always be the
fist char of the format string passed in, and only the conversions.
which always start with a % are modified, then date knows that
strftime() worked, and can simply print the result (without that
'+' which is not intended to appear - the user's strftime() format
is what followed that '+' in date's arg list).

  | And of course that's precisely how some implementations of
  | mktime *do* work!

Yes, I know, I did the first of those (not my idea of how to
implement it, that I was told about, but I wrote that code) -
long long long ago (the actual code has been much improved over
time, so I doubt you'll see any of my actual text, unless you look
at some ancient archive - and there's no reason to do that).

kre



More information about the tz mailing list