[tz] Extra transition for Europe/London with 2023d

Guy Harris gharris at sonic.net
Tue Jan 9 22:57:23 UTC 2024


On Jan 9, 2024, at 1:10 PM, Brooks Harris <brooks at edlmax.com> wrote:

> On 1/7/2024 7:51 PM, Guy Harris wrote:
>> On Jan 7, 2024, at 12:01 PM, Brooks Harris <brooks at edlmax.com> wrote:
>> 
>> 
>>> Right, but I think there should be. Posix cannot distinguish an stdoff shift independent from a gmtoff shift.
>>> 
>> So presumably:
>> 
>> "stdoff shift" is short for "a shift in the offset between UTC and standard time", "standard time" being what is specified as such by law;
>> 
>> "gmtoff shift" is short for "a shift in the offset between civil time and standard time".
>> 
>> Given that, not all shifts in the offset between UTC and civil time" are "gmtoff shifts", which is a bit confusing, given that "gmtoff" sounds as if it's an offset from "GMT" or UTC.
> 
> stdoff is the "standard" offset from UTC of UT *without DST*.  gmtoff is the offset from UT or UTC *with DST*. There is no "dstoff" to signal the DST value in effect, which is usually 1-hour but can be negative (Dublin), "double summertime" or possibly some other value. This is where the isdst flag is insufficient to cover those not 1-hour cases.

In TZif files, there is neither stdoff nor dstoff, there's just gmtoff.

In zic *source* files, there's stdoff (in the STDOFF column) and dstoff (in the SAVE column).  There is no gmtoff; that's calculated by zic and put into the TZif file.

> The term "standard" is some what ambiguous in general but understood within the TzDb context as the "normal" or "base" offset from UT or UTC  

In particular, it's what the relevant laws deem to be "standard time"; that's why standard time is in effect in summer in Ireland.

>> Not all "gmtoff shifts" constitute transitions to or from "daylight saving time", as Morocco, for example, has on some occasions introduced daylight saving time, but also shifts the clock during Ramadan.
>> 
> Yes. Which brings up the terms "spring forward" and "fall back" which imply 1-hour shifts in the spring and fall, but this doesn't work for the four transitions in Morocco and elsewhere or for negative DST. 

So those terms should be avoided.  (They don't imply 1-hour shifts; "spring forward an hour" and "fall back an hour" would.)

>> I don't know what program produced the output you're showing; how did it incorrectly infer that the offset between standard time and UTC before the transition was 0 rather than +1 hour (3600 seconds)?
>> 
> This is output from my modified version of zdump reading the TZIf output of my modified version of zic. These include a modified version of struct tm which I've renamed "struct tztm", to which I've added long int tm_stdoff which is populated by the values of STDOFF from the TzDb source files. This tm_stdoff value is also added to the TZIf file data.

I.e., it's a modified version of dump reading the *modified-TZif* output of your modified version of zic.

That should have been stated up front.

> Right. Perhaps "custom" is a misleading word. I meant that Posix seems to support USA rules ok

*Current* USA rules; for earlier USA rules, see below....

The POSIX API has no problems with converting "seconds since the Epoch" to year/month/day/hour/minute/second in local time, even for local time in, for example, Morocco; the only issue with Ireland is "what does tm_isdst mean" - does it mean "time is shifted from standard time" or does it mean "time is set ahead for the summer"?.

The problems that the POSIX API have are with the time zone designation strings and time offsets.  Those are *not* stored in the POSIX struct tm; they are, instead, stored in global variables.  The tzdb code makes an attempt to handle that, by setting the global variables with a time is converted.

It's also very much not thread-safe.

The current draft of the next revision of POSIX has tm_zone and tm_gmtoff, which addresses those problems (except that it says that, if a program calls localtime() or even localtime_r() in one thread, and another thread changes the value of the TZ environment variable between the time when localtime()/localtime_r() filled in the structure and when tm_zone is used in that structure, the result is undefined).

> but is not complete for many other time zones that act differently than the rules we US-based people are most familiar with.

Which means "if the standard offset from UTC changes or the time zone designation string changes", *both* of which have happened in the US in the past.  For example, the "US" rules in the northamerica file are

	# Rule	NAME	FROM	TO	-	IN	ON	AT	SAVE	LETTER/S
	Rule	US	1918	1919	-	Mar	lastSun	2:00	1:00	D
	Rule	US	1918	1919	-	Oct	lastSun	2:00	0	S
	Rule	US	1942	only	-	Feb	9	2:00	1:00	W # War
	Rule	US	1945	only	-	Aug	14	23:00u	1:00	P # Peace
	Rule	US	1945	only	-	Sep	30	2:00	0	S
	Rule	US	1967	2006	-	Oct	lastSun	2:00	0	S
	Rule	US	1967	1973	-	Apr	lastSun	2:00	1:00	D
	Rule	US	1974	only	-	Jan	6	2:00	1:00	D
	Rule	US	1975	only	-	Feb	lastSun	2:00	1:00	D
	Rule	US	1976	1986	-	Apr	lastSun	2:00	1:00	D
	Rule	US	1987	2006	-	Apr	Sun>=1	2:00	1:00	D
	Rule	US	2007	max	-	Mar	Sun>=8	2:00	1:00	D
	Rule	US	2007	max	-	Nov	Sun>=1	2:00	0	S

Note that the "LETTER/S" changed to something other than "S" or "D", so that the designation strings for US time zones changed from the usual EST/EDT, CST/CDT, ... PST/PDT pattern.

And as for America/Chicago, well:

	# Zone	NAME		STDOFF	RULES	FORMAT	[UNTIL]
	Zone America/Chicago	-5:50:36 -	LMT	1883 Nov 18 18:00u
				-6:00	US	C%sT	1920
				-6:00	Chicago	C%sT	1936 Mar  1  2:00
				-5:00	-	EST	1936 Nov 15  2:00
				-6:00	Chicago	C%sT	1942
				-6:00	US	C%sT	1946
				-6:00	Chicago	C%sT	1967
				-6:00	US	C%sT

We'll ignore the LMT entry, which merely serves to indicate when standard time began (the STDOFF for that entry is the offset from GMT at some particular place in Chicago).  That particular IANA timezone apparently shifted to *Eastern Standard* time - no DST - between 1936-03-01 at 2:00 local time and 1936-11-15 at 2:00 local time.

>> (Leap seconds, and the POSIX choice to mandate 86400-second days, made monotonicity a bit tricky, but I digress....)
> Quite. The leap-second is evil.

Note, BTW, that zic, the TZIf file format, and the TZDB code all handle leap seconds.  Zic can be told to put leap second transitions, from the leapseconds file, into a TZif file and, if the TZDB code is pointed at a TZif file with the leap second information, it will treat a time_t value as being seconds that have elapsed since the Epoch rather than as "seconds since the Epoch", i.e. on a transition from 23:59:59 to 23:59:60, the time_t value increases by one.  It will convert a time_t corresponding to a 23:59:60 time to have a tm_sec value of 60.

> Yes. Imagine for example a news organization collecting news feeds from cameras in Los Angeles, Washington DC,  Johannesburg, Taipei, or anywhere else. You really need to know the local time when and where an event happened, not just its UTC time.

"2023-07-01 00:01:23 -7:00" is sufficient to allow that - that's 2023-07-01 07:01:23 UTC.  A tzid is not necessary for that, nor is representing anything other than the offset from UTC in effect at that point in time.

>> (Presumably "DST" means "not STD" rather than "daylight saving time", as 1) Morocco shifts its clocks for Ramadan but that's not DST and 2) Ireland's *summer* time is standard time.)
> Right. To my point that "standard" is somewhat ambiguous and not all time zones behave in the ways  familiar to many of us, like in the USA. I've learned to be very careful not to impose these familiar biases on my implementations of local time. It just doesn't work the same way in many places.

So why is it necessary to indicate why the time was shifted, by some amount (not necessarily one hour) from "standard" time at that point in time?  Is that due to timestamps possibly *not* matching local time due to a local time shift in the middle of a video segment, so that an SMPTE timestamp, at least, can't show the results of that shift?

>> However, it reports "EEST", not "EEDT", so *that's* probably a bug. tzname[] is set to { "MSK", "EEST" } after localtime() is called on 670374000, so the timezone code in macOS 13.6, at least, has this bug in it. It should probably have been set to { "EEST", "EEDT" }.
> 
> Maybe. But that's what comes out today, and that seems sufficient for the Posix purposes. It might not be "correct", I guess?

It turns out it wasn't a bug - "EEST" is "Eastern European *Summer* Time".

>> Why? Why does the local time, plus the offset from UTC of local time at that instant, not suffice to represent any time?
> 
> With SMPTE timecode one can represent any time-point within the 24-hour range.  SMPTE timecode includes flags for the "count mode" (non-drop-frame and drop-frame). There is no discontinuity in the hh:mm:ss:frames counting sequence.

Unless you're using TAI for time codes, there will eventually be discontinuities, for leap seconds if nothing else, and, if the timecode is local time, for any shifts in local time.

Presumably that's where the jamming comes in, so that...

> So with a single SMPTE timecode an application can "trim" forward or back and calculate durations from point to point along the 24-hour timeline. There is often no relation to actual local time-of-day, just a count from zero to 24 hours. This is very typical in many scenarios, especially post-production (editing). There is no need for access to any other metadata.

...the time codes have no discontinuities within a given sequence of video.

So is the issue that, as a result of discontinuities within in local time but not within SMPTE timecodes, sometimes the timecode is out of sync with local time, and there needs to be some information to indicate the delta?

If so, why is it not sufficient to provide that delta, rather than, for example, a "STD" vs. "DST" indication?

> I extend this idea to local time time-of-day representation. Some days have transitions, so my timestamp design carries sufficient information to signal "this day is a transition day", "it is a transition of x value (often DST shifts, sometimes STDOFF shifts)", and "the transition occurs at this time-of-day". Thus, an application that does not have access to any metadata (TZif or TzDb source file) can accurately represent any point during that day.

Unless something changes twice within the day, which is neither excluded by the zic source file format nor by the TZif compiled format.

> Example - A "fall back" DST transition in America/New_York:
> 
> D2022-11-06T00:00:00U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX   UTC 01667707227
> D2022-11-06T01:59:59U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX   UTC 01667714426
> D2022-11-06T01:00:00U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX   UTC 01667714427
> D2022-11-06T23:59:59U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX   UTC 01667797226
>                                                       ^^^^^^^^^^
>                                                 DST transition metadata  

Is there any need to have anything other than the current dstoff there?  A transition that affects the "standard time offset" would be indicated by the "offset from UTC' changing and the "additional offset to the standard time offset" not changing.

(What does "UTC NNNNNNNNNNN" mean here? Those aren't POSIX-time values, as 01667707227 is 1977-11-28 02:27:35 UTC.)

> So with a single timestamp on a given day all points of that entire day can be represented without access to any additional metadata.

I.e., it's putting information from the America/New_York TZif file's transition entry that covers that particular time stamp, so if all that's to be done is to do conversions on *that particular timestamp*, the software doesn't need to look at that file.

By the way, it would probably be best not to convert the tzid to all lower case, as not all file systems are case-insensitive.  (They're also not all case-sensitive, so having two different tzids that differ only in case would be a mistake.)



More information about the tz mailing list