[tz] timezone DB distribution

Guy Harris guy at alum.mit.edu
Wed Aug 19 03:28:09 UTC 2020

On Aug 17, 2020, at 12:01 PM, Juergen Naeckel via tz <tz at iana.org> wrote:
> First of all, I would like to thank you.  I have to implement something in JavaScript that uses timezones.  However, I am using an older JS version that does not have the flexibility like today’s JS.  So I was looking for a repository holding all current timezones and rules for it, rather then me, checking time and again how which timezone is configured 😉  This really could help me.  Reading through the files, it sometimes made me chuckle and I was actually surprised how fluent timezones are handled.  Changes almost every year…
> I would like to recommend some improvements.  I know you have pretty stable release by now.  I am aware that changes probably to the structure might affect a lot of people/projects.  However…
> First of all, a tar.gz is Linux specific.

Or, rather, UN*X specific; Linux Torvalds was about 10 years old when tar was first broadly available (with V7 in 1979, I think), and gzip came out a little more than a year after somebody announced that they were "doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones", although it did support bash and gcc when that announcement was made. :-)


> True, you could install additional Windows software.  But, that might not go well with customers of mine.  I think a ZIP would be acceptable for both worlds.

...offering both a tarball and a zipball might be a good idea (zip exists as a UN*X command, and ships with at least UN*Xes, but UN*X users may be less used to it).

> Finally, I got one more recommendation/question.  I had read the readme file but it didn’t explain the data I saw in the files.

Perhaps we should also 1) split the part of the zic man page that describes the file format into a separate man page and 2) provide formatted versions for people not on UN*X systems.

> It took me a while to understand the concept of the data structure.  Some info of what to see in the file, and how to read it would help.
> And I think the first line of the ZONE definition contains some inconsistency (maybe I still didn’t understand it correctly).  Below is a screenshot.  See the first line for the zones?  It looks mismatched with the New York.  The RULE and the [UNTIL] are probably in the wrong column.  Format is probably missing.

As noted in Tim Parenti's reply, this isn't a strict tab-separated values file for consumption by spreadsheets, it's a file for 1) humans to read (so the columns are lined up visually) and 2) the zic program to read; there may be an arbitrary amount of white space between fields.

> Then I noticed that the open-end validity.  For rules it is denoted as “max” and for zones it is just a <blank>.  Could we get some consistency here?

Consistency between "Zone" and "Rules" lines in the meaning of columns?  They weren't designed to be the same - the file has four different types of lines; Rules lines, Zone lines, continuation lines, and "blank" lines, where "blank" lines are lines that have only 0 or more leading white space characters optionally followed by a comment.  "Zone"/continuation lines and "Rules" lines serve different purposes and have a different syntax.

Here's a formatted version of the section of the zic man page that describes the format of the files.

    Input lines are made up of fields.  Fields are separated from one another
    by any number of white space characters.  Leading and trailing white
    space on input lines is ignored.  An unquoted sharp character (#) in the
    input introduces a comment which extends to the end of the line the sharp
    character appears on.  White space characters and sharp characters may be
    enclosed in double quotes (") if they are to be used as part of a field.
    Any line that is blank (after comment stripping) is ignored.  Non-blank
    lines are expected to be of one of three types: rule lines, zone lines,
    and link lines.

    Names (such as month names) must be in English and are case insensitive.
    Abbreviations, if used, must be unambiguous in context.

    A rule line has the form:
    For example:
	   Rule US   1967 1973 -    Apr  lastSun   2:00 1:00 D

    The fields that make up a rule line are:

	   NAME      Give the (arbitrary) name of the set of rules this rule
		     is part of.

	   FROM      Give the first year in which the rule applies.  Any inte-
		     ger year can be supplied; the Gregorian calendar is
		     assumed.  The word minimum (or an abbreviation) means the
		     minimum year representable as an integer.	The word
		     maximum (or an abbreviation) means the maximum year rep-
		     resentable as an integer.	Rules can describe times that
		     are not representable as time values, with the unrepre-
		     sentable times ignored; this allows rules to be portable
		     among hosts with differing time value types.

	   TO	     Give the final year in which the rule applies.  In addi-
		     tion to minimum and maximum (as above), the word only (or
		     an abbreviation) may be used to repeat the value of the
		     FROM field.

	   TYPE      Give the type of year in which the rule applies.  If TYPE
		     is - then the rule applies in all years between FROM and
		     TO inclusive.  If TYPE is something else, then zic exe-
		     cutes the command yearistype year type to check the type
		     of a year: an exit status of zero is taken to mean that
		     the year is of the given type; an exit status of one is
		     taken to mean that the year is not of the given type.

	   IN	     Name the month in which the rule takes effect.  Month
		     names may be abbreviated.

	   ON	     Give the day on which the rule takes effect.  Recognized
		     forms include:

			   5	    the fifth of the month
			   lastSun  the last Sunday in the month
			   lastMon  the last Monday in the month
			   Sun>=8   first Sunday on or after the eighth
			   Sun<=25  last Sunday on or before the 25th

		     Names of days of the week may be abbreviated or spelled
		     out in full.  Note that there must be no spaces within
		     the ON field.

	   AT	     Give the time of day at which the rule takes effect.
		     Recognized forms include:

			   2	    time in hours
			   2:00     time in hours and minutes
			   15:00    24-hour format time (for times after noon)
			   1:28:14  time in hours, minutes, and seconds

		     where hour 0 is midnight at the start of the day, and
		     hour 24 is midnight at the end of the day.  Any of these
		     forms may be followed by the letter `w' if the given time
		     is local ``wall clock'' time, `s' if the given time is
		     local ``standard'' time, or `u' (or `g' or `z') if the
		     given time is universal time; in the absence of an indi-
		     cator, wall clock time is assumed.

	   SAVE      Give the amount of time to be added to local standard
		     time when the rule is in effect.  This field has the same
		     format as the AT field (although, of course, the `w' and
		     `s' suffixes are not used).

	   LETTER/S  Give the ``variable part'' (for example, the ``S'' or
		     ``D'' in ``EST'' or ``EDT'') of time zone abbreviations
		     to be used when this rule is in effect.  If this field is
		     -, the variable part is null.

    A zone line has the form:
    For example:
	   Zone Australia/Adelaide  9:30 Aus  CST  1971 Oct 31 2:00
    The fields that make up a zone line are:

    NAME    The name of the time zone.  This is the name used in creating the
	     time conversion information file for the zone.

    GMTOFF  The amount of time to add to UTC to get standard time in this
	     zone.  This field has the same format as the AT and SAVE fields
	     of rule lines; begin the field with a minus sign if time must be
	     subtracted from UTC.

	     The name of the rule(s) that apply in the time zone or, alter-
	     nately, an amount of time to add to local standard time.  If this
	     field is - then standard time always applies in the time zone.

    FORMAT  The format for time zone abbreviations in this time zone.	The
	     pair of characters %s is used to show where the ``variable part''
	     of the time zone abbreviation goes.  Alternately, a slash (/)
	     separates standard and daylight abbreviations.

	     The time at which the UTC offset or the rule(s) change for a
	     location.	It is specified as a year, a month, a day, and a time
	     of day.  If this is specified, the time zone information is gen-
	     erated from the given UTC offset and rule change until the time
	     specified.  The month, day, and time of day have the same format
	     as the IN, ON, and AT fields of a rule; trailing fields can be
	     omitted, and default to the earliest possible value for the miss-
	     ing fields.

	     The next line must be a ``continuation'' line; this has the same
	     form as a zone line except that the string ``Zone'' and the name
	     are omitted, as the continuation line will place information
	     starting at the time specified as the until information in the
	     previous line in the file used by the previous line.  Continua-
	     tion lines may contain until information, just as zone lines do,
	     indicating that the next line is a further continuation.

    A link line has the form
    For example:
	   Link Europe/Istanbul     Asia/Istanbul
    The LINK-FROM field should appear as the NAME field in some zone line;
    the LINK-TO field is used as an alternate name for that zone.

    Except for continuation lines, lines may appear in any order in the

    Lines in the file that describes leap seconds have the following form:
    For example:
	   Leap 1974 Dec  31   23:59:60  +    S
    The YEAR, MONTH, DAY, and HH:MM:SS fields tell when the leap second hap-
    pened.  The CORR field should be ``+'' if a second was added or ``-'' if
    a second was skipped.  The R/S field should be (an abbreviation of)
    ``Stationary'' if the leap second time given by the other fields should
    be interpreted as UTC or (an abbreviation of) ``Rolling'' if the leap
    second time given by the other fields should be interpreted as local wall
    clock time.

    Here is an extended example of zic input, intended to illustrate many of
    its features.

      # Rule  NAME  FROM  TO	 TYPE  IN   ON	     AT    SAVE  LETTER/S
      Rule    Swiss 1940  only  -     Nov  2	     0:00  1:00  S
      Rule    Swiss 1940  only  -     Dec  31	     0:00  0	 -
      Rule    Swiss 1941  1942  -     May  Sun>=1   2:00  1:00  S
      Rule    Swiss 1941  1942  -     Oct  Sun>=1   0:00  0
      Rule    EU    1977  1980  -     Apr  Sun>=1   1:00u 1:00  S
      Rule    EU    1977  only  -     Sep  lastSun  1:00u 0	 -
      Rule    EU    1978  only  -     Oct   1	     1:00u 0	 -
      Rule    EU    1979  1995  -     Sep  lastSun  1:00u 0	 -
      Rule    EU    1981  max	 -     Mar  lastSun  1:00u 1:00  S
      Rule    EU    1996  max	 -     Oct  lastSun  1:00u 0	 -

      # Zone  NAME	      GMTOFF   RULES	   FORMAT  UNTIL
      Zone    Europe/Zurich  0:34:08  -	   LMT	   1848 Sep 12
			      0:29:44  -	   BMT	   1894 Jun
			      1:00     Swiss	   CE%sT   1981
			      1:00     EU	   CE%sT

      Link    Europe/Zurich  Switzerland

    In this example, the zone is named Europe/Zurich but it has an alias as
    Switzerland.  Zurich was 34 minutes and 8 seconds west of GMT until
    1848-09-12 at 00:00, when the offset changed to 29 minutes and 44 sec-
    onds.  After 1894-06-01 at 00:00 Swiss daylight saving rules (defined
    with lines beginning with "Rule Swiss") apply, and the GMT offset became
    one hour.	From 1981 to the present, EU daylight saving rules have
    applied, and the UTC offset has remained at one hour.

    In 1940, daylight saving time applied from November 2 at 00:00 to Decem-
    ber 31 at 00:00.  In 1941 and 1942, daylight saving time applied from the
    first Sunday in May at 02:00 to the first Sunday in October at 00:00.
    The pre-1981 EU daylight-saving rules have no effect here, but are
    included for completeness.  Since 1981, daylight saving has begun on the
    last Sunday in March at 01:00 UTC.  Until 1995 it ended the last Sunday
    in September at 01:00 UTC, but this changed to the last Sunday in October
    starting in 1996.

    For purposes of display, "LMT" and "BMT" were initially used, respec-
    tively.  Since Swiss rules and later EU rules were applied, the display
    name for the timezone has been CET for standard time and CEST for day-
    light saving time.

    For areas with more than two types of local time, you may need to use
    local standard time in the AT field of the earliest transition time's
    rule to ensure that the earliest transition time recorded in the compiled
    file is correct.

    If, for a particular zone, a clock advance caused by the start of day-
    light saving coincides with and is equal to a clock retreat caused by a
    change in UTC offset, zic produces a single transition to daylight saving
    at the new UTC offset (without any change in wall clock time).  To get
    separate transitions use multiple zone continuation lines specifying
    transition instants using universal time.

More information about the tz mailing list