[tz] [PATCH 2/3] Convert Theory file to HTML

Paul Eggert eggert at cs.ucla.edu
Mon Oct 2 00:25:34 UTC 2017


* calendars: New file, containing the calendars section of what
used to be the Theory file.
* theory.html: New file, containing the HTMLized equivalent of
the non-calendar part of ...
* Theory: ... this file, which was removed.  Only formatting
was changed, aside from moving the calendrical section to the
new file 'calendars'.
* CONTRIBUTING, zone1970.tab:
* Makefile (COMMON, VERSION_DEPS): Adjust to move.
* NEWS: Document the move.
* tz-link.htm: Link to new file.
---
 CONTRIBUTING |    2 +-
 Makefile     |    7 +-
 NEWS         |    4 +
 Theory       |  888 -------------------------------------------------
 calendars    |  173 ++++++++++
 theory.html  | 1034 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tz-link.htm  |    5 +-
 zone1970.tab |    4 +-
 8 files changed, 1222 insertions(+), 895 deletions(-)
 delete mode 100644 Theory
 create mode 100644 calendars
 create mode 100644 theory.html

diff --git a/CONTRIBUTING b/CONTRIBUTING
index 6ce6bfd..716f32b 100644
--- a/CONTRIBUTING
+++ b/CONTRIBUTING
@@ -17,7 +17,7 @@ To email small changes, please run a POSIX shell command like
 'diff -u old/europe new/europe >myfix.patch', and attach
 myfix.patch to the email.
 
-For more-elaborate changes, please read the Theory file and browse
+For more-elaborate changes, please read the theory.html file and browse
 the mailing list archives <https://mm.icann.org/pipermail/tz/> for
 examples of patches that tend to work well.  Additions to
 data should contain commentary citing reliable sources as
diff --git a/Makefile b/Makefile
index fa755ad..c92edc0 100644
--- a/Makefile
+++ b/Makefile
@@ -419,7 +419,8 @@ MANTXTS=	newctime.3.txt newstrftime.3.txt newtzset.3.txt \
 			time2posix.3.txt \
 			tzfile.5.txt tzselect.8.txt zic.8.txt zdump.8.txt \
 			date.1.txt
-COMMON=		CONTRIBUTING LICENSE Makefile NEWS README Theory version
+COMMON=		calendars CONTRIBUTING LICENSE Makefile \
+			NEWS README theory.html version
 WEB_PAGES=	tz-art.htm tz-how-to.html tz-link.htm
 DOCS=		$(MANS) date.1 $(MANTXTS) $(WEB_PAGES)
 PRIMARY_YDATA=	africa antarctica asia australasia \
@@ -446,7 +447,7 @@ ENCHILADA=	$(COMMON) $(DOCS) $(SOURCES) $(DATA) $(MISC) $(TZS) tzdata.zi
 # This list is not the same as the output of 'git ls-files', since
 # .gitignore is not distributed.
 VERSION_DEPS= \
-		CONTRIBUTING LICENSE Makefile NEWS README Theory \
+		calendars CONTRIBUTING LICENSE Makefile NEWS README \
 		africa antarctica asctime.c asia australasia \
 		backward backzone \
 		checklinks.awk checktab.awk \
@@ -455,7 +456,7 @@ VERSION_DEPS= \
 		leap-seconds.list leapseconds.awk localtime.c \
 		newctime.3 newstrftime.3 newtzset.3 northamerica \
 		pacificnew private.h \
-		southamerica strftime.c systemv \
+		southamerica strftime.c systemv theory.html \
 		time2posix.3 tz-art.htm tz-how-to.html tz-link.htm \
 		tzfile.5 tzfile.h tzselect.8 tzselect.ksh \
 		workman.sh yearistype.sh \
diff --git a/NEWS b/NEWS
index 660a3fe..7fc32c0 100644
--- a/NEWS
+++ b/NEWS
@@ -136,6 +136,10 @@ Unreleased, experimental changes
 
   Changes to documentation and commentary
 
+    The two new files 'theory.html' and 'calendars' contain the
+    contents of the removed file 'Theory'.  The goal is to document
+    tzdb theory more accessibly.
+
     The zic man page now documents abbreviation rules.
 
     tz-link.htm now covers how to apply tzdata changes to clients.
diff --git a/Theory b/Theory
deleted file mode 100644
index 328423a..0000000
--- a/Theory
+++ /dev/null
@@ -1,888 +0,0 @@
-Theory and pragmatics of the tz code and data
-
-
------ Outline -----
-
-	Scope of the tz database
-	Names of time zone rules
-	Time zone abbreviations
-	Accuracy of the tz database
-	Time and date functions
-	Interface stability
-	Calendrical issues
-	Time and time zones on other planets
-
-
------ Scope of the tz database -----
-
-The tz database attempts to record the history and predicted future of
-all computer-based clocks that track civil time.  To represent this
-data, the world is partitioned into regions whose clocks all agree
-about timestamps that occur after the somewhat-arbitrary cutoff point
-of the POSIX Epoch (1970-01-01 00:00:00 UTC).  For each such region,
-the database records all known clock transitions, and labels the region
-with a notable location.  Although 1970 is a somewhat-arbitrary
-cutoff, there are significant challenges to moving the cutoff earlier
-even by a decade or two, due to the wide variety of local practices
-before computer timekeeping became prevalent.
-
-Clock transitions before 1970 are recorded for each such location,
-because most systems support timestamps before 1970 and could
-misbehave if data entries were omitted for pre-1970 transitions.
-However, the database is not designed for and does not suffice for
-applications requiring accurate handling of all past times everywhere,
-as it would take far too much effort and guesswork to record all
-details of pre-1970 civil timekeeping.
-
-As described below, reference source code for using the tz database is
-also available.  The tz code is upwards compatible with POSIX, an
-international standard for UNIX-like systems.  As of this writing, the
-current edition of POSIX is:
-
-  The Open Group Base Specifications Issue 7
-  IEEE Std 1003.1-2008, 2016 Edition
-  <http://pubs.opengroup.org/onlinepubs/9699919799/>
-
-
-
------ Names of time zone rules -----
-
-Each of the database's time zone rules has a unique name.
-Inexperienced users are not expected to select these names unaided.
-Distributors should provide documentation and/or a simple selection
-interface that explains the names; for one example, see the 'tzselect'
-program in the tz code.  The Unicode Common Locale Data Repository
-<http://cldr.unicode.org/> contains data that may be useful for other
-selection interfaces.
-
-The time zone rule naming conventions attempt to strike a balance
-among the following goals:
-
- * Uniquely identify every region where clocks have agreed since 1970.
-   This is essential for the intended use: static clocks keeping local
-   civil time.
-
- * Indicate to experts where that region is.
-
- * Be robust in the presence of political changes.  For example, names
-   of countries are ordinarily not used, to avoid incompatibilities
-   when countries change their name (e.g. Zaire->Congo) or when
-   locations change countries (e.g. Hong Kong from UK colony to
-   China).
-
- * Be portable to a wide variety of implementations.
-
- * Use a consistent naming conventions over the entire world.
-
-Names normally have the form AREA/LOCATION, where AREA is the name
-of a continent or ocean, and LOCATION is the name of a specific
-location within that region.  North and South America share the same
-area, 'America'.  Typical names are 'Africa/Cairo', 'America/New_York',
-and 'Pacific/Honolulu'.
-
-Here are the general rules used for choosing location names,
-in decreasing order of importance:
-
-	Use only valid POSIX file name components (i.e., the parts of
-		names other than '/').  Do not use the file name
-		components '.' and '..'.  Within a file name component,
-		use only ASCII letters, '.', '-' and '_'.  Do not use
-		digits, as that might create an ambiguity with POSIX
-		TZ strings.  A file name component must not exceed 14
-		characters or start with '-'.  E.g., prefer 'Brunei'
-		to 'Bandar_Seri_Begawan'.  Exceptions: see the discussion
-		of legacy names below.
-	A name must not be empty, or contain '//', or start or end with '/'.
-	Do not use names that differ only in case.  Although the reference
-		implementation is case-sensitive, some other implementations
-		are not, and they would mishandle names differing only in case.
-	If one name A is an initial prefix of another name AB (ignoring case),
-		then B must not start with '/', as a regular file cannot have
-		the same name as a directory in POSIX.  For example,
-		'America/New_York' precludes 'America/New_York/Bronx'.
-	Uninhabited regions like the North Pole and Bouvet Island
-		do not need locations, since local time is not defined there.
-	There should typically be at least one name for each ISO 3166-1
-		officially assigned two-letter code for an inhabited country
-		or territory.
-	If all the clocks in a region have agreed since 1970,
-		don't bother to include more than one location
-		even if subregions' clocks disagreed before 1970.
-		Otherwise these tables would become annoyingly large.
-	If a name is ambiguous, use a less ambiguous alternative;
-		e.g. many cities are named San José and Georgetown, so
-		prefer 'Costa_Rica' to 'San_Jose' and 'Guyana' to 'Georgetown'.
-	Keep locations compact.  Use cities or small islands, not countries
-		or regions, so that any future time zone changes do not split
-		locations into different time zones.  E.g. prefer 'Paris'
-		to 'France', since France has had multiple time zones.
-	Use mainstream English spelling, e.g. prefer 'Rome' to 'Roma', and
-		prefer 'Athens' to the Greek 'Αθήνα' or the Romanized 'Athína'.
-		The POSIX file name restrictions encourage this rule.
-	Use the most populous among locations in a zone,
-		e.g. prefer 'Shanghai' to 'Beijing'.  Among locations with
-		similar populations, pick the best-known location,
-		e.g. prefer 'Rome' to 'Milan'.
-	Use the singular form, e.g. prefer 'Canary' to 'Canaries'.
-	Omit common suffixes like '_Islands' and '_City', unless that
-		would lead to ambiguity.  E.g. prefer 'Cayman' to
-		'Cayman_Islands' and 'Guatemala' to 'Guatemala_City',
-		but prefer 'Mexico_City' to 'Mexico' because the country
-		of Mexico has several time zones.
-	Use '_' to represent a space.
-	Omit '.' from abbreviations in names, e.g. prefer 'St_Helena'
-		to 'St._Helena'.
-	Do not change established names if they only marginally
-		violate the above rules.  For example, don't change
-		the existing name 'Rome' to 'Milan' merely because
-		Milan's population has grown to be somewhat greater
-		than Rome's.
-	If a name is changed, put its old spelling in the 'backward' file.
-		This means old spellings will continue to work.
-
-The file 'zone1970.tab' lists geographical locations used to name time
-zone rules.  It is intended to be an exhaustive list of names for
-geographic regions as described above; this is a subset of the names
-in the data.  Although a 'zone1970.tab' location's longitude
-corresponds to its LMT offset with one hour for every 15 degrees east
-longitude, this relationship is not exact.
-
-Older versions of this package used a different naming scheme,
-and these older names are still supported.
-See the file 'backward' for most of these older names
-(e.g., 'US/Eastern' instead of 'America/New_York').
-The other old-fashioned names still supported are
-'WET', 'CET', 'MET', and 'EET' (see the file 'europe').
-
-Older versions of this package defined legacy names that are
-incompatible with the first rule of location names, but which are
-still supported.  These legacy names are mostly defined in the file
-'etcetera'.  Also, the file 'backward' defines the legacy names
-'GMT0', 'GMT-0' and 'GMT+0', and the file 'northamerica' defines the
-legacy names 'EST5EDT', 'CST6CDT', 'MST7MDT', and 'PST8PDT'.
-
-Excluding 'backward' should not affect the other data.  If
-'backward' is excluded, excluding 'etcetera' should not affect the
-remaining data.
-
-
------ Time zone abbreviations -----
-
-When this package is installed, it generates time zone abbreviations
-like 'EST' to be compatible with human tradition and POSIX.
-Here are the general rules used for choosing time zone abbreviations,
-in decreasing order of importance:
-
-	Use three or more characters that are ASCII alphanumerics or '+' or '-'.
-		Previous editions of this database also used characters like
-		' ' and '?', but these characters have a special meaning to
-		the shell and cause commands like
-			set `date`
-		to have unexpected effects.
-		Previous editions of this rule required upper-case letters,
-		but the Congressman who introduced Chamorro Standard Time
-		preferred "ChST", so lower-case letters are now allowed.
-		Also, POSIX from 2001 on relaxed the rule to allow '-', '+',
-		and alphanumeric characters from the portable character set
-		in the current locale.  In practice ASCII alphanumerics and
-		'+' and '-' are safe in all locales.
-
-		In other words, in the C locale the POSIX extended regular
-		expression [-+[:alnum:]]{3,} should match the abbreviation.
-		This guarantees that all abbreviations could have been
-		specified by a POSIX TZ string.
-
-	Use abbreviations that are in common use among English-speakers,
-		e.g. 'EST' for Eastern Standard Time in North America.
-		We assume that applications translate them to other languages
-		as part of the normal localization process; for example,
-		a French application might translate 'EST' to 'HNE'.
-
-	For zones whose times are taken from a city's longitude, use the
-		traditional xMT notation, e.g. 'PMT' for Paris Mean Time.
-		The only name like this in current use is 'GMT'.
-
-	Use 'LMT' for local mean time of locations before the introduction
-		of standard time; see "Scope of the tz database".
-
-	If there is no common English abbreviation, use numeric offsets like
-		-05 and +0830 that are generated by zic's %z notation.
-
-	Use current abbreviations for older timestamps to avoid confusion.
-		For example, in 1910 a common English abbreviation for UT +01
-		in central Europe was 'MEZ' (short for both "Middle European
-		Zone" and for "Mitteleuropäische Zeit" in German).  Nowadays
-		'CET' ("Central European Time") is more common in English, and
-		the database uses 'CET' even for circa-1910 timestamps as this
-		is less confusing for modern users and avoids the need for
-		determining when 'CET' supplanted 'MEZ' in common usage.
-
-	Use a consistent style in a zone's history.  For example, if a zone's
-		history tends to use numeric abbreviations and a particular
-		entry could go either way, use a numeric abbreviation.
-
-    [The remaining guidelines predate the introduction of %z.
-    They are problematic as they mean tz data entries invent
-    notation rather than record it.  These guidelines are now
-    deprecated and the plan is to gradually move to %z for
-    inhabited locations and to "-00" for uninhabited locations.]
-
-	If there is no common English abbreviation, abbreviate the English
-		translation of the usual phrase used by native speakers.
-		If this is not available or is a phrase mentioning the country
-		(e.g. "Cape Verde Time"), then:
-
-		When a country is identified with a single or principal zone,
-			append 'T' to the country's ISO	code, e.g. 'CVT' for
-			Cape Verde Time.  For summer time append 'ST';
-			for double summer time append 'DST'; etc.
-		Otherwise, take the first three letters of an English place
-			name identifying each zone and append 'T', 'ST', etc.
-			as before; e.g. 'CHAST' for CHAtham Summer Time.
-
-	Use UT (with time zone abbreviation '-00') for locations while
-		uninhabited.  The leading '-' is a flag that the time
-		zone is in some sense undefined; this notation is
-		derived from Internet RFC 3339.
-
-Application writers should note that these abbreviations are ambiguous
-in practice: e.g. 'CST' has a different meaning in China than
-it does in the United States.  In new applications, it's often better
-to use numeric UT offsets like '-0600' instead of time zone
-abbreviations like 'CST'; this avoids the ambiguity.
-
-
------ Accuracy of the tz database -----
-
-The tz database is not authoritative, and it surely has errors.
-Corrections are welcome and encouraged; see the file CONTRIBUTING.
-Users requiring authoritative data should consult national standards
-bodies and the references cited in the database's comments.
-
-Errors in the tz database arise from many sources:
-
- * The tz database predicts future timestamps, and current predictions
-   will be incorrect after future governments change the rules.
-   For example, if today someone schedules a meeting for 13:00 next
-   October 1, Casablanca time, and tomorrow Morocco changes its
-   daylight saving rules, software can mess up after the rule change
-   if it blithely relies on conversions made before the change.
-
- * The pre-1970 entries in this database cover only a tiny sliver of how
-   clocks actually behaved; the vast majority of the necessary
-   information was lost or never recorded.  Thousands more zones would
-   be needed if the tz database's scope were extended to cover even
-   just the known or guessed history of standard time; for example,
-   the current single entry for France would need to split into dozens
-   of entries, perhaps hundreds.  And in most of the world even this
-   approach would be misleading due to widespread disagreement or
-   indifference about what times should be observed.  In her 2015 book
-   "The Global Transformation of Time, 1870-1950", Vanessa Ogle writes
-   "Outside of Europe and North America there was no system of time
-   zones at all, often not even a stable landscape of mean times,
-   prior to the middle decades of the twentieth century".  See:
-   Timothy Shenk, Booked: A Global History of Time. Dissent 2015-12-17
-   https://www.dissentmagazine.org/blog/booked-a-global-history-of-time-vanessa-ogle
-
- * Most of the pre-1970 data entries come from unreliable sources, often
-   astrology books that lack citations and whose compilers evidently
-   invented entries when the true facts were unknown, without
-   reporting which entries were known and which were invented.
-   These books often contradict each other or give implausible entries,
-   and on the rare occasions when they are checked they are
-   typically found to be incorrect.
-
- * For the UK the tz database relies on years of first-class work done by
-   Joseph Myers and others; see <https://www.polyomino.org.uk/british-time/>.
-   Other countries are not done nearly as well.
-
- * Sometimes, different people in the same city would maintain clocks
-   that differed significantly.  Railway time was used by railroad
-   companies (which did not always agree with each other),
-   church-clock time was used for birth certificates, etc.
-   Often this was merely common practice, but sometimes it was set by law.
-   For example, from 1891 to 1911 the UT offset in France was legally
-   0:09:21 outside train stations and 0:04:21 inside.
-
- * Although a named location in the tz database stands for the
-   containing region, its pre-1970 data entries are often accurate for
-   only a small subset of that region.  For example, Europe/London
-   stands for the United Kingdom, but its pre-1847 times are valid
-   only for locations that have London's exact meridian, and its 1847
-   transition to GMT is known to be valid only for the L&NW and the
-   Caledonian railways.
-
- * The tz database does not record the earliest time for which a zone's
-   data entries are thereafter valid for every location in the region.
-   For example, Europe/London is valid for all locations in its
-   region after GMT was made the standard time, but the date of
-   standardization (1880-08-02) is not in the tz database, other than
-   in commentary.  For many zones the earliest time of validity is
-   unknown.
-
- * The tz database does not record a region's boundaries, and in many
-   cases the boundaries are not known.  For example, the zone
-   America/Kentucky/Louisville represents a region around the city of
-   Louisville, the boundaries of which are unclear.
-
- * Changes that are modeled as instantaneous transitions in the tz
-   database were often spread out over hours, days, or even decades.
-
- * Even if the time is specified by law, locations sometimes
-   deliberately flout the law.
-
- * Early timekeeping practices, even assuming perfect clocks, were
-   often not specified to the accuracy that the tz database requires.
-
- * Sometimes historical timekeeping was specified more precisely
-   than what the tz database can handle.  For example, from 1909 to
-   1937 Netherlands clocks were legally UT +00:19:32.13, but the tz
-   database cannot represent the fractional second.
-
- * Even when all the timestamp transitions recorded by the tz database
-   are correct, the tz rules that generate them may not faithfully
-   reflect the historical rules.  For example, from 1922 until World
-   War II the UK moved clocks forward the day following the third
-   Saturday in April unless that was Easter, in which case it moved
-   clocks forward the previous Sunday.  Because the tz database has no
-   way to specify Easter, these exceptional years are entered as
-   separate tz Rule lines, even though the legal rules did not change.
-
- * The tz database models pre-standard time using the proleptic Gregorian
-   calendar and local mean time (LMT), but many people used other
-   calendars and other timescales.  For example, the Roman Empire used
-   the Julian calendar, and had 12 varying-length daytime hours with a
-   non-hour-based system at night.
-
- * Early clocks were less reliable, and data entries do not represent
-   clock error.
-
- * The tz database assumes Universal Time (UT) as an origin, even
-   though UT is not standardized for older timestamps.  In the tz
-   database commentary, UT denotes a family of time standards that
-   includes Coordinated Universal Time (UTC) along with other variants
-   such as UT1 and GMT, with days starting at midnight.  Although UT
-   equals UTC for modern timestamps, UTC was not defined until 1960,
-   so commentary uses the more-general abbreviation UT for timestamps
-   that might predate 1960.  Since UT, UT1, etc. disagree slightly,
-   and since pre-1972 UTC seconds varied in length, interpretation of
-   older timestamps can be problematic when subsecond accuracy is
-   needed.
-
- * Civil time was not based on atomic time before 1972, and we don't
-   know the history of earth's rotation accurately enough to map SI
-   seconds to historical solar time to more than about one-hour
-   accuracy.  See: Stephenson FR, Morrison LV, Hohenkerk CY.
-   Measurement of the Earth's rotation: 720 BC to AD 2015.
-   Proc Royal Soc A. 2016 Dec 7;472:20160404.
-   http://dx.doi.org/10.1098/rspa.2016.0404
-   Also see: Espenak F. Uncertainty in Delta T (ΔT).
-   https://eclipse.gsfc.nasa.gov/SEhelp/uncertainty2004.html
-
- * The relationship between POSIX time (that is, UTC but ignoring leap
-   seconds) and UTC is not agreed upon after 1972.  Although the POSIX
-   clock officially stops during an inserted leap second, at least one
-   proposed standard has it jumping back a second instead; and in
-   practice POSIX clocks more typically either progress glacially during
-   a leap second, or are slightly slowed while near a leap second.
-
- * The tz database does not represent how uncertain its information is.
-   Ideally it would contain information about when data entries are
-   incomplete or dicey.  Partial temporal knowledge is a field of
-   active research, though, and it's not clear how to apply it here.
-
-In short, many, perhaps most, of the tz database's pre-1970 and future
-timestamps are either wrong or misleading.  Any attempt to pass the
-tz database off as the definition of time should be unacceptable to
-anybody who cares about the facts.  In particular, the tz database's
-LMT offsets should not be considered meaningful, and should not prompt
-creation of zones merely because two locations differ in LMT or
-transitioned to standard time at different dates.
-
-
------ Time and date functions -----
-
-The tz code contains time and date functions that are upwards
-compatible with those of POSIX.
-
-POSIX has the following properties and limitations.
-
-*	In POSIX, time display in a process is controlled by the
-	environment variable TZ.  Unfortunately, the POSIX TZ string takes
-	a form that is hard to describe and is error-prone in practice.
-	Also, POSIX TZ strings can't deal with other (for example, Israeli)
-	daylight saving time rules, or situations where more than two
-	time zone abbreviations are used in an area.
-
-	The POSIX TZ string takes the following form:
-
-		stdoffset[dst[offset][,date[/time],date[/time]]]
-
-	where:
-
-	std and dst
-		are 3 or more characters specifying the standard
-		and daylight saving time (DST) zone names.
-		Starting with POSIX.1-2001, std and dst may also be
-		in a quoted form like "<UTC+10>"; this allows
-		"+" and "-" in the names.
-	offset
-		is of the form '[+-]hh:[mm[:ss]]' and specifies the
-		offset west of UT.  'hh' may be a single digit; 0<=hh<=24.
-		The default DST offset is one hour ahead of standard time.
-	date[/time],date[/time]
-		specifies the beginning and end of DST.  If this is absent,
-		the system supplies its own rules for DST, and these can
-		differ from year to year; typically US DST rules are used.
-	time
-		takes the form 'hh:[mm[:ss]]' and defaults to 02:00.
-		This is the same format as the offset, except that a
-		leading '+' or '-' is not allowed.
-	date
-		takes one of the following forms:
-		Jn (1<=n<=365)
-			origin-1 day number not counting February 29
-		n (0<=n<=365)
-			origin-0 day number counting February 29 if present
-		Mm.n.d (0[Sunday]<=d<=6[Saturday], 1<=n<=5, 1<=m<=12)
-			for the dth day of week n of month m of the year,
-			where week 1 is the first week in which day d appears,
-			and '5' stands for the last week in which day d appears
-			(which may be either the 4th or 5th week).
-			Typically, this is the only useful form;
-			the n and Jn forms are rarely used.
-
-	Here is an example POSIX TZ string for New Zealand after 2007.
-	It says that standard time (NZST) is 12 hours ahead of UTC,
-	and that daylight saving time (NZDT) is observed from September's
-	last Sunday at 02:00 until April's first Sunday at 03:00:
-
-		TZ='NZST-12NZDT,M9.5.0,M4.1.0/3'
-
-	This POSIX TZ string is hard to remember, and mishandles some
-	timestamps before 2008.  With this package you can use this
-	instead:
-
-		TZ='Pacific/Auckland'
-
-*	POSIX does not define the exact meaning of TZ values like "EST5EDT".
-	Typically the current US DST rules are used to interpret such values,
-	but this means that the US DST rules are compiled into each program
-	that does time conversion.  This means that when US time conversion
-	rules change (as in the United States in 1987), all programs that
-	do time conversion must be recompiled to ensure proper results.
-
-*	The TZ environment variable is process-global, which makes it hard
-	to write efficient, thread-safe applications that need access
-	to multiple time zones.
-
-*	In POSIX, there's no tamper-proof way for a process to learn the
-	system's best idea of local wall clock.  (This is important for
-	applications that an administrator wants used only at certain times -
-	without regard to whether the user has fiddled the "TZ" environment
-	variable.  While an administrator can "do everything in UTC" to get
-	around the problem, doing so is inconvenient and precludes handling
-	daylight saving time shifts - as might be required to limit phone
-	calls to off-peak hours.)
-
-*	POSIX provides no convenient and efficient way to determine the UT
-	offset and time zone abbreviation of arbitrary timestamps,
-	particularly for time zone settings that do not fit into the
-	POSIX model.
-
-*	POSIX requires that systems ignore leap seconds.
-
-*	The tz code attempts to support all the time_t implementations
-	allowed by POSIX.  The time_t type represents a nonnegative count of
-	seconds since 1970-01-01 00:00:00 UTC, ignoring leap seconds.
-	In practice, time_t is usually a signed 64- or 32-bit integer; 32-bit
-	signed time_t values stop working after 2038-01-19 03:14:07 UTC, so
-	new implementations these days typically use a signed 64-bit integer.
-	Unsigned 32-bit integers are used on one or two platforms,
-	and 36-bit and 40-bit integers are also used occasionally.
-	Although earlier POSIX versions allowed time_t to be a
-	floating-point type, this was not supported by any practical
-	systems, and POSIX.1-2013 and the tz code both require time_t
-	to be an integer type.
-
-These are the extensions that have been made to the POSIX functions:
-
-*	The "TZ" environment variable is used in generating the name of a file
-	from which time zone information is read (or is interpreted a la
-	POSIX); "TZ" is no longer constrained to be a three-letter time zone
-	name followed by a number of hours and an optional three-letter
-	daylight time zone name.  The daylight saving time rules to be used
-	for a particular time zone are encoded in the time zone file;
-	the format of the file allows U.S., Australian, and other rules to be
-	encoded, and allows for situations where more than two time zone
-	abbreviations are used.
-
-	It was recognized that allowing the "TZ" environment variable to
-	take on values such as "America/New_York" might cause "old" programs
-	(that expect "TZ" to have a certain form) to operate incorrectly;
-	consideration was given to using some other environment variable
-	(for example, "TIMEZONE") to hold the string used to generate the
-	time zone information file name.  In the end, however, it was decided
-	to continue using "TZ": it is widely used for time zone purposes;
-	separately maintaining both "TZ" and "TIMEZONE" seemed a nuisance;
-	and systems where "new" forms of "TZ" might cause problems can simply
-	use TZ values such as "EST5EDT" which can be used both by
-	"new" programs (a la POSIX) and "old" programs (as zone names and
-	offsets).
-
-*	The code supports platforms with a UT offset member in struct tm,
-	e.g., tm_gmtoff.
-
-*	The code supports platforms with a time zone abbreviation member in
-	struct tm, e.g., tm_zone.
-
-*	Since the "TZ" environment variable can now be used to control time
-	conversion, the "daylight" and "timezone" variables are no longer
-	needed.  (These variables are defined and set by "tzset"; however, their
-	values will not be used by "localtime.")
-
-*	Functions tzalloc, tzfree, localtime_rz, and mktime_z for
-	more-efficient thread-safe applications that need to use
-	multiple time zones.  The tzalloc and tzfree functions
-	allocate and free objects of type timezone_t, and localtime_rz
-	and mktime_z are like localtime_r and mktime with an extra
-	timezone_t argument.  The functions were inspired by NetBSD.
-
-*	A function "tzsetwall" has been added to arrange for the system's
-	best approximation to local wall clock time to be delivered by
-	subsequent calls to "localtime."  Source code for portable
-	applications that "must" run on local wall clock time should call
-	"tzsetwall();" if such code is moved to "old" systems that don't
-	provide tzsetwall, you won't be able to generate an executable program.
-	(These time zone functions also arrange for local wall clock time to be
-	used if tzset is called - directly or indirectly - and there's no "TZ"
-	environment variable; portable applications should not, however, rely
-	on this behavior since it's not the way SVR2 systems behave.)
-
-*	Negative time_t values are supported, on systems where time_t is signed.
-
-*	These functions can account for leap seconds, thanks to Bradley White.
-
-Points of interest to folks with other systems:
-
-*	Code compatible with this package is already part of many platforms,
-	including GNU/Linux, Android, the BSDs, Chromium OS, Cygwin, AIX, iOS,
-	BlackBery 10, macOS, Microsoft Windows, OpenVMS, and Solaris.
-	On such hosts, the primary use of this package
-	is to update obsolete time zone rule tables.
-	To do this, you may need to compile the time zone compiler
-	'zic' supplied with this package instead of using the system 'zic',
-	since the format of zic's input is occasionally extended,
-	and a platform may still be shipping an older zic.
-
-*	The UNIX Version 7 "timezone" function is not present in this package;
-	it's impossible to reliably map timezone's arguments (a "minutes west
-	of GMT" value and a "daylight saving time in effect" flag) to a
-	time zone abbreviation, and we refuse to guess.
-	Programs that in the past used the timezone function may now examine
-	localtime(&clock)->tm_zone (if TM_ZONE is defined) or
-	tzname[localtime(&clock)->tm_isdst] (if HAVE_TZNAME is defined)
-	to learn the correct time zone abbreviation to use.
-
-*	The 4.2BSD gettimeofday function is not used in this package.
-	This formerly let users obtain the current UTC offset and DST flag,
-	but this functionality was removed in later versions of BSD.
-
-*	In SVR2, time conversion fails for near-minimum or near-maximum
-	time_t values when doing conversions for places that don't use UT.
-	This package takes care to do these conversions correctly.
-	A comment in the source code tells how to get compatibly wrong
-	results.
-
-The functions that are conditionally compiled if STD_INSPIRED is defined
-should, at this point, be looked on primarily as food for thought.  They are
-not in any sense "standard compatible" - some are not, in fact, specified in
-*any* standard.  They do, however, represent responses of various authors to
-standardization proposals.
-
-Other time conversion proposals, in particular the one developed by folks at
-Hewlett Packard, offer a wider selection of functions that provide capabilities
-beyond those provided here.  The absence of such functions from this package
-is not meant to discourage the development, standardization, or use of such
-functions.  Rather, their absence reflects the decision to make this package
-contain valid extensions to POSIX, to ensure its broad acceptability.  If
-more powerful time conversion functions can be standardized, so much the
-better.
-
-
------ Interface stability -----
-
-The tz code and data supply the following interfaces:
-
- * A set of zone names as per "Names of time zone rules" above.
-
- * Library functions described in "Time and date functions" above.
-
- * The programs tzselect, zdump, and zic, documented in their man pages.
-
- * The format of zic input files, documented in the zic man page.
-
- * The format of zic output files, documented in the tzfile man page.
-
- * The format of zone table files, documented in zone1970.tab.
-
- * The format of the country code file, documented in iso3166.tab.
-
- * The version number of the code and data, as the first line of
-   the text file 'version' in each release.
-
-Interface changes in a release attempt to preserve compatibility with
-recent releases.  For example, tz data files typically do not rely on
-recently-added zic features, so that users can run older zic versions
-to process newer data files.  The tz-link.htm file describes how
-releases are tagged and distributed.
-
-Interfaces not listed above are less stable.  For example, users
-should not rely on particular UT offsets or abbreviations for
-timestamps, as data entries are often based on guesswork and these
-guesses may be corrected or improved.
-
-
------ Calendrical issues -----
-
-Calendrical issues are a bit out of scope for a time zone database,
-but they indicate the sort of problems that we would run into if we
-extended the time zone database further into the past.  An excellent
-resource in this area is Nachum Dershowitz and Edward M. Reingold,
-Calendrical Calculations: Third Edition, Cambridge University Press (2008)
-<https://www.cs.tau.ac.il/~nachum/calendar-book/third-edition/>.
-Other information and sources are given below.  They sometimes disagree.
-
-
-France
-
-Gregorian calendar adopted 1582-12-20.
-French Revolutionary calendar used 1793-11-24 through 1805-12-31,
-and (in Paris only) 1871-05-06 through 1871-05-23.
-
-
-Russia
-
-From Chris Carrier (1996-12-02):
-On 1929-10-01 the Soviet Union instituted an "Eternal Calendar"
-with 30-day months plus 5 holidays, with a 5-day week.
-On 1931-12-01 it changed to a 6-day week; in 1934 it reverted to the
-Gregorian calendar while retaining the 6-day week; on 1940-06-27 it
-reverted to the 7-day week.  With the 6-day week the usual days
-off were the 6th, 12th, 18th, 24th and 30th of the month.
-(Source: Evitiar Zerubavel, _The Seven Day Circle_)
-
-
-Mark Brader reported a similar story in "The Book of Calendars", edited
-by Frank Parise (1982, Facts on File, ISBN 0-8719-6467-8), page 377.  But:
-
-From: Petteri Sulonen (via Usenet)
-Date: 14 Jan 1999 00:00:00 GMT
-...
-
-If your source is correct, how come documents between 1929 and 1940 were
-still dated using the conventional, Gregorian calendar?
-
-I can post a scan of a document dated December 1, 1934, signed by
-Yenukidze, the secretary, on behalf of Kalinin, the President of the
-Executive Committee of the Supreme Soviet, if you like.
-
-
-
-Sweden (and Finland)
-
-From: Mark Brader
-Subject: Re: Gregorian reform - a part of locale?
-<news:1996Jul6.012937.29190 at sq.com>
-Date: 1996-07-06
-
-In 1700, Denmark made the transition from Julian to Gregorian.  Sweden
-decided to *start* a transition in 1700 as well, but rather than have one of
-those unsightly calendar gaps :-), they simply decreed that the next leap
-year after 1696 would be in 1744 - putting the whole country on a calendar
-different from both Julian and Gregorian for a period of 40 years.
-
-However, in 1704 something went wrong and the plan was not carried through;
-they did, after all, have a leap year that year.  And one in 1708.  In 1712
-they gave it up and went back to Julian, putting 30 days in February that
-year!...
-
-Then in 1753, Sweden made the transition to Gregorian in the usual manner,
-getting there only 13 years behind the original schedule.
-
-(A previous posting of this story was challenged, and Swedish readers
-produced the following references to support it: "Tideräkning och historia"
-by Natanael Beckman (1924) and "Tid, en bok om tideräkning och
-kalenderväsen" by Lars-Olof Lodén (1968).
-
-
-Grotefend's data
-
-From: "Michael Palmer" [with one obvious typo fixed]
-Subject: Re: Gregorian Calendar (was Re: Another FHC related question
-Newsgroups: soc.genealogy.german
-Date: Tue, 9 Feb 1999 02:32:48 -800
-...
-
-The following is a(n incomplete) listing, arranged chronologically, of
-European states, with the date they converted from the Julian to the
-Gregorian calendar:
-
-04/15 Oct 1582 - Italy (with exceptions), Spain, Portugal, Poland (Roman
-                 Catholics and Danzig only)
-09/20 Dec 1582 - France, Lorraine
-
-21 Dec 1582/
-   01 Jan 1583 - Holland, Brabant, Flanders, Hennegau
-10/21 Feb 1583 - bishopric of Liege (Lüttich)
-13/24 Feb 1583 - bishopric of Augsburg
-04/15 Oct 1583 - electorate of Trier
-05/16 Oct 1583 - Bavaria, bishoprics of Freising, Eichstedt, Regensburg,
-                 Salzburg, Brixen
-13/24 Oct 1583 - Austrian Oberelsaß and Breisgau
-20/31 Oct 1583 - bishopric of Basel
-02/13 Nov 1583 - duchy of Jülich-Berg
-02/13 Nov 1583 - electorate and city of Köln
-04/15 Nov 1583 - bishopric of Würzburg
-11/22 Nov 1583 - electorate of Mainz
-16/27 Nov 1583 - bishopric of Strassburg and the margraviate of Baden
-17/28 Nov 1583 - bishopric of Münster and duchy of Cleve
-14/25 Dec 1583 - Steiermark
-
-06/17 Jan 1584 - Austria and Bohemia
-11/22 Jan 1584 - Lucerne, Uri, Schwyz, Zug, Freiburg, Solothurn
-12/23 Jan 1584 - Silesia and the Lausitz
-22 Jan/
-   02 Feb 1584 - Hungary (legally on 21 Oct 1587)
-      Jun 1584 - Unterwalden
-01/12 Jul 1584 - duchy of Westfalen
-
-16/27 Jun 1585 - bishopric of Paderborn
-
-14/25 Dec 1590 - Transylvania
-
-22 Aug/
-   02 Sep 1612 - duchy of Prussia
-
-13/24 Dec 1614 - Pfalz-Neuburg
-
-          1617 - duchy of Kurland (reverted to the Julian calendar in
-                 1796)
-
-          1624 - bishopric of Osnabrück
-
-          1630 - bishopric of Minden
-
-15/26 Mar 1631 - bishopric of Hildesheim
-
-          1655 - Kanton Wallis
-
-05/16 Feb 1682 - city of Strassburg
-
-18 Feb/
-   01 Mar 1700 - Protestant Germany (including Swedish possessions in
-                 Germany), Denmark, Norway
-30 Jun/
-   12 Jul 1700 - Gelderland, Zutphen
-10 Nov/
-   12 Dec 1700 - Utrecht, Overijssel
-
-31 Dec 1700/
-   12 Jan 1701 - Friesland, Groningen, Zürich, Bern, Basel, Geneva,
-                 Turgau, and Schaffhausen
-
-          1724 - Glarus, Appenzell, and the city of St. Gallen
-
-01 Jan 1750    - Pisa and Florence
-
-02/14 Sep 1752 - Great Britain
-
-17 Feb/
-   01 Mar 1753 - Sweden
-
-1760-1812      - Graubünden
-
-The Russian empire (including Finland and the Baltic states) did not
-convert to the Gregorian calendar until the Soviet revolution of 1917.
-
-Source: H. Grotefend, _Taschenbuch der Zeitrechnung des deutschen
-Mittelalters und der Neuzeit_, herausgegeben von Dr. O. Grotefend
-(Hannover: Hahnsche Buchhandlung, 1941), pp. 26-28.
-
-
------ Time and time zones on other planets -----
-
-Some people's work schedules use Mars time.  Jet Propulsion Laboratory
-(JPL) coordinators have kept Mars time on and off at least since 1997
-for the Mars Pathfinder mission.  Some of their family members have
-also adapted to Mars time.  Dozens of special Mars watches were built
-for JPL workers who kept Mars time during the Mars Exploration
-Rovers mission (2004).  These timepieces look like normal Seikos and
-Citizens but use Mars seconds rather than terrestrial seconds.
-
-A Mars solar day is called a "sol" and has a mean period equal to
-about 24 hours 39 minutes 35.244 seconds in terrestrial time.  It is
-divided into a conventional 24-hour clock, so each Mars second equals
-about 1.02749125 terrestrial seconds.
-
-The prime meridian of Mars goes through the center of the crater
-Airy-0, named in honor of the British astronomer who built the
-Greenwich telescope that defines Earth's prime meridian.  Mean solar
-time on the Mars prime meridian is called Mars Coordinated Time (MTC).
-
-Each landed mission on Mars has adopted a different reference for
-solar time keeping, so there is no real standard for Mars time zones.
-For example, the Mars Exploration Rover project (2004) defined two
-time zones "Local Solar Time A" and "Local Solar Time B" for its two
-missions, each zone designed so that its time equals local true solar
-time at approximately the middle of the nominal mission.  Such a "time
-zone" is not particularly suited for any application other than the
-mission itself.
-
-Many calendars have been proposed for Mars, but none have achieved
-wide acceptance.  Astronomers often use Mars Sol Date (MSD) which is a
-sequential count of Mars solar days elapsed since about 1873-12-29
-12:00 GMT.
-
-In our solar system, Mars is the planet with time and calendar most
-like Earth's.  On other planets, Sun-based time and calendars would
-work quite differently.  For example, although Mercury's sidereal
-rotation period is 58.646 Earth days, Mercury revolves around the Sun
-so rapidly that an observer on Mercury's equator would see a sunrise
-only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury
-day.  Venus is more complicated, partly because its rotation is
-slightly retrograde: its year is 1.92 of its days.  Gas giants like
-Jupiter are trickier still, as their polar and equatorial regions
-rotate at different rates, so that the length of a day depends on
-latitude.  This effect is most pronounced on Neptune, where the day is
-about 12 hours at the poles and 18 hours at the equator.
-
-Although the tz database does not support time on other planets, it is
-documented here in the hopes that support will be added eventually.
-
-Sources:
-
-Michael Allison and Robert Schmunk,
-"Technical Notes on Mars Solar Time as Adopted by the Mars24 Sunclock"
-<https://www.giss.nasa.gov/tools/mars24/help/notes.html> (2012-08-08).
-
-Jia-Rui Chong, "Workdays Fit for a Martian", Los Angeles Times
-<http://articles.latimes.com/2004/jan/14/science/sci-marstime14>
-(2004-01-14), pp A1, A20-A21.
-
-Tom Chmielewski, "Jet Lag Is Worse on Mars", The Atlantic (2015-02-26)
-<https://www.theatlantic.com/technology/archive/2015/02/jet-lag-is-worse-on-mars/386033/>
-
-Matt Williams, "How long is a day on the other planets of the solar
-system?" <https://www.universetoday.com/37481/days-of-the-planets/>
-(2017-04-27).
-
------
-
-This file is in the public domain, so clarified as of 2009-05-17 by
-Arthur David Olson.
-
------
-Local Variables:
-coding: utf-8
-End:
diff --git a/calendars b/calendars
new file mode 100644
index 0000000..8bc7062
--- /dev/null
+++ b/calendars
@@ -0,0 +1,173 @@
+----- Calendrical issues -----
+
+As mentioned in Theory.html, although calendrical issues are out of
+scope for tzdb, they indicate the sort of problems that we would run
+into if we extended tzdb further into the past.  The following
+information and sources go beyond Theory.html's brief discussion.
+They sometimes disagree.
+
+
+France
+
+Gregorian calendar adopted 1582-12-20.
+French Revolutionary calendar used 1793-11-24 through 1805-12-31,
+and (in Paris only) 1871-05-06 through 1871-05-23.
+
+
+Russia
+
+From Chris Carrier (1996-12-02):
+On 1929-10-01 the Soviet Union instituted an "Eternal Calendar"
+with 30-day months plus 5 holidays, with a 5-day week.
+On 1931-12-01 it changed to a 6-day week; in 1934 it reverted to the
+Gregorian calendar while retaining the 6-day week; on 1940-06-27 it
+reverted to the 7-day week.  With the 6-day week the usual days
+off were the 6th, 12th, 18th, 24th and 30th of the month.
+(Source: Evitiar Zerubavel, _The Seven Day Circle_)
+
+
+Mark Brader reported a similar story in "The Book of Calendars", edited
+by Frank Parise (1982, Facts on File, ISBN 0-8719-6467-8), page 377.  But:
+
+From: Petteri Sulonen (via Usenet)
+Date: 14 Jan 1999 00:00:00 GMT
+...
+
+If your source is correct, how come documents between 1929 and 1940 were
+still dated using the conventional, Gregorian calendar?
+
+I can post a scan of a document dated December 1, 1934, signed by
+Yenukidze, the secretary, on behalf of Kalinin, the President of the
+Executive Committee of the Supreme Soviet, if you like.
+
+
+
+Sweden (and Finland)
+
+From: Mark Brader
+Subject: Re: Gregorian reform - a part of locale?
+<news:1996Jul6.012937.29190 at sq.com>
+Date: 1996-07-06
+
+In 1700, Denmark made the transition from Julian to Gregorian.  Sweden
+decided to *start* a transition in 1700 as well, but rather than have one of
+those unsightly calendar gaps :-), they simply decreed that the next leap
+year after 1696 would be in 1744 - putting the whole country on a calendar
+different from both Julian and Gregorian for a period of 40 years.
+
+However, in 1704 something went wrong and the plan was not carried through;
+they did, after all, have a leap year that year.  And one in 1708.  In 1712
+they gave it up and went back to Julian, putting 30 days in February that
+year!...
+
+Then in 1753, Sweden made the transition to Gregorian in the usual manner,
+getting there only 13 years behind the original schedule.
+
+(A previous posting of this story was challenged, and Swedish readers
+produced the following references to support it: "Tideräkning och historia"
+by Natanael Beckman (1924) and "Tid, en bok om tideräkning och
+kalenderväsen" by Lars-Olof Lodén (1968).
+
+
+Grotefend's data
+
+From: "Michael Palmer" [with one obvious typo fixed]
+Subject: Re: Gregorian Calendar (was Re: Another FHC related question
+Newsgroups: soc.genealogy.german
+Date: Tue, 9 Feb 1999 02:32:48 -800
+...
+
+The following is a(n incomplete) listing, arranged chronologically, of
+European states, with the date they converted from the Julian to the
+Gregorian calendar:
+
+04/15 Oct 1582 - Italy (with exceptions), Spain, Portugal, Poland (Roman
+                 Catholics and Danzig only)
+09/20 Dec 1582 - France, Lorraine
+
+21 Dec 1582/
+   01 Jan 1583 - Holland, Brabant, Flanders, Hennegau
+10/21 Feb 1583 - bishopric of Liege (Lüttich)
+13/24 Feb 1583 - bishopric of Augsburg
+04/15 Oct 1583 - electorate of Trier
+05/16 Oct 1583 - Bavaria, bishoprics of Freising, Eichstedt, Regensburg,
+                 Salzburg, Brixen
+13/24 Oct 1583 - Austrian Oberelsaß and Breisgau
+20/31 Oct 1583 - bishopric of Basel
+02/13 Nov 1583 - duchy of Jülich-Berg
+02/13 Nov 1583 - electorate and city of Köln
+04/15 Nov 1583 - bishopric of Würzburg
+11/22 Nov 1583 - electorate of Mainz
+16/27 Nov 1583 - bishopric of Strassburg and the margraviate of Baden
+17/28 Nov 1583 - bishopric of Münster and duchy of Cleve
+14/25 Dec 1583 - Steiermark
+
+06/17 Jan 1584 - Austria and Bohemia
+11/22 Jan 1584 - Lucerne, Uri, Schwyz, Zug, Freiburg, Solothurn
+12/23 Jan 1584 - Silesia and the Lausitz
+22 Jan/
+   02 Feb 1584 - Hungary (legally on 21 Oct 1587)
+      Jun 1584 - Unterwalden
+01/12 Jul 1584 - duchy of Westfalen
+
+16/27 Jun 1585 - bishopric of Paderborn
+
+14/25 Dec 1590 - Transylvania
+
+22 Aug/
+   02 Sep 1612 - duchy of Prussia
+
+13/24 Dec 1614 - Pfalz-Neuburg
+
+          1617 - duchy of Kurland (reverted to the Julian calendar in
+                 1796)
+
+          1624 - bishopric of Osnabrück
+
+          1630 - bishopric of Minden
+
+15/26 Mar 1631 - bishopric of Hildesheim
+
+          1655 - Kanton Wallis
+
+05/16 Feb 1682 - city of Strassburg
+
+18 Feb/
+   01 Mar 1700 - Protestant Germany (including Swedish possessions in
+                 Germany), Denmark, Norway
+30 Jun/
+   12 Jul 1700 - Gelderland, Zutphen
+10 Nov/
+   12 Dec 1700 - Utrecht, Overijssel
+
+31 Dec 1700/
+   12 Jan 1701 - Friesland, Groningen, Zürich, Bern, Basel, Geneva,
+                 Turgau, and Schaffhausen
+
+          1724 - Glarus, Appenzell, and the city of St. Gallen
+
+01 Jan 1750    - Pisa and Florence
+
+02/14 Sep 1752 - Great Britain
+
+17 Feb/
+   01 Mar 1753 - Sweden
+
+1760-1812      - Graubünden
+
+The Russian empire (including Finland and the Baltic states) did not
+convert to the Gregorian calendar until the Soviet revolution of 1917.
+
+Source: H. Grotefend, _Taschenbuch der Zeitrechnung des deutschen
+Mittelalters und der Neuzeit_, herausgegeben von Dr. O. Grotefend
+(Hannover: Hahnsche Buchhandlung, 1941), pp. 26-28.
+
+-----
+
+This file is in the public domain, so clarified as of 2009-05-17 by
+Arthur David Olson.
+
+-----
+Local Variables:
+coding: utf-8
+End:
diff --git a/theory.html b/theory.html
new file mode 100644
index 0000000..965135d
--- /dev/null
+++ b/theory.html
@@ -0,0 +1,1034 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <title>Theory and pragmatics of the tz code and data</title>
+  <meta charset="UTF-8">
+</head>
+
+<!-- The somewhat-unusal indenting style in this file is intended to
+     shrink the output of the shell command 'diff Theory Theory.html',
+     where 'Theory' was the plain text file that this file is derived
+     from.  The 'Theory' file used leading white space to indent, and
+     when possible that indentation is preserved here.  Eventually we
+     may stop doing this and remove this comment.  -->
+
+<body>
+  <h1>Theory and pragmatics of the tz code and data</h1>
+  <h3>Outline</h3>
+  <nav>
+    <ul>
+      <li><a href="#scope">Scope of the tz database</a></li>
+      <li><a href="#naming">Names of time zone rules</a></li>
+      <li><a href="#abbreviations">Time zone abbreviations</a></li>
+      <li><a href="#accuracy">Accuracy of the tz database</a></li>
+      <li><a href="#functions">Time and date functions</a></li>
+      <li><a href="#stability">Interface stability</a></li>
+      <li><a href="#calendar">Calendrical issues</a></li>
+      <li><a href="#planets">Time and time zones on other planets</a></li>
+    </ul>
+  </nav>
+
+
+  <section>
+    <h2 id="scope">Scope of the tz database</h2>
+<p>
+The tz database attempts to record the history and predicted future of
+all computer-based clocks that track civil time.  To represent this
+data, the world is partitioned into regions whose clocks all agree
+about timestamps that occur after the somewhat-arbitrary cutoff point
+of the POSIX Epoch (1970-01-01 00:00:00 UTC).  For each such region,
+the database records all known clock transitions, and labels the region
+with a notable location.  Although 1970 is a somewhat-arbitrary
+cutoff, there are significant challenges to moving the cutoff earlier
+even by a decade or two, due to the wide variety of local practices
+before computer timekeeping became prevalent.
+</p>
+
+<p>
+Clock transitions before 1970 are recorded for each such location,
+because most systems support timestamps before 1970 and could
+misbehave if data entries were omitted for pre-1970 transitions.
+However, the database is not designed for and does not suffice for
+applications requiring accurate handling of all past times everywhere,
+as it would take far too much effort and guesswork to record all
+details of pre-1970 civil timekeeping.
+</p>
+
+<p>
+As described below, reference source code for using the tz database is
+also available.  The tz code is upwards compatible with POSIX, an
+international standard for UNIX-like systems.  As of this writing, the
+current edition of POSIX is:
+  <a href="http://pubs.opengroup.org/onlinepubs/9699919799/">
+  The Open Group Base Specifications Issue 7</a>,
+  IEEE Std 1003.1-2008, 2016 Edition.
+</p>
+  </section>
+
+
+
+  <section>
+    <h2 id="naming">Names of time zone rules</h2>
+<p>
+Each of the database's time zone rules has a unique name.
+Inexperienced users are not expected to select these names unaided.
+Distributors should provide documentation and/or a simple selection
+interface that explains the names; for one example, see the 'tzselect'
+program in the tz code.  The
+<a href="http://cldr.unicode.org/">Unicode Common Locale Data
+Repository</a> contains data that may be useful for other
+selection interfaces.
+</p>
+
+<p>
+The time zone rule naming conventions attempt to strike a balance
+among the following goals:
+</p>
+<ul>
+  <li>
+   Uniquely identify every region where clocks have agreed since 1970.
+   This is essential for the intended use: static clocks keeping local
+   civil time.
+  </li>
+  <li>
+   Indicate to experts where that region is.
+  </li>
+  <li>
+   Be robust in the presence of political changes.  For example, names
+   of countries are ordinarily not used, to avoid incompatibilities
+   when countries change their name (e.g. Zaire&rarr;Congo) or when
+   locations change countries (e.g. Hong Kong from UK colony to
+   China).
+  </li>
+  <li>
+   Be portable to a wide variety of implementations.
+  </li>
+  <li>
+   Use a consistent naming conventions over the entire world.
+  </li>
+</ul>
+<p>
+Names normally have the
+form <var>AREA</var><code>/</code><var>LOCATION</var>,
+where <var>AREA</var> is the name of a continent or ocean,
+and <var>LOCATION</var> is the name of a specific
+location within that region.  North and South America share the same
+area, '<code>America</code>'.  Typical names are
+'<code>Africa/Cairo</code>', '<code>America/New_York</code>', and
+'<code>Pacific/Honolulu</code>'.
+</p>
+
+<p>
+Here are the general rules used for choosing location names,
+in decreasing order of importance:
+</p>
+<ul>
+  <li>
+	Use only valid POSIX file name components (i.e., the parts of
+		names other than '<code>/</code>').  Do not use the file name
+		components '<code>.</code>' and '<code>..</code>'.
+		Within a file name component,
+		use only ASCII letters, '<code>.</code>',
+		'<code>-</code>' and '<code>_</code>'.  Do not use
+		digits, as that might create an ambiguity with POSIX
+		TZ strings.  A file name component must not exceed 14
+		characters or start with '<code>-</code>'.  E.g.,
+		prefer '<code>Brunei</code>' to
+		'<code>Bandar_Seri_Begawan</code>'.  Exceptions: see
+		the discussion
+		of legacy names below.
+  </li>
+  <li>
+	A name must not be empty, or contain '<code>//</code>', or
+	start or end with '<code>/</code>'.
+  </li>
+  <li>
+	Do not use names that differ only in case.  Although the reference
+		implementation is case-sensitive, some other implementations
+		are not, and they would mishandle names differing only in case.
+  </li>
+  <li>
+	If one name <var>A</var> is an initial prefix of another
+		name <var>AB</var> (ignoring case), then <var>B</var>
+		must not start with '<code>/</code>', as a
+		regular file cannot have
+		the same name as a directory in POSIX.  For example,
+		'<code>America/New_York</code>' precludes
+		'<code>America/New_York/Bronx</code>'.
+  </li>
+  <li>
+	Uninhabited regions like the North Pole and Bouvet Island
+		do not need locations, since local time is not defined there.
+  </li>
+  <li>
+	There should typically be at least one name for each ISO 3166-1
+		officially assigned two-letter code for an inhabited country
+		or territory.
+  </li>
+  <li>
+	If all the clocks in a region have agreed since 1970,
+		don't bother to include more than one location
+		even if subregions' clocks disagreed before 1970.
+		Otherwise these tables would become annoyingly large.
+  </li>
+  <li>
+	If a name is ambiguous, use a less ambiguous alternative;
+		e.g. many cities are named San José and Georgetown, so
+		prefer '<code>Costa_Rica</code>' to '<code>San_Jose</code>' and '<code>Guyana</code>' to '<code>Georgetown</code>'.
+  </li>
+  <li>
+	Keep locations compact.  Use cities or small islands, not countries
+		or regions, so that any future time zone changes do not split
+		locations into different time zones.  E.g. prefer
+		'<code>Paris</code>' to '<code>France</code>', since
+		France has had multiple time zones.
+  </li>
+  <li>
+	Use mainstream English spelling, e.g. prefer
+		'<code>Rome</code>' to '<code>Roma</code>', and prefer
+		'<code>Athens</code>' to the Greek
+		'<code>Αθήνα</code>' or the Romanized
+		'<code>Athína</code>'.
+		The POSIX file name restrictions encourage this rule.
+  </li>
+  <li>
+	Use the most populous among locations in a zone,
+		e.g. prefer '<code>Shanghai</code>' to
+		'<code>Beijing</code>'.  Among locations with
+		similar populations, pick the best-known location,
+		e.g. prefer '<code>Rome</code>' to '<code>Milan</code>'.
+  </li>
+  <li>
+	Use the singular form, e.g. prefer '<code>Canary</code>' to '<code>Canaries</code>'.
+  </li>
+  <li>
+	Omit common suffixes like '<code>_Islands</code>' and
+		'<code>_City</code>', unless that would lead to
+		ambiguity.  E.g. prefer '<code>Cayman</code>' to
+		'<code>Cayman_Islands</code>' and
+		'<code>Guatemala</code>' to
+		'<code>Guatemala_City</code>', but prefer
+		'<code>Mexico_City</code>' to '<code>Mexico</code>'
+		because the country
+		of Mexico has several time zones.
+  </li>
+  <li>
+	Use '<code>_</code>' to represent a space.
+  </li>
+  <li>
+	Omit '<code>.</code>' from abbreviations in names, e.g. prefer
+		'<code>St_Helena</code>' to '<code>St._Helena</code>'.
+  </li>
+  <li>
+	Do not change established names if they only marginally
+		violate the above rules.  For example, don't change
+		the existing name '<code>Rome</code>' to
+		'<code>Milan</code>' merely because
+		Milan's population has grown to be somewhat greater
+		than Rome's.
+  </li>
+  <li>
+	If a name is changed, put its old spelling in the
+		'<code>backward</code>' file.
+		This means old spellings will continue to work.
+  </li>
+</ul>
+
+<p>
+The file '<code>zone1970.tab</code>' lists geographical locations used
+to name time
+zone rules.  It is intended to be an exhaustive list of names for
+geographic regions as described above; this is a subset of the names
+in the data.  Although a '<code>zone1970.tab</code>' location's longitude
+corresponds to its LMT offset with one hour for every 15 degrees east
+longitude, this relationship is not exact.
+</p>
+
+<p>
+Older versions of this package used a different naming scheme,
+and these older names are still supported.
+See the file '<code>backward</code>' for most of these older names
+(e.g., '<code>US/Eastern</code>' instead of '<code>America/New_York</code>').
+The other old-fashioned names still supported are
+'<code>WET</code>', '<code>CET</code>', '<code>MET</code>', and '<code>EET</code>' (see the file '<code>europe</code>').
+</p>
+
+<p>
+Older versions of this package defined legacy names that are
+incompatible with the first rule of location names, but which are
+still supported.  These legacy names are mostly defined in the file
+'<code>etcetera</code>'.  Also, the file '<code>backward</code>' defines the legacy names
+'<code>GMT0</code>', '<code>GMT-0</code>' and '<code>GMT+0</code>', and the file '<code>northamerica</code>' defines the
+legacy names '<code>EST5EDT</code>', '<code>CST6CDT</code>', '<code>MST7MDT</code>', and '<code>PST8PDT</code>'.
+</p>
+
+<p>
+Excluding '<code>backward</code>' should not affect the other data.  If
+'<code>backward</code>' is excluded, excluding '<code>etcetera</code>' should not affect the
+remaining data.
+</p>
+
+
+  </section>
+  <section>
+    <h2 id="abbreviations">Time zone abbreviations</h2>
+<p>
+When this package is installed, it generates time zone abbreviations
+like '<code>EST</code>' to be compatible with human tradition and POSIX.
+Here are the general rules used for choosing time zone abbreviations,
+in decreasing order of importance:
+<ul>
+  <li>
+	Use three or more characters that are ASCII alphanumerics or
+		'<code>+</code>' or '<code>-</code>'.
+		Previous editions of this database also used characters like
+		'<code> </code>' and '<code>?</code>', but these
+		characters have a special meaning to
+		the shell and cause commands like
+			'<code>set `date`</code>'
+		to have unexpected effects.
+		Previous editions of this rule required upper-case letters,
+		but the Congressman who introduced Chamorro Standard Time
+		preferred "ChST", so lower-case letters are now allowed.
+		Also, POSIX from 2001 on relaxed the rule to allow
+		'<code>-</code>', '<code>+</code>',
+		and alphanumeric characters from the portable character set
+		in the current locale.  In practice ASCII alphanumerics and
+		'<code>+</code>' and '<code>-</code>' are safe in all locales.
+
+		In other words, in the C locale the POSIX extended regular
+		expression <code>[-+[:alnum:]]{3,}</code> should match
+		the abbreviation.
+		This guarantees that all abbreviations could have been
+		specified by a POSIX TZ string.
+  </li>
+  <li>
+	Use abbreviations that are in common use among English-speakers,
+		e.g. 'EST' for Eastern Standard Time in North America.
+		We assume that applications translate them to other languages
+		as part of the normal localization process; for example,
+		a French application might translate 'EST' to 'HNE'.
+  </li>
+  <li>
+	For zones whose times are taken from a city's longitude, use the
+		traditional <var>x</var>MT notation, e.g. 'PMT' for
+		Paris Mean Time.
+		The only name like this in current use is 'GMT'.
+  </li>
+  <li>
+	Use 'LMT' for local mean time of locations before the introduction
+		of standard time; see "<a href="#scope">Scope of the
+		tz database</a>".
+  </li>
+  <li>
+	If there is no common English abbreviation, use numeric offsets like
+		<code>-</code>05 and <code>+</code>0830 that are
+		generated by zic's <code>%z</code> notation.
+  </li>
+  <li>
+	Use current abbreviations for older timestamps to avoid confusion.
+		For example, in 1910 a common English abbreviation for UT +01
+		in central Europe was 'MEZ' (short for both "Middle European
+		Zone" and for "Mitteleuropäische Zeit" in German).  Nowadays
+		'CET' ("Central European Time") is more common in English, and
+		the database uses 'CET' even for circa-1910 timestamps as this
+		is less confusing for modern users and avoids the need for
+		determining when 'CET' supplanted 'MEZ' in common usage.
+  </li>
+  <li>
+	Use a consistent style in a zone's history.  For example, if a zone's
+		history tends to use numeric abbreviations and a particular
+		entry could go either way, use a numeric abbreviation.
+  </li>
+</ul>
+    [The remaining guidelines predate the introduction of <code>%z</code>.
+    They are problematic as they mean tz data entries invent
+    notation rather than record it.  These guidelines are now
+    deprecated and the plan is to gradually move to <code>%z</code> for
+    inhabited locations and to "<code>-</code>00" for uninhabited locations.]
+<ul>
+  <li>
+	If there is no common English abbreviation, abbreviate the English
+		translation of the usual phrase used by native speakers.
+		If this is not available or is a phrase mentioning the country
+		(e.g. "Cape Verde Time"), then:
+	<ul>
+	  <li>
+		When a country is identified with a single or principal zone,
+			append 'T' to the country's ISO	code, e.g. 'CVT' for
+			Cape Verde Time.  For summer time append 'ST';
+			for double summer time append 'DST'; etc.
+	  </li>
+	  <li>
+		Otherwise, take the first three letters of an English place
+			name identifying each zone and append 'T', 'ST', etc.
+			as before; e.g. 'CHAST' for CHAtham Summer Time.
+	  </li>
+	</ul>
+  </li>
+  <li>
+	Use UT (with time zone abbreviation '<code>-</code>00') for
+		locations while uninhabited.  The leading
+		'<code>-</code>' is a flag that the time
+		zone is in some sense undefined; this notation is
+		derived from Internet RFC 3339.
+  </li>
+</ul>
+<p>
+Application writers should note that these abbreviations are ambiguous
+in practice: e.g. 'CST' has a different meaning in China than
+it does in the United States.  In new applications, it's often better
+to use numeric UT offsets like '<code>-</code>0600' instead of time zone
+abbreviations like 'CST'; this avoids the ambiguity.
+</p>
+  </section>
+
+
+  <section>
+    <h2 id="accuracy">Accuracy of the tz database</h2>
+<p>
+The tz database is not authoritative, and it surely has errors.
+Corrections are welcome and encouraged; see the file CONTRIBUTING.
+Users requiring authoritative data should consult national standards
+bodies and the references cited in the database's comments.
+</p>
+
+<p>
+Errors in the tz database arise from many sources:
+</p>
+<ul>
+  <li>
+   The tz database predicts future timestamps, and current predictions
+   will be incorrect after future governments change the rules.
+   For example, if today someone schedules a meeting for 13:00 next
+   October 1, Casablanca time, and tomorrow Morocco changes its
+   daylight saving rules, software can mess up after the rule change
+   if it blithely relies on conversions made before the change.
+  </li>
+  <li>
+   The pre-1970 entries in this database cover only a tiny sliver of how
+   clocks actually behaved; the vast majority of the necessary
+   information was lost or never recorded.  Thousands more zones would
+   be needed if the tz database's scope were extended to cover even
+   just the known or guessed history of standard time; for example,
+   the current single entry for France would need to split into dozens
+   of entries, perhaps hundreds.  And in most of the world even this
+   approach would be misleading due to widespread disagreement or
+   indifference about what times should be observed.  In her 2015 book
+   <cite>The Global Transformation of Time, 1870-1950</cite>, Vanessa Ogle writes
+   "Outside of Europe and North America there was no system of time
+   zones at all, often not even a stable landscape of mean times,
+   prior to the middle decades of the twentieth century".  See:
+   Timothy Shenk, <a
+   href="https://www.dissentmagazine.org/blog/booked-a-global-history-of-time-vanessa-ogle">Booked:
+   A Global History of Time</a>. <cite>Dissent</cite> 2015-12-17.
+  </li>
+  <li>
+   Most of the pre-1970 data entries come from unreliable sources, often
+   astrology books that lack citations and whose compilers evidently
+   invented entries when the true facts were unknown, without
+   reporting which entries were known and which were invented.
+   These books often contradict each other or give implausible entries,
+   and on the rare occasions when they are checked they are
+   typically found to be incorrect.
+  </li>
+  <li>
+   For the UK the tz database relies on years of first-class work done by
+   Joseph Myers and others; see
+   "<a href="https://www.polyomino.org.uk/british-time/">History of
+   legal time in Britain</a>".
+   Other countries are not done nearly as well.
+  </li>
+  <li>
+   Sometimes, different people in the same city would maintain clocks
+   that differed significantly.  Railway time was used by railroad
+   companies (which did not always agree with each other),
+   church-clock time was used for birth certificates, etc.
+   Often this was merely common practice, but sometimes it was set by law.
+   For example, from 1891 to 1911 the UT offset in France was legally
+   0:09:21 outside train stations and 0:04:21 inside.
+  </li>
+  <li>
+   Although a named location in the tz database stands for the
+   containing region, its pre-1970 data entries are often accurate for
+   only a small subset of that region.  For example, <code>Europe/London</code>
+   stands for the United Kingdom, but its pre-1847 times are valid
+   only for locations that have London's exact meridian, and its 1847
+   transition to GMT is known to be valid only for the L&amp;NW and the
+   Caledonian railways.
+  </li>
+  <li>
+   The tz database does not record the earliest time for which a zone's
+   data entries are thereafter valid for every location in the region.
+   For example, <code>Europe/London</code> is valid for all locations in its
+   region after GMT was made the standard time, but the date of
+   standardization (1880-08-02) is not in the tz database, other than
+   in commentary.  For many zones the earliest time of validity is
+   unknown.
+  </li>
+  <li>
+   The tz database does not record a region's boundaries, and in many
+   cases the boundaries are not known.  For example, the zone
+   <code>America/Kentucky/Louisville</code> represents a region around
+   the city of
+   Louisville, the boundaries of which are unclear.
+  </li>
+  <li>
+   Changes that are modeled as instantaneous transitions in the tz
+   database were often spread out over hours, days, or even decades.
+  </li>
+  <li>
+   Even if the time is specified by law, locations sometimes
+   deliberately flout the law.
+  </li>
+  <li>
+   Early timekeeping practices, even assuming perfect clocks, were
+   often not specified to the accuracy that the tz database requires.
+  </li>
+  <li>
+   Sometimes historical timekeeping was specified more precisely
+   than what the tz database can handle.  For example, from 1909 to
+   1937 Netherlands clocks were legally UT +00:19:32.13, but the tz
+   database cannot represent the fractional second.
+  </li>
+  <li>
+   Even when all the timestamp transitions recorded by the tz database
+   are correct, the tz rules that generate them may not faithfully
+   reflect the historical rules.  For example, from 1922 until World
+   War II the UK moved clocks forward the day following the third
+   Saturday in April unless that was Easter, in which case it moved
+   clocks forward the previous Sunday.  Because the tz database has no
+   way to specify Easter, these exceptional years are entered as
+   separate tz Rule lines, even though the legal rules did not change.
+  </li>
+  <li>
+   The tz database models pre-standard time using the proleptic Gregorian
+   calendar and local mean time (LMT), but many people used other
+   calendars and other timescales.  For example, the Roman Empire used
+   the Julian calendar, and had 12 varying-length daytime hours with a
+   non-hour-based system at night.
+  </li>
+  <li>
+   Early clocks were less reliable, and data entries do not represent
+   clock error.
+  </li>
+  <li>
+   The tz database assumes Universal Time (UT) as an origin, even
+   though UT is not standardized for older timestamps.  In the tz
+   database commentary, UT denotes a family of time standards that
+   includes Coordinated Universal Time (UTC) along with other variants
+   such as UT1 and GMT, with days starting at midnight.  Although UT
+   equals UTC for modern timestamps, UTC was not defined until 1960,
+   so commentary uses the more-general abbreviation UT for timestamps
+   that might predate 1960.  Since UT, UT1, etc. disagree slightly,
+   and since pre-1972 UTC seconds varied in length, interpretation of
+   older timestamps can be problematic when subsecond accuracy is
+   needed.
+  </li>
+  <li>
+   Civil time was not based on atomic time before 1972, and we don't
+   know the history of earth's rotation accurately enough to map SI
+   seconds to historical solar time to more than about one-hour
+   accuracy.  See: Stephenson FR, Morrison LV, Hohenkerk CY.
+   <a href="http://dx.doi.org/10.1098/rspa.2016.0404">Measurement
+   of the Earth's rotation: 720 BC to AD 2015</a>.
+   <cite>Proc Royal Soc A</cite>. 2016 Dec 7;472:20160404.
+   Also see: Espenak F. <a
+   href="https://eclipse.gsfc.nasa.gov/SEhelp/uncertainty2004.html">Uncertainty
+   in Delta T (ΔT)</a>.
+  </li>
+  <li>
+   The relationship between POSIX time (that is, UTC but ignoring leap
+   seconds) and UTC is not agreed upon after 1972.  Although the POSIX
+   clock officially stops during an inserted leap second, at least one
+   proposed standard has it jumping back a second instead; and in
+   practice POSIX clocks more typically either progress glacially during
+   a leap second, or are slightly slowed while near a leap second.
+  </li>
+  <li>
+   The tz database does not represent how uncertain its information is.
+   Ideally it would contain information about when data entries are
+   incomplete or dicey.  Partial temporal knowledge is a field of
+   active research, though, and it's not clear how to apply it here.
+  </li>
+</ul>
+<p>
+In short, many, perhaps most, of the tz database's pre-1970 and future
+timestamps are either wrong or misleading.  Any attempt to pass the
+tz database off as the definition of time should be unacceptable to
+anybody who cares about the facts.  In particular, the tz database's
+LMT offsets should not be considered meaningful, and should not prompt
+creation of zones merely because two locations differ in LMT or
+transitioned to standard time at different dates.
+</p>
+  </section>
+
+
+  <section>
+    <h2 id="functions">Time and date functions</h2>
+<p>
+The tz code contains time and date functions that are upwards
+compatible with those of POSIX.
+</p>
+
+<p>
+POSIX has the following properties and limitations.
+</p>
+<ul>
+  <li>
+    <p>
+	In POSIX, time display in a process is controlled by the
+	environment variable TZ.  Unfortunately, the POSIX TZ string takes
+	a form that is hard to describe and is error-prone in practice.
+	Also, POSIX TZ strings can't deal with other (for example, Israeli)
+	daylight saving time rules, or situations where more than two
+	time zone abbreviations are used in an area.
+    </p>
+    <p>
+      The POSIX TZ string takes the following form:
+    </p>
+    <p>
+      <var>stdoffset</var>[<var>dst</var>[<var>offset</var>][<code>,</code><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]]]
+    </p>
+    <p>
+	where:
+    <dl>
+      <dt><var>std</var> and <var>dst</var></dt><dd>
+		are 3 or more characters specifying the standard
+		and daylight saving time (DST) zone names.
+		Starting with POSIX.1-2001, <var>std</var>
+		and <var>dst</var> may also be
+		in a quoted form like '<code>&lt;UTC+10&gt;</code>'; this allows
+		"<code>+</code>" and "<code>-</code>" in the names.
+      </dd>
+      <dt><var>offset</var></dt><dd>
+		is of the form
+		'<code>[&plusmn;]<var>hh</var>:[<var>mm</var>[:<var>ss</var>]]</code>'
+		and specifies the offset west of UT.  '<var>hh</var>'
+		may be a single digit; 0&le;<var>hh</var>&le;24.
+		The default DST offset is one hour ahead of standard time.
+      </dd>
+      <dt><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]</dt><dd>
+		specifies the beginning and end of DST.  If this is absent,
+		the system supplies its own rules for DST, and these can
+		differ from year to year; typically US DST rules are used.
+      </dd>
+      <dt><var>time</var></dt><dd>
+		takes the form
+		'<var>hh</var><code>:</code>[<var>mm</var>[<code>:</code><var>ss</var>]]'
+		and defaults to 02:00.
+		This is the same format as the offset, except that a
+		leading '<code>+</code>' or '<code>-</code>' is not allowed.
+      </dd>
+      <dt><var>date</var></dt><dd>
+		takes one of the following forms:
+	<dl>
+	  <dt>J<var>n</var> (1&le;<var>n</var>&le;365)</dt><dd>
+			origin-1 day number not counting February 29
+          </dd>
+	  <dt><var>n</var> (0&le;<var>n</var>&le;365)</dt><dd>
+			origin-0 day number counting February 29 if present
+          </dd>
+	  <dt><code>M</code><var>m</var><code>.</code><var>n</var><code>.</code><var>d</var> (0[Sunday]&le;<var>d</var>&le;6[Saturday], 1&le;<var>n</var>&le;5, 1&le;<var>m</var>&le;12)</dt><dd>
+			for the <var>d</var>th day of
+			week <var>n</var> of month <var>m</var> of the
+			year, where week 1 is the first week in which
+			day <var>d</var> appears, and '<code>5</code>'
+			stands for the last week in which
+			day <var>d</var> appears
+			(which may be either the 4th or 5th week).
+			Typically, this is the only useful form;
+			the <var>n</var>
+			and <code>J</code><var>n</var> forms are
+			rarely used.
+	  </dd>
+</dl>
+</dd>
+</dl>
+	Here is an example POSIX TZ string for New Zealand after 2007.
+	It says that standard time (NZST) is 12 hours ahead of UTC,
+	and that daylight saving time (NZDT) is observed from September's
+	last Sunday at 02:00 until April's first Sunday at 03:00:
+
+        <pre><code>TZ='NZST-12NZDT,M9.5.0,M4.1.0/3'</code></pre>
+
+	This POSIX TZ string is hard to remember, and mishandles some
+	timestamps before 2008.  With this package you can use this
+	instead:
+
+	<pre><code>TZ='Pacific/Auckland'</code></pre>
+  </li>
+  <li>
+	POSIX does not define the exact meaning of TZ values like
+	"<code>EST5EDT</code>".
+	Typically the current US DST rules are used to interpret such values,
+	but this means that the US DST rules are compiled into each program
+	that does time conversion.  This means that when US time conversion
+	rules change (as in the United States in 1987), all programs that
+	do time conversion must be recompiled to ensure proper results.
+  </li>
+  <li>
+	The TZ environment variable is process-global, which makes it hard
+	to write efficient, thread-safe applications that need access
+	to multiple time zones.
+  </li>
+  <li>
+	In POSIX, there's no tamper-proof way for a process to learn the
+	system's best idea of local wall clock.  (This is important for
+	applications that an administrator wants used only at certain
+	times &ndash;
+	without regard to whether the user has fiddled the TZ environment
+	variable.  While an administrator can "do everything in UTC" to get
+	around the problem, doing so is inconvenient and precludes handling
+	daylight saving time shifts - as might be required to limit phone
+	calls to off-peak hours.)
+  </li>
+  <li>
+	POSIX provides no convenient and efficient way to determine the UT
+	offset and time zone abbreviation of arbitrary timestamps,
+	particularly for time zone settings that do not fit into the
+	POSIX model.
+  </li>
+  <li>
+	POSIX requires that systems ignore leap seconds.
+  </li>
+  <li>
+	The tz code attempts to support all the <code>time_t</code>
+	implementations allowed by POSIX.  The <code>time_t</code>
+	type represents a nonnegative count of
+	seconds since 1970-01-01 00:00:00 UTC, ignoring leap seconds.
+	In practice, <code>time_t</code> is usually a signed 64- or
+	32-bit integer; 32-bit signed <code>time_t</code> values stop
+	working after 2038-01-19 03:14:07 UTC, so
+	new implementations these days typically use a signed 64-bit integer.
+	Unsigned 32-bit integers are used on one or two platforms,
+	and 36-bit and 40-bit integers are also used occasionally.
+	Although earlier POSIX versions allowed <code>time_t</code> to be a
+	floating-point type, this was not supported by any practical
+	systems, and POSIX.1-2013 and the tz code both
+	require <code>time_t</code>
+	to be an integer type.
+  </li>
+</ul>
+<p>
+These are the extensions that have been made to the POSIX functions:
+</p>
+<ul>
+  <li>
+    <p>
+	The TZ environment variable is used in generating the name of a file
+	from which time zone information is read (or is interpreted a la
+	POSIX); TZ is no longer constrained to be a three-letter time zone
+	name followed by a number of hours and an optional three-letter
+	daylight time zone name.  The daylight saving time rules to be used
+	for a particular time zone are encoded in the time zone file;
+	the format of the file allows U.S., Australian, and other rules to be
+	encoded, and allows for situations where more than two time zone
+	abbreviations are used.
+    </p>
+    <p>
+	It was recognized that allowing the TZ environment variable to
+	take on values such as '<code>America/New_York</code>' might
+	cause "old" programs
+	(that expect TZ to have a certain form) to operate incorrectly;
+	consideration was given to using some other environment variable
+	(for example, TIMEZONE) to hold the string used to generate the
+	time zone information file name.  In the end, however, it was decided
+	to continue using TZ: it is widely used for time zone purposes;
+	separately maintaining both TZ and TIMEZONE seemed a nuisance;
+	and systems where "new" forms of TZ might cause problems can simply
+	use TZ values such as "<code>EST5EDT</code>" which can be used both by
+	"new" programs (a la POSIX) and "old" programs (as zone names and
+	offsets).
+    </p>
+</li>
+<li>
+	The code supports platforms with a UT offset member
+	in <code>struct tm</code>,
+	e.g., <code>tm_gmtoff</code>.
+</li>
+<li>
+	The code supports platforms with a time zone abbreviation member in
+	<code>struct tm</code>, e.g., <code>tm_zone</code>.
+</li>
+<li>
+	Since the TZ environment variable can now be used to control time
+	conversion, the <code>daylight</code>
+	and <code>timezone</code> variables are no longer needed.
+	(These variables are defined and set by <code>tzset</code>;
+	however, their values will not be used
+	by <code>localtime</code>.)
+</li>
+<li>
+	Functions <code>tzalloc</code>, <code>tzfree</code>,
+	<code>localtime_rz</code>, and <code>mktime_z</code> for
+	more-efficient thread-safe applications that need to use
+	multiple time zones.  The <code>tzalloc</code>
+	and <code>tzfree</code> functions allocate and free objects of
+	type <code>timezone_t</code>, and <code>localtime_rz</code>
+	and <code>mktime_z</code> are like <code>localtime_r</code>
+	and <code>mktime</code> with an extra
+	<code>timezone_t</code> argument.  The functions were inspired
+	by NetBSD.
+</li>
+<li>
+	A function <code>tzsetwall</code> has been added to arrange
+	for the system's
+	best approximation to local wall clock time to be delivered by
+	subsequent calls to <code>localtime</code>.  Source code for portable
+	applications that "must" run on local wall clock time should call
+	<code>tzsetwall</code>; if such code is moved to "old" systems that don't
+	provide tzsetwall, you won't be able to generate an executable program.
+	(These time zone functions also arrange for local wall clock time to be
+	used if tzset is called &ndash; directly or indirectly &ndash;
+	and there's no TZ
+	environment variable; portable applications should not, however, rely
+	on this behavior since it's not the way SVR2 systems behave.)
+</li>
+<li>
+	Negative <code>time_t</code> values are supported, on systems
+	where <code>time_t</code> is signed.
+</li>
+<li>
+	These functions can account for leap seconds, thanks to Bradley White.
+</li>
+</ul>
+<p>
+Points of interest to folks with other systems:
+</p>
+<ul>
+  <li>
+	Code compatible with this package is already part of many platforms,
+	including GNU/Linux, Android, the BSDs, Chromium OS, Cygwin, AIX, iOS,
+	BlackBery 10, macOS, Microsoft Windows, OpenVMS, and Solaris.
+	On such hosts, the primary use of this package
+	is to update obsolete time zone rule tables.
+	To do this, you may need to compile the time zone compiler
+	'<code>zic</code>' supplied with this package instead of using
+	the system '<code>zic</code>', since the format
+	of <code>zic</code>'s input is occasionally extended, and a
+	platform may still be shipping an older <code>zic</code>.
+  </li>
+  <li>
+	The UNIX Version 7 <code>timezone</code> function is not
+	present in this package;
+	it's impossible to reliably map timezone's arguments (a "minutes west
+	of GMT" value and a "daylight saving time in effect" flag) to a
+	time zone abbreviation, and we refuse to guess.
+	Programs that in the past used the timezone function may now examine
+	<code>localtime(&amp;clock)-&gt;tm_zone</code>
+	(if <code>TM_ZONE</code> is defined) or
+	<code>tzname[localtime(&amp;clock)-&gt;tm_isdst]</code>
+	(if <code>HAVE_TZNAME</code> is defined)
+	to learn the correct time zone abbreviation to use.
+  </li>
+  <li>
+	The 4.2BSD <code>gettimeofday</code> function is not used in
+	this package.
+	This formerly let users obtain the current UTC offset and DST flag,
+	but this functionality was removed in later versions of BSD.
+  </li>
+  <li>
+	In SVR2, time conversion fails for near-minimum or near-maximum
+	<code>time_t</code> values when doing conversions for places
+	that don't use UT.
+	This package takes care to do these conversions correctly.
+	A comment in the source code tells how to get compatibly wrong
+	results.
+  </li>
+</ul>
+<p>
+The functions that are conditionally compiled
+if <code>STD_INSPIRED</code> is defined
+should, at this point, be looked on primarily as food for thought.  They are
+not in any sense "standard compatible" &ndash; some are not, in fact,
+specified in <em>any</em> standard.  They do, however, represent responses of
+various authors to
+standardization proposals.
+</p>
+
+<p>
+Other time conversion proposals, in particular the one developed by folks at
+Hewlett Packard, offer a wider selection of functions that provide capabilities
+beyond those provided here.  The absence of such functions from this package
+is not meant to discourage the development, standardization, or use of such
+functions.  Rather, their absence reflects the decision to make this package
+contain valid extensions to POSIX, to ensure its broad acceptability.  If
+more powerful time conversion functions can be standardized, so much the
+better.
+</p>
+  </section>
+
+
+  <section>
+    <h2 id="stability">Interface stability</h2>
+<p>
+The tz code and data supply the following interfaces:
+</p>
+<ul>
+  <li>
+   A set of zone names as per "<a href="#naming">Names of time zone
+   rules</a>" above.
+  </li>
+  <li>
+   Library functions described in "<a href="#functions">Time and date
+   functions</a>" above.
+  </li>
+  <li>
+   The programs <code>tzselect</code>, <code>zdump</code>,
+   and <code>zic</code>, documented in their man pages.
+  </li>
+  <li>
+   The format of <code>zic</code> input files, documented in
+   the <code>zic</code> man page.
+  </li>
+  <li>
+   The format of <code>zic</code> output files, documented in
+   the <code>tzfile</code> man page.
+  </li>
+  <li>
+   The format of zone table files, documented in <code>zone1970.tab</code>.
+  </li>
+  <li>
+   The format of the country code file, documented in <code>iso3166.tab</code>.
+  </li>
+  <li>
+   The version number of the code and data, as the first line of
+   the text file '<code>version</code>' in each release.
+  </li>
+</ul>
+<p>
+Interface changes in a release attempt to preserve compatibility with
+recent releases.  For example, tz data files typically do not rely on
+recently-added <code>zic</code> features, so that users can run
+older <code>zic</code> versions to process newer data
+files.  <a href="tz-link.htm">Sources for time zone and daylight
+saving time data</a> describes how
+releases are tagged and distributed.
+</p>
+
+<p>
+Interfaces not listed above are less stable.  For example, users
+should not rely on particular UT offsets or abbreviations for
+timestamps, as data entries are often based on guesswork and these
+guesses may be corrected or improved.
+</p>
+  </section>
+
+
+  <section>
+    <h2 id="calendar">Calendrical issues</h2>
+<p>
+Calendrical issues are a bit out of scope for a time zone database,
+but they indicate the sort of problems that we would run into if we
+extended the time zone database further into the past.  An excellent
+resource in this area is Nachum Dershowitz and Edward M. Reingold,
+<cite><a href="https://www.cs.tau.ac.il/~nachum/calendar-book/third-edition/">Calendrical
+Calculations: Third Edition</a></cite>, Cambridge University Press (2008).
+Other information and sources are given in the file '<samp>calendars</samp>'
+in the tz distribution.  They sometimes disagree.
+</p>
+  </section>
+
+
+  <section>
+    <h2 id="planets">Time and time zones on other planets</h2>
+<p>
+Some people's work schedules use Mars time.  Jet Propulsion Laboratory
+(JPL) coordinators have kept Mars time on and off at least since 1997
+for the Mars Pathfinder mission.  Some of their family members have
+also adapted to Mars time.  Dozens of special Mars watches were built
+for JPL workers who kept Mars time during the Mars Exploration
+Rovers mission (2004).  These timepieces look like normal Seikos and
+Citizens but use Mars seconds rather than terrestrial seconds.
+</p>
+
+<p>
+A Mars solar day is called a "sol" and has a mean period equal to
+about 24 hours 39 minutes 35.244 seconds in terrestrial time.  It is
+divided into a conventional 24-hour clock, so each Mars second equals
+about 1.02749125 terrestrial seconds.
+</p>
+
+<p>
+The prime meridian of Mars goes through the center of the crater
+Airy-0, named in honor of the British astronomer who built the
+Greenwich telescope that defines Earth's prime meridian.  Mean solar
+time on the Mars prime meridian is called Mars Coordinated Time (MTC).
+</p>
+
+<p>
+Each landed mission on Mars has adopted a different reference for
+solar time keeping, so there is no real standard for Mars time zones.
+For example, the Mars Exploration Rover project (2004) defined two
+time zones "Local Solar Time A" and "Local Solar Time B" for its two
+missions, each zone designed so that its time equals local true solar
+time at approximately the middle of the nominal mission.  Such a "time
+zone" is not particularly suited for any application other than the
+mission itself.
+</p>
+
+<p>
+Many calendars have been proposed for Mars, but none have achieved
+wide acceptance.  Astronomers often use Mars Sol Date (MSD) which is a
+sequential count of Mars solar days elapsed since about 1873-12-29
+12:00 GMT.
+</p>
+
+<p>
+In our solar system, Mars is the planet with time and calendar most
+like Earth's.  On other planets, Sun-based time and calendars would
+work quite differently.  For example, although Mercury's sidereal
+rotation period is 58.646 Earth days, Mercury revolves around the Sun
+so rapidly that an observer on Mercury's equator would see a sunrise
+only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury
+day.  Venus is more complicated, partly because its rotation is
+slightly retrograde: its year is 1.92 of its days.  Gas giants like
+Jupiter are trickier still, as their polar and equatorial regions
+rotate at different rates, so that the length of a day depends on
+latitude.  This effect is most pronounced on Neptune, where the day is
+about 12 hours at the poles and 18 hours at the equator.
+</p>
+
+<p>
+Although the tz database does not support time on other planets, it is
+documented here in the hopes that support will be added eventually.
+</p>
+
+<p>
+Sources:
+</p>
+<ul>
+  <li>
+Michael Allison and Robert Schmunk,
+"<a href="https://www.giss.nasa.gov/tools/mars24/help/notes.html">Technical
+Notes on Mars Solar Time as Adopted by the Mars24 Sunclock</a>"
+(2012-08-08).
+  </li>
+  <li>
+Jia-Rui Chong,
+"<a href="http://articles.latimes.com/2004/jan/14/science/sci-marstime14">Workdays
+Fit for a Martian</a>", Los Angeles Times
+(2004-01-14), pp A1, A20-A21.
+  </li>
+  <li>
+Tom Chmielewski,
+"<a href="https://www.theatlantic.com/technology/archive/2015/02/jet-lag-is-worse-on-mars/386033/">Jet
+Lag Is Worse on Mars</a>", The Atlantic (2015-02-26)
+  </li>
+  <li>
+Matt Williams,
+"<a href="https://www.universetoday.com/37481/days-of-the-planets/">How
+long is a day on the other planets of the solar system?</a>"
+(2017-04-27).
+  </li>
+</ul>
+  </section>
+
+  <footer>
+    <hr>
+This file is in the public domain, so clarified as of 2009-05-17 by
+Arthur David Olson.
+  </footer>
+</body>
+</html>
diff --git a/tz-link.htm b/tz-link.htm
index 09edf0b..03259d2 100644
--- a/tz-link.htm
+++ b/tz-link.htm
@@ -214,7 +214,9 @@ Studio Code</a>.
 For further information about updates, please see
 <a href="https://tools.ietf.org/html/rfc6557">Procedures for
 Maintaining the Time Zone Database</a> (Internet <abbr
-title="Request For Comments">RFC</abbr> 6557).</p>
+title="Request For Comments">RFC</abbr> 6557). More detail can be
+found in <a href="theory.html">Theory and pragmatics of the tz code and data</a>.
+</p>
 <h2 id="commentary">Commentary on the <code><abbr>tz</abbr></code> database</h2>
 <ul>
 <li>The article
@@ -915,6 +917,7 @@ is called "<abbr>GMT</abbr>".</li>
 </ul>
 <h2 id="see-also">See also</h2>
 <ul>
+<li><a href="theory.html">Theory and pragmatics of the tz code and data</a></li>
 <li><a href="tz-art.htm">Time and the Arts</a></li>
 </ul>
 <hr>
diff --git a/zone1970.tab b/zone1970.tab
index 2bcdc64..8b828e6 100644
--- a/zone1970.tab
+++ b/zone1970.tab
@@ -2,7 +2,7 @@
 #
 # This file is in the public domain.
 #
-# From Paul Eggert (2014-07-31):
+# From Paul Eggert (2017-10-01):
 # This file contains a table where each row stands for a zone where
 # civil time stamps have agreed since 1970.  Columns are separated by
 # a single tab.  Lines beginning with '#' are comments.  All text uses
@@ -15,7 +15,7 @@
 #     either +-DDMM+-DDDMM or +-DDMMSS+-DDDMMSS,
 #     first latitude (+ is north), then longitude (+ is east).
 # 3.  Zone name used in value of TZ environment variable.
-#     Please see the 'Theory' file for how zone names are chosen.
+#     Please see the theory.html file for how zone names are chosen.
 #     If multiple zones overlap a country, each has a row in the
 #     table, with each column 1 containing the country code.
 # 4.  Comments; present if and only if a country has multiple zones.
-- 
2.7.4




More information about the tz mailing list