[tz] User time zones

Zefram zefram at fysh.org
Thu Dec 8 11:39:47 UTC 2011


Steven Abner wrote:
>I am not sure of what function Perl would use or your application would use
>if you wished to translate a server in Japan sending a date string of
>"Maintenance will begin at 12:00 JST"
>and desired to display it in a local time string.

That sort of thing is properly tackled in the opposite direction.  When
maintenance will start is a point in time, independent of timezone, and
so should be represented in UT or equivalent.  Then it can be trivially
converted to each user's preferred timezone for display.  For example,

	$ maintdate='2011-12-09 00:00 UT'
	$ for TZ in America/New_York America/Sao_Paulo Asia/Tokyo; do
	    date -d "$maintdate" +'Maintenance will begin at %H:%M %Z on %A.';
	  done
	Maintenance will begin at 19:00 EST on Thursday.
	Maintenance will begin at 22:00 BRST on Thursday.
	Maintenance will begin at 09:00 JST on Friday.

If you really need to go the other way... well, that's pretty bizarre.
You're scraping a human-oriented announcement "Maintenance will begin at
12:00 JST" from someone's server, the format is rigid enough for you to
reliably pick out "12:00 JST" as a time indicator, but the application
isn't specific enough for you to statically configure that this server
uses Tokyo time?  I don't believe it as stated.  There are more plausible
scenarios, though, where you could get "12:00 JST" as a time indicator
and have to automatically interpret it.

In that case, firstly, you don't want to go to the Asia/Tokyo timezone
per se.  The job isn't to guess which geographical zone applies,
it's to guess what "JST" means.  That's not so much of an issue with
"JST", but consider "MST": you don't really care where it's coming from
America/Denver or America/Phoenix, it could be either and means the same
either way.  If you get "MST" for a date in July then it's probably
America/Phoenix, and you'd go quite wrong if you were to interpret
it as America/Denver and use the offset that Denver uses in July.
(Denver uses the abbreviation "MDT" in July.)

Secondly, you need to accept that you're guessing.  There are ambiguities,
and if you can't have any user input to disambiguate then you're going
to go wrong sometimes.

>Would you scan all the files?
>or search the internet of how to interpret JST?

Once you've got the above issues clear, actually performing the guess
is fairly easy.  As you point out, it's a bit much to search through
every zone file each time, at least if you're doing this regularly, so
it's sensible to build an abbreviation-to-whatever index.  Generating the
index is trivial, by a single pass through every zone file.  Depending on
application, you might want to limit the indexing to abbreviations used
in the last N years (so no "LMT" outlier).

Looking up an abbreviation in the index will give you a small list of
candidate offsets.  If you want to be clever, you could try narrowing
down the list further by checking the slightly larger list of candidate
geographical timezones, to see whether the abbreviation is meant to be
in use at the time of year that the time expression appears to depict.

>How would one even start querying a user.

If you've got a user to query, you can do this properly.  You usually want
to know a user's timezone for output purposes.  Usually a geographical
civil timezone, and you *do* want to distinguish America/Denver from
America/Phoenix.  The zone.tab file provides a convenient structure
for this, based on contemporary political geography.  You end up with
a dialogue going something like "what continent are you in?" "Asia"
"which of these countries?" "Japan" "right, you'll be wanting Asia/Tokyo".
Countries with multiple timezones get an additional question, "is that
east or west Uzbekistan?".  tzselect.ksh in the tz distribution implements
this system in a simple way.  Some refinements are possible, such as
displaying the current time (and abbreviation) in each candidate zone.

If, bizarrely, you've got "JST" from a user, and then want to ask
to disambiguate (or confirm) it, that'll be a much shorter line of
questioning.  Use the index discussed above to get a list of candidate
geographical zones, and then show them to the user (possibly using the
descriptions from zone.tab, though they don't cover all zones in the
database, this isn't what they're for), and ask the user to pick one.
Here's a very crude version of this type of search:

	$ for TZ in $(comm -23 \
	      =(grep -wl JST /usr/share/zoneinfo/posix/**/*(.) | \
	        sed 's,.*posix/,,' | sort) \
	      =(grep '^Link' ~/tmp/tz/* | awk '{print $3}' | sort)); do
	    date +'%a %H:%M %Z%t'$TZ;
	  done
	Thu 20:31 TLT   Asia/Dili
	Thu 19:31 HKT   Asia/Hong_Kong
	Thu 18:31 WIT   Asia/Jakarta
	Thu 19:31 MYT   Asia/Kuala_Lumpur
	Thu 19:31 MYT   Asia/Kuching
	Thu 19:31 CIT   Asia/Makassar
	Thu 19:31 PHT   Asia/Manila
	Thu 18:31 WIT   Asia/Pontianak
	Thu 18:01 MMT   Asia/Rangoon
	Thu 22:31 SAKT  Asia/Sakhalin
	Thu 19:31 SGT   Asia/Singapore
	Thu 20:31 JST   Asia/Tokyo
	Thu 23:31 NRT   Pacific/Nauru

Obviously, grep is just turning up every zone that has ever used "JST"
at all, possibly with some false positives.  You can do better by
properly parsing the tzfiles.  You can limit to zones that have used
the abbreviation recently, and so on.

>What if your application shouldn't interact to convert to
>local time display?

It'll still need some kind of input to determine which timezone to use.
A zone abbreviation is a rather unlikely form of such input.  If you
can't get an explicit zone name input then you'll have to guess to some
extent, of course.  GeoIP will give you a decent guess.

-zefram



More information about the tz mailing list