Another tz compiler

Brian S O'Neill broneill at earthlink.net
Tue Mar 2 17:13:08 UTC 2004


The main tz project page shows various links to other time zone database 
formats and other tz compilers. I have been working on the Joda-Time 
project, which is designed as a replacement for Java's standard date and 
time classes. It includes a tz compiler and it has its own compact 
binary format for the resulting files. I would be pleased if the 
Joda-time project was mentioned on the tz page as well.

Joda-Time project: http://joda-time.sourceforge.net/
Compiler (link is unstable): 
http://joda-time.sourceforge.net/api-0.95/org/joda/time/tz/ZoneInfoCompiler.html

Most users will have no need to compile the files themselves, as the 
Joda-Time distribution includes pre-compiled tz files in the jar. The 
DateTimeZone class knows how to load the files and create objects for them.

I finished the tz compiler and new binary format about a year ago. I 
would have just used Java's standard time zone class, except it did not 
perform fast enough. Sun's Java v1.4 has a time zone implementation that 
retrieves offsets using a binary search. Joda-Time's CachedDateTimeZone 
is faster than a binary search. Caching is handled automatically, and 
time zones with trivial rules are not wrapped with the cached 
implementation. It could not be piggybacked onto Java's standard time 
zone class, as it does not provide a way to iterate over offset 
transitions.

The only documentation on the binary format and the caching is in the 
source code itself. One of the features in the binary format is that it 
stores times with variable precision and size. It can store up to 
millisecond precision, as a 64-bit signed integer. It stores 
precalculated offset transitions up to the point where a simple DST rule 
can fully describe all future transitions.

Runtime caching is implemented by breaking the time line down into fixed 
size regions of 2^32 milliseconds, or about 49.7 days. Offset lookup is 
performed by retrieving one of these regions from the cache. The lookup 
is performed by shifting out the lower 32 bits of the 64-bit timestamp. 
This value, modulo 512, is used as an array index to retrieve the region 
info. A hashtable with 512 regions of 49.7 days provides collision free 
caching within periods of 69.7 years. The region info object contains a 
linked list of offset transition instants. Since most regions have less 
than two transitions, the linked list search is quite fast.



More information about the tz mailing list