[tz] A Partly-Baked Idea

Guy Harris gharris at sonic.net
Fri Mar 8 18:10:18 UTC 2024


On Mar 8, 2024, at 7:52 AM, Bill Seymour via tz <tz at iana.org> wrote:

> Could there be a "Version 4" of the compiled TZif files in which the ints and time_ts have the correct endianness for the platform on which they're installed?

In the very early days of the project, the files were in the byte order of the host on which zic ran.

I was in the OS group at Sun at that time; when I discovered the project, I decided use it in SunOS 4.0, which was under development at that time.

SunOS 4.0 removed Sun's old ND (network disk) protocol, which was used for the root file system for diskless workstations, replacing it with NFS.

At the time, Sun also decided to reorganize the directory layout of the system; many of the conventions used in most UN*Xes at the time, such as:

	/sbin and /usr/sbin;

	/usr/share;

	/var;

were introduced in that reorganization.

/usr/share was introduced for files that were platform-independent, so that diskless workstations with different instruction sets could all use the same versions of those files on a file server.

I decided to store the tzdb files under /usr/share.  The machines Sun sold *at that time* were all big-endian, so byte order would not be a problem for then.

*However*, Sun was also developing their 80386-based Sun386i line of workstations; x86 processors are little-endian, so that would *make* byte order a problem.

I decided to change the file format to store multi-byte integral values in network byte order, i.e. big-endian format, and changed the code to support that.  I submitted that patch to Arthur, and it was accepted.

I.e., at the time, there were cases where the the platform on which the files were installed, in the sense of "the machine to which the disks on which the files are stored are attached", is not the platform that is running code that is reading the files, and may not have the same byte order as the machine on which the files are stored.

See

	https://mm.icann.org/pipermail/tz/1986-November/000422.html

for the message announcing that:

	The important differences:

	*	There's a new format for the binary versions of time zone information
		files, designed to allow the files to be used by both big-endian and
		little-endian machines in shared file environments.

		...

Diskless workstation support using NFS isn't much of a thing these days, so that rationale might be less important, although going with native-byte-order tzdb files *would* mean that they should be moved out of /usr/share on systems that store them there - but, on my UN*X box, from some obscure UN*X-box company in Cupertino, store them under /var/db/timezone/zoneinfo/ anyway.

> This wouldn't let users rip out all their endianness code since not all TZif files would be version 4; but it might reduce running time in programs that read lots of TZif files.
> 
> Does this make sense, or is it just premature optimization?

I'd call it premature to the extent that we don't know whether it'd significantly reduce running time in programs of that sort.

Somebody should probably do some tests with both host-native and big-endian files to see what performance difference it makes.


More information about the tz mailing list