[tz] tabs vs spaces
guy at alum.mit.edu
Thu May 2 23:30:20 UTC 2013
On May 2, 2013, at 4:06 PM, Tobias Conradi <mail.2012 at tobiasconradi.com> wrote:
> On Fri, May 3, 2013 at 12:38 AM, Guy Harris <guy at alum.mit.edu> wrote:
>> On May 2, 2013, at 2:59 PM, Tobias Conradi <mail.2012 at tobiasconradi.com> wrote:
>>> "IANA and the tz database - diverging from Theory"
>> Then perhaps it's time to retire Theory in favor of RFC 6557.
> Or retire the RFC.
Possibly. My preference is to retire Theory.
>>> Where is the "ultimate documentation of the time zone database"?
>> Perhaps nowhere. Perhaps a number of places, such as the various man pages for the technical details of the format of time zone data files and of the binary files produced by zic, and RFC 6557 for policies.
> The man pages cannot be accessed with a browser and cannot be
> html-href-linked, can they?
They currently cannot be directly accessed from
although the versions for at least some flavors of UN*X *can* be, e.g.
There is, obviously, no *technical* reason why they *couldn't* be.
I might suggest that the format of the time zone data files be published in an RFC, and that the format of the binary files produced by zic perhaps be treated as an implementation detail and left in the man page.
>>> OK, that is the current reason. But what might have been the reason
>>> when zone.tab was established with single tab?
>> Code to parse it was a bit quicker to whip up with that limitation, given that it was probably not viewed by its creator as being as "core" to the time zone database as the time zone data files?
> Might contradict the claim by random832
> 1) In most languages, splitting up by
> 'any whitespace' is the simplest thing in the world.
> 2) In C (where nothing is simple), you could
> reuse the code from zic itself.
Part of the problem with zone.tab is that "some whitespace" is allowed in the last field on the line, so, to allow arbitrary white space to separate columns, the parsing would have to treat whitespace as a field separator between:
the country code column and the coordinates column;
the coordinates column and the TZ column;
the TZ column and the comments column;
but not to treat white space *after* that point as a field separator - blanks, at least, in the comments column are part of the entry in that column.
I'm not sure there's a quick-and-dirty way to tell Awk, for example, to do that (and "Awk" means "any of the versions of Awk found in the UN*Xes commonly available in late 1996", as that's when tzselect was written; its use of Korn-shell-isms was itself a source of controversy, at least recently).
More information about the tz