Inconsistent format of data files?

Colin Bowern Colin.Bowern at officialcommunity.com
Fri Jul 21 13:08:08 UTC 2006


Hi Srdjan,

Thanks, I'll take a look at that.  I've got a suggestion from Arthur I'm
going to try this weekend.  I've almost got it working as far as reading
it in and converting it out the other end to XML.  I'll be posting my
source on CodePlex.com as soon as the project is created.

Cheers,
Colin

-----Original Message-----
From: Srdjan Krajnalic [mailto:ludiskr at yahoo.com] 
Sent: Friday, July 21, 2006 3:49 AM
To: tz at lecserver.nci.nih.gov
Subject: RE: Inconsistent format of data files?

Hi Colin,

The intention was to make zone files human-readable. Presumably there
are users out there who consider it a good read ;) 

I'll write a php script to convert zone files into a more
machine-readable format, http://php-tz.110mb.com/ Just created the
account, will have something online probably by end of the week. 

One thing you're right about though, using dashes in date/time stamps
instead of space would not decrease file readability but would help some
of us considerably.

Srdjan

 

-----Original Message-----
From: Colin Bowern [mailto:Colin.Bowern at officialcommunity.com]
Sent: Thursday, July 20, 2006 10:29 PM
To: Olson, Arthur David (NIH/NCI) [E]; tz at lecserver.nci.nih.gov
Subject: RE: Inconsistent format of data files?

Hi Arthur,

I've trimmed the input for the leading and trailing spaces in the latest
iteration.  The problem I'm having comes into play when you've got
whitespaces between fields, yet fields have multiple bits of data
separated by spaces.  The zic manual page says:

"White space characters and sharp characters may be enclosed in double
quotes (") if they're to be used as part of a field."

Relating to that statement the problem is when I see a Zone record as
such:

Zone	Antarctica/Vostok	0	-	zzz	1957 Dec 16

In this example the final Until field has several whitespaces but is not
enclosed in double quotes.  If we say that all fields are tab separated
then it's easy to interpret the above line, but if it's too liberal then
I would think the Until field should be wrapped with double quotes.

Thoughts?

Thanks,
Colin

-----Original Message-----
From: Olson, Arthur David (NIH/NCI) [E]
[mailto:olsona at dc37a.nci.nih.gov]
Sent: Thursday, July 20, 2006 4:20 PM
To: Colin Bowern; tz at elsie.nci.nih.gov
Subject: RE: Inconsistent format of data files?

Like the good book (the zic manual page) says...
     Input lines are made up of fields.  Fields are separated
     from one another by any number of white space characters.
     Leading and trailing white space on input lines is ignored.
So, by definition, there can't be "extra" tabs at the beginnings of
lines.

While we could make the stuff in the time zone package more consistent,
there are presumably files out in the wild created by other folks that
wouldn't match whatever consistent pattern we settled on. The safest
course for developers is to parse liberally in accordance with the
manual page.

				--ado

________________________________

From: Colin Bowern [mailto:Colin.Bowern at officialcommunity.com]
Sent: Wednesday, July 19, 2006 1:37 PM
To: tz at lecserver.nci.nih.gov
Subject: Inconsistent format of data files?



Hi,

 

I'm working on compiling the time zone data into an XML file for easier
handling in a program.  I noticed some inconsistencies in the format of
the files versus the description in zic.8.txt.  For example, in the
latest version in Africa on line 93 there is an extra tab at the
beginning.  Can anyone confirm if this?  I've attached screenshots of
what I am seeing.
 




More information about the tz mailing list