strtotm
Bradley White
bww+ at CS.CMU.EDU
Tue Jun 23 02:42:18 UTC 1992
Thanks to Guy Harris and Paul Eggert for their comments, and again to
Paul Eggert for an earlier, private note.
There at least seems to be a consensus that adding a date/time parser
to ado's "tz" package would be a reasonable thing, although it may be
difficult to decide exactly what it should look like. In this note I
will try to summarize what has been said so far in the hope that a
reasonable specification may begin to take shape.
The final goal would be an implementation of sufficient quality and
taste to merit inclusion in the package.
* Prior Art
Perhaps a reasonable candidate already exists, or can be grown from
existing code. These routines (in alphabetical order) have been
mentioned:
func who where
-----------------------------------------------------
dparsetime() RAND/UCI MH
getabsdate() Moraes C News
getdate() USL Sys V Rel 4
getdate() Bellovin/Salz/Berets B News
parsedate() Hamey/Accetta Mach
partime() Harrenstien/Eggert RCS
strptime() Harris SunOS 4.1[.x]
The following evaluation criteria (assuming correctness) have been
mentioned:
- interface (some are more useful than others)
- date/time language
- implementation methods
- ease of internationalization
- default/optional values
- error checking
- speed
* Interface
So far, it seems that we want to return a struct_tm (as opposed to
a time_t) plus an indication of how much of the date/time string was
used, leaving the following minimal interface.
in: struct tm * (to fill in [if NULL malloc or static?])
char * (date/time string to parse)
out: struct tm * (result [NULL on error?])
char * (unconsumed part of string)
What else is needed? Perhaps we could list full declarations for each
of the above routines.
* The date/time language
I think we can all agree that we want to be able to parse strings
like ...
Mon Jun 22 15:26:09 EDT 1992
Mon, 22 Jun 92 15:26:09 -0400 (EDT)
06/22/1992 15:26:09 -0400
1992-06-22T15:26:09
... and whatever similar strings are for different locales. However,
some of the above implementations can also handle strings like ...
now
next Friday
three days ago
two years from today
new year's eve, 1999
half-past four the day after tomorrow
8pm US/Pacific on the 1st Tuesday in November, 1996
... but this may seem like kitchen-sink-ism. What language do we
want to accept?
* Implementation Methods
A particular answer to the language question (e.g., the input language
is regular) may suggest a preferred implementation method (e.g., DFA's)
and/or a useful tool (e.g., lex).
Paul points out that some kind of hack may be necessary to allow any
yacc-generated parsers to co-exist. On the other hand, lex, yacc,
and other formal methods are usually easy to extend, and provide a
good level of correctness-confidence.
Should we limit ourselves to regular, LL(1), or LALR(1) languages?
* Internationalization
It is clear that there needs to be support for "internationalization"
(like $LANG, $LC_TIME, ..., ???). Depending upon the implementation
method and date/time language, this may be more or less difficult.
Indeed, the presence of features in the target system, like dynamic
loading, may change the best answer. What's the favoured approach?
* Default values
How do you interpret relative times? How do you specify default values
for optional fields? How do you know where default values were used?
Default timezones probably need to be specified with something like
"localtime", "US/Eastern", or another zoneinfo name, so that standard
and daylight offsets can be given at the correct times of year. How
do you indicate this?
* Out-of-range values
Should out-of-range numeric values result in a parse error? If so, is
"out-of-range" smart (e.g., knows about month lengths, leap years, leap
seconds, correct day-of-week, ....)?
Your comments are appreciated. Is this even worth pursuing?
Brad
More information about the tz
mailing list