strtotm

Bradley White bww+ at CS.CMU.EDU
Tue Jun 23 02:42:18 UTC 1992


Thanks to Guy Harris and Paul Eggert for their comments, and again to
Paul Eggert for an earlier, private note.

There at least seems to be a consensus that adding a date/time parser
to ado's "tz" package would be a reasonable thing, although it may be
difficult to decide exactly what it should look like.  In this note I
will try to summarize what has been said so far in the hope that a
reasonable specification may begin to take shape.

The final goal would be an implementation of sufficient quality and
taste to merit inclusion in the package.

* Prior Art

Perhaps a reasonable candidate already exists, or can be grown from
existing code.  These routines (in alphabetical order) have been
mentioned:

	  func		    who			where
	-----------------------------------------------------
	dparsetime()	RAND/UCI		MH
	getabsdate()	Moraes			C News
	getdate()	USL			Sys V Rel 4
	getdate()	Bellovin/Salz/Berets	B News
	parsedate()	Hamey/Accetta		Mach
	partime()	Harrenstien/Eggert	RCS
	strptime()	Harris			SunOS 4.1[.x]

The following evaluation criteria (assuming correctness) have been
mentioned:

	- interface (some are more useful than others)
	- date/time language
	- implementation methods
	- ease of internationalization
	- default/optional values
	- error checking
	- speed

* Interface

So far, it seems that we want to return a struct_tm (as opposed to
a time_t) plus an indication of how much of the date/time string was
used, leaving the following minimal interface.

	in:	struct tm * (to fill in [if NULL malloc or static?])
		char * (date/time string to parse)

	out:	struct tm * (result [NULL on error?])
		char * (unconsumed part of string)

What else is needed?  Perhaps we could list full declarations for each
of the above routines.

* The date/time language

I think we can all agree that we want to be able to parse strings
like ...

	Mon Jun 22 15:26:09 EDT 1992
	Mon, 22 Jun 92 15:26:09 -0400 (EDT)
	06/22/1992 15:26:09 -0400
	1992-06-22T15:26:09

... and whatever similar strings are for different locales.  However,
some of the above implementations can also handle strings like ...

	now
	next Friday
	three days ago
	two years from today
	new year's eve, 1999
	half-past four the day after tomorrow
	8pm US/Pacific on the 1st Tuesday in November, 1996

... but this may seem like kitchen-sink-ism.  What language do we
want to accept?

* Implementation Methods

A particular answer to the language question (e.g., the input language
is regular) may suggest a preferred implementation method (e.g., DFA's)
and/or a useful tool (e.g., lex).

Paul points out that some kind of hack may be necessary to allow any
yacc-generated parsers to co-exist.  On the other hand, lex, yacc,
and other formal methods are usually easy to extend, and provide a
good level of correctness-confidence.

Should we limit ourselves to regular, LL(1), or LALR(1) languages?

* Internationalization

It is clear that there needs to be support for "internationalization"
(like $LANG, $LC_TIME, ..., ???).  Depending upon the implementation
method and date/time language, this may be more or less difficult.
Indeed, the presence of features in the target system, like dynamic
loading, may change the best answer.  What's the favoured approach?

* Default values

How do you interpret relative times?  How do you specify default values
for optional fields?  How do you know where default values were used?

Default timezones probably need to be specified with something like
"localtime", "US/Eastern", or another zoneinfo name, so that standard
and daylight offsets can be given at the correct times of year.  How
do you indicate this?

* Out-of-range values

Should out-of-range numeric values result in a parse error?  If so, is
"out-of-range" smart (e.g., knows about month lengths, leap years, leap
seconds, correct day-of-week, ....)?

Your comments are appreciated.  Is this even worth pursuing?

Brad



More information about the tz mailing list