comments on draft-newman-datetime-00.txt

Paul Eggert eggert at twinsun.com
Tue Dec 31 23:18:33 UTC 1996


I liked Markus Kuhn's comments on your Internet Draft
``Date and Time on the Internet''
<URL:ftp://ds.internic.net/internet-drafts/draft-newman-datetime-00.txt>
(December 1996).  I have the following further comments and
suggestions, some of them echoing Kuhn, some providing more details.
Please let me know if you have any questions about these comments,
and I'd appreciate hearing about any updates to your draft.


Section 2

This section should mention that all dates use the Gregorian calendar.


Section 3

Section 3 suggests that 00-49 should map to 2000-2049, and that 50-99
should map to 1950-1999.  This contradicts existing practice in one
important set of implementations: the XPG4 standard says that 00-68
should map to 2000-2068, and 69-99 should map to 1969-1999.  This
accommodates nonnegative Posix timestamps which start at 1970-01-01
00:00:00 UTC (equivalent to 1969-12-31 in some time zones).

Any suggestion for mapping 2-digit to 4-digit dates is bound to be
controversial.  Perhaps it would be best to remove this recommendation
entirely.  If you must suggest a range, the XPG4 range is a bit better
than the range in the December draft, as it matches existing practice
better (and it will give you another 19 years to upgrade your software...).

The recommendation ``Three digit years MUST be interpreted by adding
1900'' seems to be designed only for interfacing to buggy software
(e.g. software that uses C's tm_year value without changing it).  It
is also incompatible with ISO 8601, which uses 3-digit strings for
day-of-year.  I suggest removing this recommendation.

Section 3 mentions ``two digit'' years and ``three digit'' years
without making it clear that we are talking about lexical syntax, not
numeric value.  This should be clarified.  For example, the string
`0096' should denote the year 96, not 1996; otherwise there would be
no way to specify the year 96.


Section 4.1

A good reference for local time zone rules is the tz database
maintained by Olson et al (<URL:ftp://elsie.nci.nih.gov/pub/>, updated
regularly).


Section 4.2

  When the local offset is unknown, the offset "-00:00" MAY be used to
  indicate that the time is in UTC and the local offset is unknown.

This is worded a little confusingly -- could you please clarify?
Is it common to have situations where UTC is known but local time isn't?
Without more motivation, it's hard to see why this suggestion is needed.


Section 4.3

  An alternative would be to show a list of the timezone labels
  defined in [section XXX].

I'm not sure what is meant here.

Surely by ``timezone label'' you do not mean commonly used strings
like `EST', since such strings are ambiguous in practice.  E.g. `EST'
has one meaning in the US, another in Brazil, and yet another in
Australia (where the meaning of `EST' also depends on whether it is
winter or summer!).  Also, there is controversy about what the time
zone labels ought to be -- e.g. should Canadian Eastern Standard Time
be called `EST' (English) or `HNE' (French)?

If by ``timezone label'' you mean some other identifying label, then I
suggest using the Posix TZ string as extended by the Olson package
(again, see <URL:ftp://elsie.nci.nih.gov/pub/>).  This is common
practice, is widely used, and there are multiple implementations;
e.g. the source code to another implementation of the client library
for the Olson TZ extensions can be found in
<URL:ftp://prep.ai.mit.edu/pub/gnu/glibc-1.09.1.tar.gz>.

For example, the Olson TZ string `America/Los_Angeles' stands for
local time in Los Angeles; the Posix TZ string `CST-8' stands for the
time zone named `CST' that is always 8 hours ahead of UTC.  The
advantage of `America/Los_Angeles' is that it can work correctly for
all times in the past (the tz database has entries back to 1883, when
Los Angeles adopted standard time), and requires no user modification
to work correctly in the future, as the tables can be updated
automatically by administrators as needed.


Section 5.4

time-hour's range should be 00-23, not 00-24.  ISO 8601 allows 24 but
people are sometimes confused by it, it makes sorting a bit trickier,
and it provides no useful functionality in this context.

time-second allows leap seconds (value 60), but many systems do not
support leap seconds.  (E.g. Posix requires that the host not support
leap seconds in time_t values.)  The RFC should recommend what to do
on a system that lacks leap second support when it is given a time
stamp containing a leap second.  I suggest that such systems treat
values >= 60 as 59.999... (the number of 9s after the decimal point
being the maximum allowed by the host).

time-secfrac should use "." for the decimal point, as this is more
commonly used in computer applications.  Usurping "," for the decimal
point might cause problems in applications that use "," for other
punctuation.  ISO 8601 allows "." for decimal point.

time-numzone should not require specification of minutes (and seconds)
as they are normally zero.  E.g. change it to:

  time-numzone = ("+" / "-") time-hour [":" time-minute]

date-time should not require a "T" between the date and the time; it
should allow a space as well.  This is easier to read, is allowed by
ISO 8601, and is common practice.

ISO 8601 provides no way to represent years before the year 0000, or
after the year 9999.  This makes it difficult to represent timestamps
in some historical applications.  To fix this, you might extend the
syntax for date-fullyear to:

	date-fullyear = ["-"] 4*DIGIT

where the years are numbered ..., -0002, -0001, 0000, 0001, 0002, ....
(Note that unlike the traditional Julian calendar, there is a year 0
in the modern Gregorian calendar.)


Section 6

I don't know of any ``IANA Registry of Timezone Names''.  But please
see my comments on Section 4.3 above for more details about an
existing registry that could be used as the basis of an IANA registry.


Appendix A

Why is this section needed?


Appendix B

I don't see why this section is needed, since this draft RFC doesn't
care about the day of the week.  But if you think it's needed, here's
the canonical reference for Zeller's congruence (written in German):

	Chr. Zeller, Kalender-Formeln, Acta Mathematica, Vol. 9, Nov. 1886

And here is code that is derived directly from Zeller's paper and uses
Zeller's notation.

#include <stdio.h>

/* Return day of week, with 0 meaning Saturday and 1 meaning Sunday.
   See Chr. Zeller, Kalender-Formeln, Acta Mathematica, Vol. 9, Nov. 1886.  */
int zeller (int year, int month, int day)
{
  int jan_or_feb = month < 3;
  int y = year - jan_or_feb;
  int J = y / 100;  /* century number */
  int K = y % 100;  /* year number within the same */
  int m = month + 12 * jan_or_feb;  /* month number */
  int q = day;  /* day number in the month */
  int h = (q + ((m + 1) * 26) / 10 + K + K/4 + J/4 - 2*J) % 7;
		/* weekday number (1 is Sunday) */
  return h;
}

char *dayofweek[] = {
  "Saturday", "Sunday", "Monday", "Tuesday", "Wednesday",
  "Thursday", "Friday"
};

int main ()
{
  int year, month, day;

  printf("Enter the year (0001-9999): ");
  scanf("%d", &year);
  printf("\nEnter the month (1-12): ");
  scanf("%d", &month);
  printf("\nEnter the day of the month (1-31): ");
  scanf("%d", &day);
  printf("The day of the week is: %s\n", dayofweek[zeller (year, month, day)]);
  return 0;
}



More information about the tz mailing list