Floating-point time
Monty Solomon
monty at roscom.COM
Fri May 5 12:54:11 UTC 1995
FYI. Excerpts from RISKS DIGEST 17.08, 17.09, 17.10, 17.11
-----------------------------
Date: Sun, 23 Apr 1995 22:02:42 +0059 (EDT)
From: Robert J Horn <rjh at world.std.com>
Subject: Floating-Point Time
The opponents of floating-point representation for time have done an
insufficient analysis. About twenty years ago I was part of a research
group doing extensive time series analysis of weather and related data. We
needed a good way to represent time. Fortunately we had a few astronomers
on the team, so time was reasonably well understood.
We chose "second of century", using a double precision floating point
representation. Analysis showed that this would preserve millisecond
accuracy for the span of interest. (Actually for all of recorded history
and more.) Since we usually were satisfied with one minute accuracy this
seemed sufficient. There was a brief debate about using a better time base,
but 12:00:01 AM GMT, 1 January, 1901 was easy to explain to everyone. There
are a few applications that need better than millisecond precision, but for
most of the worlds applications double precision floating point will provide
enough precision for the next few millenia. (A simple test for those who
are unsure about their needs. Do you compensate for the variations in the
rate of the Earth's rotation? If not, you probably don't need millisecond
accuracy.)
This notation had some interesting side effects. At the time, floating
point turned out to be somewhat faster than 64-bit integers due to a
quirk of hardware. It also led to excellent compatibility with the
other time series processing. Time was just another well behaved
variable. This notation eliminated a lot of the mistakes made by the
typical programmer who is ignorant of traditional time notations and
their problems. There could have been some round-off issues, but we
rarely did any arithmetic other than addition or subtraction of two
times, where millisecond accuracy is maintained. It even led to a
simple notation for interval time span data, e.g. "0.01 inches of rain
fell between 1633 and 1647 on ...", which is how many meteorological
measurements are made.
The difficult problems were in translation to and from local. The most
severe problem was the inherent ambiguity of local time in recent decades.
There are two true times corresponding to each time in the one hour of
overlap when Daylight Savings shifts back to Standard. Correctly
resolving this ambiguity was always a headache. Fortunately most
professional measurements have been recorded in UTC, or GMT before UTC
was defined.
A word of caution, double precision floating point is suitable for an
internal representation of UTC, or "absolute" time. You have to do your
own analysis if you are interested in timing relative to some event.
Rob Horn rjh at world.std.com
P.S. The turn of century problem has made The NY Times. It may be so
widely hyped that almost all the problems are fixed by the time it comes.
[Hmm! According to you, it comes at 1/1/01 rather than 1/1/00.
I wonder who agrees with that! PGN]
-----------------------------
-----------------------------
Date: Tue, 25 Apr 1995 10:54:01 -0400
From: rjh at world.std.com (Robert J Horn)
Subject: Re: Floating-Point Time
In comp.risks you write:
> [Hmm! According to you, it comes at 1/1/01 rather than 1/1/00.
> I wonder who agrees with that! PGN]
Who am I to argue with astronomers? They work in mathematics and FORTRAN,
so counting starts with one. Therefore the first day of the first century
is 1/1/00000001. C was invented later, and never really accepted by
astronomers. We just moved things closer by starting at 1901.
Rob Horn rjh at world.std.com
[The customary convention was also commented on by
jcs at zoo.bt.co.uk (John Sager),
pbm1001 at thor.cam.ac.uk (Paul Menage),
harper at kauri.vuw.ac.nz (John Harper),
ag325 at detroit.freenet.org (William M. Bickham),
"david (d.p.) woodman" <woodman at bnr.ca>,
rjwells at undergrad.math.uwaterloo.ca (justin wells), and
Greg Lindahl <gl8f at fermi.clas.virginia.edu>, an astronomer who noted
"Astronomers get to go to 2 sets of turn-of-the-century parties...
you nay-sayers only get to go to one." I'll be at the former, when
most of the computer-related risks are likely to begin. PGN]
-----------------------------
Date: Tue, 25 Apr 95 21:02:17 PDT
From: "Peter G. Neumann" <neumann at csl.sri.com>
Subject: Re: Floating-Point Time
Oh, yes, I certainly agree that the astronomers and ephemeris/almanac folks
like 1/1/01 as the century start. However, because there was no year ZERO,
that does not scale backwards. The first century BC clearly began on
1/1/-100, and the first millennium BC on 1/1/-1000. The only SANE way to
handle this is to provide a 99-year first century; just as we have
leap-years and leap-seconds, we could have a backwards leap-century.
Indeed, it is just as well there were no computers in virtual-year 0000, or
the religious wars would have been proven recursively unsolvable. PGN
-----------------------------
Date: Tue, 25 Apr 95 13:53:39 PDT
From: geoff at ficus.CS.UCLA.EDU (Geoff Kuenning)
Subject: Re: Floating-Point Time (Robert J Horn, RISKS-17.08)
> We chose "second of century", using a double precision floating point
> representation. Analysis showed that this would preserve millisecond
> accuracy for the span of interest.
Sigh. There is more to time calculations than just understanding time, and
there is more to numerical analysis than just the range of the mantissa.
I'll try this once more, keeping it brief. This project wouldn't have used
IEEE format 20 years ago, but let's proceed with the analysis under modern
assumptions. The IEEE double-precision floating-point representation
provides 53 bits of significance in the mantissa. Using the approximation
that 2^10 = 10^3, this can be see to allow a range of about 8x10^15 in the
mantissa, before bits start getting dropped. Since there are about 3x10^7
seconds in a year, or about 10^8 every 3 years, one can represent about
8x16x3 = 384 years to millisecond precision without violating that range,
right?
Wrong. This is true only if you represent time as integer milliseconds.
Since the representation used seconds, the milliseconds were represented
fractionally. There are only four millisecond values that can be
represented accurately in a binary floating-point system: 0, 250, 500, and
750. ANYTHING ELSE is an approximation. This is especially true if your
calculations involve any addition and subtraction of times.
> Since we usually were satisfied with one minute accuracy this
> seemed sufficient.
It sounds like this project involved some sort of modeling of the physical
world. In such cases, the loss of a few bits of accuracy may not matter,
although I still think programmers ought to understand numerical analysis
better than most do. But remember that the original discussion of time
representation arose in the context of measuring time in computer and
financial applications, where these bits can matter.
> There are a few applications that need better than millisecond precision,
> but for most of the worlds applications double precision floating point
> will provide enough precision for the next few millennia. (A simple test
> for those who are unsure about their needs. Do you compensate for the
> variations in the rate of the Earth's rotation? If not, you probably don't
> need millisecond accuracy.)
This is both a broad and a biased statement. Here's a simple response to
the simple test: I'm not dealing with astronomical phenomena, and so I
couldn't care less about variations in the Earth's rotation. What I care
about is whether I can query the system time, do something, query the time
again, and subtract to get an accurate elapsed time. I care about not only
milliseconds, but microseconds (and in a few years, nanoseconds).
Now it's true that I generally deal with time in a small range.
Unfortunately, the system time is represented in terms of a relatively long
base interval. So I have to be able to take two times measured in years,
subtract them, and get microsecond precision in my results. If the times
are represented as integer microseconds, I can be sure that everything will
"balance," as the accountants say. If a sloppy floating-point
representation insists on giving it to me as integer and fractional seconds,
things won't add up, and my papers may not get published.
> There could have been some round-off issues, but we
> rarely did any arithmetic other than addition or subtraction of two
> times, where millisecond accuracy is maintained.
But addition and subtraction are precisely the places where numerical errors
are introduced! Millisecond accuracy was *not* maintained in this
situation. Since you only cared about minutes, this hardly mattered, but
it's still an incorrect statement.
> A word of caution, double precision floating point is suitable for an
> internal representation of UTC, or "absolute" time.
It's only appropriate when you're doing modeling of the physical world,
where you don't care about losing a few bits of accuracy in the fraction
because your measurements aren't that good anyway. Double precision is a
*terrible* way to model elapsed time inside a computer system. It has
nothing to do with the nature of time, and everything to do with numerical
analysis.
Geoff Kuenning g.kuenning at ieee.org geoff at ITcorp.com
http://www.cs.ucla.edu/ficus-members/geoff/
-----------------------------
-----------------------------
Date: Sat, 29 Apr 1995 19:49:08 GMT
From: dcline at netcom.com (David Cline)
Subject: Re: Floating-Point Time (Kuenning, RISKS-17.09)
> ... Since there are about 3x10^7seconds in a year, or about 10^8 every
> 3 years, one can represent about 8x16x3 = 384 years to millisecond
> precision without violating that range, right?
Wrong. This confuses milliseconds and microseconds; You can represent 285
years to *microsecond* accuracy in 53 bits. If you only care about
millisecond accuracy, you can represent about 285,000 years. There are also
ways of using the sign bit to double the effective range.
Dave Cline Spring Valley Software dcline at netcom.com
[Your moderator is dismayed that this is dragging on so long! PGN]
-----------------------------
Date: Fri, 28 Apr 95 11:12:30 EDT
From: hopkins at VFL.Paramax.COM
Subject: Re: Floating-Point Time
On the year-zero and religious wars: PGN suggests [RISKS-17.09] that
first-century dates (which were, after all, not invented until well after
the fact) would have created religious wars had there been computers to
suggest that there should be a year zero.
Any self-respecting computer, however, would have balked at attempts to
divide the factions by zero.
Bill Hopkins
hopkins at VFL.Paramax.Com Unisys Corporation (Soon to be Loral, they say)
610-648-2854 or 363-7464 Valley Forge Eng'g Ctr, POB 517, Paoli PA 19301
-----------------------------
-----------------------------
Date: Tue, 2 May 1995 19:49:51 -0400
From: "Andrew D. Fernandes" <adfernan at cnd.mcgill.ca>
Subject: Re: Floating-point time
It seems to me that the real issue as to whether or not floating point
representations are appropriate for timestamps is not "can I get microsecond
resolution?" but "is it going to work unconditionally?"
Floating point numbers bedevil numerical programmers for several reasons; we
need to know the radix, number of mantissa and exponent digits, behaviour of
rounding, etc. to guarantee "good" results---there is a reason that
textbooks have been written about numerical analysis. Modern computers
generally stick to the IEEE-754 standard for floating point arithmetic, and
thus code is usually fairly interchangeable, but anyone who has ever done
intensive number crunching on old IBMs, VAXen, or CRAYs can tell horror
stories about interchanging floating point programs between systems. Even
going between a SPARCstation and my '486 at home, I can see a few digits
change. "Unconditional portability" is ludicrous under these conditions.
A 64 bit integer can represent 1.84e19 values, which implies about 2
nanosecond resolution of a thousand year interval. Integer math is very
portable across computer systems, or at least is very easy to describe
fully, and 1 + 1 always evaluates as 2, and not 2.0001.
-Andrew D. Fernandes (adfernan at cnd.mcgill.ca)
-----------------------------
Date: Thu, 4 May 95 11:42 PDT
From: ludemann at netcom.com (Peter Ludemann)
Subject: Re: Floating-point time
In my original note I mentioned something really simple: handling clock
ticks. Not handling time, which is a topic for PhD theses (Stanford is
about to grant a PhD for a thesis on temporal representation in databases).
My problem was simple: I needed to do 48-bit arithmetic on a machine which
had 32-bit integers. I had been given an assembler routine which did the
48-bit arithmetic wrong; and I was fortunate to test it in December when its
accumulated error was about 3 minutes in converting clock ticks to a date
(in January, it had almost no error).
So, what to do? Obviously, writing 48-bit arithmetic in assembler was
error-prone and tedious. Doing it in PL/I (this was before C was generally
available) wasn't much more fun nor less error-prone (anyone who has read
the PL/I manual's description of handling overflow and precisions will agree
that it's not a pretty sight; and I later learned that the PL/I implementors
had got it wrong anyway). And I didn't have access to LISP's
rational-number packages.
In a flash of inspiration, I realized that double-precision floating
point, with 53-bit mantissas, would handle my 48-bit integers with no
round-off errors. Converting between 48-bit integers and floating
point is simple (it's a "well-known assembler idiom" that takes about
3 instructions). And everything would then proceed perfectly (even a
Pentium can get floating-point addition and subtraction right).
At one stroke I had removed a few pages of difficult assembler and
replaced them by a few lines of PL/I. Less code, less chance of error.
[Inspired by this, I wrote a disk accounting package that eschewed
packed decimal arithmetic and instead used floating point (of
course, I calculated everything in cents because 1.01 doesn't
represent exactly but 101 does. Another success.)]
One of my jobs as a programmer is to reduce the risk of error in my system.
I can only test so much. Testing 48-bit integer arithmetic is not fun, at
least for me; and proving the code right is difficult (Knuth allegedly wrote
"Beware of bugs in the above code; I have only proved it correct, not tried
it."). PL/I's implementation of 15-digit packed decimal was buggy. And I
wasn't being paid by the number of lines of code (au contraire: if I
finished early, I took time off).
The lessons for RISKS: before you try to do something complicated, maybe
there's something simple and reliable lying around that you can pervert
slightly into the form you want. [My original RISKS lesson was to be sure
to test time/date conversion routines at many times of the year; if they
work in January, they might not work in December.]
- Peter Ludemann
-----------------------------
Date: Tue, 2 May 1995 18:26:06 GMT0BST
From: Phil Brady <P.R.Brady at swansea.ac.uk>
Subject: Re: Floating-point time
[Your moderator is dismayed that this is dragging on so long! PGN]
Agreed - the debate has been interesting, but hasn't it run it's course yet?
The fact is that every programmer needs to decide whether to use floating,
decimal, integer, string, etc and the appropriate precision for every
variable in every program depending on the application. If they don't know
the appropriate form for time for the problem in hand, then they aren't
programmers!
Phil Brady, University of Wales, Swansea 01792 295160
-----------------------------
-----------------------------
Date: 24 March 1995 (LAST-MODIFIED)
From: RISKS-request at csl.sri.com
Subject: Info on RISKS (comp.risks), contributions, subscriptions, FTP, etc.
The RISKS Forum is a moderated digest. Its USENET equivalent is comp.risks.
Undigestifiers are available throughout the Internet, but not from RISKS.
SUBSCRIPTIONS: PLEASE read RISKS as a newsgroup (comp.risks or equivalent) on
your system, if possible and convenient for you. BITNET folks may use a
LISTSERV (e.g., LISTSERV at UGA): SUBSCRIBE RISKS or UNSUBSCRIBE RISKS. U.S.
users on .mil or .gov domains should contact <risks-request at pica.army.mil>
(Dennis Rears <drears at pica.army.mil>). UK subscribers please contact
<Lindsay.Marshall at newcastle.ac.uk>. Local redistribution services are
provided at many other sites as well. Check FIRST with your local system or
netnews wizards. If that does not work, THEN please send requests to
<risks-request at csl.sri.com> (which is not yet automated). SUBJECT: SUBSCRIBE
or UNSUBSCRIBE; text line (UN)SUBscribe RISKS [address to which RISKS is sent]
CONTRIBUTIONS: to risks at csl.sri.com, with appropriate, substantive Subject:
line, otherwise they may be ignored. Must be relevant, sound, in good taste,
objective, cogent, coherent, concise, and nonrepetitious. Diversity is
welcome, but not personal attacks. PLEASE DO NOT INCLUDE ENTIRE PREVIOUS
MESSAGES in responses to them. Contributions will not be ACKed; the load is
too great. **PLEASE** include your name & legitimate Internet FROM: address,
especially from .UUCP and .BITNET folks. Anonymized mail is not accepted.
ALL CONTRIBUTIONS CONSIDERED AS PERSONAL COMMENTS; USUAL DISCLAIMERS APPLY.
Relevant contributions may appear in the RISKS section of regular issues
of ACM SIGSOFT's SOFTWARE ENGINEERING NOTES, unless you state otherwise.
All other reuses of RISKS material should respect stated copyright notices,
and should cite the sources explicitly; as a courtesy, publications using
RISKS material should obtain permission from the contributors.
RISKS can also be read on the web at URL http://catless.ncl.ac.uk/Risks
Individual issues can be accessed using a URL of the form
http://catless.ncl.ac.uk/Risks/VL.IS.html
(Please report any format errors to Lindsay.Marshall at newcastle.ac.uk)
RISKS ARCHIVES: "ftp unix.sri.com<CR>login anonymous<CR>[YourNetAddress]<CR>
cd risks<CR> or cwd risks<CR>, depending on your particular FTP.
Issue J of volume 17 is in that directory: "get risks-17.J<CR>". For issues
of earlier volumes, "get I/risks-I.J<CR>" (where I=1 to 16, J always TWO
digits) for Vol I Issue j. Vol I summaries in J=00, in both main directory
and I subdirectory; "bye<CR>" I and J are dummy variables here. REMEMBER,
Unix is case sensitive; file names are lower-case only. <CR>=CarriageReturn;
UNIX.SRI.COM = [128.18.30.66]; FTPs may differ; Unix prompts for username and
password. Also ftp bitftp at pucc.Princeton.EDU. WAIS repository exists at
server.wais.com [192.216.46.98], with DB=RISK (E-mail info at wais.com for info)
or visit the web wais URL http://www.wais.com/ .
Management Analytics Searcher Services (1st item) under http://all.net:8080/
also contains RISKS search services, courtesy of Fred Cohen. Use wisely.
-----------------------------
More information about the tz
mailing list