[tz] A history maintaining repository for tzdata

Sat Oct 8 18:42:08 UTC 2011

On Sat, Oct 8, 2011 at 08:43, Bill Sommerfeld <sommerfeld at alum.mit.edu> wrote:
> An ancestor of the current tzdata was posted to the usenet mod.sources
> newsgroup with a date of March 7, 1986; I found copies at:

Fantastic.  I got the idea that Usenet was the next place to go, but I
had to catch a bus so I put the initial part up to let people know
what I was working on.

Annoyingly the reason I threw my work up on github in a hurry last
night was because I was going to Galway last night.  Now I'm here and
while I have the CDROM *cases* of the first three volumes of the
Walnut Creek Usenet Source Code archive, I suspect the CDROMs
themselves are back in Dublin.  Argh.

Anyway, thanks for the links.  Any additional early releases I find
will have similar style commits - correct authors, timestamps and
message ids.  I'm downloading the comp.sources.unix archives now and
will get the mod.sources archives next.  Was settz posted on
alt.sources prior to mod.sources?

As for the points raised by other posters - US based hosts and
preserving SCCS history...

First, I chose github just for a quick place to push the repo.  I
obviously have the full copy (and more importantly the scripts used to
generate it which I'll push up later today) and I'm in Ireland.  It
lives in a VM on my laptop and I've pushed out all the materials I
used to build the repo to a few other machines in other countries.

I have two thoughts on US based servers.  First, there are a host of
other projects on github that have these files in some form so I'm not
really concerned.  Second, this repository is a work in progress that
folks here can evaluate in terms of its structure (see the README for
the issues I'd like feedback on).  It's my belief that I should send
my repo generation scripts and supporting metadata to kre (or ado or
eggert) and have them generate the repo that people actually use.

Lastly I completely agree that the SCCS history should be preserved if
possible.  However I don't think I can get access to it without cause
ado or eggert a lot of problems.  I think I know how to weave in SCCS
commits into the resulting git tree.  Essentially the algorithm will
go something like this (in shell/pythony pseudo code):

 for commit, next_commit in list_of_tags; do
   start_date=# get date of this commit
   end_date=# get date of next commit
   # get list of changes in sccs repository between those dates
   # for each change, get the date, commit message and diff
     # apply diff to repo, commit with correct date and metadata
   git rebase next_commit
 done

How to do this will depend on what ado actually uses - SCCS or RCS or
if he migrated from one to the other at some point.  I also suspect
there are some tweaks that will be required (I'm not sure how rebase
deals with empty commits for instance and if everything is working
correctly every rebase will cause an empty commit).  And obviously
when that happens everyone would need to resync their tree.  But I'm
happy to help make it happen.

In the meantime I'll get the scripts and data kre needs to generate a
repo that he can make a "blessed" repository for his own use (and
wherever he chooses to publish it - if anywhere).  Well, first I'll
get the pre-93 sources and the tzcode in there first.

Kevin