[tz] [PATCH] New file 'pre1970' for zones that differ only in pre-1970 time stamps.

Paul Eggert eggert at cs.ucla.edu
Fri Sep 6 21:10:48 UTC 2013


I've started to take a look at this, and it appears
to be nice work that will head us in the right direction;
thanks.  I found a problem, though, in that tzwinnow is
not winnowing out as much as I expect.
For example, tzwinnow -a 1970z should find that
Africa/Accra and Africa/Dakar are duplicates, since
they've both been at plain GMT since 1970, but tzwinnow
considers them to be distinct for some reason.

One suggestion: it would be nice for tzwinnow to have
an option where it ignores differences due only to the
time zone abbreviations, for applications that care only
about UTC offsets.

I found the problem with Accra and Dakar by running the
following test script, which is not intended to be portable
or fast but should run on any GNU/Linux host with the necessary
packages installed.

This script found that the version of the tz database that
you used had 417 zones (this does not count links), of
which 190 are duplicates from the year 2013 on.  Hence
it found 227 distinct zones from the year 2013 on, a
considerably smaller number than what you found with
tzwinnow.

If we ignore time zone abbreviations, a variant of the script
finds 314 duplicates, which means there are 103
distinct zones today.  Having to choose from 103
values should be significantly easier for users
than having to choose from 417.

#! /bin/sh

TOPDIR=$1
test -f "$TOPDIR/etc/zdump" || { echo >&2 "$0: usage: $0 topdir"; exit 1; }

start_time_t=0
start_year=2013
limit_year=2500

# Prepend "." to the path, since this is meant to be run
# in the source directory, which contains tzwinnow, zdump, and maybe 'date'.
LC_ALL=C
PATH=.:$PATH
TZ=UTC0
export LC_ALL PATH TZ

date_format='%Y-%m-%dT%H:%M:%S %Z'

for date_origin_option in '-d@' '-r' ''; do
  test -n "$date_origin_option" || { echo >&2 "date is dumb"; exit 1; }
  date_output=$(date $date_origin_option$start_time_t "+$date_format")
  [ "$date_output" = '1970-01-01T00:00:00 UTC' ] && break
done

zonedir=$TOPDIR/etc/zoneinfo

tmp=$(mktemp -d) || exit
# trap 'status=$?; rm -fr $tmp; exit $status' 0
# trap exit 1 2 13 15

(cd $zonedir &&
    find * ! -name '*.tab' -type f -ls |
    sort |
    awk '{if (inum != $1) print $NF; inum = $1; }' |
    sort
) >$tmp/names

tzwinnow -a ${start_year}z -B ${limit_year}z -z "$zonedir" -l \
  <$tmp/names >$tmp/tzwinnow.out

for name in $(cat $tmp/names); do
  dest=$tmp/zdump.out/$name
  mkdir -p $(dirname $dest)
  (TZ=$zonedir/$name date $date_origin_option$start_time_t "+$date_format" &&
   zdump -V -c $start_year,$limit_year $name | sed 's/^[^ ]* *//'
  ) >$dest || break
done

(cd $tmp/zdump.out && fdupes -qr . | sed 's@^\./@@') >$tmp/check.out

echo "output is in: $tmp"



More information about the tz mailing list