[tz] zic tweak to warn about non-ASCII in filenames

Paul Eggert eggert at cs.ucla.edu
Thu Jun 26 16:42:28 UTC 2014


Arthur David Olson wrote:
> I'll advocate for warning about any dots in file names

That's easy enough, and simplifies the code and documentation; I pushed 
the attached patches.  The second patch documents four other exceptional 
names I found when I ran the new 'zic -v' against the tz database.
-------------- next part --------------
From 300e008f98cc6c8e0540b051bac5390e3f248f4f Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert at cs.ucla.edu>
Date: Thu, 26 Jun 2014 09:36:50 -0700
Subject: [PATCH 1/2] 'zic -v' now warns about all '.'s in output file names.

* zic.c (componentcheck, namecheck): Warn about all '.'s in
the file name, not merely about "." and ".." file name components.
* zic.8 (DESCRIPTION), NEWS: Document this.
---
 NEWS  | 6 +++---
 zic.8 | 7 +------
 zic.c | 6 +-----
 3 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/NEWS b/NEWS
index 5e5ba8e..80bad54 100644
--- a/NEWS
+++ b/NEWS
@@ -15,9 +15,9 @@ Unreleased, experimental changes
     Error diagnostics of 'zic' and 'yearistype' have been reworded so that
     they no longer use ASCII '-' as if it were a dash.
 
-    'zic -v' now warns about output file names that do not follow POSIX rules,
-    or that contain a digit or a file name component of '.' or '..'.
-    (Thanks to Arthur David Olson for starting the ball rolling on this.)
+    'zic -v' now warns about output file names that do not follow
+    POSIX rules, or that contain a digit or '.'.  (Thanks to Arthur
+    David Olson for starting the ball rolling on this.)
 
     Some lint has been removed when using GCC_DEBUG_FLAGS with GCC 4.9.0.
 
diff --git a/zic.8 b/zic.8
index 95dd038..e22e6cd 100644
--- a/zic.8
+++ b/zic.8
@@ -115,17 +115,12 @@ POSIX requires at least 3.
 .PP
 An output file name contains a byte that is not an ASCII letter, digit,
 .q "-" ,
-.q "." ,
 .q "/" ,
 or
 .q "_" ;
 or it contains a file name component that contains more than 14 bytes
 or that starts with
-.q "-"
-or is
-.q "."
-or
-.q ".." .
+.q "-" .
 .RE
 .TP
 .B \-s
diff --git a/zic.c b/zic.c
index 62c5fd5..ddf764c 100644
--- a/zic.c
+++ b/zic.c
@@ -630,10 +630,6 @@ componentcheck(char const *name, char const *component,
 	if (0 < component_len && component[0] == '-')
 		warning(_("file name '%s' component contains leading '-'"),
 			name);
-	if (0 < component_len && component_len <= 2
-	    && component[0] == '.' && component_end[-1] == '.')
-		warning(_("file name '%s' contains '%.*s' component"),
-			name, (int) component_len, component);
 	if (component_len_max < component_len)
 		warning(_("file name '%s' contains overlength component"
 			  " '%.*s...'"),
@@ -644,7 +640,7 @@ static void
 namecheck(const char *name)
 {
 	register char const *cp;
-	static char const benign[] = ("-./_"
+	static char const benign[] = ("-/_"
 				      "abcdefghijklmnopqrstuvwxyz"
 				      "ABCDEFGHIJKLMNOPQRSTUVWXYZ");
 	register char const *component = name;
-- 
1.9.1
-------------- next part --------------
From ee82eb4a05088238fa5370e5c0c7112a58362c1b Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert at cs.ucla.edu>
Date: Thu, 26 Jun 2014 09:39:13 -0700
Subject: [PATCH 2/2] * Theory, NEWS: Also document EST5EDT etc. as exceptions.

Use the term "legacy names" for exceptions.
---
 NEWS   |  4 ++--
 Theory | 15 ++++++++-------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/NEWS b/NEWS
index 80bad54..e87e2b7 100644
--- a/NEWS
+++ b/NEWS
@@ -23,8 +23,8 @@ Unreleased, experimental changes
 
   Changes affecting documentation and commentary
 
-    The 'Theory' file documents the longstanding exceptions to the
-    POSIX file name rules that are in 'etcetera' and 'backward'.
+    The 'Theory' file documents legacy names, the longstanding
+    exceptions to the POSIX-inspired file name rules.
 
     Documentation and commentary now prefer UTF-8 to US-ASCII,
     allowing the use of proper accents in foreign words and names.
diff --git a/Theory b/Theory
index c31731a..ef26751 100644
--- a/Theory
+++ b/Theory
@@ -406,7 +406,7 @@ in decreasing order of importance:
 		TZ strings.  A file name component must not exceed 14
 		characters or start with '-'.  E.g., prefer 'Brunei'
 		to 'Bandar_Seri_Begawan'.  Exceptions: see the discussion
-		of the 'etcetera' file below.
+		of legacy names below.
 	A name must not be empty, or contain '//', or start or end with '/'.
 	Do not use names that differ only in case.  Although the reference
 		implementation is case-sensitive, some other implementations
@@ -469,12 +469,13 @@ See the file 'backward' for most of these older names
 The other old-fashioned names still supported are
 'WET', 'CET', 'MET', and 'EET' (see the file 'europe').
 
-Older versions of this package defined names that were
-incompatible with POSIX.  These older names are still supported,
-even though they do not conform to first rule of location names.
-These incompatible names are mostly defined in the file 'etcetera'.
-Also, the file 'backward' defines the incompatible names 'GMT0',
-'GMT-0', 'GMT+0', and 'Canada/East-Saskatchewan'.
+Older versions of this package defined legacy names that are
+incompatible with the first rule of location names, but which are
+still supported.  These legacy names are mostly defined in the file
+'etcetera'.  Also, the file 'backward' defines the legacy names
+'GMT0', 'GMT-0', 'GMT+0' and 'Canada/East-Saskatchewan', and the file
+'northamerica' defines the legacy names 'EST5EDT', 'CST6CDT',
+'MST7MDT', and 'PST8PDT'.
 
 Excluding 'backward' should not affect the other data.  If
 'backward' is excluded, excluding 'etcetera' should not affect the
-- 
1.9.1



More information about the tz mailing list