[tz] zic tweak to warn about non-ASCII in filenames

Tim Parenti tim at timtimeonline.com
Thu Jun 26 01:04:46 UTC 2014


If you allow dot, you'll want to add checks that "." and ".." are not used
as file name components.

Our guidelines in Theory prevent us from doing this to ourselves.  It
states, "Within a file name component, use only ASCII letters, [dot,
hyphen, and underscore]."  (Slash delimits components, and plus is used in
some Etc zones.)  However, Theory also says to "[o]mit [dot] from
abbreviations in names".

So the question is: Could we imagine some use case where someone (likely
not us) might want to use a dot that makes this worthwhile?

A related question that arose along those lines: Are we already checking
that components do not start with a hyphen?  Theory specifies that as well.

--
Tim Parenti


On 25 June 2014 20:52, Arthur David Olson <arthurdavidolson at gmail.com>
wrote:

> I limited the BENIGN list to charcters other than [a-zA-Z0-9] currently
> used in distribution file names; <dot> could be added.
>
> (The "etcetera" file has lines such as "Zone Etc/GMT+10...")
>
> On an unrelated note: I checked; "zic -v" already checks abbreviations and
> issues warnings.
>
>     --ado
>
>
> On Wed, Jun 25, 2014 at 8:36 PM, Jonathan Leffler <
> jonathan.leffler at gmail.com> wrote:
>
>> Dot is pretty benign in a file name, isn't it?
>>
>> POSIX defines the portable file name character set as:
>>
>> 3.278 Portable Filename Character Set
>>
>> The set of characters from which portable filenames are constructed.
>>
>> A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
>> a b c d e f g h i j k l m n o p q r s t u v w x y z
>> 0 1 2 3 4 5 6 7 8 9 . _ -
>>
>>
>> (
>> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_278
>> )
>> or
>> (http://pubs.opengroup.org/onlinepubs/9699919799/toc.htm and under Base
>> Definitions, section 3 Definitions, and thence to 3.278).
>>
>> Your list omits . <dot> and adds + <plus> (and includes / <slash> the
>> path separator).
>>
>>
>>
>>
>> On Wed, Jun 25, 2014 at 4:37 PM, Arthur David Olson <
>> arthurdavidolson at gmail.com> wrote:
>>
>>> To help ensure that non-ASCII characters don't appear in distribution
>>> filenames,
>>> changes to zic.c so that the "-v" option warns about them. Both attached
>>> and
>>> tab-mangled below.
>>>
>>>     --ado
>>>
>>> *** /tmp/,azic.c    2014-06-25 19:32:44.803874900 -0400
>>> --- /tmp/,bzic.c    2014-06-25 19:32:44.906880800 -0400
>>> ***************
>>> *** 134,139 ****
>>> --- 134,140 ----
>>>   static int    itsdir(const char * name);
>>>   static int    lowerit(int c);
>>>   static int    mkdirs(char * filename);
>>> + static void    namecheck(const char * name);
>>>   static void    newabbr(const char * abbr);
>>>   static zic_t    oadd(zic_t t1, zic_t t2);
>>>   static void    outzone(const struct zone * zp, int ntzones);
>>> ***************
>>> *** 621,632 ****
>>> --- 622,652 ----
>>>       return (errors == 0) ? EXIT_SUCCESS : EXIT_FAILURE;
>>>   }
>>>
>>> + #define BENIGN    "+-_/"
>>> +
>>> + static void
>>> + namecheck(const char * const name)
>>> + {
>>> +     register const char *    cp;
>>> +
>>> +     if (!noise)
>>> +         return;
>>> +     for (cp = name; *cp != '\0'; ++cp)
>>> +         if (!isascii(*cp) ||
>>> +             (!isalnum(*cp) && strchr(BENIGN, *cp) == NULL)) {
>>> + warning(_("file name %s has non-ASCII-alphanumeric character other
>>> than %s"),
>>> +                     name, BENIGN);
>>> +             return;
>>> +         }
>>> + }
>>> +
>>>   static void
>>>   dolink(const char *const fromfield, const char *const tofield)
>>>   {
>>>       register char *    fromname;
>>>       register char *    toname;
>>>
>>> +     namecheck(tofield);
>>>       if (fromfield[0] == '/')
>>>           fromname = ecpyalloc(fromfield);
>>>       else {
>>> ***************
>>> *** 1495,1500 ****
>>> --- 1515,1521 ----
>>>       void *typesptr = ats + timecnt;
>>>       unsigned char *types = typesptr;
>>>
>>> +     namecheck(name);
>>>       /*
>>>       ** Sort.
>>>       */
>>>
>>>
>>>
>>
>>
>> --
>> Jonathan Leffler <jonathan.leffler at gmail.com>  #include <disclaimer.h>
>> Guardian of DBD::Informix - v2013.0521 - http://dbi.perl.org
>> "Blessed are we who can laugh at ourselves, for we shall never cease to
>> be amused."
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/tz/attachments/20140625/c6d79deb/attachment-0001.html>


More information about the tz mailing list