[tz] [PROPOSED 1/4] Allow “§” etc. in commentary

Brian Inglis Brian.Inglis at Shaw.ca
Tue Jan 24 02:22:46 UTC 2023

On 2023-01-23 15:32, John Sauter via tz wrote:
> On Mon, 2023-01-23 at 15:28 -0700, Paul Gilmartin via tz wrote:
>> On 1/23/23 13:48:02, Paul Eggert via tz wrote:
>>> * Makefile (UNUSUAL_OK_LATIN_1): Allow all non-alphabetic,
>>> non-ASCII printable characters that are Latin-1.  This is
>>> primarily for “§” and we might as well allow them all
>>> since even XEmacs 21 supports them all.

 >>> +UNUSUAL_OK_LATIN_1 = ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿×÷

>> Ouch!  UTF-8 is too pervasive on desktops and WWW for that to be
>> comfortable.

>> And on a UTF-8 desktop, GNU sed strangles on non-UTF-8 strings:
>> 1250 $ printf 'a\xa7b\n' | sed -E 's/(.)(.)(.)/1 \1  2 \2  3 \3/'
>> sed: RE error: illegal byte sequence
>> 1251 $

> I think the intent is to allow non-ASCII characters that are in Latin-
> 1, even though the file is coded in UTF-8. That is, not all Unicode
> characters are allowed, only those that appear in Latin-1.

Nitpick - ordinal indicators are Letters other like non-Latin scripts and micro 
sign is lowercase like Western scripts so match [[:alpha:]] not [[:punct:]]:

$ man iso-8859-1 | grep '\s[[:alpha:]]\s' | head -3
        252   170   AA     ª     FEMININE ORDINAL INDICATOR
        265   181   B5     µ     MICRO SIGN
        272   186   BA     º     MASCULINE ORDINAL INDICATOR
$ grep -ah 'ORDINAL\|MICRO SIGN' unicode-symbols.txt \
µ  U+00B5   MICRO SIGN
00AA;FEMININE ORDINAL INDICATOR;Lo;0;L;<super> 0061;;;;N;;;;;
00B5;MICRO SIGN;Ll;0;L;<compat> 03BC;;;;N;;;039C;;039C
00BA;MASCULINE ORDINAL INDICATOR;Lo;0;L;<super> 006F;;;;N;;;;;

Take care. Thanks, Brian Inglis			Calgary, Alberta, Canada

La perfection est atteinte			Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter	not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer	but when there is no more to cut
			-- Antoine de Saint-Exupéry

More information about the tz mailing list