[tz] Corrupt output (was: Re: [tz-announce] 2020e release of tz code and data available

Brian Inglis Brian.Inglis at SystematicSw.ab.ca
Wed Dec 23 19:10:34 UTC 2020


Debian and Cygwin also agree using awk or gawk:

$ awk -v DATAFORM=rearguard -f ziguard.awk africa | grep '^ '
   Interpretation Ordinance (Cap 2)
   1919, s. 2.)"

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]


On 2020-12-23 00:06, Deborah Goldsmith via tz wrote:
> I verified that introducing a rule to set zone to empty when encountering “End of rearguard section" fixes the problem, and does not introduce any other changes to the output.
> 
> Debbie
> 
>> On Dec 22, 2020, at 10:59 PM, Deborah Goldsmith via tz <tz at iana.org> wrote:
>>
>> OK, I think I (mostly) figured it out. On Darwin (macOS) the default value of FS is “ “ (space).
>>
>> For these two lines:
>> # Interpretation Ordinance (Cap 2)
>> # 1919, s. 2.)"
>>
>> I believe the second one is hitting this rule:
>> 	   || in_comment + 3 == NF))
>> and I’m not sure which rule the first one is hitting.
>>
>> I think a safe fix for this issue (and future such issues) might be to check for “End of rearguard section.” and if found, set the variable zone to an empty string. However, I’ll leave this to more experienced awk aficionados. This seems like a pretty dangerous set of awk rules to leave active over a large section of the input.
>>
>> I suspect that these failures will occur on any system, not just Darwin, but I don’t have access to a non-Darwin system with a working awk at the moment.
>>
>> Debbie
>>
>>> On Dec 22, 2020, at 10:10 PM, Deborah Goldsmith via tz <tz at iana.org> wrote:
>>>
>>> I think I’m getting closer.
>>>
>>> In the africa input file, the Zone directive most nearly preceding the new comment block is Africa/Windhoek, so the “zone” variable contains that. Meanwhile, ziguard.awk contains this:
>>>
>>> # If this line should differ due to Namibia using negative SAVE values,
>>> # uncomment the desired version and comment out the undesired one.
>>> Rule_Namibia = /^#?Rule[\t ]+Namibia[\t ]/
>>> Zone_using_Namibia_rule \
>>>    = (zone == "Africa/Windhoek" \
>>>       && ($(in_comment + 2) == "Namibia" \
>>> 	   || (1994 <= $(in_comment + 4) && $(in_comment + 4) <= 2017) \
>>> 	   || in_comment + 3 == NF))
>>> if (Rule_Namibia || Zone_using_Namibia_rule) {
>>>      if ((Rule_Namibia \
>>> 	   ? ($(in_comment + 9) ~ /^-/ \
>>> 	      || ($(in_comment + 9) == 0 && $(in_comment + 10) == "CAT")) \
>>> 	   : $(in_comment + 1) == "2:00" && $(in_comment + 2) == "Namibia") \
>>> 	  == vanguard) {
>>>      uncomment = in_comment
>>>    } else {
>>>      comment_out = !in_comment
>>>    }
>>> }
>>>
>>> So these rules are in effect while the comment block is being processed. I tried reproducing this on a raspberry pi, but in Raspbian awk is crashing with a malloc heap corruption, so no luck there:
>>>
>>> pi at raspberrypi:~/source/tz $ make rearguard_tarballs
>>> awk -v DATAFORM=`expr main.zi : '\(.*\).zi'` -f ziguard.awk \
>>> 	  africa antarctica asia australasia europe northamerica southamerica etcetera factory backward  >main.zi.out
>>> malloc(): unsorted double linked list corrupted
>>> Aborted
>>> make: *** [Makefile:604: main.zi] Error 134
>>>
>>> It’s not clear to me why these rules would match the comment lines that are being altered, but this seems like where it must be coming from. In the previous commit, a new Zone directive for Africa/Lagos follows almost immediately, which would change the value of the zone variable. In the previous commit there were very few comment lines between Zone Africa/Windhoek and Zone Africa/Lagos, so these awk rules have never been tested on a large comment block.
>>>
>>> Debbie
>>>
>>>> On Dec 22, 2020, at 9:48 PM, Deborah Goldsmith via tz <tz at iana.org> wrote:
>>>>
>>>> In the 2020e release, the build process produces a rearguard zic file for africa that does not compile. This is because the rearguard.zi file has that same grammatical problem: missing comment markers.
>>>>
>>>> source africa file:
>>>>
>>>> # In 1919, standard time was changed to GMT+1.
>>>> # Interpretation Ordinance (Cap 2)
>>>> # The Laws of Nigeria, Containing the Ordinances of Nigeria, in Force on the
>>>> # 1st Day of January, 1923, Vol.I [p 16]
>>>> # https://books.google.com/books?id=BOMrAQAAMAAJ&pg=PA16
>>>> # "The expression 'Standard time' means standard time as used in Nigeria:
>>>> # namely, 60 minutes in advance of Greenwich mean time.  (As amended by 18 of
>>>> # 1919, s. 2.)"
>>>> # From Tim Parenti (2020-12-10):
>>>>
>>>> output rearguard.zi (and africa zic file):
>>>>
>>>> # In 1919, standard time was changed to GMT+1.
>>>> Interpretation Ordinance (Cap 2)
>>>> # The Laws of Nigeria, Containing the Ordinances of Nigeria, in Force on the
>>>> # 1st Day of January, 1923, Vol.I [p 16]
>>>> # https://books.google.com/books?id=BOMrAQAAMAAJ&pg=PA16
>>>> # "The expression 'Standard time' means standard time as used in Nigeria:
>>>> # namely, 60 minutes in advance of Greenwich mean time.  (As amended by 18 of
>>>> 1919, s. 2.)"
>>>> # From Tim Parenti (2020-12-10):
>>>>
>>>> This happens on macOS; I don’t know if it happens on other platforms. It looks like something is going awry in ziguard.awk. The only thing I can think of is that the africa file in the source repository has a big new comment block from Tim Parenti (commit 316c1598e166e15c27fe611cacd81aeada2a836d) and the problem is occurring inside that. Apparently there’s something in that comment block that ziguard.awk is mistaking for zone rules that it needs to change. I’m trying to see if I can reproduce this on a non-Apple system.
>>>>
>>>> Debbie
-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]





More information about the tz mailing list