[tz] Converting TZ DB files to perl(1) POD

Steffen Daode Nurpmeso sdaoden at gmail.com
Sun Jan 27 01:05:03 UTC 2013


Steffen "Daode" Nurpmeso <sdaoden at gmail.com> wrote:
 |And then there is the script, rewritten to comply to the new
 |syntax.  And it produces much nicer HTML output, it is oh so
 |pretty now,
 |
 |Of course checking still works, and is also more sophisticated
 |than before.

And then there is the idea that Russ Allbery brought up,
converting the data to POD.
I've not finished that yet, and it's 2 o'clock in the morning ;/.
There are at least two problems unresolved.

The first is that we must not loose the original formatting of
plain text comments; we cannot simply indent because then L<> etc.
are not expanded, then.
The second is that the patch does not yet expand entitities.
The third is that for this to be real neat we had to inject =head
lines.

I'll look at that at a later time, when my small mailer is really
ready :-/, but i'm thankful for any suggestion, of course.

Thanks and ciao

--steffen

>From deb0b0ec9f1f31c837deb38969eb7034a7b2d100 Mon Sep 17 00:00:00 2001
Message-Id: <deb0b0ec9f1f31c837deb38969eb7034a7b2d100.1359248654.git.sdaoden at gmail.com>
From: "Steffen \"Daode\" Nurpmeso" <sdaoden at gmail.com>
Date: Sun, 27 Jan 2013 01:00:22 +0100
Subject: [PATCH] workht.pl: intermediate POD mode that does not work

---
 workht.pl |   70 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/workht.pl b/workht.pl
index 56ee83c..c13e665 100644
--- a/workht.pl
+++ b/workht.pl
@@ -4,7 +4,11 @@ require 5.008_001;
 #@ Public domain, 2013, Steffen Nurpmeso.
 #@ Synopsis:
 #@    workht.pl html   < DATA_FILE | elinks -force-html -dump 1
+#@    workht.pl pod    < DATA_FILE | pod2XY
+#@    workht.pl newpod < DATA_FILE | pod2XY
 #@    workht.pl check  < DATA_FILE > NEW_DATA_FILE
+#@ The *pod* and *newpod* modes produce perl(1) Pod output that can be
+#@ converted using any STDIN-aware pod2XY parser; see below for the difference.
 #@ The *check* mode requires an installed curl(1) (<http://curl.haxx.se>);
 #@ Input data notes:
 #@ - Only comment lines (\s*#) are recognized.
@@ -17,8 +21,8 @@ require 5.008_001;
 #@   work.)
 #@ - A link may be followed by WS and a link text in parenthesis ('\([^)]*?\)');
 #@   If no link text exists, the URL is used as the link content, too.
-#@   Note this only works in *html* mode, otherwise it'll always be the URL,
-#@   and the text in parenthesis will be left as is.
+#@   Note this only works in the *html* and *newpod* modes, otherwise it'll
+#@   always be the URL, and the text in parenthesis will be left as is.
 #@ - A link may also be followed by WS, a backslash and a LF ('\s*\\$'),
 #@   in which case the link text in parenthesis may be placed on the very next
 #@   line.
@@ -69,6 +73,8 @@ sub main_fun {
       usage($EX_NOINPUT);
    }
    mode_html() if $ARGV[0] eq 'html';
+   mode_pod(0) if $ARGV[0] eq 'pod';
+   mode_pod(1) if $ARGV[0] eq 'newpod';
    mode_check() if $ARGV[0] eq 'check';
    usage($EX_USAGE);
 }
@@ -77,9 +83,15 @@ sub usage {
    print STDERR <<__EOT__;
 Synopsis:
    workht.pl html   < DATA_FILE | elinks -force-html -dump 1
+   workht.pl pod    < DATA_FILE | pod2XY
+   workht.pl newpod < DATA_FILE | pod2XY
    workht.pl check  < DATA_FILE > NEW_DATA_FILE
 
 The *html* mode generates a very simple HTML page with hyperlinks.
+The *pod* and *newpod* modes produce perl(1) Pod output that can be
+converted using any STDIN-aware pod2XY parser, e.g., pod2text;
+the difference in between them is that *newpod* produces L<TEXT|URL>
+markup, whereas *pod* uses the backward compatible L<URL> form only.
 The *check* mode requires an installed curl(1) (<http://curl.haxx.se>).
 __EOT__
 
@@ -153,6 +165,60 @@ __EOT__
    exit($ESTAT)
 }
 
+sub mode_pod {
+   my $newpod = $_[0];
+
+   Line::parse_input();
+
+   die unless print "=head1 IANA TZ database file\n\n";
+
+   my ($lnl, $mode) = (1, 0);
+   while (defined(my $lo = shift @$INPUT)) {
+      if (! $lo->{ISCOMM}) {
+         if ($lo->{DATA} !~ /^\s*$/) {
+            if ($mode != 1) {
+               $mode = 1;
+               if (! $lnl) {
+                  die unless print "\n";
+               }
+            }
+            die unless print "\t";
+            $lnl = 0;
+         } else {
+            $lnl = 1;
+         }
+         die unless print $lo->{DATA}, "\n";
+         next;
+      }
+      if ($mode) {
+         $mode = 0;
+         if (! $lnl) {
+            die unless print "\n";
+         }
+      }
+      $lnl = 0;
+
+      my ($l, $rest) = ('', substr $lo->{DATA}, $lo->{ISCOMM});
+      Line::join_follow(\$lo, \$rest, $INPUT) if $lo->{FOLLOW};
+
+      while ($rest =~ $SCHEME_URL) {
+         $l .= $1 ? $1 : '';
+         $rest = $3;
+         my $url = $2;
+         my $text;
+         if ($newpod && $rest =~ $SCHEME_TEXT) {
+            $rest = $2;
+            $l .= 'L<' . $1 . '|' . $url . '>';
+         } else {
+            $l .= 'L<' . $url . '>';
+         }
+      }
+      $l .= $rest if $rest;
+      die unless print $l, "\n";
+   }
+   exit($ESTAT)
+}
+
 sub mode_check {
    Line::parse_input();
 
-- 
1.7.9.rc2.1.g69204





More information about the tz mailing list