[tz] Simplification and unification of scheme:// anchors
abbotti at mev.co.uk
Wed Jan 30 17:27:04 UTC 2013
On 2013-01-30 14:02, Steffen Daode Nurpmeso wrote:
> Ian Abbott <abbotti at mev.co.uk> wrote:
> |On 2013-01-30 11:28, Steffen Daode Nurpmeso wrote:
> |> Ian Abbott <abbotti at mev.co.uk> wrote:
> |>|While on the subject, the backslash escapes at the ends of the lines
> |>|with a <URL> with a parenthesised comment on the following line is kind
> |>|of ugly. I'm sure it must be possible to re-work your script to avoid
> |>|the need for that. (I.e. if a line ends with a <URL> plus optional
> |>|whitespace, check if the following line starts with optional whitespace
> |>|plus parenthesised link text.)
> |> Hmm.
> |> So i've reworked the (Pod-less) script to support multiple follow
> |> lines in the middle of nowhere, and changed the two links from
> |> which i remembered that it did matter.
> |> This updated version also fixes the "trailing empty line after
> |> rules are included in data boxes" issue.
> |> And it uses normal text paragraphs for the comment text, forcing
> |> newline breaks via <br />, instead of using preformatted text for
> |> that, which makes it even nicer, since some of the dramatically
> |> long links will now be wrapped by browsers.
> |Self closing tags such as <br /> are only legal in xhtml, not plain
> |html, so you'll need to output a XML declaration and a DOCTYPE in your
> That is indeed a good point, it must be '<br>'.
That depends what DOCTYPE you decide to use.
There are various other things wrong with the output, such as '&', '<'
and '>' not being turned into the entities '&', '<' and '>'.
Note that if doing that, you'd need to make sure not to convert the
existing entities such as 'á' into '&aacute;'. That would be
easier if the existing HTML entities were converted to UTF-8 sequences
(There are also a few odd-ball bits of mark-up in the original text,
such as <e'> which need to be dealt with by a separate patch to the data
files, e.g. to replace <e'> with the HTML entity é or by the
UTF-8 sequence é if going down the UTF-8 road.)
Also, validator.w3.org is your friend!
> |> # For more about the first ten years of DST in the United States, see
> |> # Robert Garland's <http://www.clpgh.org/exhibit/dst.html> \
> |> -# (``Ten years of daylight saving from the Pittsburgh standpoint'', \
> |. Carnegie Library of Pittsburgh, 1927).
> |> +# (``Ten years of daylight saving from the Pittsburgh standpoint'', \
> |> +# Carnegie Library of Pittsburgh, 1927).
> |It would still be great to get rid of the backslash line continuations
> |and modify the script to work without them.
> I personally like it explicit and would definitely go for the L<><>
> syntax i've used first, since it is completely unambiguous.
There's also the MediaWiki style for external links, e.g.:
[http://www.foobar.org/baz.html Meaningful link text]
which is not too unreadable, but less readable than having the
Meaningful link text in parentheses. For long URLs, it might be split
Meaningful link text]
which should be fine as long as the Meaningful link text contains no ']'
characters (or at least no unmatched ']' characters if matched pairs of
'[' and ']' are to be allowed).
> I would also spend some more time and convert the many "headlines" that
> yet exist in the comments to enough markup to get to something
> real; in fact with not that much effort, maybe a weekend, it would
> be possible to adjust the comments so that the script could use
> indents, lists and normal paragraphs without any <br> at all;
> then the Pod-way (any many others, too) could be pursued,
> also leading to cross-referenced PDF output -- and that is
> something that would surely be interesting for some people, as
> i suppose.
It depends how much mark-up people are willing to put up with in the
tzdata files, but I suspect not very much, if any! The primary method
for viewing the tzdata files should be the plain text originals, not the
output from some fancy converter.
> But the idea ypu proposed won't work with git(1), since trailing
> whitespace is a no-go; right?
You shouldn't need trailing anything, right? If the line ends with URL,
see if the next line(s) contains the start of the link text before you
decide to output the <br /> or whatever.
-=( Ian Abbott @ MEV Ltd. E-mail: <abbotti at mev.co.uk> )=-
-=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-
More information about the tz