[tz] Simplification and unification of scheme:// anchors
Ian Abbott
abbotti at mev.co.uk
Wed Jan 30 17:27:04 UTC 2013
On 2013-01-30 14:02, Steffen Daode Nurpmeso wrote:
> Ian Abbott <abbotti at mev.co.uk> wrote:
> |On 2013-01-30 11:28, Steffen Daode Nurpmeso wrote:
> |> Ian Abbott <abbotti at mev.co.uk> wrote:
> |>|While on the subject, the backslash escapes at the ends of the lines
> |>|with a <URL> with a parenthesised comment on the following line is kind
> |>|of ugly. I'm sure it must be possible to re-work your script to avoid
> |>|the need for that. (I.e. if a line ends with a <URL> plus optional
> |>|whitespace, check if the following line starts with optional whitespace
> |>|plus parenthesised link text.)
> |>
> |> Hmm.
> |> So i've reworked the (Pod-less) script to support multiple follow
> |> lines in the middle of nowhere, and changed the two links from
> |> which i remembered that it did matter.
> |>
> |> This updated version also fixes the "trailing empty line after
> |> rules are included in data boxes" issue.
> |> And it uses normal text paragraphs for the comment text, forcing
> |> newline breaks via <br />, instead of using preformatted text for
> |> that, which makes it even nicer, since some of the dramatically
> |> long links will now be wrapped by browsers.
> |
> |Self closing tags such as <br /> are only legal in xhtml, not plain
> |html, so you'll need to output a XML declaration and a DOCTYPE in your
> |script.
>
> That is indeed a good point, it must be '<br>'.
That depends what DOCTYPE you decide to use.
There are various other things wrong with the output, such as '&', '<'
and '>' not being turned into the entities '&', '<' and '>'.
Note that if doing that, you'd need to make sure not to convert the
existing entities such as 'á' into 'á'. That would be
easier if the existing HTML entities were converted to UTF-8 sequences
first!
(There are also a few odd-ball bits of mark-up in the original text,
such as <e'> which need to be dealt with by a separate patch to the data
files, e.g. to replace <e'> with the HTML entity é or by the
UTF-8 sequence é if going down the UTF-8 road.)
Also, validator.w3.org is your friend!
> |> # For more about the first ten years of DST in the United States, see
> |> # Robert Garland's <http://www.clpgh.org/exhibit/dst.html> \
> |> -# (``Ten years of daylight saving from the Pittsburgh standpoint'', \
> |. Carnegie Library of Pittsburgh, 1927).
> |> +# (``Ten years of daylight saving from the Pittsburgh standpoint'', \
> |> +# Carnegie Library of Pittsburgh, 1927).
> |
> |It would still be great to get rid of the backslash line continuations
> |and modify the script to work without them.
>
> :)
> I personally like it explicit and would definitely go for the L<><>
> syntax i've used first, since it is completely unambiguous.
There's also the MediaWiki style for external links, e.g.:
[http://www.foobar.org/baz.html Meaningful link text]
which is not too unreadable, but less readable than having the
Meaningful link text in parentheses. For long URLs, it might be split
like this:
[http://www.foobar.org/baz.html
Meaningful link text]
or even:
[http://www.foobar.org/baz.html Meaningful
link
text]
which should be fine as long as the Meaningful link text contains no ']'
characters (or at least no unmatched ']' characters if matched pairs of
'[' and ']' are to be allowed).
> I would also spend some more time and convert the many "headlines" that
> yet exist in the comments to enough markup to get to something
> real; in fact with not that much effort, maybe a weekend, it would
> be possible to adjust the comments so that the script could use
> indents, lists and normal paragraphs without any <br> at all;
> then the Pod-way (any many others, too) could be pursued,
> also leading to cross-referenced PDF output -- and that is
> something that would surely be interesting for some people, as
> i suppose.
It depends how much mark-up people are willing to put up with in the
tzdata files, but I suspect not very much, if any! The primary method
for viewing the tzdata files should be the plain text originals, not the
output from some fancy converter.
> But the idea ypu proposed won't work with git(1), since trailing
> whitespace is a no-go; right?
You shouldn't need trailing anything, right? If the line ends with URL,
see if the next line(s) contains the start of the link text before you
decide to output the <br /> or whatever.
--
-=( Ian Abbott @ MEV Ltd. E-mail: <abbotti at mev.co.uk> )=-
-=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-
More information about the tz
mailing list