[tz] TZ file comments UTF-8? Bastardized HTML? (was Re: Busingen revisited)

Tue Jan 15 01:17:42 UTC 2013

On 14 January 2013 18:50, gunther Vermeir wrote:

> >> # History of legal time in Britain
> >> # http://student.cusu.cam.ac.uk/~jsm28/british-time/.
> >
> > Well is that punctuation at the end of your URL part of the URL or
> > not?  It's fairly obviously not in that case, but not always so.  The
> > brackets help to delimit the URL.
> >
> Thunderbird seams to think it's not part :)
> It is for humans in any case, maybe simply "require" to have an URL
> always need to be surrounded with 2 spaces (U+0020) ?

+1 on "comments are for humans"; unless and until we push the comments out
into standalone documents (which I don't see as likely), we don't
particularly need to use any markup in the official database for URLs.

Honestly, all but the shortest URLs are long enough to be given their own
line anyway, so why not just do that for all of them?  It's about as
unambiguous as one can reasonably get while keeping things simple for
readability's sake, and you can name links (for those that need names) by
just putting the title followed by a colon (:) on the previous line(s).
Maybe indent the URL with a couple spaces for stylistic/parsing reasons,
but nothing fancier than that.

On 14 January 2013 18:50, gunther Vermeir wrote:

> +1, also for UTF-8 (no BOM) in comments only without the HTML entities /
> tags etc etc and non-comment parts of the files to be restricted to
> plain-old ASCII until there is a compelling need.

+1 on UTF-8 for comments, ASCII elsewhere; if it's feasibly implementable.
I'm not certain how one would actually go about actually supporting that,
but such an approach would allow each user to choose how to handle and/or
view the non-ASCII characters in comments while not affecting how the
data/code itself is interpreted... again, unless and until there's need to
support UTF-8 there, too.

Tim Parenti
