[vip] Suggested meta-questions to think about

Cary Karp ck at nic.museum
Fri Jun 24 13:55:53 UTC 2011


Quoting Vladimir:

> Why can't we just say that a variant is whatever the registry wants to
> be a variant? The registry only needs to define a unique way of finding
> out whether String1 is indeed a variant of String2 or not.

I suspect that we are going to need to spend a bit of time sorting out
the relationship between this study and the ICANN policy development
process. To be sure, the former is intended to inform the latter, but
our contribution is the objective consideration and description of
issues attaching to "variance" as we end up defining that concept.
Policy itself is, however, not made here.

Notwithstanding, a look at policies already in effect -- many of which
are well entrenched -- might lend some useful focus to our effort. Here
is a snippet from the narrative adjunct to the Latin script table
provided jointly by .SE (the host organization for the Latin study) and
.MUSEUM:

"The repertoire supports numerous languages written using the Latin
alphabet and is intended to permit the representation of names derived
from European languages, using their native orthographies to the fullest
extent possible. There is, however, neither a requirement nor an
expectation that a label in a domain name will correspond to a proper
name or dictionary word in any language, and many labels deliberately do
not have any such attributes.

There is therefore no basis for determining the extent to which any
word-based restrictions or other language-specific orthographic
conventions can be applied here and, in consequence, all registration
policies are script based. Any permissible character may appear at any
point in a string, with the exception of digits and the hyphen, which
may not be in the initial or final positions in a label. [The positional
constraint on digits has since be rescinded.] The holder of an IDN is
responsible for the orthographic rigor of any proper words or names used
as labels. Each representation of a label in an alternative orthography
requires separate registration. For example, the prospective holder of
the label 'lättöl' is free to register the correlate 'lattol', without
either form imposing any restriction on the availability of the other,
or on any further variants using the more than twenty diacritically
marked forms of the base 'a' in the Unicode chart, or the similar number
of marked forms of 'o'.

This also applies to marked or ligated characters that can alternately
be represented as digraphs. It is again up to the prospective name
holder to make an individual determination as to whether or not there is
an equivalence between an umlauted 'ä', and an 'ae' digraph or an 'æ'
ligature, or if the 'ä' can acceptably also be indicated with an 'a'.
Even if lexicographic rules might be contemplated for  reducing the
inherent ambiguity, their automated implementation would easily be
stymied by reasonable  differences between the representations of both
proper names and dictionary words: 'encyclopaedia' and  'encyclopædia'
could be treated as identical, but 'mueller' and 'müller' cannot, and
'öresund' and  'øresund' can be argued either way."

/Cary


More information about the vip mailing list