[gnso-contactinfo-pdp-wg] Wednesday 12 November 23:59 UTC soft deadline for comments

Dillon, Chris c.dillon at ucl.ac.uk
Tue Nov 11 11:57:29 UTC 2014


Dear Emily,

I would like to thank you on behalf of the Group for this large amount of work both in summarizing colleagues’ comments and providing your own.

I hope both that it will make the non-mandatory arguments stronger and stimulate more discussion of the mandatory arguments on this list and in the meetings.

With all best wishes,

Chris.
--
Research Associate in Linguistic Computing, Centre for Digital Humanities, UCL, Gower St, London WC1E 6BT Tel +44 20 7679 1599 (int 31599) www.ucl.ac.uk/dis/people/chrisdillon<http://www.ucl.ac.uk/dis/people/chrisdillon>

From: Emily Taylor [mailto:emily.taylor at netistrar.com]
Sent: 11 November 2014 11:20
To: Dillon, Chris
Cc: Lars Hoffmann; gnso-contactinfo-pdp-wg at icann.org
Subject: Re: [gnso-contactinfo-pdp-wg] Wednesday 12 November 23:59 UTC soft deadline for comments

Dear Chris

Thank you for this timely reminder.  Over the past few days, I have been gathering input from colleagues in the Registrar Stakeholder group.  There was a rich discussion on the list, with many participants.  These are less comments on the paper itself than contributions to the general discussion of the issues.

Here is a synthesis of the comments. I hope that they will be useful in cross-checking against the "arguments opposing mandatory transformation" on pages 11-12:

1. Costs:  This proposal essentially externalises translation costs from LEA/IP to Registrars, and none of the commentators were convinced that the costs for contracted parties are justified by benefits to others.  Those requesting the data can pay for the translation.

2. Scale:  Why translate/transliterate all WHOIS data, rather than simply those names that are of interest on-the-fly?  Status quo is several orders of magnitude more efficient

3. Accuracy and responsibility: If the premise of  WHOIS data is that it is provided (and declared accurate) by the Registrant, then who accepts responsibility if Registrars are required to alter that data? How would the proposals impact whois data accuracy complaints and whois verification requirements?

4: Data integrity: The whois should be displaying what the client entered.   Our trying to interpret that only leads to more data errors, and less accurate data. If we change what the client enters it will only lead to errors:

a.       Will there be rules on how transliterate non-ascii characters so that it can be done programmatically? Is there some standard system to be used, or are we all just counting on Google Translate?

b.       If human judgment is required, who is responsible for doing it?

c.       If the registrant is responsible, what if they do not know what it should be?

d.       What if a third-party disagrees with the accuracy of a transliteration?

e.       Is the registrant’s consent required before a transliteration is published in the whois?

f.       Can a registrant withhold consent?

g.       What if a registrant wants to change an “approved” transliteration?

h.       Is a whois verification required every time one of these transliterated fields are updated?

i.    Where does the requirement for data transformation end? Could Chinese LEA require a contracted party to translate/transliterate existing English contact details into Mandarin? Or, what if the original registration was in a third language/script (Russian Cyrillic), would that skip English and go directly to Chinese?
5.  Compliance: "who will and how will this be policed?”  If ICANN are making cutbacks in their budget, how are they going to afford the human resources to check every Whois transliteration is correct? It doesn’t make much operational sense, and will likely end up with the registrant paying higher fees for something that they never asked for.

6. Internationalisation: The concept starts to erode the “my language, my Internet” / IDN principle of ICANN, by compelling the use of English/Latin/ASCII by people and locations not using those language/script combinations.  One commentator put it as "Sadly, it is North American thinking I suspect. 'We must translate everything into English'.

7: Competition: If a contracted party does not want to support a language that should be their prerogative. They can turn away business if they decide that they won’t be able to service that customer appropriately.

---------------

General comments

Taking into account the above input, I have the following observations to make on the draft paper.

First, thank you Chris and the ICANN team for your work in the unenviable task of fairly summarising the arguments on both sides.  I appreciate that it is an important step in the process to try and understand the arguments on both sides.

A general point: I have no sense from the paper, or from the discussions in the group, of the scale of the problem we are addressing here.  Do we have any stats for the following:

(1) a breakdown of WHOIS data by country of registrant - and can we infer what language WHOIS data is likely to be in?  The nearest I can get to is this map from OII which shows the predominance of Latin script / English language countries in the current domain market (http://geography.oii.ox.ac.uk/?page=geography-of-top-level-domain-names) .  However, if you look at growth potential, clearly that is not the case.  And IDN registrations by country show a different pattern (see page 17 at http://www.eurid.eu/files/publ/IDNWorldReport2014_Interactive.pdf)

(2) an estimate of what is likely to be the language of WHOIS data if multiple languages were enabled in these fields.  For example, we could perhaps draw some inferences from the IDN registrations in ASCII TLDs.  Approximately 1% of .com and .net registrations are IDNs, and the majority of those are Latin script.  This may not be representative in that the Latin script ending for .com is more likely to be attractive to Latin script IDNs than, say, right to left scripts or pictograms.  There are currently just shy of 900,000 Russian ccTLD IDNs.  Of these over 800,000 has a registrant based in Russia, and uptake in other countries is low (even former Soviet Union).  See http://statdom.ru/tld/%D1%80%D1%84/report/summary/. There are approximately 12,000 IDNs in Arabic script ccTLDs.  Uptake of IDN new gTLDs has been fairly limited.  I don't think that anyone is claiming that the IDN market has even nearly fulfilled its market potential, but can we have some statement of the scale of the problem?

(3) Do we have a sense of how many WHOIS look-ups are performed by law enforcement and IP interests, what percentage that represents of all WHOIS look ups, and how many prove to be problematic in terms of language of contact?  On the other hand, what problems are currently created by not having the ability to record contact details in the script of the domain name (eg for IDNs)?

(4) There have been a number of studies on different aspects of WHOIS data in the last couple of years - do any of these help to guide us?



Specific comments

Page 11 - as you say there is disagreement on "ease" of search.  If you're English mother tongue, then it might be "easier" to understand the output of a search, but any string is searchable, and you can interpret the search results whatever their script/language.

I find the first bullet point unconvincing - it's like saying "why doesn't everyone just learn English?  It's such a mess having all these languages"

On the second bullet point, p11 - I appreciate that a counter argument is stated to the "transformation will to some extent facilitate communication" argument.  The communication argument is a difficult one.  On one level - as demonstrated within this working group and many others - we default to English in order to communicate with one another across different languages.  However, this is also (to some extent) a factor that deters input from those who are not confident in English as a second language - who may be able to give valuable insights into the debate.  I believe that this is captured in "to some extent" but would welcome more acknowledgement that this cuts both ways.

The third bullet point does not explain why it is also necessary to transliterate/translate *all* data for this benefit to be felt. We need some consideration of proportionality here.

Fourth bullet - define "least translatable" - for whom? Is this truly posed as a barrier to law enforcement and others?

To balance the "cyberflight" argument in the fourth bullet point, could we also point out that in general people tend to register and host locally.  This is perhaps a surprising phenomenon given the strength of some registrars internationally.  For example, on page 5 at http://www.eurid.eu/files/publ/IDNWorldReport2014_Interactive.pdf) we have an analysis of country of hosting for gTLD IDNs plus .eu IDNs.  This was done based on the IP ranges associated with the domain names.  You can see that countries and regions with strong international registrars (eg North America, UK) don't really show any "winner" script.  In contrast, Chinese script, Cyrillic, Han (plus Katakana, Hiragana), Thai, Hangul, Arabic script domains tend to be hosted in countries where associated languages are spoken.

Could I also add that you can see within large IDN namespaces which offer multiple scripts (eg .com and .net) that registrations cluster strongly around popular scripts.  There are very small numbers indeed outside of them.  I can produce some more analysis on that point if people like.

I hope these inputs are helpful to the working group in its deliberations, and I look forward to joining the discussions.



Best wishes,

Emily
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/gnso-contactinfo-pdp-wg/attachments/20141111/7931f324/attachment.html>


More information about the Gnso-contactinfo-pdp-wg mailing list