[gnso-rds-pdp-wg] Question for Stephanie

Stephanie Perrin stephanie.perrin at mail.utoronto.ca
Thu Dec 8 17:29:50 UTC 2016


The answer is yes, and thanks for the question.  I was going to jump in 
earlier and challenge the same assertion, but figured I had said enough 
recently. :-)

Furthermore, even the datestamp and registrar-generated data may reveal 
association of domains that leads you to the registrant.  Let's say I 
register ten names one day, with the same registrar,  one of which is 
Stephanieperrin.com, another is canadianconvertstoterrorism.com, is it 
not possible to find that cluster of registrations, and associate all 
domains with me?  The data commissioners pointed out many years ago 
(2003 I think, I can check) that they had a problem with the reverse 
directory capability of the WHOIS, because it was not at all necessary 
for the functioning of the domain system, or at least ICANN had never 
made the argument. They did not think WHOIS should offer the capability 
of searching by registrant name.  I would argue further, these days, 
that publication of other data should not make registrant identity 
reasonably retrievable.

There is a question that I have in return.  I presume that much of the 
current configuration and policy of WHOIS and its data elements is based 
on simply building on a flimsy foundation.  A primary drive has been to 
keep the costs to the registrars/registries down, since human 
intervention is too expensive, and the appetite for data is proving to 
be insatiable.  I don't think either of those parties were keen on 
publishing the personal data of their customers, but the alternatives 
were not at all attractive.  I realize costs are to be dealt with later, 
but to what extent has the technical capability increased to the point 
where we can stop caring about whether the registrar/registry actually 
publishes the data, or merely allows (for instance) a duly authorized 
law enforcement agent in the appropriate jurisdiction (ie one with a 
valid warrant or other judicial authorization) to have access to the 
data in their files?  I realize we talked about this concept of tiered 
access extensively in the EWG, but at least one member of that group 
(me) never understood whether the tiered access we were specing is 
something that is technically possible but financially, legally and 
operationally infeasible.  [Shortly before I retired, I had to deal with 
a lot of breathless enthusiasm about what "big data" was going to do to 
transform our risk management in benefits programs.  Totally infeasible, 
in my view, given the state of our data systems, the accuracy rate, the 
available budget, and the availability of investigators to act on 
findings (a critical factor; once you know you have fraud you have to do 
something about it if you are a government). We won't even talk about 
whether such risk assessment is constitutionally acceptable.]

We have a similar situation here in my view.  Much of what is going on 
now violates data protection law, we have plenty of input from the DPAs 
pointing that out.  A new system ought to be attentive to that point. In 
the example I cited, the relevant law enforcement authority would have 
no legal trouble getting access to all data related to the registrant of 
canadianconvertstoterrorism.com in my view, the operative question is 
how fast can they do it, what authority do they have to show and how, 
and what mechanisms does the registrar/registry have to build in order 
to permit this access securely (from all three perspectives, registrar, 
registrant, and LEA) and at reasonable cost.   The same applies to 
others with less compelling interests (ie domain speculators, IP and 
trademark owners, etc) and here we run into complex cost and authorities 
issues, in my view.

Cheers Stephanie


On 2016-12-08 06:17, Michael D. Palage wrote:
>
> Greg,
>
> Again I am not trying to be confrontation, but I would respectfully 
> disagree with you on Thin Data never containing PII.
>
> Take for example the very domain name that I am using on this email, 
> PALAGE.COM.  I believe it is possible for PII to be contained in the 
> very domain name itself.
>
> Take for example the following three domain name examples
>
> FirstName_SurName.CHRISTIAN
>
> FirstName_SurName.HIV
>
> FirstName_SurName.LGBT
>
> I believe that any information that discloses a person’s religious 
> affiliation, sexual orientation or medical condition, could be deemed 
> PII in certain jurisdictions.  I will to defer to Stephanie on this 
> question, however, I believe the answer is yes.
>
> So NOW lets come to a point where I think “we” can find some agreement.
>
> I believe that all Thin Data ( as I previously defined as all data 
> elements necessary for the minimum operation of a gTLD SRS – including 
> status) should be made available even if it does contain PII in the 
> domain name itself of the domain name of the name servers.
>
> Domain Name:
>
> Registrar:
>
> Sponsoring Registrar IANA ID:
>
> Whois Server:
>
> Referral URL:
>
> Name Server:
>
> Name Server:
>
> Status:
>
> Updated Date:
>
> Creation Date:
>
> Expiration Date:
>
> Notwithstanding the fact that PII may be contained in the domain name 
> or the name server domain, I believe that this “thin” data is so 
> necessary that it MUST be disclosed and there is no situation that I 
> can foresee where this “thin” data can be withheld. Again however, I 
> will let Stephanie answer this question.  If we can all agree on this 
> “thin” data question that could be an important first building block 
> toward consensus.
>
> Best regards,
>
> Michael
>
> *From:* Greg Aaron [mailto:gca at icginc.com]
> *Sent:* Wednesday, December 7, 2016 7:30 PM
> *To:* Michael D. Palage <michael at palage.com>; 'Gomes, Chuck' 
> <cgomes at verisign.com>; gnso-rds-pdp-wg at icann.org
> *Subject:* RE: [gnso-rds-pdp-wg] key concepts: say "contact data" when 
> that is what we mean
>
> BTW, much of the thin data in WHOIS is not even “collected” from or 
> provided by the registrant. Much of it is generated automatically at 
> the registry, as a key registry function/responsibility.  When you 
> register a domain:
>
> ·the registry knows what registrar is creating the domain, and records 
> that and associates the registrar’s IANA ID.  The registry then 
> displays those in WHOIS.
>
> ·policy dictates what initial domain statuses there are.
>
> ·the registrar indicates how many years the registrant wants, but the 
> create/updated/expiration timestamps are generated and maintained by 
> the registry.
>
> ·Nameserver data is provided by the registrant.  (Unless he or she 
> didn’t specify any, in which case the registrar often provides defaults.)
>
> ·Domain statuses can be manipulated after the domain’s out of AGP. 
> Depending on the status type and the situation, they can be added and 
> deleted by the registrant, the registrar, and/or by the registry.
>
> None of these thin  data fields are sensitive info AFAIK.
>
> All best,
>
> --Greg
>
> *From:* Michael D. Palage [mailto:michael at palage.com]
> *Sent:* Wednesday, December 7, 2016 5:04 PM
> *To:* 'Gomes, Chuck' <cgomes at verisign.com 
> <mailto:cgomes at verisign.com>>; Greg Aaron <gca at icginc.com 
> <mailto:gca at icginc.com>>; gnso-rds-pdp-wg at icann.org 
> <mailto:gnso-rds-pdp-wg at icann.org>
> *Subject:* RE: [gnso-rds-pdp-wg] key concepts: say "contact data" when 
> that is what we mean
>
> Chuck,
>
> This is where a choice/orientation of words may have significant legal 
> distinction.
>
> (My text) - All data associated with a domain name registration
>
> (WG Text) – Registration Data
>
> I am taking a much more expansive view of data associated with a 
> domain name registration to include data potentially NOT originally 
> provided by a registrant at the time of registration. Versus the 
> potentially more restrictive definition of only data provided by 
> Registrant to Registrar at the time of registration.
>
> Take for example a .BRAND registry where licensees of that trademark 
> owner are permitted to register in that .BRAND TLD. As part of 
> promoting awareness to consumers, the registry operator (trademark 
> owner) may desire to include/append authoritative data associated with 
> each licensees consumer ranking (e.g. rating 1 thru 5 stars) so that 
> consumers can better choose which licensee to conduct business. 
> Because this ranking may change over time, the Registrant/Licensee is 
> NOT in a position to provide this data as it appears in the RDS/WHOIS 
> output. Only the Registry Operator (trademark owner) would be best 
> positioned to include this authoritative data in the RDS/Whois output.
>
> The point I am trying to make is that innovation has only just begun 
> in connection with the new gTLD expansion. While I respect the rights 
> of privacy advocates to safeguard registrant PII, I do not want broad 
> policy statements to have unintended consequences in impeding future 
> innovation.
>
> Best regards,
>
> Michael
>
> *From:* Gomes, Chuck [mailto:cgomes at verisign.com]
> *Sent:* Wednesday, December 7, 2016 4:34 PM
> *To:* michael at palage.com <mailto:michael at palage.com>; gca at icginc.com 
> <mailto:gca at icginc.com>; gnso-rds-pdp-wg at icann.org 
> <mailto:gnso-rds-pdp-wg at icann.org>
> *Subject:* RE: [gnso-rds-pdp-wg] key concepts: say "contact data" when 
> that is what we mean
>
> Thanks Mike.  I am glad to see this discussion going on in advance of 
> considering the first users/purposes question: “*Should gTLD 
> registration data be accessible for any purpose or only for specific 
> purposes?*”
>
> Chuck
>
> *From:* Michael D. Palage [mailto:michael at palage.com]
> *Sent:* Wednesday, December 07, 2016 4:13 PM
> *To:* Gomes, Chuck <cgomes at verisign.com <mailto:cgomes at verisign.com>>; 
> gca at icginc.com <mailto:gca at icginc.com>; gnso-rds-pdp-wg at icann.org 
> <mailto:gnso-rds-pdp-wg at icann.org>
> *Subject:* [EXTERNAL] RE: [gnso-rds-pdp-wg] key concepts: say "contact 
> data" when that is what we mean
>
> Chuck,
>
> I appreciate Greg’s historical context of Whois data primarily being 
> for purposes of “contacting” the registrant of a domain name using 
> those data fields with personally identifying information. However, I 
> think introducing/relying upon the concept of “CONTACT DATA” as 
> proposed by Greg while well intentioned will only lead to greater 
> confusion.
>
> First Greg acknowledges that not ALL data other than the thin 
> technical data falls within his CONTACT DATA definition (trademark, 
> nexus, reseller, etc). So we begin today with a model that is less 
> than 100% inclusive and will likely become less inclusive as more 
> innovative uses of the RDS and Whois data are created.
>
> Second, the use of this terminology ignores the reality in the 
> marketplace that Registrant data is widely relied upon to make legal 
> determinations (i.e. ownership, authority to transfer a domain name, 
> infringement, etc.). When law enforcement is trying to shut down a 
> counterfeit operation, they are not looking to use this data to 
> ‘contact” the registrant, but instead ‘arrest” him/her.
>
> I understand how the term “contact data” provides a certain comfort 
> level to Stephanie and the valid concerns she has.  However, as 
> someone that is involved in making legal determinations regarding the 
> ownership rights (property/service contract) concerning domain name 
> registrations on a regular basis, this  concept of “Contact Data” will 
> just lead to a lot of confusion.
>
> The whole legal construct (private contractual rights) upon which the 
> domain name system is based recognizes the Registrant and the 
> Registrant Data that it provides. In fact ICANN’s Whois web page makes 
> the following statement: “ICANN's WHOIS Lookup gives you the ability 
> to lookup any generic domains, such as "icann.org" _to find out the 
> registered domain owner_.” (emphasis added) Again this data by ICANN’s 
> own admission is relied upon to make “ownership” decisions NOT mere 
> “contact” information.
>
> So I think we stick to one of the first things I learned as a young 
> engineer. Keep It Simple Stupid (KISS)
>
> Thin Data – the minimum technical data necessary for a registry to 
> perform its function as a registry operator in a shared registry system.
>
> Thick Data – All data associated with a domain name registration made 
> available via Whois/RDS, which may include Personal Identifying 
> Information (PII)
>
> Again I appreciate the constructive efforts of Greg, Stephanie and 
> others, but I just do not see this concept scaling meaningfully.
>
> Best regards,
>
> Michael
>
> *From:* gnso-rds-pdp-wg-bounces at icann.org 
> <mailto:gnso-rds-pdp-wg-bounces at icann.org> 
> [mailto:gnso-rds-pdp-wg-bounces at icann.org] *On Behalf Of *Gomes, Chuck
> *Sent:* Wednesday, December 7, 2016 10:20 AM
> *To:* gca at icginc.com <mailto:gca at icginc.com>; 
> gnso-rds-pdp-wg at icann.org <mailto:gnso-rds-pdp-wg at icann.org>
> *Subject:* Re: [gnso-rds-pdp-wg] key concepts: say "contact data" when 
> that is what we mean
>
> Thanks Greg for the helpful suggestion.  I have one question for you 
> and others: If we exclude THIN DATA, is there any data we will need to 
> consider that could not be accurately classified as CONTACT DATA.  If 
> not, then dividing data into these two categories should suffice.
>
> Chuck
>
> *From:* gnso-rds-pdp-wg-bounces at icann.org 
> <mailto:gnso-rds-pdp-wg-bounces at icann.org> 
> [mailto:gnso-rds-pdp-wg-bounces at icann.org] *On Behalf Of *Greg Aaron
> *Sent:* Wednesday, December 07, 2016 9:55 AM
> *To:* gnso-rds-pdp-wg at icann.org <mailto:gnso-rds-pdp-wg at icann.org>
> *Subject:* [EXTERNAL] [gnso-rds-pdp-wg] key concepts: say "contact 
> data" when that is what we mean
>
> Speaking of key concepts…  people often say “registration data” when 
> they really mean “contact data.”  Being plain and specific here can 
> help discussion in our group.  The concept will come up in next week’s 
> discussion.
>
> There are basically two kinds of “registration data”.  The first is 
> called the*THIN DATA*. This is the basic data about a domain name 
> registration: the domain name, the sponsoring registrar name and ID, 
> the domain’s status(es) , created-updated-expiration dates, and 
> nameservers.  
> (https://whois.icann.org/en/what-are-thick-and-thin-entries )  This 
> data is factual, accurate, is not personally identifiable, and I think 
> is completely noncontroversial.
>
> The second kind of registration data is *CONTACT DATA* – contact 
> names, postal and email addresses, phone numbers.   Contact data 
> raises issues of privacy and data protection.  Contact data can be 
> (and regularly is)  inaccurate because it’s ultimately supplied by the 
> registrants.  When people talk about “registration data accuracy” and 
> “registration data validation” they are really talking about the 
> accuracy of *CONTACT DATA*, not all “registration data.”
>
> In the coming discussions, one approach could be: There are good 
> reasons to publish the thin data … is there any compelling reason 
> /not/ to publish it?   If we can take care of this low-hanging fruit, 
> we will solve part of the puzzle and we can concentrate on the issues 
> around contact data.  This is not a proposal to publish thin data 
> only.  It’s an attempt to disentangle concepts and find a way forward. 
> Not all data is the same, so let’s stop treating all data the same.  
> We may not have to iterate repeatedly about thin data.
>
> Even the EWG’s language wasn’t always clear and specific in this area. 
> Here’s the question we will begin with next week:
>
> /Should gTLD registration data be accessible for any purpose or only 
> for specific purposes?/
>
> /“The EWG unanimously recommends abandoning today’s WHOIS model of 
> giving every user the same entirely anonymous public access to (often 
> inaccurate) gTLD registration data. Instead, the EWG recommends a 
> paradigm shift to a next-generation RDS that collects, validates and 
> discloses gTLD registration data for permissible purposes only./
>
> /While basic data would remain publicly available, the rest would be 
> accessible only to accredited requestors who identify themselves, 
> state their purpose, and agree to be held accountable for appropriate 
> use.”/
>
> What the EWG really meant was:
>
> ·Give public, anonymous access to the THIN data.  (“Basic data” as the 
> EWG called it.)
>
> ·Don’t give every user the same anonymous public access to (“often 
> inaccurate”) gTLD CONTACT DATA.
>
> ·Shift to an RDS that collects, validates and discloses gTLD CONTACT 
> DATA for permissible purposes only.
>
> All best,
>
> --Greg
>
> **********************************
>
> Greg Aaron
>
> Vice-President, Product Management
>
> iThreat Cyber Group / Cybertoolbelt.com
>
> mobile: +1.215.858.2257
>
> **********************************
>
> The information contained in this message is privileged and 
> confidential and protected from disclosure. If the reader of this 
> message is not the intended recipient, or an employee or agent 
> responsible for delivering this message to the intended recipient, you 
> are hereby notified that any dissemination, distribution or copying of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please notify us immediately by replying to 
> the message and deleting it from your computer.
>
>
>
> _______________________________________________
> gnso-rds-pdp-wg mailing list
> gnso-rds-pdp-wg at icann.org
> https://mm.icann.org/mailman/listinfo/gnso-rds-pdp-wg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/gnso-rds-pdp-wg/attachments/20161208/98994236/attachment.html>


More information about the gnso-rds-pdp-wg mailing list