[gnso-rds-pdp-wg] Apologies, and some reflections on requirements

Thu Jun 30 08:20:50 UTC 2016

I would be happy to be wrong about the need for a charter change so we will explore that further. If the main thing we are talking about is Federated v. Distributed, then I don't think a charter change would be needed.     Am I correct that is the main issue or is there more to what Andrew is suggesting?

Andrew?

Chuck

Sent from my iPhone

On Jun 30, 2016, at 3:40 AM, Greg Aaron <gca at icginc.com<mailto:gca at icginc.com>> wrote:

I think Andrew is saying two things.  1) Fundamental relationships and business processes have been around for many years and probably can’t be blown up.  A reality is that the registrant-registrar-registry model exists (and for some good reasons – remember that one of ICANN’s missions is to promote competition in the registrar and  TLD spaces).  2) Any centralized system introduces additional risks (which SSAC has noted).  That goes for a system  in which all gTLD registration data would be collected into one centralized super-mega-registry (ARDS), or some centralized point that would be used to regulate access to the various registries.
A “decentralized” system  could be one in which each registry continues to hold and provide access to data related to the domains in their registries.  (With each registry being thick, not thin.  A PDP has already decided that.)
Andrew, please correct me if I’m wrong.

Responding to Chuck’s note: I don’t see how anything in Andrew’s note suggests or requires a charter change.  For example, the ARDs is not a requirement that our group must accede to.  Federated is an option, synchronized is an option, and siloed registries (with perhaps access requirements) are an option, and there may be others.  We figure out technical model after figuring out fundamental questions such as what data should be collected and who should have access to it.

Responding to Stephanie’s note: yes, the EWG should have been conducted differently.  But Fadi created something that circumvented  established, transparent community processes, and thus our PDP group is tasked with doing a lot of the same work and having some of the same debates the EWG did.  That’s simply the situation we are in.  A corollary is that nothing the EWG decided is written in stone; it is this PDP WG’s prerogative to like or dislike anything the EWG did.

All best,
--Greg

**********************************
Greg Aaron
Vice-President, Product Management
iThreat Cyber Group / http://Cybertoolbelt.com
mobile: +1.215.858.2257
**********************************
The information contained in this message is privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer.

From: gnso-rds-pdp-wg-bounces at icann.org<mailto:gnso-rds-pdp-wg-bounces at icann.org> [mailto:gnso-rds-pdp-wg-bounces at icann.org] On Behalf Of Stephanie Perrin
Sent: Thursday, June 30, 2016 1:59 AM
To: gnso-rds-pdp-wg at icann.org<mailto:gnso-rds-pdp-wg at icann.org>
Subject: Re: [gnso-rds-pdp-wg] Apologies, and some reflections on requirements

I have not responded either to Andrew's excellent post.  As a member of the EWG, representing the privacy rights constituency, I must say that I was faced with an embarrassment of riches on that group in terms of things to argue about.  Fellow EWG members will likely be willing to back me up that I gave it my best shot in terms of pushback.  The central model was not one I spent much energy on, being non-technical and thus requiring backup support which was difficult to arrange given our confidential operations.  Meeting number 2 I gave up on that one.  IN particular, since I did not understand the rationale that was driving the thick vs thin work that was ongoing and out of scope for our discussions (done deal), the actual mechanism that would drive us to a thin model that seemed to me desirable from a privacy perspective appeared to be out of reach.  Similarly, since the 2013 RAA was out of scope and contained masses of embedded policy decisions that had never gone through a proper pdp process that I could find, tiered access using RDAP appeared to be the only hope.  Having worked on authentication issues for a long time, I have some notion about the cost and complexity issues for what I would want under this model, which is why I will be harping on costs and who pays (file under fair warning) because I don't like working on something for five years only to be told at the end of it that Oops, we can't afford this, who is going to pay?  So sorry.....

However, we are supposed to be the group that is looking at the issue de novo.  We are supposed to be paying attention to the SSAC and Article 29 documents which, coincidentally both tell us to "figure out what the purpose of collecting, using, and storing registration data is and then we will tell you how to build and what to build" (loosely translated).  "What to build" is a very important trio of words for those of us who are policy oriented and not technology oriented, because the concept of what we are doing creates a gestalt that influences policy choices, mostly negatively in my view.  If I may give another example in the privacy field, we still talk about "video surveillance" to describe our security camera technologies, summoning up images of the decades old video cameras and tapes we used to have.  This prevents deeper questioning of data flows and ancillary inputs, making the privacy choices seem deceptively simple.  This is why I am so enthused about Andrew's post, it helps me understand how to go back to a de novo approach.

In my view, the thorough debates of the EWG on various issues should have been done in public.  The presentations we gave were cheery but lean on detail and choices, in my view.  Furthermore, as the lone privacy representative, I could have used a Caspar Bowden or a Ross Anderson or a Kyle Rannenberg or an Ian Goldberg in my corner, to name just a couple of the folks I yearned to consult but could not (and not to mention Andrew Sullivan whom I do not know personally).

I would be happy to introduce the Charter change proposal at Council, once we have figured out what the desirable change might be. Let us please face the reality that we do not have the number of multidisciplinary lateral thinkers in our PDP that it would take to figure this all out, and not be embarrassed about going back and forth on the Charter.  No aspersions being cast on the Charter drafting group, or our collective failure to comment beyond putting markers in for our own siloed issues, and our own siloed understanding of "requirements".

Stephanie Perrin

On 2016-06-29 21:04, Gomes, Chuck wrote:

Andrew,

I am sorry to take so long to respond to your very thoughtful message but as you know I have been pretty busy here in Helsinki.  It seems to me personally that you make some suggestions that could possibly be constructive to the work ahead but I have two primary concerns:

1. I am pretty sure that it would require a charter change.  To do that would require going back to the GNSO Council with the proposed changes and seeking their approval.  That is something that is not out of the question but it could cause some delays and I would want to make sure that there is strong WG support for doing so.  Also, I think we need to remember that a lot of very smart people spent quite a bit of time developing the framework that resulted in the charter so I think we should consider possible changes with that in mind.

2. My understanding is that EWG debated things like you are suggesting quite intensely.  As you know I was not a member of the EWG but Lisa has provided some thoughts about that I include below.

" It might be useful to reflect upon the EWG's experience with system modeling. After starting with use cases, some EWG members needed a system model against which to test principles on purposes, data needs, and associated privacy, access, and accuracy issues. This led to the EWG's Initial Report proposing both a set of principles and an Aggregated RDS system model to support those principles - but without much explanation of the ARDS. Over the year that followed, the EWG evaluated half a dozen system models, drilling deeper into two (Federated and Synchronized) to examine feasibility and costs before recommending the SRDS. Both SRDS and FRDS models use RDAP; neither stores data in a single physical location. While the SRDS is a "thick" storage model where queries are served from synchronized data, the runner-up FRDS actually uses "thin" registries, querying data from registrars and validators in real-time.

"While some possible requirements may reflect a particular system model - for example, those drawn from today's WHOIS policies -- our PDP WG has yet to consider whether to recommend a next-gen system. But no matter what model we recommend, perhaps we can learn from the EWG's experience. First, while envisioning a possible new model early on was helpful to some, reaching agreement on a recommended model was not possible until the EWG was nearly finished, following feasibility and cost analysis. Second, while each had pros/cons, both models were found to be capable of supporting the EWG's principles. In other words, model choice did not drive the EWG's principles - principles and criteria such as cost drove the EWG's choice of model."

I want to add to Lisa's thoughts my own personal opinion:  I don't think the issue of Federated v. Synchronized is a closed issue.  My understanding is that the final recommendation in the EWG report could have been more the result of the desire to finish the work than a strong consensus.  Whether I am right on that or not, our WG can consider both and make our own decision between either one or some variation.

Finally, I want to encourage all WG members to share your thoughts on Andrews comments and on my responses above.

Chuck

-----Original Message-----

From: gnso-rds-pdp-wg-bounces at icann.org<mailto:gnso-rds-pdp-wg-bounces at icann.org> [mailto:gnso-rds-pdp-wg-bounces at icann.org] On Behalf Of Andrew Sullivan

Sent: Friday, June 24, 2016 10:04 PM

To: gnso-rds-pdp-wg at icann.org<mailto:gnso-rds-pdp-wg at icann.org>

Subject: [gnso-rds-pdp-wg] Apologies, and some reflections on requirements

Dear colleagues,

Apologies first.  I'm not going to be in Helsinki.  I'm in the middle of a move from NH back to Toronto, and it turns out that my movers'

understanding of, "I need to leave on $date," entails arranging things such that goods will arrive after $date.  Alas, in this case the goods arrive Monday.  I will attempt to follow the ICANN meetings remotely next week, but I expect it will be tricky.

I have been deeply dissatisfied with the way the work is going, and I believe it is because I see a mismatch in what we are trying to do and the kind of system we are trying to do it to.  In particular, I think we are trying to treat the RDS as a single monolithic system, and attempting to build "requirements" that match that assumption.  Here is an effort to sketch why I think that.  I didn't have time to write a short note, &c. &c.  Sorry this is long.

Since the very introduction of the competitive-registrar model (and arguably before that), the RDS has been a distributed database.  It is far less successful than the other distrubuted database we all know and love -- DNS -- but it is nevertheless distributed.

The distribution comes from different parties having various parts of the data.  In so-called "thin" registries, this was always the case.

The registry has names and nameservers, and since the invention of registrars knows who the registrar is.  But if you wanted to know certain kinds of data, you had to ask the registrar in question.

Because in (say) 1999-2001 nobody had anything better than the whois/rwhois/whois++ protocol(s) to deliver this kind of data, a whole bunch of bad compromises got enshrined in policy.  First, we continued to use whois and its descendents (anything on port 43) as the model for all of this.  The plain fact is that whois was obsolete nearly at birth.  It's a terrible protocol, and should be taken behind the ice house and put out of its misery.

Second, in order to "fix up" whois, clients were created all over the Internet that built in a bunch of assumptions about whom to ask for what data.  The consequence of this was that clients routinely got bad data as they queried the wrong server.  Old registrar data hung around even after a transfer.  When I worked on the org transition from Verisign to PIR in 2003 (?), it took a long time before whois clients stopped asking Verisign about org data.  And so on.

Third, in an attempt to hack around the above technical flaws in an already-obsolete protocol, "thick whois" gained popularity in possibly the worst possible arrangement known to data science.  Instead of insisting that registries hold the data and that registrars and everyone else treat the registry data as The Truth, we created "thick"

whois in registries _without allowing registrars to stop their service_.  Any half-competent database person will tell you that storing "the same data" in two places that don't have tight connections is an excellent way to create data inconsistency, but is not a good way to arrive at the truth.  (Latterly, as though illustrating the tendency of people to double down on bad ideas, there have been suggestions that ICANN should run the One Giant RDS of the Universe and hold all the data in a central place.  What could possibly go wrong?)

The thread running through this history of error is the idea that the RDS is one system.  But like the DNS, it only appears to be one system.  It's actually a "distributed database", where in this case the distribution is separable on organization lines.  That is, registries -- including ICANN, who can be thought of in this case as both the registry and registrar for the root zone -- have some data.

Registrars have some other data.  Resellers and privacy/proxy services have yet other data.  In many cases, the data does not need to be shared across these organizational lines to make it queryable by humans.

The reason that isn't clear to most of us is because whois -- the RDS we use today -- _was_ designed as a monolithic system.  It was designed that way because back when it was created -- RFC 812 is from _1982_! -- the database _was_ a monolithic database.  Whois (the protocol and the client program) continues to have all the deficiencies for distributed use that you might expect of a program or protocol designed to talk to exactly one authoritative service.

Whois++ and rwhois attempted to graft on to this basic protocol some

distributed operation, but the graft didn't really take and the ornamental shrub now looks like a weed.

People have nevertheless internalized the whois-based thinking, which is why we keep asking things like, "What data should be collected?"

In a distributed system like this, that's barely interesting, for the commercial interests in this case all militate against collecting data that nobody needs for any function.  Instead, we should ask what data should be collected _by different actors_.  This implicitly involves describing what those actors are doing to require the data.

The nice thing, of course, is that protocol designers have done _a lot_ of this work for us, when they were working on RDAP.  They did this because they were trying to come up with use cases for the protocol, which finally did away with the monolithic-system thinking of whois and offers us a protocol designed precisely to work in the distributed-database environment that is the actual registration system.  That we even still have a work step that involves evaluating what protocol we're going to use for all this makes me a little ill.

It seems to me that we can just say that we have to embrace the distributed-database fact.  For first, it's a fact of how registration actually works now.  If we don't agree with that, I think we should give up.  Second, it's consistent with how every single other thing on the Internet that has not crashed and burned works.  The Internet cannot scale depending on monolithic systems.  And nobody has the power to impose one anyway.

Once we have done that, there are still important policy issues about what data ought to be collected by anyone, under what conditions they might reveal it to someone else (and who that someone else is), and so on.  But there are empirical tests for whether some of the answers people are proposing really match the distributed nature of the system.  If they don't, we can close off those avenues of inquiry, because they'll never be productive.

Best regards,

A

--

Andrew Sullivan

ajs at anvilwalrusden.com<mailto:ajs at anvilwalrusden.com>

_______________________________________________

gnso-rds-pdp-wg mailing list

gnso-rds-pdp-wg at icann.org<mailto:gnso-rds-pdp-wg at icann.org>

https://mm.icann.org/mailman/listinfo/gnso-rds-pdp-wg

_______________________________________________

gnso-rds-pdp-wg mailing list

gnso-rds-pdp-wg at icann.org<mailto:gnso-rds-pdp-wg at icann.org>

https://mm.icann.org/mailman/listinfo/gnso-rds-pdp-wg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/gnso-rds-pdp-wg/attachments/20160630/cc7c99f2/attachment.html>