[gnso-rds-pdp-wg] Using the GDPR as a basis for RDS Policy

Thu Feb 15 14:43:50 UTC 2018

Hi,

On Thu, Feb 15, 2018 at 02:03:49PM +0000, Greg Aaron wrote:
> 

> Well... no.  We can certainly agree that a move to RDAP is sorely
> needed.  But deficiencies in the WHOIS protocol were not the
> problem.

Actually, your quote shows otherwise.  See below.

> Rather it was failure by many registrars to implement
> properly and uniformly -- not just "bad actors" but the many more
> that were inattentive or not competent.

I used "bad actors" loosely, to include registrars who didn't do their
job.  I can't tell whether a registrar who doesn't do its job is
incompetent, lazy, or malicious.  And I don't care.

> "Historically, the centralized databases of thick Whois registries
>are operated under a single administrator that sets conventions and
>standards for submission and display,

The mere fact that this talks about display _at all_ is evidence that
the whois protocol itself was indeed part of the problem.  Display in
a data system should not be under the control of the data source, but
under some formatting system.  This is the same reason that my user
agent (browser) is responsible for formatting things on the web
according to the css file sent by the server (or, perhaps, some other
css file I can use locally to override that stylesheet).

The submission standards are a different problem, because in fact
there are agreemnets that _already_ govern such submissions.  There is
no need of a central datbase to get the submmission correct, unless
some participants in the system are just not doing their job.  The
answer to that, of course, is market discipline, including either
reputation system counter-bias or deaccreditation.  ICANN's dependence
on fees from the registry business makes it an ineffective
agent of deaccreditation, of course.

>  The thin model is thus criticized for introducing variability among
>Whois services, which can be problematic for legitimate forms of
>automation.

This is again evidence that whois the protocol was part of the
problem.  You can't automate against whois because the only way to do
it is to scrape screens, and that is unreliable.  Indeed, in a
properly formatted data output, even missing data isn't as great a
problem, because your automation can cope with the missing data
precisely because it is formatted for machine consumption.

> In other words: security, stability, and usability reasons.

Those may be the reasons people selected this path, but it has always
been evident to many of us that the mistake was in relying on a
protocol misfit to the purpose.  We didn't get greater security from
it: data leaks like crazy, there is no authentication of who is
requesting, and people lie about their data for the perfectly
reasonable end of not getting doxed just because of having a domain
name.  We didn't get greater stability, either, because we have
increased the data maintenance burden on registries for no obvious
benefit, and have increased the probabilty of data mismatches across
two different "sources of truth" (as they say, "The man with two
watches never knows what time it is").  And usability was not
improved, either, because whois can't do internationalization, can't
give you only the data you want, and can't do referrals reliably or
effectively.

> The accuracy of the data is a completely separate matter.

It is not.  A significant reason for data problems is manifestly the
bad protocol, which creates incentives for white lies.

> A distributed system relies on the competence, robustness, and good
> faith of all the parties involved.  Centralizing some aspects can
> mitigate failures, incompetence, and bad faith.

It seems to me that the current whois is, quite literally, a
counterexample to your claim, whereas the DNS and its actual operation
on the Internet suggests to me that when the incentives are correctly
aligned a distributed system works well.

I can buy the argument that the R/R/R model was dumb, and that
registrars are a needless wheel that does no work in the registration
system.  _That_ is a reason to centralise all data in the registry.
But I doubt we are headed in that direction.

Alternatively, I can buy the argument that the registry should be the
only source of data having to do with any registration, and that
registrars are basically just authorized agents of the registry and
must provide passthrough access to such data as is related to domain
name registrations.  _That_ could be a reason to centralise all data
in the registry, too.  It leads pretty quickly to pretty serious
questions (the ones we have been debating) about exactly which data
the registry really needs in support of domain name registration, and
it also leads to additional questions (not yet discussed) about
whether registrars may retain any of that data in their own
repositories when they are collected only for domain name
registration.  I suspect the answer is no, at least not without
consent (registrars would have access to it anyway, through the same
SRS where the data would be stored.  But I can't imagine any registrar
being comfortable with nailing their own uptime to that of every
registry in the world).

I cannot buy, however, any claim that centralising the data
necessarily makes things better for the Internet.  I don't think the
claim has been demonstrated, and I can think of lots of ways in which
it is obviously false.

Best regards,

A

-- 
Andrew Sullivan
ajs at anvilwalrusden.com