[gnso-rds-pdp-wg] authoritative

Mon May 1 14:21:12 UTC 2017

On Mon, May 01, 2017 at 08:38:18AM -0400, Sam Lanfranco wrote:
> Chuck,
> 
> Not Quite. There are two parts (as per the programming tradition).
> 
> 1. The "Data of Record" (or what Andrew Sullivan calls "Authenticatable
>    Data of Record" in his posting) is the data.
> 2. The data source (SOR or SSoR) is where it is kept, and by whom.
> 
> In my opinion, essential and useful properties to this are:
> 
>  * The Data of Record (DOR) is authoritative because it comes directly
>    from the agreed upon system of record (SOR), (e.g. ICANN, or a trustee)

Well, yes, but if you don't define DOR this way then you don't need
SOR at all.  I think under these circumstances, the SOR is a
distraction, because it's talking about _who operates_ the SOR.  But
we don't have to care about that if we pay attention to the DOR.  This
makes the quite clear the point that the SOR could be SORs, because
the database is actually distributed.

Today it happens that we are trying to centralise more, with "thick
whois", but in my opinion that has turned out to be a tactical error:
we're getting registrars entering bad data precisely because they
don't want the registry to have the data, and so on.  There turn out
to be a lot of policies about this, and yet the registrars now have
data that the registry will never have, but that is potentially of
interest to, say, law enforcement.  Think how convenient it would be,
for instance, for a LEA to have automatic access to data if it could
provide authenticatable proof of a subpoena for the relevant
jurisdiction.  That's not something available _today_, but if we write
policy that needs to be changed in case such a thing becomes available
we're doing this wrong.

So I think it would be better to define DOR as the data set at a given
time relevant to a given registration object that expresses the data
provided in the then-current registration for that object, as found in
the source repository for the object in question.  (This is "object"
because it's not only data about domain names that count.  Nameservers
are separate objects in most EPP deployments, for instance, and the
"expiration date" controversy is an example of where both registry and
registrar repositories are relevant.)

>  * Other data sources within the Internet ecosystem (cashed data,
>    extracted data, etc.) is not equivalent to Data of Record and "user
>    be(a)ware"

Cached data _might_ be equivalent to DOR, which is why I liked the
ADOR instead.  In RDAP, you can get ADOR by asking the RDAP server
over https.  It might be also that we could get ADOR using RDAP and
web caches, by signing data about each object.  That would not be so
hard in RDAP, though I don't believe it's ready today.  ADOR is
impossible on whois, because there is no data authenticity proof
(https offers this by authenticating the server and protecting the
data transmission; neither happens in whois).  

>  * Data errors are a separate issue (errors in collection,
>    transcription, or deliberate) with complaints going to the data
>    gatherers (Registrars/ISPs, etc.)

Yes -- that's why this should be nailed to what is in the registration
repository relevant to the object.

Best regards,

A

-- 
Andrew Sullivan
ajs at anvilwalrusden.com