[rssac-caucus] FOR REVIEW: Elements of Potential Root Operators

Shane Kerr shane at time-travellers.org
Thu Sep 8 03:52:57 UTC 2016


Wes,

[ I considered breaking this up into separate mails since there are a
  few topics here, but I am feeling lazy. Sorry.  ]

At 2016-09-07 08:20:39 -0700
Wes Hardaker <hardaker at isi.edu> wrote:

> Shane Kerr <shane at time-travellers.org> writes:
> 
> > One thing that I would like to see is a request that all of the
> > information in this document be public.  For example, descriptions of
> > zone distribution architecture.  
> 
> I'm confused about whether you're asking about the internal
> documentation about a particular operator (many (most?) operators of any
> networks don't like advertising their exact internal architectures for
> security-through-obscurity related reasons, which we shouldn't argue
> about here), or are you asking about the root zone distribution system,
> which is how IANA/ICANN/Versign/and-the-roots get data changes
> propagated through the system.  This last system is fairly well
> documented publicly I think, no?  I would need to go search for the
> document that describes it, but I'm sure there is one (including how
> zone changes are reviewed, etc).

Yes, I'm asking about the internal documentation... or rather,
documentation about the architecture and design. For example, it
appears that some root operators use some sort of push mechanism to
get new versions of the root zone out, while others seem to rely on
SOA timers. It looks like maybe some have a cron job for this purpose,
or maybe they have a push mechanism with some slow-running validation
process. It's impossible to know, and I think this could be useful,
both because it gives a chance to have more eyeballs on the design as
well as being able to understand the behavior of the system.

I'm actually not aware of documentation about the
IANA/ICANN/Verisign/all-the-roots setup. A quick search didn't really
turn up anything more detailed than this:

https://www.ntia.doc.gov/legacy/DNS/CurrentProcessFlow.pdf

This information may exist though! I would love to see some better
pointers (I have already shown that I am crap at finding ICANN
documents, so there is a good chance that it is just me)....

> > As someone without access to this for current root server operators I
> > have to infer this - I often don't know whether something is broken in
> > the root server system or whether they are merely acting in ways that
> > I didn't expect because I don't know what's going on "under the hood".  
> 
> I'd love to hear an example problem where you don't know of whether
> something is broken or not based on some external test.

For example, K root normally gets a new copy of the root zone within a
few seconds of it being available to the other roots. However, for a 5
day period (2016-07-25 to 2016-07-29) this jumped to being delayed by 1
to 6 minutes. The same seemed to happen to several others... D and I,
and even B which is usually updated within a couple seconds... around
the same time. I haven't done a full analysis, but it does seem like A
did not have the same delay.

Since I don't know the details of how the zone is published between
Verisign and itself, or between Verisign and the other roots, or
between the other roots and their various sites... it's impossible to
know what happened here. It could be measurement error on my side (I
need to dig into lookup error reports, although clustered errors like
that for specific root servers would also be interesting to note), it
could be some test of the root operators, it could be some operational
change that was reverted or fixed, it could be FSB upgrading their wire
taps... without knowing the architecture or design it's just guesswork.

> >     3.3.3 Addressing Resources
> >
> >     The candidate operator MUST have its own AS number(s) and IPv4 and
> >     IPv6 address allocations. It is assumed that IP anycast will be
> >     used. If IP anycast will not be used, a technology providing
> >     similar or better service levels SHOULD be specified. Provider
> >     address space or addresses that cannot be used with anycast are
> >     undesirable. The expected production IPv4 and IPv6 address blocks
> >     MUST be severable from the candidate’s organization to facilitate
> >     emergency or planned transfers.
> >
> > Some RIRs have policies to allocate addresses for "critical use":
> >
> > https://www.nro.net/rir-comparative-policy-overview/rir-comparative-policy-overview-2016-02#2-4-2
> >
> > This means that potential root server operators probably do not need to
> > have address space in advance of being approved for being a root server
> > operator.  
> 
> I think the paragraph is trying to say "the candidate MUST have address
> space before coming an operator" not "before getting approved to be
> one".  But it could probably be made more clear?

Perhaps something like:

    The candidate operator MUST obtain its own AS number and IPv4 and
    IPv6 address allocations for operating a root server. ...

> > My concern is that this will naturally lead to "diversity inflation"
> > where each root server operator runs multiple versions of everything
> > without any real benefit, and in fact a potential reduction in
> > reliability of the overall system.  
> 
> There is certainly a large amount of debate to be had about how much
> diversity is a good thing.  Single points of failure are known to be
> bad, and it would be bad if everyone ran the exact same version of bind
> (eg) and FreeBSD (eg).  But too much diversity leads to more points of
> failure, which I suppose could be bad.  Though if the system could lose
> X% of it's deployed infrastructure without visibly affecting the service
> itself, then high diversity is likely to be helpful since it's unlikely
> that the entire system could go down if the diversity ensured it would
> take multiple vulnerabilities to pass the X% threshold. 

I think it should just be clear that applications will be evaluated on
how they fit in the diversity of the root server system, rather than
specifically on how much diversity they can provide within a single
root server.

Cheers,

--
Shane
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://mm.icann.org/pipermail/rssac-caucus/attachments/20160908/85427caf/attachment.sig>


More information about the rssac-caucus mailing list