[ksk-rollover] Starting discussion on acceptable criteria for proceeding with the root KSK roll

Doug Barton dougb at dougbarton.email
Fri Jan 5 07:49:34 UTC 2018


On 01/04/2018 04:19 PM, Geoff Huston wrote:
> When we roll the clock back to September 2017, the cited reason for the deferral of the root roll was the existence of data from resolvers that supported RFC8145 signalling that a pool of these resolvers had not loaded KSK20911 into their local trusted key store.
> 
> Carlos, (I’m asking because you posted a "me too") what is the data set you are using to justify this call to be “over soon”? It seems to me that in the absence of new data, the only changed factor is your own appetite for risk. Without additional data, your tolerance for risk appears to increase over time (*). But is this altered personal perception of the risk sufficient motivation to proceed? Objectively, if the numbers in September 2017 gave sufficient grounds to pause, and the numbers haven't changed (**) then surely the grounds for pausing the operation as as strong now as they were in September (***).

The grounds for pausing have the same strength that they did in 
September, yes. Which is to say, very limited compared to the overall 
risk of not doing the roll.

Since a little before September when the 8145 data started rolling in 
all I've heard discussed is the risk to the deployed base if we do the 
roll and their stuff breaks. But there is another, arguably greater risk 
that is not being discussed, what happens if we get ourselves into a 
position where we are forced to do an emergency roll? (The common 
scenarios for that are key compromise, which is very unlikely but not 
impossible, and alg failure.) We aren't planning to do the roll for the 
fun of it. We are planning to do the roll because at SOME point in the 
future, it will be necessary to do one. Every day that we don't roll the 
key adds to that risk, since we already know, for sure, that rolling the 
key NOW will break stuff.

There are only two conditions that can be true at this point:

1. DNSSEC, while deployed to a non-trivial degree, has little actual 
utility at this time. That is, nothing mission-critical depends on it 
now, or in the near future.

OR

2. DNSSEC is an essential service that many organizations depend on.

If #1 is true we should do the roll ASAP because any fallout from 
breakage will be minimal, and hopefully have a net positive benefit when 
people update their broken stuff.

If #2 is true we should do the roll ASAP because we need to demonstrate 
that we can, and so that any breakage can be dealt with in a somewhat 
controlled environment with lots of eyeballs and resources dedicated to 
it. This ultimately will make the system more robust because people will 
fix their broken stuff, and develop confidence in the system regarding 
any possible rolls in the future.

Either way, Jacques is right, we need to make like Nike and "Just Do It."

Doug


More information about the ksk-rollover mailing list