[ksk-change] planned vs. emergency (was Re: [ksk-rollover] root zone KSK ...)

Tue Sep 23 17:48:00 UTC 2014

On 9/22/2014 1:46 AM, David Conrad wrote:
> Mike,
>
> On Sep 21, 2014, at 9:02 PM, Michael StJohns <msj at nthpermutation.com> wrote:
>> *sigh*
> Unhelpful.
>
>>> Or do you believe we should revise the key handling policies and processes to roll _much_ more frequently?
>> I'd suggest 1-2 years.  Or basically once every 4-8 times you do a ZSK replacement.
> The current DPS has 5 years. On what analysis do you base your suggestion?

Skill degradation, memory loss, personnel turn over.  I'd do it every 
1-2 years or never.

>
>> I think you're underestimating by perhaps several orders of magnitude the cost of a "full trust reboot”.
> Actually, not. I believe we have to be prepared for a "full trust reboot” _regardless of 5011 support_ and part of the exercise with the key change exercise we’re discussing/planning a workshop for is to ensure that preparation.

To clarify this:  I believe you need to retain the capability to do a 
"full trust reboot" for the life of DNSSEC.  I also believe that if you 
ever have to do it, the results will be catastrophic.  My third belief, 
is that the process for doing the FTR (new acronym as of now), will need 
to be maintained and updated and probably won't adequately be.
>
>> The comment about irrelevancy of the CA model is that none of these are universal global roots of trust.  They compete and mostly that causes really interesting interoperability problems.  Failure of one of them is not going to have the universal/broad impact that the failure of the single DNSSEC root of trust would have.
> Which would seem to argue that one must be extremely careful, much more careful than CAs, if you have a single root of trust and not expose that trust to potential risk unnecessarily. You appear to be arguing that rolling the key every 1-2 years would not increase risk over rolling it less frequently. I do not agree.

I understand your lack of agreement.  However, there is risk to 
everything.  As I said above, having to do an FTR will be catastrophic.  
That could change over time if you socialize it and keep socializing it 
so that the every 5-7 years you do it people understand why its 
necessary and "nothing bad" (tm) happens.  The risk you have with the 
status quo is that a completely unlikely set of events happens (e.g. 
root compromise) and you're midway through your cycle.  No one knows 
where the knobs are to replace their root trust anchor configuration, 
everyone yells, and the root gets taken away from ICANN because it 
hasn't been a good caretaker.

The major risk for putting together a key replacement cycle will be when 
you revoke the current existing sole root of trust key.  That's when 
things have the most potential to break because it will be the first 
time we've done it.  That applies both to 5011 and FTR.  Get past that 
and a 1-2 year replacement cycle that's handled on an automated basis 
near universally is pretty much risk negative.

>
>> Going back to trust reboot - think about the timeline used for the original key creation and signing ceremonies.  Pretend a compromise happens "now".  How long until DNSSEC is back up using the trust reboot process?  Oh yeah - the compromise happened because the HSM you're using  was found to be insecure.  Ready,..... GO!
> What is the difference in this scenario without 5011 and with 5011 if you assume a compromise of all keys?

You keep over constraining things.  There are at least four different 
scenarios:

1) 5011, multiple roots of trust and a trust anchor replacement cycle.
2) 5011, multiple roots of trust and an N-1 key compromise
3) 5011 or FTR-only and a 100% compromise.
4) FTR only and a planned key replacement.

In the first one, I can schedule the addition of the new key and the 
revocation of the old one to coincide with one of the ZSK ceremonies.  
All actions are ICANN's. Updates occur in the field automatically.

In the second one, I need to get the signers together quickly to revoke 
the old one.  (And I can actually figure out  way to do that with the 
signers being remoted instead of the current process).  All the actions 
take place on behalf of ICANN.

In the third one, with 5011 I can revoke all the old keys so at least 
they can't be used - with FTR, I'm limited by how fast I can pass the 
word and how much people are paying attention.    ICANN actions are 
dwarfed by the number of manual changes and updates necessary by 
resolver managers.  DNSSEC goes down for the time needed to reboot it.  
Anything that relies on DNSSEC is now insecure.

In the fourth one, you can generate the keys in advance, publish them 
and wait for 6 months for everyone to get the word and do the update.  
At some point you have to stop signing with the old key. At that point 
some chaos ensues because Comcast missed all the resolvers on the east 
coast, or an update process failed or the guy who was responsible for 
doing the updates was laid off.    And this will happen without fail 
every time an FTR is done.  If you do it frequently enough, people will 
learn (but that applies to 5011 as well), but mostly there will always 
be someone that didn't get the word, or someplace where no one was 
responsible for doing the update actions.

I want Humans out of the loop to the greatest extent possible on the 
resolver side.  It's the only way to scale this.

>   Do you believe we do NOT need to be prepared for the latter?

Have I said anywhere anything that would lead you to make that 
statement?  Shit happens.  The world can come to the end, and we may 
need to do an FTR.  It doesn't mean I like the idea, nor does it mean 
that I think we should settle for that as our only tool in the toolbox.  
To be blunt and clear - _*Yes, I think we need to be prepared to do an 
FTR*_.

>
> You also seem to assume there will be universal deployment of 5011 and that everyone will allow 5011 to operate on their infrastructure.  Neither of these assumptions seem realistic to me.

This is where the *sigh* creeps back in.  ICANN has specified in the DPS 
that 5011 is the method for doing key replacements.  If they don't want 
to do 5011, you've pointed them to where the root key files are and 
they're responsible for tracking them manually. That's doesn't require 
that everyone has 5011, it does require that everyone be responsible for 
their own deployments.

If you now change the DPS and say "we were only kidding, we're not going 
to use 5011", then you run into the whole problem of systems that were 
"relying" (legal term) on your assertions and not realizing they need to 
do something different when you update the root sets.

>
>> By the way, your attacker *is* using 5011.  Since he now has access to the trust anchor private key, he's using it to place new trust anchors for the BOA, Google and the IRS local resolvers by intercepting and replacing root zone queries.
> _Exactly_ the reason I would see some folks choosing not to support 5011.
Hmm - how is this any different for a resolver automatically accepting 
new zone keys for say .COM because they were updated and resigned and 
the new chains go through new keys?

The design of DNSSEC is that keys, contents and signatures can change 
overtime and that resolvers won't burp.  Why do you think that 5011 is 
so much of a difference from that?

  Regards, -drc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ksk-rollover/attachments/20140923/29170c59/attachment-0001.html>