[Comments-proposal-future-rz-ksk-rollovers-01nov19] Feedback on proposal for Future Root Zone KSK Rollovers

Roland van Rijswijk-Deij roland at nlnetlabs.nl
Fri Jan 24 08:24:40 UTC 2020


Thank you for this proposal that takes some of the community's feedback on the past KSK rollover process into consideration. Please find my feedback on the proposal below:

My first point concerns the proposal's stance on algorithm rollover. You write that the current signing infrastructure is capable of transitioning the signing algorithm for the root zone from RSA/SHA256 (algorithm 8) to an ECDSA-based algorithm (13 or 14). You then go on to argue that such a rollover is not on the table in this proposal, citing that "[...] a comprehensive approach needs to consider the readiness and effects on all the other components in the ecosystem, including global resolver behavior". You also state that that current setup that uses 2048-bit RSA keys is not known to have exploitable vulnerabilities. I see issues with both statements that lead me to urge you to reconsider explicitly including an ambition to change the root's signing algorithm in the next or possibly subsequent rollover. My reasons for suggesting this are as follows:

- RSA 2048, which currently indeed has no known weaknesses that will allow it to be broken at short notice, has been listed for sunsetting in NIST SP800-57 part 1 by the year 2031 (as Michael StJohns also pointed out). That is within the proposed scope of this document.

- Choosing to keep RSA 2048 restricts some of the other choices made in the proposal. It effectively, for example, limits the number of keys in a standby state to a single key, as the combined DNSKEY set would otherwise exceed the desired maximum datagram size. This is further exacerbated by plans from the DNS software implementers community to restrict DNS UDP datagram sizes to around 1232 bytes [1].

- The proposal's stance on algorithm rollover appears to hinge on the assumption that support for ECDSA is not as widespread or "good" as for RSA 2048. I would argue that this is an outdated premise; freely accessible measurements [2,3,4] show that support for ECDSA P256 and P384 is on par (within margin of error) with support for RSA/SHA256. Indeed, even support for EdDSA 25519 is rapidly rising [5]. I would argue that the community is already performing the due diligence of testing support for ECDSA in resolvers and is demonstrating that ECC algorithms in general, and ECDSA in particular, are well-supported. Equally, multiple TLD operators (e.g. [6,7] have already switched to signing with ECDSA without problems. Note that this also implicitly means that these TLDs have shown resolvers can deal with an algorithm rollover successfully. Finally, e.g., Cloudflare has been signing domains for its customers using ECDSA for well over 4 years [8] (and there are many other examples of second-level domains signed with ECDSA).

There is another argument for performing an algorithm rollover sooner rather than later, which is the transition to post-quantum cryptography (PQC) algorithms. While it is as yet uncertain if, and if so, when there will be a need to make this transition, performing an algorithm rollover now creates experience in a controlled manner with an algorithm (ECDSA) that is well-understood and well-supported. It is as yet unclear which PQC algorithms are suited for use in DNSSEC, and transitioning may require additional protocol changes. Nevertheless, gaining experience with algorithm rollovers is an important step toward facilitating such a transition. On that note, I believe it would be prudent to mention a potential future transition of the root to PQC algorithms, but to place it firmly out of scope for this proposal.

My second point concerns the timing choices in the proposal. While I have no quarrel per se with the timing choices, I do think some of the argumentation to justify the choices is weak. Specifically, the claim that the schedule allows for "sufficient exercise of processes and procedures by staff" is too strong. Effectively, every cycle has a two-year period during which nothing significant happens related to the root KSK rollover. This is long enough to lose institutional memory (in the form of staff changes) and to lose the automatic "muscle memory" (for lack of a better term) needed to proceed through a rollover. I would argue, though, that this is not really an issue, since a rollover of the root is always special, important, and risky, given the special role of the root. I would argue that we want people to be on guard during a root rollover and while institutional memory is nice to have, it does not necessarily lead to increased vigilance.

Thirdly, I would like to comment on the emphasis given to emergency rollovers. Many of the choices are (partly) justified by stating that they make emergency rollovers easier or possible, but, in my view, insufficient consideration is given as to what such an emergency rollover would look like, and how the key that is being rolled to is protected differently from the key that is seen as compromised. Because all keys are managed in the same facilities, it seems highly unlikely to me that one key is compromised independent of the other keys managed in the same facility. Consequently, no matter what procedures are followed, there is no secure key on standby to roll to in case a key is compromised, independent of how it was compromised. One could argue that a key that has been "exposed" to the world longer, is more likely to be compromised because an adversary has had more time to perform cryptanalysis on this key. Realistically, though, in the proposed scheme the effective public lifetime of a key is 5 years, two of which it is pre-published for, and three of which it is active for. A new key is published one year into the active part of a key's lifecycle, meaning that an adversary would have a 3-year head start on compromising the active key compared to the newly introduced key that would be the go-to key in case of an emergency rollover. If a means exists for an adversary to compromise a key within a 3-year time window, then arguably the algorithm for that key is no longer strong enough to be used in production. This means that any emergency rollover under such conditions must be to a new algorithm, and thus a new, unknown and not pre-published key. A second note on emergency rollovers is that while technically there might be conditions under which they could be possible, in human terms, trust in "the system" (of the root and DNSSEC on the root) will be thoroughly compromised to the point where the question becomes if one should not give up. If we are unable as a community to securely manage signing of the root, why should people place their trust in it? Taken together, I think the whole premise of an emergency rollover is potentially flawed and should be placed outside of consideration. I know this is strongly worded, but I believe we - as a community - would be better off investing our time in ensuring that an emergency rollover never ever has to take place, rather than planning for a compromise, a case of shutting the stable door when the horse has well and truly bolted.

My fourth point concerns pre-establishing keys in resolvers through RFC 5011 and other means. I think the proposed two-year prepublication is an excellent idea and allows for much more time for the new key to be picked up by any means (and not just RFC 5011). If there is one thing we have learned from the recent rollover, it's that RFC 5011 is not the ideal rollover mechanism, nor is it the only rollover mechanism. Applications ship with built-in keys and OS vendors increasingly distribute trust anchors in updates. That is likely to become more common practice than it is now, with more applications performing their own DNS resolution. Announcing keys well ahead of time is an important means to provide time for the key to become established through all the different distribution mechanisms. I therefore fully support the practice of prepublishing, and would even argue that publishing two future keys may be in order (given the practical replacement life cycle of some applications). In the short term, publishing an extra RSA 2048-bit key would be problematic given packet sizes, so the N+2 key could be distributed via the IANA web channel only. In the longer term, of course, such a second standby key should also be part of the DNSKEY set, something which is only possible after an algorithm rollover to an algorithm with shorter signature and key lengths.

Finally I'd like to briefly comment on communication of planned rollovers and new keys. The "send only" approach in the current proposal may not have the desired effect, I would strongly encourage explicitly planning proactive communication with OS vendors as a minimum, as key distribution through OS updates is a likely future channel.

I hope this input proves useful, and am happy to answer any questions.

-- Roland M. van Rijswijk-Deij
-- NLnet Labs



More information about the Comments-proposal-future-rz-ksk-rollovers-01nov19 mailing list