[ksk-rollover] Thoughts on future KSK Rolls

Tim April timapril at gmail.com
Fri Mar 29 19:00:15 UTC 2019

Below is the distillation of my notes regarding the future of KSK rolls.
Without a doubt, many of the ideas below have come from conversations with
others or statements made on the list or in the meetings, but this is my
current view of things.


Rotation Period:

The key should be rotated on a regular schedule. What a "regular schedule”
is up for debate, probably with parties from the ICANN Community, ICANN org
(likely the Office of the CTO - OCTO) and IETF participants (likely
interested parties from dnsop). The starting point for the conversation
would be to consider the guidance set out in the [DNSSEC Practice Statement
for the Root Zone KSK Operator] document, section 6.5. (Sidenote: I believe
that section 6.5 should have been worded differently to be more clear on
the intentions of the authors, possibly saying "Each RZ KSK will be
scheduled to be rolled over through a key ceremony as required, and to
happen at least once every five years”).

The rotation period of five years, in my opinion, is far too long and can
result in operators and developers becoming complacent, or being generally
unaware, that keys rolls happen. My current opinion is that the rotation
period should be roughly once a year. The yearly cadence would allow the
key to be somewhat stable but also have the key rolls be a regular event
that can be predicted while not falling too far out of working memory.

In thinking about rotations, I started to wonder if the timing for the
whole process should be on a predictable, fixed day of week and time of
year as to be predictable, even before formal timelines are published. The
scheduling of the critical dates (like changes to the root zone file)
should probably be scheduled in the same way that the US holidays of
Thanksgiving, Memorial Day, Labor Day or Election Day is done; where a
specific day of the week and a specific week in the year are specified (eg
4th Thursday in November for US Thanksgiving).

A proposed schedule for the process could be like the following:

--- Process Start ---


   Key Generation (Site A): First Tuesday of September

   Key Replicated to Alternate Site: As able but before the next step

   Key Published in IANA: First Tuesday in October

   Key Published in the Root Zone: First Tuesday in November

      ( allow 2 months for RFC5011 to work )

   New KSK switch to sign the root zone: Second Tuesday in January

      ( pushed out a week to allow for New Years Celebrations )

      ( allow 2 months before revocation )

   Old Key Revoked Bit Set: Second Tuesday in March

      ( allow 2 months before removal )

   Old Key removed from Root Zone: Second Tuesday in May

--- Process Complete ---

I have not been very rigorous about the dates suggested above, but
something similar to the [IETF Clash List] should be considered at least at
the start of the roll process to try and limit the collateral damage that
might be caused by an accident while the process starts up. It would also
be wise to consider major Holidays like Passover and Christmas as well.
When thinking about the schedule above the following considerations were

* avoid US major / bank holidays on key steps
* avoid Patch Tuesday during the first half of the process until the new
key is published

* skip Mondays (holidays tend to fall on Mondays). This also helps avoid
people who take long weekends.

* skip Fridays / Weekends: many senior engineering / ops staff have
weekends off. Also changing things on Friday tends to have unnoticed
failures until the following week.

Each of the steps listed above should also have some sort of communication
around it, ideally with some pre-notification and then a confirmation
message following the completion. When the rolls start to become more
commonplace, the number of notifications may start to go away, but I think
that would be community driven as we get better at rolling the KSK.

Key Management Methodology

The current steady state of the KSK has one valid KSK. To this point, this
has not been a problem, but this also means that if an emergency keyroll is
required, the current tooling / support will not help with a roll. A
potential way to support an emergency key roll, should one be needed, would
be to have a backup key created and staged in the root zone. To take this
approach, I would propose a model where, outside of an emergency condition,
there would be at least two KSKs in the root zone at any time which have
been published for at least one month (the 5011 hold down timer length). To
achieve this, using the schedule proposed above, the steps 1-4 would
happen, creating a new key. Step 5 would promote a key that has been in the
root zone for 14 months to active and then step 6 would be on a key that
has been provisioned for 26 months before it is then removed from the zone
two months later. For a six-month window, there would end up being three
keys in the root zone, but one would be a hot standby, ready for use in the
event of an emergency.

Here is an example of the key rolling broken down

State 1: KSKA (signing), KSKB (next key / emergency key)

State 2: KSKA (signing), KSKB (next key / emergency key), KSKC (new)

State 3: KSKA, KSKB (signing), KSKC (next key / emergency key)

State 4: KSKA (revoked), KSKB (signing), KSKC (next key / emergency key)

State 4: KSKB (signing), KSKC (next key / emergency key)

Having three keys (using the algorithm currently in use by the root zone,
RSA/SHA-256) would likely make the DNSKEY query response quite large
resulting in increased traffic to the root servers. Before considering this
model, I would defer to OCTO and the RSSAC for an assessment of the state
of the root servers. To support an expanded number of keys in the root, a
key algorithm roll might also be an interesting conversation area.

Having now talked about an algorithm roll, I also want to bring up another
possible concern when it comes to cryptographic keys: potential weaknesses
in algorithms. It may be worth considering a backup key that is a different
algorithm family from the other keys that are provisioned in the event that
an algorithm is needed expeditiously.

Considerations of KSK2010

I’ve seen the discussion that happened in Prague at IETF104 about the
scheduled destruction of KSK2010. I’m not convinced it is the best idea to
destroy the key yet assuming it is still stored in the same way as it was
when it was the active key. To Wes’ point during the KSK BoF, it might be
interesting to find some way to chain all KSKs that are in use back to a
KSK, such as KSK2010, but then there is the issue that code that still
relies on KSK2010 will not have any new method to bootstrap up from
KSK2010, so the point might be moot. I think it would be wise to consider
other options of things we might want to do with that key before it goes
away, but to also keep in mind that we might want to just treat KSK2010 as
dead and pick up on KSK2017 (or some successor) as the initial trust anchor.

IETF Clash List: http://www6.ietf.org/meeting/clash-list.html

DNSSEC Practice Statement for the Root Zone KSK Operator:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ksk-rollover/attachments/20190329/5a213849/attachment-0001.html>

More information about the ksk-rollover mailing list