[ksk-rollover] Starting discussion on acceptable criteria for proceeding with the root KSK roll

Thu Jan 11 19:04:09 UTC 2018

On Wed, Jan 10, 2018 at 5:33 AM, Petr Špaček <petr.spacek at nic.cz> wrote:

> On 5.1.2018 23:12, David Conrad wrote:
> > On January 5, 2018 at 2:06:10 AM, S Moonesamy (sm+icann at elandsys.com
> > <mailto:sm+icann at elandsys.com>) wrote:
> >> The plan was put on hold because of the
> >> data from September 2017. At the moment it is
> >> unknown if/when there will be a KSK roll. Is not
> >> doing a KSK roll by 2020 [1] a viable option?
> >
> > Speaking personally, I’m hoping we can do the rollover long before 2020.
> > The key is for the community to provide some sort of guidance to the
> > ICANN Org about how to move forward. So far, my impression is that to
> > date, most of the input from this mailing list has been “do it now”,
> > implying we do NOT need to assess "the impact on users” (as mentioned
> > in https://www.icann.org/news/blog/update-on-the-root-ksk-
> rollover-project).
> > This means that the plan that will be published on 31 January for public
> > comment will say the input we have received suggests the majority of
> > contributors do not believe we need to take potential negative impact of
> > the KSK rollover into account.
>
> I think this is misunderstanding. I haven't seen anyone saying that "we
> [do not] need to take potential negative impact of the KSK rollover into
> account", rather than "people will fix it if it really breaks".
>
> Let me state my interpretation of the discussion (in the following text,
> "contributors" reads "me"):
>
> Contributors believe that there is no way to reliably measure readiness
> for the rollover, and that tools for such measurement will not be
> available in upcoming years.
>
> ---
> While not having reliable data, contributors believe that KSK rollover
> process already got sufficient publicity and that breakage will be dealt
> with swiftly, similarly to other security issues or DDoS attacks. For
> these reasons risk of postponing KSK rollover indefinitely is deemed to
> be higher than risk of breakage which will be fixed using usual methods.
> ---
>
> I hope it helps to explain how others might read this discussion.
>

I think there will always be breakage, in the old pre-RFC5011 and KSK
design discussions there was one case identified as non-solvable
 --- old OS/Box comes alive i.e.
I think we now have a second class of failures that was not "anticipated"
  -- non-persistence i.e. resolver can not store state in a way that will
be used if resolver is restarted.
  -- operators hard code keys i.e. disable RFC5011 (trusted-keys vs
managed-keys)

RFC5011 assumes that timings and state of keys can be stored and will
survive reboot/restart,
this seems to be violated by some operators by design (i.e. configuration
information is non-writeable by the Resolver process)
and in some cases a mixture of the old OS and use of modern technologies
like containers.

Having said this I'm going to argue that we should proceed with roll by
picking a day and
generating a PR outreach to try to minimize outages.

     Olafur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/ksk-rollover/attachments/20180111/bdca3c3e/attachment.html>