[RSSAC Caucus] [SPAM] Re: Security Incident Reporting and c-root incident

Wed May 22 23:28:04 UTC 2024

Robert, everyone,

Sorry that I should have attached some references at the first time.
Let me do now.

1. c-root has experiencing zone transfer/update and routing issue: https://lists.dns-oarc.net/pipermail/dns-operations/2024-May/022558.html
2. gov. (maintained by Cloudflare) new alg 13 DS is not appearing in data from c-root, decided temporary terminate alg roll over until this issue is fixed: https://lists.dns-oarc.net/pipermail/dns-operations/2024-May/022566.html
3. The same on int. (managed by ICANN): https://lists.dns-oarc.net/pipermail/dns-operations/2024-May/022573.html

Just to make it clear, the purpose I've posted this question is not to suggest c-root to do or not to do something just like people doing in other MLs, but to think:
- if this ongoing issue is somehow related to the Security Incident Reporting doc.
- find any lack of requirements or suggestion and add them to the doc before finalizing it (if there is any)

This is jist an example that we may be able to gain something for the doc.

---
alt (from iPhone)

> On May 23, 2024, at 6:20, David Conrad <david.conrad at layer9.tech> wrote:
> 
> Hi Robert,
> 
> I just recently joined the work party, so apologies if I’m missing some context.
> 
>> On May 22, 2024, at 4:14 PM, Robert Story <rstory at ant.isi.edu> wrote:
>> The work being referenced is the Security Incident Reporting work party, and
>> the document is here:
>> 
>> https://docs.google.com/document/d/1NvSw7PoLGYhXPuMEjiBgqjCtp_khTGGEh0DaHkNJdds/
>> 
>> I completely agree that the question is reasonable, and I was merely stating
>> my opinion based on my feel for the way the document has been progressing.
>> 
>>> I know you didn't mean to suggest that spending a few minutes searching for
>>> impact is sufficient as criteria for judging whether an incident has
>>> occured, but we have metrics defined in RSSAC002 that relate directly to
>>> serving stale data; those metrics for C are surely well beyond the expected
>>> values over this event. Perhaps it's an idea to use those metrics as
>>> quantitative measures of impact?
>> 
>> The statement of work for the SIR wp explicitly states that 'the work party
>> should focus on security incidents that have a *material adverse effect* on
>> the root service.'
> 
> I guess this gets into the definition of “root service”.  While it’s arguably true that this most recent incident did not impact _resolution_ service, I gather .GOV and .INT (prudently) delayed completing their key change until it has been resolved.  If you assume that “root service” includes enabling zone maintenance, such as changing keys, it would seem that this incident did indeed have "material adverse effect” on root service.
> 
>> The working party is carefully avoiding tying any hard numbers or rules to
>> whether or not an incident qualifies as 'reportable', or trying to imagine
>> whether or not any particular scenario qualifies or not, and explicitly
>> stating that the decision is left to the RSO(s).
> 
> This seems kind of superfluous: as with all things RSOs, it is always the decision of individual RSOs to play or not as they see fit.
> 
>> Based on the information I have at the moment, my personal opinion is that
>> this incident wouldn't qualify for security incident reporting as defined in
>> the document.
> 
> If you’re talking about the RSS SIR Working Document, section 4.2 states:
> 
> "Data integrity refers to the "correctness" of the data in responses generated by the RSS.
> […]
> Examples of reportable incidents that affect Integrity:
> * Any part of the RSS serving incorrect data for the root zone”
> 
> Providing stale data would appear to me to be “serving incorrect data for the root zone."
> 
>> Other interesting questions are:
>> 
>> - what is the impact of stale data being served from some or all of the
>> instances of a single RSO? Does it depend on how old the stale data is?
> 
> Yes, it depends on how old the stale data is but I don’t think it would be a good idea to try to quantify this. Back in August 2018, the operators of “C” misconfigured a firewall in a way that blocked zone transfers. IIRC, this misconfiguration was noticed when the operators of the RU ccTLD notified IANA their DS wasn’t updated at “C” (I believe after people complained to them).  In this scenario, given caching, etc., it would probably be difficult to draw a line around “too old."
> 
>> - what would the impact have been if the rollovers had proceeded?
> 
> Potentially a repeat of the .RU issue.
> 
> Regards,
> -drc
> 
> 
> _______________________________________________
> rssac-caucus mailing list
> rssac-caucus at icann.org
> https://mm.icann.org/mailman/listinfo/rssac-caucus
> 
> _______________________________________________
> By submitting your personal data, you consent to the processing of your personal data for purposes of subscribing to this mailing list accordance with the ICANN Privacy Policy (https://www.icann.org/privacy/policy) and the website Terms of Service (https://www.icann.org/privacy/tos). You can visit the Mailman link above to change your membership status or configuration, including unsubscribing, setting digest-style delivery or disabling delivery altogether (e.g., for a vacation), and so on.