[NCAP-Discuss] [Ext] Re: Draft final Study 1 report
Danny McPherson
danny at tcb.net
Wed Apr 29 20:27:59 UTC 2020
On 2020-04-28 20:17, Karen Scarfone wrote:
> Here are my responses to your questions. I've copied the text of your
> original email below for reference.
Thanks for the prompt response Karen, comments inline!
> 1. I stated in the report that "there does not appear to be any recent
> academic research into the causes of name collisions or name collision
> mitigation strategies." You disagreed with that and cited nine
> examples of recent work. One was a 2020 posting to the NCAP mailing
> list from Jeff Schmidt about corp.com. One was a blog posting from
> 2019 on how pen testers can take advantage of name collisions. The
> other seven examples are from 2017 or earlier. Perhaps you and I are
> interpreting "recent" differently, but you haven't provided any
> examples of academic (or industry) research into name collisions from
> the past three years, and all the examples except the pen tester blog
> posting were already reviewed for the draft report. Would it be
> clearer if I reworded my statement to say, "There does not appear to
> be any academic or industry research during the past three years into
> the causes of name collisions or name collision mitigation
> strategies"?
>
> 2. I based my assertion about finding the causes for name collisions
> on evidence from previously identified causes. Sections 3.5 and 3.6 of
> the draft report contain most of this information. Causes mentioned
> include:
> * Shortened name usage
> * Search list processing
> * User error and misconceptions
> * Client software misconfiguration
> * Browser prefetching
> * Third-party applications or plug-ins
> * Web crawlers
> * Malware
> * Web Proxy Auto-Discovery (WPAD) protocol
> * Expired registrations
> * Intentional acquisition of colliding names
As noted on the call a moment ago, IF you're not doing honeypotting to
qualify risks then you could at least look at the labels to determine
the riskiness of some strings. This is what Duane Wessels did in the
mid 2000s, what Interisle did thereafter at scale, and what Verisign,
JAS, and others did subsequently. There are some techniques that can
improve identifying these risky / unicorn strings considerably, to
include this work from November 2017, which I certainly consider as
recent, especially since this WG started in 2018, IIRC:
Client-side Name Collision Vulnerability in the New gTLD Era: A
Systematic Study
https://dl.acm.org/doi/pdf/10.1145/3133956.3134084
But that also doesn't mean earlier work didn't allow ICANN Org or others
to "test" strings for riskiness proactively v. hoping to break
("interrupt") things during an initial delegation to notify potentially
impacted parties. Interisle did that what at ICANN's direction and it's
well understood, IMO.
Furthermore, most of the riskiest classes of attacks where MitM and the
like can occur are employing service discovery protocols where the
labels are detectable at the root level (Today at least -- although
QNAME minimization is in fact having a _significant impact on visibility
at the root, which is something else that has changed now and impacts
any analysis considerably).
> There is no evidence that there's a single root cause of most name
> collisions. The evidence is overwhelming that there are many root
> causes, and that the types of root causes in the list above are not
> ones you would identify by analyzing datasets. Datasets might give you
> a starting point, but most of the analysis work would need to be
> carried out on a case-by-case basis outside those datasets. But I
> don't see any evidence that there's a substantial number of
> unexplained name collisions happening, let alone causing problems.
I don't agree with this wholesale - especially for the class of service
discovery protocols like WPAD and the thousands of others can be
identified specifically through the QNAME, are are arguably the
riskiness because someone is looking for a service to rendezvous with.
I think you could just apply a framework like the one above and do it
tomorrow and set some unacceptable thresholds and unless someone wants
to do outreach then consider some bar as too risky - Interisle had this
mostly right I think but the fact that 1000+ strings needed to be
delegated to enable business plans "complicated things", IMO.
> 3. Regarding controlled interruption, Jeff Schmidt's email
> (https://mm.icann.org/pipermail/ncap-discuss/2020-April/000282.html)
> already said much of what I was going to say. Controlled interruption
> has been highly effective at mitigating name collisions--there's
> overwhelming evidence of that--and that encompasses all current root
> causes. Unless a major new root cause is identified that controlled
> interruption can't effectively mitigate, I do not see the need to
> study mitigation strategies other than controlled interruption. I
> don't even know *how* you would study mitigation strategies unless you
> know which root cause needs to be addressed and how controlled
> interruption is insufficient. I will state this more clearly in
> Section 6 of the report.
Warren and I both commented on this on the call. I'm till of the
opinion that controlled interruption has provided some value, but It's
brute force, inherently reactive, ignores full classes of places where
signal will never make it to the client / user, and doesn't allow any
proactive feasibility for delegation "test" (to borrow from Neuman) to
yield any useful data a priori or other insights prior to delegation.
Thanks,
-danny
More information about the NCAP-Discuss
mailing list