[NCAP-Discuss] [Ext] Re: An Approach to Measuring Name Collisions Using Online Advertisement

Thu Jun 9 13:39:20 UTC 2022

Thank you Anne for your thoughtful response. There are a few areas where I think we’re talking past one and other I’ll try to focus on those.

I don’t know if it’s up to the DG to evaluate the ad tool

NCAP is supposed to be a technical working group, an organ of the SSAC, and our recommendations should be technically rigorous and defensible. I don’t think we have the option to kick the can to “someone else.”

In my opinion, the problem we are solving is that the risk assessment used in the 2012 round was not systematic.

What does “systematic” mean in this context?

In 2012, the problem was that the issue was late-breaking. However, eventually, the response was rigorous and appropriate. ICANN adopted a programmatic approach to collisions that included: (1) identification of potential Black Swan strings from the list of applied-for strings; (2) a notification and “cooling-off period” known as Controlled Interruption; and (3) an emergency response capability. This was – and remains – the correct approach.

If you mean that the risk assessment must be “mechanical” i.e. performed from completely wrote procedures without expert analysis, that is not possible. As this group has discussed as recently as yesterday, expert discretion given the highly situation-specific potential for problematic Black Swan collisions is essential.

Perhaps the best evidence of this is that strings which arguably carry more collision risk than .mail were delegated and .mail was not.

I don’t know what this means. Are you suggesting that there are 2012 strings that should not have been delegated? Which ones, and based on what data?

Another problem we are trying to solve is the bad experience of some organizations as documented in Casey’s research.

If you are suggesting that the handful of alleged (*) minor technical hiccups mentioned in Casey’s report are showstoppers, then you are correct. Moreover, ICANN should never delegate another TLD (cc or generic), certainly should never roll a KSK, never allow a root server operator to renumber, and it needs to take a hard look at those pesky IDNs. Perhaps most importantly, it should kill DNSSEC as it has shown an amazing propensity to break things (including large TLDs) and cause widespread “bad experiences.” DNSSEC has broken more things in the past 2 years than TLD collisions have in the past 20.

My point is this: change is never zero risk. Particularly on the Internet, ability to tolerate change is one of the things that makes it wonderful and allows innovation. It’s a balance. My position – and I believe the position of most in the broader audience – is that the 2012 procedures well balanced significant change with extremely limited operational issues. Zero risk is an untenable position.

(*) “alleged” because these are self-reported based on the memories of a handful of people. We don’t know what really happened. It should be instructive to us that literally no one else is talking about collisions; if it was a real problem, IETF, NANOG, et al, would be discussing it. IETF – the organization that has the most ability to directly help by clearly creating 1918-ilke DNS namespaces - refuses to even take it up. No one cares. It’s not an issue.

And another problem we are trying to solve is we have no idea what sort of harm may have occurred to consumers.

This is the fear, uncertainty, and doubt argument. Again, if we’re really concerned about consumers, we should prohibit domain expiration drop-catching (intentional collisions) as that creates more real and well-documented harm than the TLD-level collisions we’re considering. Before someone says “that’s not in scope” it absolutely is. Read the definitions in SSAC and NCAP Study 1. The fact that this group continues to ignore those *real* harms is instructive.

 What is different now is we are called upon to develop a systematic process for risk assessment.   That comes with a “MUST” in the Sub Pro Final Report Recommendations.

SubPro Implementation Guidance 18.5: If the risk of name collisions will be determined after applications are submitted, ICANN should provide a full refund to applicants in cases where a new gTLD is applied for but later is not approved because of risk of name collision.

SubPro, correctly, anticipated the scenario where strings may be assessed after applications are submitted. This is the only way this can be done.

This existing DNS Stability Review evaluation (AGB 2.2.1.3) should be augmented to specifically evaluate for Black Swan collision potential using the techniques and metrics well known and published in Interisle, JAS, SSAC, Verisign, and NCAP publications. Correctly, the identification of potential Black Swans will be evaluated by technical experts based on string-by- string and case-by-case analysis after applications are submitted. This is entirely consistent – even expected – by SubPro.

If you want the Board to lockup and defer going forward on the next round, your approach makes sense.

Quite the opposite. Some of the worst ideas being advocated within NCAP will lead to quagmire and controversy by punting string-by-string decisions to the ICANN Board for consideration with little to no evaluation criteria. This must not be done. Instead, we should re-affirm the methodologies successfully applied in the previous round and make surgical improvements to those processes. In addition, predictability and certainty can be improved through communications and documentation. This is how we move forward, not solving problems we don’t have by inventing complex new procedures.

Having this option be systematic would be different from the 2012 round and the advice is codified and sits before the Board in the form of Sub Pro Implementation Guidance 29.5.

Agree. But I think you’re defining “systematic” as “all new and certainly not what we did in 2012.” Merely codifying what we did in 2012 completely satisfies SubPro and is “systematic.”

I’m very interested in the observation you made that everything in the Passive Collision Assessment section Matt explained is “nothing new” (though that terminology is a bit derogatory, as is “science experiment”. )  I guess that means that Interisle and JAS already know how to supply that PCA analysis to the Technical Review Team and could do that for a fee pursuant to an awarded RFP.  Is that correct?  At least they could offer to perform that service if it doesn’t involve the ad measurement piece?  (As I understand it, the purpose of PCA is to identify very  high risk strings that likely should not move forward.)

Respectfully, Anne, that is *exactly* what Interisle and JAS did! A decade ago! At this point, I’m not sure how it’s possible you don’t know this.

My use of “nothing new” and “science experiment” are meant to be merely descriptive, not derogatory. These things have all been done. Exhaustively. It is stunning to me that some don’t seem to know this.

Our report, Section 5 entitled “Etiology of DNS Namespace Collisions.” It explains a superset of the root causes NCAP has discussed.

Our report, page 35, summarizes the following thousands of pages: “The classification was based on: (1) the diversity of querying source IP addresses and Autonomous Systems; (2) the diversity of labels queried; (3) applying sophisticated “randomness detection” to strings and substrings; (4) presence of linguistic terms and colloquialisms in strings and substrings; (5) temporal patterns; and (6) analysis of the Regular Expressions of the labels queried within each TLD and across all TLDs.

That is a superset of what was presented yesterday.

Our report, Appendix A, is 1000+ pages of string analysis of colliding qnames for applied-for strings. Including cool visualizations for every applied-for string.

Our report, Appendix B, is 2000+ pages of quantitative source diversity and second level qname diversity for every applied-for string. Including a binned scatterplot to visualize the relationships. For every applied-for string.

I would argue that the fitting of regular expressions to qnames was one of the more illustrative things we did, and it has *not* been discussed in NACP. Special strings like “WPAD” stick out brightly in that analysis.

Interisle starting at page 50 discusses SRV records extensively, as does JAS. We use the “Interisle Categories” they defined starting on page 9.

The metrics and analysis techniques presented yesterday are actually *less sophisticated* and *less comprehensive* than the metrics used by Interisle and JAS a decade ago. It’s worse than “more of the same” – it’s actually a step backward.

In terms of the ad network, it is innovative and has generated interesting data about DNSSEC and other topics. It has never been applied to collisions and as stated in my previous email there are real questions about the applicability and appropriateness. We would be trying something new and untested with no ability to back-test. Just the facts.

Jeff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/ncap-discuss/attachments/20220609/0e53623a/attachment-0001.html>