[NCAP-Discuss] Root Cause Analysis Reports - Final Call for Comments

Wed Aug 10 11:42:05 UTC 2022

Casey,

Thank you for your comments.  Some comments below:

Section 7.3.1

>> The criteria is explained more above (see 1 and 2), but in short, it is 100% of a *sample* of queries, and it is compared to previous work on the subject.

This only brings in more questioning to the reliability and accuracy and biased nature of using a sampled query data. My concerns remain even with your additional explanatory text.  There is no data driven foundation as to why 5 queries and 100% adherence are appropriate.  This is a fundamental premise that is currently unmotivated and unsupported properly. Stating this as an approximate lower bound measurement is fine, but it needs to be treated as such and it doesn’t facilitate making definitive statements like your conclusion.

>>I'm not exactly sure what you're saying.  The de Vries paper covers many facets of qname minimization, and they are very careful to distinguish which parts can be applied and compared elsewhere--and in how they apply them.  I've summarized some of those points in my introductory text of this email.  And again: 1) we used the same methodology for determining minimized queries in passive analysis as they did, which was based on the findings from their active analysis; 2) we applied the analysis of minimized queries to resolvers using metrics also from their paper; and 3) our technique is a heuristic.

>>The selection criteria ignores a more selective QNM criteria defined in the RFC such as the Qtype (e.g., A and NS) and excludes multiple implementations of QNM techniques (e.g., nonce second level labels, underscore labels, asterisk labels, etc.).

Yes – I’m saying profiling known QNM implementations for ground truth needs to be done (again – de Vries paper was years ago). Furthermore, the de Vries knowledge still doesn’t make it into your selection criteria. Why not test for it being NS or A qtypes?  Furthermore, QNM in 2018/2019 was not a standards track RFC.  Things have changed.  Industry implementations have changed. The RFC is now standards track. Applying a 2018/2019 heuristic without profiling current standards is ignoring the state-of-the-art. Another point is that this analysis did not apply its measurements to “resolvers”, it applied it to any IP that queried the RSS during DITL.  There is no reason to believe that some IP sending 5 queries to the RSS is a recursive resolver.  This is a big distinction and again goes to why the thresholds in #1 are not motivated.  And to the last point of it being a heuristic, sure it is a heuristic but without proper motivation and reasoning, the results are just a measurement of an unmotivated heuristic and need to be treated as such

>>See introductory text, parts 3A and 3B in particular.

This has nothing to do with 3A or 3B. The state-of-the-art has moved significantly since 2018/2019 and de Vries with regards to QNM.  I’d encourage you to reexamine the current data from 2022 (actually from 2019 on) at DITL to better understand QNM deployment.  I’m looking at A root data now and I can clearly see a _sizeable percentage of QNM traffic that you are not capturing to the items I mentioned previously. I _can’t _stress _this _point _enough.

>>Sorry, I'm not sure what you are getting at here.  If you are referring to the longitudinal measurement plot, it was intended to show the % of ASNs over time with at least one qname minimizing resolver, as a deployment trend.  Nothing more, nothing less.

The point is that measuring QNM at a ASN or even IP level longitudinally within the context of NCAP makes no sense. Having one IP in a large ASN that might be QNM tells us nothing material around name collisions.

>>Perhaps - but that is not within the scope of this report.  Some of that discussion is had in the "Fourteen Years..." paper.

Can you please share this paper b/c I can’t find it?

>> This is, of course, unrelated to the qname minimization analysis.  But it is an interesting hypothesis that could be tested.

It is absolutely within scope if you are claiming a finding in the Root Cause Analysis Report that name collisions decrease via your data measurements using this passive DNS data. There are existing measurements of the difference between positive and negative referral query rates– not sure why this would need to be tested given what we already know about resolver treatment of NXD responses and the differences in positive/negative TTLs.

>>With regard to the large public resolver, I'm sorry, but this this is very vague.  Without any documentation and/or empirical analysis to go on, I have nothing to re-assess or improve my analysis.

See: https://blog.apnic.net/2022/06/02/more-mysterious-dns-root-query-traffic-from-a-large-cloud-dns-operator/  This was also disclosed years ago I believe at DNS-OARC events.

>>Please remember that 1) we are interested in *trends* over time and 2) we only need samples--not complete data--to get those trends.  The samples are taken from the resolvers identified as non-qname-minimizing in 2021.  The process and the ultimate sample sizes are well documented in the text and the table.

Trends require a consistent measurement of the underlying study over time.  You have not motivated or shown why an IP should be treated equal or consistent for measurement over time.  Nor have you shown that a 13 query sample is representative.  Saying this sampling of data is representative is not true without additional evidence.  The same needs to be said about the measurements taken from the passive recursive data used for the root cause analysis report – there is no data or proof that shows that subset of queries to the root is representative and unbiased, given they were collected by a biased subset of operators, who were willing to deploy those data collection probes.

Matt Thomas
Verisign
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mm.icann.org/pipermail/ncap-discuss/attachments/20220810/995e8ce7/attachment-0001.html>