[RSSAC Caucus] Local Perspective Work Party: "Underserved" use case text

Sun Nov 22 22:50:38 UTC 2020

> On Nov 22, 2020, at 2:31 PM, Steve Crocker <steve at shinkuro.com> wrote:
> 
> Ken,
> 
> Thanks for this.  After reading the document and your email, I have the following comments and questions.
> I believe I understand the general idea of measuring both availability and latency to each of the various root server identities (RSIs) multiple times over a 30 minute period.  I had a bit of trouble understanding the precise specifications.  I am completely comfortable with math and math notation, so my trouble is due to some missing pieces in the description.  I suspect this is easily remedied with a few more words.
Agree that it could be expressed better.  Adding formulas with more squiggly lines and Greek letters would be good.  I’d like to talk through the metric on the call tomorrow and get some consensus.  I may then ask for help with the notation.

> What is the justification for the numbers, e.g. 20 times, σ = .65, etc
The justification for σ = 0.65 is in the user narrative document, but pasted here:

desired probability 0.95 success rate for measurement point among N=3 RSIs.
(1-p)^N = (1 - 0.95)
p = 0.63. (for N=3)
round to p = 0.65
note that desired rate of 0.95 is starting point for discussion

As for the “20" values, those are somewhat out of the blue.  Just waiting for someone to ask why and start the discussion.  For availability, we need a moderate number of samples.  20 samples seems reasonable to me right now, but open to other thoughts.  As for averaging 20 metrics before competing to other sites, I just wanted something that can wash away some anomalies and get a better idea of what is happening at a location.  I feel the same about the T=30 minute interval; open to ideas.
> What conclusion would you draw if the local measurements show there is extremely high availability but latency varies between very low and not so low.  For example, there might be one RSI that is topologically close but suffers intermittent availability and other RSIs that are not as close but extremely reliable?
These scenarios will hopefully average out over enough samples (do we need more than 20?).  We are not treating latency and availability as equal factors, but the 2 scenarios you describe above might end having similar metrics.

> Adding one or more additional root servers is one way to improve both reliability and availability.  Installing local root service is another way to accomplish the same goals.  How do you expect the measurement process described in this document will affect discussions regarding these two approaches?
This measurement process should just be data points for each case.  A recursive operator might use a bad result as justification to implement local root for its user base (although this measurement would not reflect the improvements of local root).  A RSO would probably want a lot of data points to inform their subjective “underserved” decision.  Bottom line is that this metric will not be a definitive answer or a direct reason for any particular action.  You still need to do your homework.

Thanks for the comments!

-Ken
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/rssac-caucus/attachments/20201122/c509efed/attachment.html>