[Gnso-rpm-sunrise] Statistical issues/validity of the Analysis Group survey results

Wed Dec 12 20:06:46 UTC 2018

Each WG member is free to give any particular piece of data whatever weight they think it deserves, as well as to decide for themselves whether any singular element or collective aggregation of data justifies a particular operational or policy proposal.

I just reviewed the GNSO WG Guidelines and I couldn't find any requirement that any consensus recommendations be based on data at all, much less statistically significant data. I know that Council put out a communication pertaining to WG recommendations being data-based, and that became the rationale for our funding request which paid for the Analysis Group. I'd invite staff to share that document with us, but I doubt that it requires data to be significantly significant to be a basis for a recommendation.

There may also be instances where available data is inconclusive or contradictory. In my personal view, that may be the case with data pertaining to whether or not the trademark claims notice excessively deters non-infringing domain registrations, or adequately deters infringing registrations. The fact that available data may be inconclusive or contradictory does not bar the working group from proposing or adopting modifications, even where there are different views among members on the conclusions to be drawn from or the weight to be accorded various data elements.  For example, there might be consensus on modifying the language of the trademark claims notice with the intent to both better inform unsophisticated registrants with no infringing intent, while simultaneously more effectively deterring those who intend to infringe for commercial gain.

Philip S. Corwin

Policy Counsel

VeriSign, Inc.

12061 Bluemont Way
Reston, VA 20190

703-948-4648/Direct

571-342-7489/Cell

"Luck is the residue of design" -- Branch Rickey

From: Gnso-rpm-sunrise [mailto:gnso-rpm-sunrise-bounces at icann.org] On Behalf Of Susan Payne
Sent: Wednesday, December 12, 2018 12:11 PM
To: Greg Shatan <gregshatanipc at gmail.com>
Cc: Gnso-rpm-trademark at icann.org; Gnso-rpm-sunrise at icann.org
Subject: [EXTERNAL] Re: [Gnso-rpm-sunrise] Statistical issues/validity of the Analysis Group survey results

Copying both mailing lists to avoid duplicative threads.

I agree with Greg.  George has raised this extensively during two F2F meetings and on follow-up phone calls.  For those who need the reminder, he has also reiterated his views below.  Many others expressed the view that some data is better than none and that as a group we are capable of giving this an appropriate weight.  This issue has had substantial airtime therefore and it is time for the subgroups to get on with their substantive work, without having to spend further time on this particular issue.

Susan

Sent from my iPad

On 12 Dec 2018, at 16:58, Greg Shatan <gregshatanipc at gmail.com<mailto:gregshatanipc at gmail.com>> wrote:

All,

George has provided his opinions regarding the surveys, including his opinion as to a conclusion.  I think these were well-known beforehand. It may be useful to some in the group to have this lengthy compendium (which was cut-and-pasted to both subgroups).  While these opinions were expressed at great length and have been widely disseminated, that does not mean these opinions are widely shared in the group (or even narrowly shared...).

I disagree with these opinions.  While surveys with smaller sample sizes should not be treated identically to those with larger sample sizes, surveys with smaller sample sizes still produce results that valid, useful and informative.  Significant weight can still be attached to these results and it's entirely appropriate to use the survey analysis tools for these results.

Surveys and polls are conducted for wildly different reasons.  If we were attempting to predict the outcome of a contested election, these sample sizes might be deemed inadequate.  But this is a different exercise, and the sample size was reasonable for this purpose.  Yes, a larger sample size would have allowed for more fine-grained conclusions, but we can still see strong tendencies in the data.

Smaller sample sizes are also appropriate where one is seeking qualitative and experiential data, rather than solely looking for quantitative proportional relationships to a high degree of accuracy.

I hope we can get past this issue (if it is an issue) quickly so we can get on to substance.  Procedurally, I would also hope the co-chairs can figure out how to avoid two parallel discussions based on the same "original post."  This would seem to be a recipe for wasted time and inconsistent discussions.  I hope that those chairing these discussions can exercise some control over the agenda, so we can do our work.

Best regards,

Greg

On Tue, Dec 11, 2018 at 8:35 PM George Kirikos <icann at leap.com<mailto:icann at leap.com>> wrote:

   Hi folks,

   To prepare for tomorrow's call, I analyzed the multiple previous calls
   we had regarding the Analysis Group survey results (October 21,
   October 22, and November 28, 2018). Sub team members might want to
   review the transcripts too, as an in-depth review of the transcripts
   and materials demonstrates that little or no weight should be attached
   to the results, and so it really calls into question the use of the
   "survey analysis tools" for those results.

   Below are some key snippets:

   ***** October 21, RPM Working Group Session 2 *****

   https://gnso.icann.org/en/meetings/transcript-gnso-rpm-session-2-21oct18-en.pdf

   a) Bottom of page 6: I specifically ask about how they determined
   sample size, etc.

   Answer (which goes into page 7): "No, that's a really good question,
   George, and I appreciate it. I mean, I think that the - I have a
   couple of thoughts and responses to what you were raising and I
   essentially kind of, well, I won't say agree with everything but I
   think I do. You know, it's - given the
   available budget for this project, it was difficult for or I think it
   would have been impossible for us to get the number of responses that
   one would have wanted to have to kind of - relatively small margin of
   errors on each of the different estimates in the, you know, that we
   ultimately kind of were able to provide in the various tables in the
   report. And so I wouldn't say that this is a statistically
   representative sample and I wouldn't say that it comes with small
   margins of error but I would say that the trends, at least that we're
   seeing in the data, I would say are informative. I wouldn't, you know,
   hang my hat if my life depended on it in terms of relying on these
   results but I think you see some pretty clear trends, at least with
   respect to the registrants, the potential registrant and the trademark
   surveys. "

   "But George, and I think just to go back to your point, it's very well
   taken. You know, I think for the registrant, the potential registrant
   and the trademark surveys, given the number of responses that we
   received, I feel pretty confident about what those results are saying
   with respect to how people view various components of the rights
   protection mechanisms. Sorry. "

   "And but, with respect to the registries and the registrars, where we
   received a very small number of respondents, I would view that data as
   more anecdotal than anything else."

   [admits that it's not a statistically representative sample; note how
   the statement of "confident about what the results are saying" is
   entirely inconsistent with the later answers about statistical
   confidence levels for  November 28, 2018 meeting!, also incorrectly
   talks about "trends", see October 22, 2018 call snippet below that
   directly addresses the "trends" issue]

   b) Page 10: survey was done in English, excluded Asian countries which
   dominate new gTLDs!

   George Kirikos: Thanks, George Kirikos for the transcript. Given the
   actual distribution, country by country, of new gTLDs, particularly
   the high concentration of registrants from China, I was wondering if
   there was any thought given to doing the survey of, you know, Asian
   and Chinese in particular registrants, because they obviously
   outnumber registrants from other countries? So was there any thought
   given to translating the survey into Chinese and getting their
   feedback? Thank you.

   Ariel Liang: This is Ariel from staff. Thanks for the question,
   George, but based on our budget limitation and discussion with
   Analysis Group and translating is really out of our capability in
   terms of the resources allocated, so we have to do the English but we
   did encourage our colleagues, especially in the GSC, to distribute the
   survey to countries outside English-speaking countries and then just
   tried to promote it as widely as possible. So mainly because of the
   budget resources, that's why we couldn't do that translation.

   c) page 13: admission that this was not a random sample

   George Kirikos: Yes, George Kirikos for the transcript. I'd actually
   like to go to the page directly before that, which is Page number 10,
   where you could actually look at the demographics of the countries.
   You can see, for example, that Canada represented 12% of the sample,
   which is the exact same percentage of the United States. We know that
   Canada has one tenth of the population of the United States, so do you
   actually believe that this is a randomly sampled, representative
   sample based on the fact that these proportions are way out of whack
   with the actual distribution, country by country? Thank you.

   Greg Rafert: Yes and I would say that it is not a random sample

   d) page 14: we learn responses for panel survey were 75 cents/survey

   George Kirikos: Yes, George Kirikos here. Yes, that fraction seems
   very high to me and we would actually know, from the experience of the
   registrars, what the actual figures should be. I want to actually ask
   about the composition of the panel sample, I think it said somewhere
   in the report that these were people that were paid small amounts to
   participate in the survey. I know from Mechanical Turk and other
   things, other survey systems like that, you have a bunch of people
   that can take a survey for $2 or $3 and you know, make extra money in
   their spare time and self-qualify for the survey, declare that they
   are registrants, et cetera in order to make an extra few bucks. Can
   you tell us exactly how much the panel sample was paid per response?
   Thank you.

   Greg Rafert: I believe they were paid 75 cents. There's some variation
   based on the country so I think it can go as much as like a $1.25 or
   $1.50 but it's around 75 cents.

   e) page 17: obvious illogical answers contaminated the data

   Paul Keating: (Paul Keating), for the record. Just a quick question,
   because you said that people could select multiple responses here. If
   someone had responded with a response to a question that you -
   basically concludes they don't know what they're - they don't
   understand the issue, did you eliminate those from the statistics used
   if they happened to also have responded to the first two? So in other
   words, if someone responded to one of the first two, but then clearly
   responded to the last two, it seems logical that you would disregard
   that entire response. Thank you.

   Greg Rafert: Yes, that's a good point. We have not, and I think we should.

   Similarly on page 29:

   Woman 6: To the same question that was asked earlier, I think by Paul,
   if somebody answered yes to all of these, are you going to go back and
   kind of eliminate that?

   Greg Rafert: Yes, I think it's a really good suggestion change -
   suggested change.

   [unclear whether that's been done or not]

   f) page 32: members of this PDP *not* prevented from answering survey!!

   George Kirikos: George Kirikos for the transcript. Do - there's an
   echo. There's still an echo. Better now. I wanted to ask whether
   members of this PDP were prevented from doing - filling out this
   survey or is there overlap in membership of this PDP and answers to
   the survey? Thank you.

   Greg Rafert: So they were not prevented from taking the survey.

   ***** October 22, 2018: RPM Working Group Session 3 *****

   https://gnso.icann.org/en/meetings/transcript-gnso-rpm-session-3-22oct18-en.pdf

   g) page 10: incorrect claim that there are "trends", another admission
   of not a statistically representative sample

   Greg Rafert: ... I think there are some useful trends and interesting
   data points that we've identified in the survey. But I certainly would
   not say that this is a statistically representative sample of registry
   operators, for example.

   h) page 13: need to be careful due to the low number of responses

   Greg Rafert:... I want to be of course careful with kind of how we
   interpret any of the results coming out of this because very few
   individuals actually responded to it...

   i) pages 17-18: Paul Keating asks about the small sample; admissions
   about lack of statistical validity

   Paul Keating: (V) Paul. Paul Keating. I have a question. It's directed
   at - I'm trying to solicit from you a little bit of guidance. This was
   a survey. A lot of work went into it, both by members of the sub teams
   and by yourselves. But we had a pretty small survey universe and there
   would be, if someone were to look at it from a statistically valid
   standpoint, it wouldn't be considered very - it wouldn't be given lots
   of weight. Let's put it that way. So what would - how do you think we
   ought to take this information and how much importance should we be
   placing on individual responses versus the report as a whole? Thank
   you.

   Greg Rafert: I think it's a really good question. I agree. I certainly
   wouldn't ascribe any kind of statistical validity to the survey
   instrument. I would kind of view it as part of the process. I've
   worked on a couple of ICANN reviews and within the context of a given
   ICANN review, we do surveys, we do interviews, we talk to other
   participants within the ICANN community. We read kind of external past
   year materials and reports. And so the survey is kind of one component
   of that data collection exercise and thinking about the problem. And
   we kind of - it's not weighted more or less than any of those pieces.
   We kind of think about it as a holistic exercise and seeing are we
   identifying trends that we've also kind of heard about from just
   discussions with people in the industry, for example. So I would view
   it -- and I don't know if this is helpful or not -- I would view it as
   a piece of information. I think there's some interesting trends in
   some of the questions. Is that the exact state of the world? So is it
   really that 55% of all registry operators believe X? Probably not but
   it might give you an indication as to kind of where people's views and
   beliefs are headed.

   Paul Keating: This is Paul Keating following up.
   So it's more anecdotal than anything else?

   Greg Rafert: Certainly for
   the registry operator and registrar surveys I would say that it's
   anecdotal information. It can be informative but you have to kind of
   look at it through the lens of there being not a large number of
   responses. For the trademark owner potential registrant and registrant
   surveys, one thing we don't have a large enough sample to say that
   it's statistically valid. But given the higher response rates there,
   especially for the potential registrants, I say it begins to move away
   from the world of anecdotes and to be something a little bit more.

   Paul McGrady: This is Paul McGrady for the record. Just to follow-up
   on the real Paul's question, which is what is better, what we have
   here, or no information? Right, is it better to have this, anecdotal
   information perhaps for the contracted parties, something more than an
   anecdotal for the rest. Is that a - in terms of decision making, is it
   better to have this or is it better to have nothing?

   Greg Rafert: I at least think so. I think it's of course largely up
   for you all to decide but I would think that it's better than having
   no information.

   j) pages 22-23: I call out the issue of whether this report actually
   identifies "trends", and we also learn the budget for the survey

   Ariel Liang: ..And then there is a question from George Kirikos for
   Greg. Greg has a PhD so I trust that he can answer this. Would Greg
   confirm that trends refer to changes of data over time and this survey
   was a one-time survey. Thus, isn't it entire incorrect to claim that
   this survey is an indicator of a trend when there is no way to detect
   change over time from a single survey? And then he also asked another
   question. What was the actual budget for this survey? Sorry, this is
   Ariel Liang from staff. We can probably answer George's second
   question. The actual budget is $50,000.

   Greg Rafert: And to your first question, George, yes, I was being a
   little loose with my language in terms of using the word trends. So I
   definitely did not mean to imply or suggest that we're looking at
   changes in perceptions or people's views over time.

   k) page 25-26: Paul Keating also inquires about the issue of members of this
   PDP filling out the survey

   Paul Keating: Hi, Paul Keating for the record and I'm not trying to
   add controversy where none needs to be but I was just thinking about
   Paul's comment about more information is better than no information.
   And I remember that yesterday, you didn't have a restriction on at
   least the council's participation in the survey as prohibiting someone
   who is in the working group already from participating in in the
   survey. So I'm wondering how many people in any of those questions
   participated who were actually working group members. And the only
   reason is that if we're going to allocate importance in any degree to
   the survey, want to make sure that we're not double counting people,
   right, because we have our own opinions during the working group and
   working group sessions. And I don't want there to be a perceived even
   concept out there that certain people, whether it's on the domain
   registrant side or the trademark claimant side were in the survey
   pumping up the results so that effectively their position during the
   working group could be sustained to a higher degree of importance. So
   if you could verify that, if there's any way of verifying it, or just
   simply asking people -- members of the working group -- whether or not
   they participated in the survey that would answer the question and
   eliminate a potential source of conflict and suspicion. Thank you.

   Greg Rafert: So to verify, we would need them to identify to us that
   they took the survey and we can also identify the responses as long as
   they gave us some sense for - assuming they remembered how they
   answered certain questions. And maybe that latter part is not
   possible.

   l) page 31: admission that there's not a representative sample of
   trademark owners for this survey

   Ariel Liang: This is Ariel from staff. There is a question from George
   Kirikos. I'd like to point Greg to Page A2-526. Is that a
   representative sample of the entire universe of trademark holders
   given the number of firms who - and with revenues in billions of
   dollars? And he has a follow-up.

   Greg Rafert: Yes, I would say it is not a representative sample of
   trademark owners.

   ***** November 28, 2018 RPM PDP Working Group meeting *****

   https://community.icann.org/display/RARPMRIAGPWG/2018-11-28+Review+of+all+Rights+Protection+Mechanisms+%28RPMs%29+in+all+gTLDs+PDP+WG

   https://community.icann.org/download/attachments/99483940/Questions%20%26%20Comments%20-%20Final%20Report%20RPM%20Survey%20-%20AG%20comments.pdf?version=1&modificationDate=1543271647000&api=v2

   m) no statistical confidence intervals, margin of errors!

   Question (by George Kirikos, page 5):

   None of the tables include the asserted "margin of error" numbers in
   the current draft of the final report. Please provide them at a 95%
   confidence level. [one can use a standard calculator such as
   https://www.surveymonkey.com/mp/margin-oferror-calculator/ to do this;
   if it's too much work to do this for all tables, please advise what
   your "population" or "universe" number is, and calculate the margin of
   error for tables: Q1a (page 10), Q6 (page 12), Q6a (page 12), Q6a.i
   (page 13), Q8 (page 15), Q9 (page 21), Q4a (page 32), Q21 (page 38),
   Q21a.i. (page 39), Q5 (page 43) Q2 (page 47), and Q13b (page 53).

   Answer (from the Analysis Group):

    "Due to the opt-in nature of the surveys and issues that arise in
   defining the proper population for some of the respondent groups, we
   do not feel it would be proper to provide confidence intervals or
   margins of error for these results."

   n) from November 28, 2018 call transcript, page 6, no anti-fraud measures!

   https://gnso.icann.org/sites/default/files/file/field-file-attach/transcript-rpm-review-28nov18-en.pdf

   George Kirikos: ...So I was wondering whether the list of questions and answers
   that we saw was complete. Or whether there are also these anti-fraud
   questions that were inter-mingled in the survey. To try to deter that
   kind of abuse. "

   Answer (transcript says "Greg Shatan", but should be "Greg Rafert"
   obviously):  "Yes, this is Greg. There were no anti-fraud questions in
   the survey since we were just trying to keep it to be as short as
   possible. I don't think we will saw much evidence of people randomly
   answering questions in the survey. Generally, the ways in which people
   answered questions were pretty consistent. And I think, you know, we
   can certainly look into this two that received cease and desist
   letters. It's possible that we were mistaken in stating that. But if
   they did we can certainly look if there are any kind of other fishy
   responses by those two individuals. Thanks."

   (Key part of the above: No anti-fraud questions! Results obviously
   identified (previous to that part of the transcript) that folks from
   the panel were simply providing random answers that don't correspond
   to observable real world event proportions (i.e. lawsuits, UDRPs,
   etc.), in order to earn their 75 cent payouts as quickly as possible)

   In conclusion, there were serious issues with this survey, and it'd be
   difficult to attach any weight at all to the results.

   Sincerely,

   George Kirikos
   416-588-0269
   http://www.leap.com/
   _______________________________________________
   Gnso-rpm-trademark mailing list
   Gnso-rpm-trademark at icann.org<mailto:Gnso-rpm-trademark at icann.org>
   https://mm.icann.org/mailman/listinfo/gnso-rpm-trademark

   _______________________________________________
   Gnso-rpm-trademark mailing list
   Gnso-rpm-trademark at icann.org<mailto:Gnso-rpm-trademark at icann.org>
   https://mm.icann.org/mailman/listinfo/gnso-rpm-trademark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mm.icann.org/pipermail/gnso-rpm-sunrise/attachments/20181212/9ece1ed5/attachment-0001.html>