[Gnso-rpm-sunrise] Statistical issues/validity of the Analysis Group survey results
icann at leap.com
Wed Dec 12 01:36:42 UTC 2018
To prepare for tomorrow's call, I analyzed the multiple previous calls
we had regarding the Analysis Group survey results (October 21,
October 22, and November 28, 2018). Sub team members might want to
review the transcripts too, as an in-depth review of the transcripts
and materials demonstrates that little or no weight should be attached
to the results, and so it really calls into question the use of the
"survey analysis tools" for those results.
Below are some key snippets:
***** October 21, RPM Working Group Session 2 *****
a) Bottom of page 6: I specifically ask about how they determined
sample size, etc.
Answer (which goes into page 7): "No, that's a really good question,
George, and I appreciate it. I mean, I think that the - I have a
couple of thoughts and responses to what you were raising and I
essentially kind of, well, I won't say agree with everything but I
think I do. You know, it's - given the
available budget for this project, it was difficult for or I think it
would have been impossible for us to get the number of responses that
one would have wanted to have to kind of - relatively small margin of
errors on each of the different estimates in the, you know, that we
ultimately kind of were able to provide in the various tables in the
report. And so I wouldn't say that this is a statistically
representative sample and I wouldn't say that it comes with small
margins of error but I would say that the trends, at least that we're
seeing in the data, I would say are informative. I wouldn't, you know,
hang my hat if my life depended on it in terms of relying on these
results but I think you see some pretty clear trends, at least with
respect to the registrants, the potential registrant and the trademark
"But George, and I think just to go back to your point, it's very well
taken. You know, I think for the registrant, the potential registrant
and the trademark surveys, given the number of responses that we
received, I feel pretty confident about what those results are saying
with respect to how people view various components of the rights
protection mechanisms. Sorry. "
"And but, with respect to the registries and the registrars, where we
received a very small number of respondents, I would view that data as
more anecdotal than anything else."
[admits that it's not a statistically representative sample; note how
the statement of "confident about what the results are saying" is
entirely inconsistent with the later answers about statistical
confidence levels for November 28, 2018 meeting!, also incorrectly
talks about "trends", see October 22, 2018 call snippet below that
directly addresses the "trends" issue]
b) Page 10: survey was done in English, excluded Asian countries which
dominate new gTLDs!
George Kirikos: Thanks, George Kirikos for the transcript. Given the
actual distribution, country by country, of new gTLDs, particularly
the high concentration of registrants from China, I was wondering if
there was any thought given to doing the survey of, you know, Asian
and Chinese in particular registrants, because they obviously
outnumber registrants from other countries? So was there any thought
given to translating the survey into Chinese and getting their
feedback? Thank you.
Ariel Liang: This is Ariel from staff. Thanks for the question,
George, but based on our budget limitation and discussion with
Analysis Group and translating is really out of our capability in
terms of the resources allocated, so we have to do the English but we
did encourage our colleagues, especially in the GSC, to distribute the
survey to countries outside English-speaking countries and then just
tried to promote it as widely as possible. So mainly because of the
budget resources, that's why we couldn't do that translation.
c) page 13: admission that this was not a random sample
George Kirikos: Yes, George Kirikos for the transcript. I'd actually
like to go to the page directly before that, which is Page number 10,
where you could actually look at the demographics of the countries.
You can see, for example, that Canada represented 12% of the sample,
which is the exact same percentage of the United States. We know that
Canada has one tenth of the population of the United States, so do you
actually believe that this is a randomly sampled, representative
sample based on the fact that these proportions are way out of whack
with the actual distribution, country by country? Thank you.
Greg Rafert: Yes and I would say that it is not a random sample
d) page 14: we learn responses for panel survey were 75 cents/survey
George Kirikos: Yes, George Kirikos here. Yes, that fraction seems
very high to me and we would actually know, from the experience of the
registrars, what the actual figures should be. I want to actually ask
about the composition of the panel sample, I think it said somewhere
in the report that these were people that were paid small amounts to
participate in the survey. I know from Mechanical Turk and other
things, other survey systems like that, you have a bunch of people
that can take a survey for $2 or $3 and you know, make extra money in
their spare time and self-qualify for the survey, declare that they
are registrants, et cetera in order to make an extra few bucks. Can
you tell us exactly how much the panel sample was paid per response?
Greg Rafert: I believe they were paid 75 cents. There's some variation
based on the country so I think it can go as much as like a $1.25 or
$1.50 but it's around 75 cents.
e) page 17: obvious illogical answers contaminated the data
Paul Keating: (Paul Keating), for the record. Just a quick question,
because you said that people could select multiple responses here. If
someone had responded with a response to a question that you -
basically concludes they don't know what they're - they don't
understand the issue, did you eliminate those from the statistics used
if they happened to also have responded to the first two? So in other
words, if someone responded to one of the first two, but then clearly
responded to the last two, it seems logical that you would disregard
that entire response. Thank you.
Greg Rafert: Yes, that's a good point. We have not, and I think we should.
Similarly on page 29:
Woman 6: To the same question that was asked earlier, I think by Paul,
if somebody answered yes to all of these, are you going to go back and
kind of eliminate that?
Greg Rafert: Yes, I think it's a really good suggestion change -
[unclear whether that's been done or not]
f) page 32: members of this PDP *not* prevented from answering survey!!
George Kirikos: George Kirikos for the transcript. Do - there's an
echo. There's still an echo. Better now. I wanted to ask whether
members of this PDP were prevented from doing - filling out this
survey or is there overlap in membership of this PDP and answers to
the survey? Thank you.
Greg Rafert: So they were not prevented from taking the survey.
***** October 22, 2018: RPM Working Group Session 3 *****
g) page 10: incorrect claim that there are "trends", another admission
of not a statistically representative sample
Greg Rafert: ... I think there are some useful trends and interesting
data points that we've identified in the survey. But I certainly would
not say that this is a statistically representative sample of registry
operators, for example.
h) page 13: need to be careful due to the low number of responses
Greg Rafert:... I want to be of course careful with kind of how we
interpret any of the results coming out of this because very few
individuals actually responded to it...
i) pages 17-18: Paul Keating asks about the small sample; admissions
about lack of statistical validity
Paul Keating: (V) Paul. Paul Keating. I have a question. It's directed
at - I'm trying to solicit from you a little bit of guidance. This was
a survey. A lot of work went into it, both by members of the sub teams
and by yourselves. But we had a pretty small survey universe and there
would be, if someone were to look at it from a statistically valid
standpoint, it wouldn’t be considered very - it wouldn’t be given lots
of weight. Let's put it that way. So what would - how do you think we
ought to take this information and how much importance should we be
placing on individual responses versus the report as a whole? Thank
Greg Rafert: I think it's a really good question. I agree. I certainly
wouldn't ascribe any kind of statistical validity to the survey
instrument. I would kind of view it as part of the process. I've
worked on a couple of ICANN reviews and within the context of a given
ICANN review, we do surveys, we do interviews, we talk to other
participants within the ICANN community. We read kind of external past
year materials and reports. And so the survey is kind of one component
of that data collection exercise and thinking about the problem. And
we kind of - it's not weighted more or less than any of those pieces.
We kind of think about it as a holistic exercise and seeing are we
identifying trends that we've also kind of heard about from just
discussions with people in the industry, for example. So I would view
it -- and I don’t know if this is helpful or not -- I would view it as
a piece of information. I think there's some interesting trends in
some of the questions. Is that the exact state of the world? So is it
really that 55% of all registry operators believe X? Probably not but
it might give you an indication as to kind of where people's views and
beliefs are headed.
Paul Keating: This is Paul Keating following up.
So it's more anecdotal than anything else?
Greg Rafert: Certainly for
the registry operator and registrar surveys I would say that it's
anecdotal information. It can be informative but you have to kind of
look at it through the lens of there being not a large number of
responses. For the trademark owner potential registrant and registrant
surveys, one thing we don’t have a large enough sample to say that
it's statistically valid. But given the higher response rates there,
especially for the potential registrants, I say it begins to move away
from the world of anecdotes and to be something a little bit more.
Paul McGrady: This is Paul McGrady for the record. Just to follow-up
on the real Paul's question, which is what is better, what we have
here, or no information? Right, is it better to have this, anecdotal
information perhaps for the contracted parties, something more than an
anecdotal for the rest. Is that a - in terms of decision making, is it
better to have this or is it better to have nothing?
Greg Rafert: I at least think so. I think it's of course largely up
for you all to decide but I would think that it's better than having
j) pages 22-23: I call out the issue of whether this report actually
identifies "trends", and we also learn the budget for the survey
Ariel Liang: ..And then there is a question from George Kirikos for
Greg. Greg has a PhD so I trust that he can answer this. Would Greg
confirm that trends refer to changes of data over time and this survey
was a one-time survey. Thus, isn't it entire incorrect to claim that
this survey is an indicator of a trend when there is no way to detect
change over time from a single survey? And then he also asked another
question. What was the actual budget for this survey? Sorry, this is
Ariel Liang from staff. We can probably answer George's second
question. The actual budget is $50,000.
Greg Rafert: And to your first question, George, yes, I was being a
little loose with my language in terms of using the word trends. So I
definitely did not mean to imply or suggest that we're looking at
changes in perceptions or people's views over time.
k) page 25-26: Paul Keating also calls out the issue of members of this
PDP filling out the survey
Paul Keating: Hi, Paul Keating for the record and I'm not trying to
add controversy where none needs to be but I was just thinking about
Paul's comment about more information is better than no information.
And I remember that yesterday, you didn't have a restriction on at
least the council's participation in the survey as prohibiting someone
who is in the working group already from participating in in the
survey. So I'm wondering how many people in any of those questions
participated who were actually working group members. And the only
reason is that if we're going to allocate importance in any degree to
the survey, want to make sure that we're not double counting people,
right, because we have our own opinions during the working group and
working group sessions. And I don't want there to be a perceived even
concept out there that certain people, whether it's on the domain
registrant side or the trademark claimant side were in the survey
pumping up the results so that effectively their position during the
working group could be sustained to a higher degree of importance. So
if you could verify that, if there's any way of verifying it, or just
simply asking people -- members of the working group -- whether or not
they participated in the survey that would answer the question and
eliminate a potential source of conflict and suspicion. Thank you.
Greg Rafert: So to verify, we would need them to identify to us that
they took the survey and we can also identify the responses as long as
they gave us some sense for - assuming they remembered how they
answered certain questions. And maybe that latter part is not
l) page 31: admission that there's not a representative sample of
trademark owners for this survey
Ariel Liang: This is Ariel from staff. There is a question from George
Kirikos. I'd like to point Greg to Page A2-526. Is that a
representative sample of the entire universe of trademark holders
given the number of firms who - and with revenues in billions of
dollars? And he has a follow-up.
Greg Rafert: Yes, I would say it is not a representative sample of
***** November 28, 2018 RPM PDP Working Group meeting *****
m) no statistical confidence intervals, margin of errors!
Question (by George Kirikos, page 5):
None of the tables include the asserted “margin of error” numbers in
the current draft of the final report. Please provide them at a 95%
confidence level. [one can use a standard calculator such as
https://www.surveymonkey.com/mp/margin-oferror-calculator/ to do this;
if it’s too much work to do this for all tables, please advise what
your “population” or “universe” number is, and calculate the margin of
error for tables: Q1a (page 10), Q6 (page 12), Q6a (page 12), Q6a.i
(page 13), Q8 (page 15), Q9 (page 21), Q4a (page 32), Q21 (page 38),
Q21a.i. (page 39), Q5 (page 43) Q2 (page 47), and Q13b (page 53).
Answer (from the Analysis Group):
"Due to the opt-in nature of the surveys and issues that arise in
defining the proper population for some of the respondent groups, we
do not feel it would be proper to provide confidence intervals or
margins of error for these results."
n) from November 28, 2018 call transcript, page 6, no anti-fraud measures!
George Kirikos: ...So I was wondering whether the list of questions and answers
that we saw was complete. Or whether there are also these anti-fraud
questions that were inter-mingled in the survey. To try to deter that
kind of abuse. "
Answer (transcript says "Greg Shatan", but should be "Greg Rafert"
obviously): "Yes, this is Greg. There were no anti-fraud questions in
the survey since we were just trying to keep it to be as short as
possible. I don't think we will saw much evidence of people randomly
answering questions in the survey. Generally, the ways in which people
answered questions were pretty consistent. And I think, you know, we
can certainly look into this two that received cease and desist
letters. It's possible that we were mistaken in stating that. But if
they did we can certainly look if there are any kind of other fishy
responses by those two individuals. Thanks."
(Key part of the above: No anti-fraud questions! Rebecca Tushnet obviously
identified (previous to that part of the transcript) that folks from
the panel were simply providing random answers that don't correspond
to observable real world
event proportions (i.e. lawsuits, UDRPs, etc.), in order to earn their
75 cent payouts as quickly as possible)
In conclusion, there were serious issues with this survey, and it'd be
difficult to attach any weight at all to the results.
More information about the Gnso-rpm-sunrise