[gnso-rpm-wg] A Proposal for Smarter Non-Exact Matches

Wed May 31 15:59:14 UTC 2017

Going back to Greg's proposal, it would lead to an enormous number of
expanded matches, and is in no way a "smarter non-exact matches"
system. As an exercise, I challenge Greg to produce a list of ALL the
"smarter non-exact matches" for just the top 10 TMCH terms as
documented by The Analysis Group, to see how expansionary this
so-called "smarter" set of non-exact matches. Remember, there are
already tens of thousands of terms in the TMCH, and if Greg's proposal
was adopted, it would attract even more entries into that database.

Let's do a little bit of math:

1. missing-dot typos: triples the number of matches (by adding 2 more terms)

2. fat finger typos: there are between 3 and 8 possible "fat-finger"
characters surrounding *each* letter of the standard QWERTY keyboard.
Let's suppose the average is 5. That means 5 new matching terms for
each LETTER in the mark. For a 10 letter mark, that means 5x10 = 50
additional matching terms!

3. Character duplication: this adds an additional number of matches
equal to the length of the mark. For a 10 letter mark, that's another
10 matching terms.

4. Character swaps: this can add an additional number of matches *up
to* the length of the mark ("up to", because in some cases the
adjacent letters will be identical, e.g. 'Google' has the 'oo').
Conservatively, it'll be 60% x the length of the mark. So, for a 10
letter mark, that's at least 6 more matching terms.

5. Character removal: this can add an additional number of matches *up
to* the length of the mark ("up to", because in some cases the
adjacent letters will be identical, e.g. 'Google' has the 'oo')
Conservatively, it'll be 60% x the length of the mark. So, for a 10
letter mark, that's at least 6 more matching terms.

6. Plurals: This is poorly defined in the proposal, since adding "s"
doesn't always make sense as a plural. Sometimes it will be "es",
sometimes "ies" (delivery -- deliveries), and then there are plurals
in non-English languages where it might be something else:

http://www.dummies.com/languages/french/how-to-make-french-nouns-plural/

Also, this assumes the term is a noun. What's the "plural" of a mark
that is a verb or adjective? Anyhow, the "naive" rule of adding just
the letter 's' will add 1 more term per mark.

7. Digit Addition: this is a puzzler. What's magical about the number
'1'?? If it's as stated, it adds 1 more term per mark. Or, maybe 10
more terms per mark, if one doesn't discriminate between '1' and
2/3/4/5/6/7/8/9/0

8. CHEAP / BUY: 4 more matches per term. Unclear if Greg wants to also
consider hyphens, which would match even more.

9. Non-Latin character substitutions: these are IDN matches which
would result in a gigantic number of increased matches (e.g. multiple
languages, plus variations for each letter, etc.)

10. Latin Character Substitutions: this is poorly defined. If it's
only 'w' vs 'vv' then that's a small number of additional matches.
But, as the "SWORD" algorithm debacle demonstrated, similarity is in
the eye of the beholder.

11. Goods and Services and Industry Keywords: poorly defined, would
add presumably at least 10 or more terms per mark, multiplied by even
higher variations if hyphens are included or not.

12. Commonly abused terms: see #11.

Greg hasn't discussed whether these also apply in *combination*, e.g.
missing dot + character sway, or "cheap/buy + digit addition", etc. If
so, that's a combinatorial explosion in the number of matches. The
number of false positives would be enormous.

Furthermore, there would be multiple mark matches per domain name
(e.g. multiple marks generating a claim for a single domain name
registration attempt), especially for short marks (5 characters or
less), where the "density" or registrations of "nearby" coexisting and
non-infringing strings is high.

Sincerely,

George Kirikos
416-588-0269
http://www.leap.com/

On Wed, May 31, 2017 at 9:11 AM, Phil Corwin <psc at vlaw-dc.com> wrote:
> PS—An additional question: While arguably in the realm of our upcoming
> Claims Notice review, how and to what extent would the language of the
> notice need to be altered and expanded to accurately inform the potential
> domain registrant of the reason that he/she received the Notice? Would there
> need to be additional language for each new category of non-exact matches,
> or would we be able to generate a customized claims notice based on the
> category that triggered it?
>
>
>
> This question stems from the consideration that when considering policy
> changes we should also consider implementation details and practicalities.
>
>
>
> Thanks
>
>
>
> Philip S. Corwin, Founding Principal
>
> Virtualaw LLC
>
> 1155 F Street, NW
>
> Suite 1050
>
> Washington, DC 20004
>
> 202-559-8597/Direct
>
> 202-559-8750/Fax
>
> 202-255-6172/Cell
>
>
>
> Twitter: @VlawDC
>
>
>
> "Luck is the residue of design" -- Branch Rickey
>
>
>
> From: gnso-rpm-wg-bounces at icann.org [mailto:gnso-rpm-wg-bounces at icann.org]
> On Behalf Of Phil Corwin
> Sent: Tuesday, May 30, 2017 10:46 PM
> To: Greg Shatan; gnso-rpm-wg
> Subject: Re: [gnso-rpm-wg] A Proposal for Smarter Non-Exact Matches
>
>
>
> Greg:
>
>
>
> Thank you for submitting this proposal. I think it will certainly help focus
> our ongoing discussion.
>
>
>
> Attached is a copy of the proposal annotated by initial comments and
> questions it has raised for me in my personal capacity. I hope you will be
> prepared to address them in your presentation or follow-up WG dialogue.
>
>
>
> See you on the call tomorrow.
>
>
>
> Best, Philip
>
>
>
> Philip S. Corwin, Founding Principal
>
> Virtualaw LLC
>
> 1155 F Street, NW
>
> Suite 1050
>
> Washington, DC 20004
>
> 202-559-8597/Direct
>
> 202-559-8750/Fax
>
> 202-255-6172/Cell
>
>
>
> Twitter: @VlawDC
>
>
>
> "Luck is the residue of design" -- Branch Rickey
>
>
>
> From: gnso-rpm-wg-bounces at icann.org [mailto:gnso-rpm-wg-bounces at icann.org]
> On Behalf Of Greg Shatan
> Sent: Monday, May 29, 2017 10:12 PM
> To: gnso-rpm-wg
> Subject: [gnso-rpm-wg] A Proposal for Smarter Non-Exact Matches
>
>
>
> All,
>
>
>
> As I've mentioned earlier, I think that a proposal to use non-exact matches
> other than "mark contained" matches ("dumb matches") makes sense to pursue.
> Various types of matches have been discussed; however, there has been no
> actual proposal for "smarter" matches that can be used by the group.
>
>
>
> The attached proposal seek to fill that gap.  It is more in the nature of an
> addendum to the initial proposal on non-exact matches.  However, it does
> provide a more formal proposal on the types of non-exact matches to be
> considered.  The intent is to provide a sufficient framework to discuss
> these types of non-exact matches and to add these non-exact matches to the
> proposal.
>
>
>
> I hope that this helpful to the work of the group.
>
>
>
> Greg
>
> ________________________________
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2016.0.8013 / Virus Database: 4776/14514 - Release Date: 05/29/17
>
> ________________________________
>
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2016.0.8013 / Virus Database: 4776/14514 - Release Date: 05/29/17
>
>
> _______________________________________________
> gnso-rpm-wg mailing list
> gnso-rpm-wg at icann.org
> https://mm.icann.org/mailman/listinfo/gnso-rpm-wg