[Gnso-ppsai-pdp-wg] Public Comments

Graeme Bunton gbunton at tucows.com
Tue Jul 14 03:07:23 UTC 2015


Hi All,

I had a kind developer at Tucows 'screen scrape' all of the PPSAI public 
comments. This means they wrote a program that essentially visited 
all*of the comments submitted and captured:
     -the sender
     -the subject
     -the body of the message
     - a url for any attachments
     - a url for the comment itself online

To those fields I've added:
     - Has Attachment (Y/N) - this allows for easy filtering of comments 
with attachments
     - NameCheap (Y/N) - this flag is generated if the message body 
contains the words "regardless of whether the request comes from a 
private individual" which comes from the templated namecheap comments.**
     -Word count - allows for sorting by comment length

Screenscraping is never exact, and it has a tough time with some 
formatting. By and large though it's pretty good and I've found it 
useful so far for triaging and prioritizing comments.

Caveats:
* I know it's missing about 15 or so comments, I've figured out a way to 
identify which are missing and will send those along tomorrow.
** There are many namecheap comments where the sender chose to write 
their own text and therefore the above phrase is not included, these 
don't have the Y flag.  Similarly, many with the flag will include extra 
content the sender chose to add.  These can be identified by applying a 
filter on the namecheap column and then sorting by wordcount. The above 
phrase was chosen because it was long enough to be unlikely to show up 
in in other comments.  This is obviously not perfect at identify those 
comments.

With a bit of excel expertise you should be able to filter and sort the 
submitted comments as you see fit.

We have an obligation to read what's been submitted, and I hope you find 
the attached makes reading the comments easier, and that it's helpful in 
understanding what the public is telling us.

Graeme

(Also, apologies to ICANN for the punishment we gave their webservers 
while testing and scraping the comments)

-- 
_________________________
Graeme Bunton
Manager, Management Information Systems
Manager, Public Policy
Tucows Inc.
PH: 416 535 0123 ext 1634

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PPSAI Public Comments_Cleanish.xlsx
Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 1779416 bytes
Desc: not available
URL: <http://mm.icann.org/pipermail/gnso-ppsai-pdp-wg/attachments/20150713/4e82cd3a/PPSAIPublicComments_Cleanish-0001.xlsx>


More information about the Gnso-ppsai-pdp-wg mailing list