[Comments-cct-recs-27nov17] HosterStats comments on CCT draft report RFC (Parking)

John McCormac jmcc at hosterstats.com
Mon Jan 15 21:20:22 UTC 2018


Comments on Section: 1.1 Parking

The CCT review team did not understand the complexities of measuring web 
usage in TLDs. While well intentioned, it relied data generated by a 
highly simplistic methodology that, among other mistakes, considered 
redirects to be "parking". This negatively impacted the report's 
conclusions on "parking".


1.1 Parking, Page 3: "Given the high percentage of “parked” 
registrations in new gTLDs, even relative to the high percentage of 
parking in legacy gTLDs, the Review Team sought to understand whether 
this phenomenon would affect its conclusions regarding the competitive 
impact of the New gTLD Program."


This links directly to the approach used to measure "parking" in the new 
and legacy gTLDs. The reality is that different TLDs have different 
rates of "parking" and this is often a function of the dominant market 
for the TLD and the age of the TLD. It is like, in terms of development, 
comparing a rock with an iPhone and wondering why one cannot receive 
phone calls on the rock as it has silicon in it too.

Grouping all new gTLDs as a single TLD is not a good way to measure this 
the competitive impact of new gTLDs. The new gTLDs form a set of TLDs 
that were launched on different dates and target different markets. Many 
of the legacy gTLDs have been in operation for over a decade and some 
longer. Their web usage patterns are quite different to newly launched 
gTLDs. It can, as acknowledged by the review team, take up to five years 
for the web usage trends in new gTLDs to stabilise.

The review team did not establish what new gTLDs are competing with 
other specific gTLDs or ccTLDs. Even if a new gTLD registration has no 
website, it may still be used for e-mail or other services. It might 
also be a brand protection registration. Brand protection registrations 
are good indications of the popularity of a new TLD. Defining markets 
and competition, especially with generic TLDs, is difficult and it all 
seems to come back to the review team, through no fault of its own, not 
having sufficient data to analyse its hypotheses.


1.1 Parking, Page 3: "While several hypotheses as to potential impact of 
parking on competition were advanced, no conclusive evidence was 
available to support them in the near term."


The important aspect with any new gTLD, or any new TLD, is web usage. 
Usage drives awareness which drives development and increases renewals. 
How are the domain names being used? Is there active content on these 
websites? How do these TLDs link to existing TLDs in their target 
markets? These are the important questions. The review team was unable 
to even come close to answering any of these questions. It just didn't 
have the necessary data, the analysis and the metrics.

1.1, Parking, Page 3: "While the Review Team did not find definitive 
evidence of parking’s effect on competition, we found some 
differentiation between regions when it comes to parking. In particular, 
there appears to be more parked domains in Chinese language domains 
where more speculation seems to be occurring."

The review team had to rely on surveys that were not sufficiently 
comprehensive in terms of metrics and analytical depth. This was 
glaringly clear when it came to trying to understand what was happening 
in the Chinese market.

The Chinese market is a very complex one in that it has speculative 
dynamics, ordinary web development, discounting driven registration 
volume and different web usage trends compared to some of the legacy 
gTLDs. While it has elements of Pay Per Click (PPC) advertising and 
parking, it also has, as befits speculation, significant numbers of 
domain names that are for sale on domain name auction websites.

The review team's grouping of Chinese language gTLDs (XIN, WANG, TOP, 
XN--SES554G, REN) to illustrate how "parking" is higher in Chinese 
language domain names ignores the fact that Chinese registrants also 
register domain names in other new gTLDs. Some of these new gTLDs have 
specifically engaged in promotional discounting and tend to have higher 
levels of speculative registrations, from many countries, than new gTLDs 
that have higher registration fees.

Because of the poor methodology used in determining if a website or 
domain name is "parked", the review team did not have the necessary data 
to measure the effect of these speculative registrations and how they 
are being used.

Some of these registrations have websites with automatically generated 
affiliate pages and seemingly deep content. Some speculative 
registrations in the Chinese market have lottery and gambling landing 
pages/websites rather than developed content. To an unsophisticated 
methodology, such as that relied upon by the review team, these websites 
would not appear as "parked" and would give a misleading view of web 
usage in the gTLD. Statistics on the effect of speculative registrations 
and their renewal rates were posted to the mailing list for the gTLD 
Marketplace Health Index.

The review team, because of the lack of historical data, could not 
recognise the fact that many of these websites are "one year wonders". 
They do not renew at high rates and it is cheaper for the registrant to 
avail of the latest discounting promotion rather than renew a domain 
name at a full registration fee.

Speculative registrations in a newly launched gTLD tend to have a higher 
than average non-renewal rate. They are registered with the intent of 
being sold on for a profit. When they cannot be sold on for a profit, 
they are generally deleted unless the registrant wishes to renew them in 
the hope that one day they will be sold for a profit. The introduction 
of discounted registrations amplified speculative trends in newly 
launched gTLDs and since 2014, there has been a drive amongst the larger 
new gTLDs to grow the number of domains under management by discounting. 
As has been seen with the XYZ gTLD's 1 cent promotion, heavy discounting 
leads to high rates of non-renewal.

Discounting, when overused, locks the registry into a boom and bust 
cycle where it finds itself chasing the next discounting deal in order 
to keep the number of new registrations and renewals ahead of the deletions.

1.1 Parking, Page 3: "There may be some correlation between parking and 
malware distribution, but that is not as strong and indicative as the 
overall trend of lower malware distribution rates than those of legacy 
gTLDs. Nonetheless, the malware distribution rate gap between legacy and 
new gTLDs appears to be shrinking, and it behooves the community to 
further explore the correlation between parking and malware distribution."

While the recommendation for further analysis and exploration is 
welcome, the initial hypothesis seems to be based on a somewhat narrow 
focus on malware propagation and definitions.

Active websites, rather than "parked" websites, are more efficient 
vectors for malware distribution simply because they have higher levels 
of traffic than "parked" domain names. However, "parked" domain names 
are not generally as effective or targeted as a means of spreading malware.

Compromised and infected websites are more likely to be active websites 
using vulnerable software and plug-ins rather than "parked" domain 
names. Compromised control panel software is generally a short-lived 
event. Abandoned websites tend to be a greater risk when it comes to 
malware distribution because they remain unpatched, unsecured and abused.

1.1 Parking, Page 4: "The overall results of the Review Team’s 
observations on parking are inconclusive and suggest the need for 
further research not limited to the impact of new gTLDs. Therefore, the 
Review Team recommends a more rigorous collection of data around various 
types of parking to facilitate further examination by the community of 
the impact of parking on competition, consumer trust and its proxy, DNS 
abuse."

The acknowledgement of the need for further research is welcome. It 
would have been better if the review team had reached out to the 
community for advice rather than attempting to muddle through with 
flawed conclusions based on poorly defined metrics and insufficient data.

The problem of inconclusive results has its genesis in the lack of 
experience of the review team in the complex field of web usage 
measurements. More rigorous collection and analysis would have been a 
good thing but the emphasis should be on web usage trends rather than 
simply on "parking".

ICANN should have been more proactive in terms of providing data. It 
should not have expected that people with limited expertise in various 
fields could properly specify, on their own, the kind of data needed and 
be assured that the data received was of sufficient quality and depth to 
test various hypotheses. For future review teams, ICANN needs to 
reevaluate its selection process to ensure a team with the necessary 
combination of skillsets and expertise, (domain name industry, 
technical, commercial, legal and economic) to carry out the review and, 
the necessary data, and, if necessary, directly help the teams in 
formulating hypotheses.

If the necessary data or help is unavailable from ICANN, future review 
teams should reach out to the community for help when an element of the 
review is beyond their collective levels of expertise.


Comment on Section 3.1 Potential Impact of “Parked” Domains on
Measures of Competition.

Competition, Section 3.1, Page 8: "Examples of behaviors that could be 
considered parking include:
...
The domain redirects to another domain in a different TLD."

This is wrong. Redirects to domain names in other TLDs are not 
"parking". This type of baseless assumption damages the credibility of 
the "parking" section in the report. Redirects are common in most TLDs 
and redirecting to domain names/websites in other TLDs is often done to 
prevent duplicate content issues with search engines and for brand 
protection purposes. There is growing use of HTTPS secured websites and 
these will often use redirects to redirect the user to the secured 
version of the website either in the same TLD or a different TLD.

The attempt at guessing renewal rates for parked domain names from prior 
years only using parking rates from a month after these domain names had 
deleted was a feat rivalling the Medieval religious debates about how 
many angels can dance on the head of a pin. It doesn't matter that a 
Pearson Correlation was used to give some semblance of mathematical 
credibility. The simple fact is that the review team and its contractor 
did not have any data on previously parked and deleted domain names as 
these domain names were deleted prior to its survey.

The review team was trying to find a correlation with renewal rates and 
"parking" rates without having any domain name level data on renewal 
rates and data on whether the deleted domain names were "parked". 
Unsurprisingly, it couldn't find any correlation. The section is so 
logically unsound and abjectly wrong that it should not have a place in 
an ICANN report.

The deletions from the period used (July 2016 to December 2016) 
coincided with the bursting of the 2015 Chinese Bubble with millions 
more domain names than usual being deleted in the legacy gTLDs. It is 
ironic that the deletion of many of the Chinese Bubble registrations 
(many were not active or developed, or were parked on PPC or for sale on 
domain name auction sites) in this period was just the kind of effect 
that the review team wanted to find. Some of the domain names the review 
team's contractor's survey would not go through their first 
renewal/deletion cycle until approximately December 2017 to February 2018.


Comment on Section 3.2 Geographic Differences in Parking Behavior

The focus on Chinese gTLDs to illustrate parking behaviour on a 
geographic basis has been mentioned earlier. The Chinese market also has 
PPC, holding pages, and domain names at auction like other gTLDs. The 
use of automated affiliate landing pages and websites, typically 
gambling related, is more common in Chinese dominated gTLDs than others. 
Some of these Chinese dominated gTLDs are not based in China.

The reference to the Oxford Information Labs, LACTLD, EURid and 
InterConnect Communications, Latin America and Caribbean
DNS Marketplace Study (September 2016) did seem to obfuscate the 
difference between non-responding domain names (domain names that are 
not active in DNS) and domain names that have websites that are not 
developed or parked on PPC, holding pages or domain name sales sites. To 
its credit, the review team states that this section is based on limited 
data and that more granular data is necessary to properly study how the 
behaviour varies across regions.



Comment on Section 3.3 Relationship Between Parking and DNS Abuse

The problems created by a lack of data are also apparent with this 
section. While it relies upon academic studies (Vissers et al) to 
support its possible hypothesis of a relationship between "parking" and 
DNS abuse, the hypothesis seems to be based on a limited understanding 
of the spread of malware and the part that compromised active websites, 
rather than "parked" websites, play in its efficient and successful 
propagation. A compromised active website with thousands of users each 
day has the capability to infect thousands of users in a day. A "parked" 
website with a single visitor every six months has the potential to 
infect that single user.

The review team does mention that it is unsure about whether Vissers et 
al's study applies more to malware links in advertising networks than 
compromised parked websites. Search engines have been actively removing 
parked websites from their indices for some years now.

While the review team's emphasis was on "parked" domain names, it did 
not mention anything concerning a connection between DNS abuse and 
heavily discounted domain names. Heavily discounted domain names reduce 
the costs of setting up websites to propagate malware or spam.

The link between cost of registration and malware propagation, and other 
forms of DNS abuse, in a gTLD has been covered more extensively in the 
SIDN study. It would be a good thing if the review team gave more 
prominence to the report as it is based on data.

The value of a compromised website to a malicious actor lies in its 
traffic and the trust in which that website is held by visitors. DNS 
abuse is not limited to the propagation of malware, spam and phishing. A 
compromised website can also be valuable to a malicious actor in terms 
of hidden links that are only visible to search engines rather than to 
human users. These compromised sites tend to be used for blackhat search 
engine optimisation and linking schemes and this kind of use can be more 
common than malware distribution and phishing because it often goes 
undetected by the owners of the websites since there are no direct 
victims other than the site owner. The recommendation to focus purely on 
the connection between parking and malware distribution by the review 
team is one that seems to be based on a limited understanding of the 
threat environment in gTLDs and it should be expanded to cover other 
forms of DNS abuse. The SIDN study was a good step in this direction.


Comment on Section 3.4 Recommendations
"Recommendation 5: Collect parking data."

The focus on "parking" is misleading. It should be rephrased to "Collect 
usage data." This would give a future review team, and ICANN, the 
ability to comprehensively analyse trends in how gTLDs are used and 
detect competition and other activities.

There is a correlation between "parking" rates and non-renewal. But each 
TLD also has a separate parking and renewal, or non-renewal, trend for 
non-speculative registrations. Discounted registrations should be 
identified as part of this collection process so that the effects of 
discounting on both usage and renewal rates can be measured. Discounting 
promotions generally result in new registration spikes on registrars and 
hosters so ICANN already has some of the raw data.

Future review teams should be provided with enough data to test their 
hypotheses rather than having to try to assemble the data themselves 
without knowing the reliability of the data. This should be an ICANN task.

John McCormac.
-- 
**********************************************************
*  e-mail: jmcc at hosterstats.com
*  web: http://www.hosterstats.com/
*  Domain Registrations Statistics
*  And Historical DNS Database.
*  Over 519 Million Domains Tracked.
*  Skype: hosterstats.com
**********************************************************



More information about the Comments-cct-recs-27nov17 mailing list