[tz] Irish Standard Time vs Irish Summer Time
eggert at cs.ucla.edu
Sun Dec 10 21:09:02 UTC 2017
Brian Inglis wrote:
> 72/ 425 82/198 Irish+Summer+Time+IST
> 55/ 322 31/ 77 Irish+Standard+Time+IST
These two queries have too many false hits to be useful. For example, here are
the top ten hits for the first query (I used google.com from UCLA), preceded by
codes indicating what these pages say about the abbreviation IST ("std" means
Irish Standard Time, "sum" means Irish Summer Time, "---" means no opinion):
So, even though this query attempts to count web pages that call IST "Irish
Summer Time", its ten highest-ranking pages suggest that "Irish Standard Time"
is twice as popular as "Irish Summer Time". Evidently the query is too broad,
and Google's algorithms find so many pages on the general subject of Ireland and
time and summer that it's double counting pages.
We must therefore discard the results of those two queries, as they're not
really counting what we are interested in.
> no site site:ie query > 61/ 388 19/ 61 "Irish+Summer+Time"+IST
> 55/ 312 32/111 "Irish+Standard+Time"+IST
> 53/ 399 9/ 18 "Irish+Summer+Time+IST"
> 51/ 391 20/ 66 "Irish+Standard+Time+IST"
These queries are better, but I get waaay different results from you when I
query from UCLA. I get:
(no site) site:ie
1 0 "Irish+Summer+Time"+IST
4 1 "Irish+Standard+Time"+IST
1 0 "Irish+Summer+Time+IST"
47 20 "Irish+Standard+Time+IST"
and this is an even bigger win for "Irish Standard Time" than the earlier
results I posted. Perhaps I'm misunderstanding how you do a query? Here's how I
did it: I visited https://www.google.com, and pasted this into the search box:
(including all the quotation marks and plus signs), and pressed the "Google
Search" button. As noted above I got just one hit (it says "IST" stands for
Indian Standard Time, and so doesn't even support the contention that "IST"
means Irish Summer Time). I get the same hit if I visited https://www.google.ca.
I don't know how to explain the fact that my results differ so much from yours.
I should mention that I do not look at hits that Google ordinarily suppresses as
very similar, as I've found that these expanded hit counts are not as good an
indication of popularity: many sites just republish other site's articles, for
example. It's better to try to count count independent sources.
The problem of assessing Google hit counts is not limited to Ireland. Several
years ago I observed similar issues when estimating the most popular
abbreviations Australia. Obvious queries like 'EST "Eastern Standard Time"
site:au' did not work, as too many of the hits were polluted by other data
(e.g., Australian web pages reporting North American abbreviations). It can be
quite tricky to get reliable results.
More information about the tz