[tz] Irish Standard Time vs Irish Summer Time

Paul Eggert eggert at cs.ucla.edu
Sun Dec 10 21:09:02 UTC 2017


Brian Inglis wrote:
>   72/ 425	 82/198		Irish+Summer+Time+IST
> ...
>   55/ 322	 31/ 77		Irish+Standard+Time+IST

These two queries have too many false hits to be useful. For example, here are 
the top ten hits for the first query (I used google.com from UCLA), preceded by 
codes indicating what these pages say about the abbreviation IST ("std" means 
Irish Standard Time, "sum" means Irish Summer Time, "---" means no opinion):

std https://en.wikipedia.org/wiki/Time_in_Ireland
--- https://www.timeanddate.com/time/zones/ist-ireland
--- https://www.timeanddate.com/worldclock/ireland/dublin
std http://timebie.com/timezone/irishindia.php
std https://www.worldtimeserver.com/time-zones/ist-3/
--- https://www.worldtimeserver.com/current_time_in_IE.aspx
--- https://www.worldtimebuddy.com/ireland-dublin-to-ist
sum https://www.horlogeparlante.com/time-zone-IST.html
sum https://www.sitesworld.com/time/ist-(irish)-to-azost.html
std http://www.ireland.com/en-us/about-ireland/once-you-are-here/time-zone/

So, even though this query attempts to count web pages that call IST "Irish 
Summer Time", its ten highest-ranking pages suggest that "Irish Standard Time" 
is twice as popular as "Irish Summer Time". Evidently the query is too broad, 
and Google's algorithms find so many pages on the general subject of Ireland and 
time and summer that it's double counting pages.

We must therefore discard the results of those two queries, as they're not 
really counting what we are interested in.

>  no site	site:ie		query >  61/ 388	 19/ 61		"Irish+Summer+Time"+IST
> ...
>  55/ 312	 32/111		"Irish+Standard+Time"+IST
>  53/ 399	  9/ 18		"Irish+Summer+Time+IST"
>  51/ 391	 20/ 66		"Irish+Standard+Time+IST"

These queries are better, but I get waaay different results from you when I 
query from UCLA. I get:

  (no site)  site:ie
    1         0         "Irish+Summer+Time"+IST
    4	     1         "Irish+Standard+Time"+IST
    1         0         "Irish+Summer+Time+IST"
   47	    20         "Irish+Standard+Time+IST"

and this is an even bigger win for "Irish Standard Time" than the earlier 
results I posted. Perhaps I'm misunderstanding how you do a query? Here's how I 
did it: I visited https://www.google.com, and pasted this into the search box:

"Irish+Summer+Time"+IST

(including all the quotation marks and plus signs), and pressed the "Google 
Search" button. As noted above I got just one hit (it says "IST" stands for 
Indian Standard Time, and so doesn't even support the contention that "IST" 
means Irish Summer Time). I get the same hit if I visited https://www.google.ca. 
I don't know how to explain the fact that my results differ so much from yours.

I should mention that I do not look at hits that Google ordinarily suppresses as 
very similar, as I've found that these expanded hit counts are not as good an 
indication of popularity: many sites just republish other site's articles, for 
example. It's better to try to count count independent sources.

The problem of assessing Google hit counts is not limited to Ireland. Several 
years ago I observed similar issues when estimating the most popular 
abbreviations Australia. Obvious queries like 'EST "Eastern Standard Time" 
site:au' did not work, as too many of the hits were polluted by other data 
(e.g., Australian web pages reporting North American abbreviations). It can be 
quite tricky to get reliable results.


More information about the tz mailing list