[tz] Irish Standard Time vs Irish Summer Time

Brian Inglis Brian.Inglis at SystematicSw.ab.ca
Mon Dec 11 06:20:28 UTC 2017


On 2017-12-10 14:09, Paul Eggert wrote:
> Brian Inglis wrote:
>>   72/ 425     82/198        Irish+Summer+Time+IST
>> ...
>>   55/ 322     31/ 77        Irish+Standard+Time+IST
> These two queries have too many false hits to be useful. For example, here are
> the top ten hits for the first query (I used google.com from UCLA), preceded by
> codes indicating what these pages say about the abbreviation IST ("std" means
> Irish Standard Time, "sum" means Irish Summer Time, "---" means no opinion):
> std https://en.wikipedia.org/wiki/Time_in_Ireland
> --- https://www.timeanddate.com/time/zones/ist-ireland
> --- https://www.timeanddate.com/worldclock/ireland/dublin
> std http://timebie.com/timezone/irishindia.php
> std https://www.worldtimeserver.com/time-zones/ist-3/
> --- https://www.worldtimeserver.com/current_time_in_IE.aspx
> --- https://www.worldtimebuddy.com/ireland-dublin-to-ist
> sum https://www.horlogeparlante.com/time-zone-IST.html
> sum https://www.sitesworld.com/time/ist-(irish)-to-azost.html
> std http://www.ireland.com/en-us/about-ireland/once-you-are-here/time-zone/
> So, even though this query attempts to count web pages that call IST "Irish
> Summer Time", its ten highest-ranking pages suggest that "Irish Standard Time"
> is twice as popular as "Irish Summer Time". Evidently the query is too broad,
> and Google's algorithms find so many pages on the general subject of Ireland and
> time and summer that it's double counting pages.
> We must therefore discard the results of those two queries, as they're not
> really counting what we are interested in.
>>  no site    site:ie        query >  61/ 388     19/ 61       
>> "Irish+Summer+Time"+IST
>> ...
>>  55/ 312     32/111        "Irish+Standard+Time"+IST
>>  53/ 399      9/ 18        "Irish+Summer+Time+IST"
>>  51/ 391     20/ 66        "Irish+Standard+Time+IST"
> These queries are better, but I get waaay different results from you when I
> query from UCLA. I get:
>  (no site)  site:ie
>    1         0         "Irish+Summer+Time"+IST
>    4         1         "Irish+Standard+Time"+IST
>    1         0         "Irish+Summer+Time+IST"
>   47        20         "Irish+Standard+Time+IST"
> and this is an even bigger win for "Irish Standard Time" than the earlier
> results I posted. Perhaps I'm misunderstanding how you do a query? Here's how I
> did it: I visited https://www.google.com, and pasted this into the search box:
> "Irish+Summer+Time"+IST

I find it easier to deal with and modify the search parameters directly in the
URL query string for consistent search results to scrape: just (re-)move the
'"'s or paste-select-cut between Summer and Standard, to ensure other parameter
settings are unchanged.

The URL "+" appears as " " in the search box and vice-versa " " in the search
box appears as "+" in the URL.
This is in the URL encoding spec for the query string parameter value default
application/x-www-form-urlencoded content type, also requiring literal value "+"
be URL-/%-encoded as "%2b".
That is what those query strings contain instead of "+" or " ", and they appear
to be disrupting the search.

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada


More information about the tz mailing list