[tz] Irish Standard Time vs Irish Summer Time
Brian Inglis
Brian.Inglis at SystematicSw.ab.ca
Mon Dec 11 06:20:28 UTC 2017
On 2017-12-10 14:09, Paul Eggert wrote:
> Brian Inglis wrote:
>> 72/ 425 82/198 Irish+Summer+Time+IST
>> ...
>> 55/ 322 31/ 77 Irish+Standard+Time+IST
> These two queries have too many false hits to be useful. For example, here are
> the top ten hits for the first query (I used google.com from UCLA), preceded by
> codes indicating what these pages say about the abbreviation IST ("std" means
> Irish Standard Time, "sum" means Irish Summer Time, "---" means no opinion):
> std https://en.wikipedia.org/wiki/Time_in_Ireland
> --- https://www.timeanddate.com/time/zones/ist-ireland
> --- https://www.timeanddate.com/worldclock/ireland/dublin
> std http://timebie.com/timezone/irishindia.php
> std https://www.worldtimeserver.com/time-zones/ist-3/
> --- https://www.worldtimeserver.com/current_time_in_IE.aspx
> --- https://www.worldtimebuddy.com/ireland-dublin-to-ist
> sum https://www.horlogeparlante.com/time-zone-IST.html
> sum https://www.sitesworld.com/time/ist-(irish)-to-azost.html
> std http://www.ireland.com/en-us/about-ireland/once-you-are-here/time-zone/
> So, even though this query attempts to count web pages that call IST "Irish
> Summer Time", its ten highest-ranking pages suggest that "Irish Standard Time"
> is twice as popular as "Irish Summer Time". Evidently the query is too broad,
> and Google's algorithms find so many pages on the general subject of Ireland and
> time and summer that it's double counting pages.
> We must therefore discard the results of those two queries, as they're not
> really counting what we are interested in.
>> no site site:ie query > 61/ 388 19/ 61
>> "Irish+Summer+Time"+IST
>> ...
>> 55/ 312 32/111 "Irish+Standard+Time"+IST
>> 53/ 399 9/ 18 "Irish+Summer+Time+IST"
>> 51/ 391 20/ 66 "Irish+Standard+Time+IST"
> These queries are better, but I get waaay different results from you when I
> query from UCLA. I get:
> (no site) site:ie
> 1 0 "Irish+Summer+Time"+IST
> 4 1 "Irish+Standard+Time"+IST
> 1 0 "Irish+Summer+Time+IST"
> 47 20 "Irish+Standard+Time+IST"
> and this is an even bigger win for "Irish Standard Time" than the earlier
> results I posted. Perhaps I'm misunderstanding how you do a query? Here's how I
> did it: I visited https://www.google.com, and pasted this into the search box:
> "Irish+Summer+Time"+IST
I find it easier to deal with and modify the search parameters directly in the
URL query string for consistent search results to scrape: just (re-)move the
'"'s or paste-select-cut between Summer and Standard, to ensure other parameter
settings are unchanged.
The URL "+" appears as " " in the search box and vice-versa " " in the search
box appears as "+" in the URL.
This is in the URL encoding spec for the query string parameter value default
application/x-www-form-urlencoded content type, also requiring literal value "+"
be URL-/%-encoded as "%2b".
That is what those query strings contain instead of "+" or " ", and they appear
to be disrupting the search.
--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada
More information about the tz
mailing list