Re: Biggest difference between Google estimates and actual returns?
- From: peter.ludemann@xxxxxxxxx
- Date: 6 Oct 2005 08:52:15 -0700
Paul Blay wrote:
> I've come across a search term that returns a fairly impressive
> 87,400 estimated hits but only returned 429 actual pages
> including duplicates. Anybody spotted a larger discrepancy?
>
> It was going to be my first 'insta-submission' word based on the
> 87,400 - it's lucky I spotted the mismatch.
>
> (In case anybody is wondering - here's the search link
> http://www.google.co.uk/search?q=%22%E3%82%A8%E3%83%83%E3%83%81%E3%81%8F&hl=en&lr=&safe=off&start=990&sa=N&filter=0
> ) YMMV
I tried this with both Google and Yahoo ... Google estimated 4,130 but
returned only 633; Yahoo estimated 4,130 and returned 807. (Presumably
both would have returned more if the search had been specified to also
return similar pages.)
What's going on here? Quite a few things. And I can only speak in
generalities because I happen to work for one of those companies.
You might have noticed that Yahoo and Google have stopped the "my index
is bigger than your index" boasting. One problem (besides the massive
parallelism that makes any count a bit fuzzy): do you count mirrors or
almost-copies when you're counting results? (In the example quoted
above, both Google and Yahoo decided that 3/4 or so of the query
results were duplicates).
Having said this, you can amuse yourself by querying for "the" and "a"
and seeing how many results you get.
And then there's web spam. There are people who try to tweak things to
get themselves higher on the search results page. The good guys are
called SEOs (search engine optimizers) and concentrate on making their
pages look good so that the search engines will rank them well. The bad
guys use whatever tricks they can, such as "link farms" and "directory
pages". If you do a query and find that most of your results are either
irrelevant or are pages that contain lists of words or lists of ads,
you've stumbled into some web spam. (Yes, both Google and Yahoo are
fighting these guys, but it's an arms race.)
See also http://www.theregister.co.uk/2005/08/16/google_yahoo_junk/ for
a bit more on this (the debunking starts about half-way through the
article).
So, when you're using google-hits to decide which phrase is more
common, just remember that you might have instead wandered into some
piles of 塵, not properly sorted into burnable, plastics, and other.
And you might want to also try yahoo-hits for a second opinion (it also
has advanced search for restricting by language, site, etc.).
- peter
.
- Follow-Ups:
- Re: Biggest difference between Google estimates and actual returns?
- From: jwb
- Re: Biggest difference between Google estimates and actual returns?
- From: Paul Blay
- Re: Biggest difference between Google estimates and actual returns?
- From: peter . ludemann
- Re: Biggest difference between Google estimates and actual returns?
- References:
- Biggest difference between Google estimates and actual returns?
- From: Paul Blay
- Biggest difference between Google estimates and actual returns?
- Prev by Date: genzai bakudan
- Next by Date: Re: Biggest difference between Google estimates and actual returns?
- Previous by thread: Re: Biggest difference between Google estimates and actual returns?
- Next by thread: Re: Biggest difference between Google estimates and actual returns?
- Index(es):
Relevant Pages
|
Loading