Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

number of results of mountain names

0 views
Skip to first unread message

ian_s...@yahoo.com

unread,
Aug 23, 2006, 9:57:12 AM8/23/06
to
If anyone (particular Google employees) can tell me what's going on i'd
be really grateful..

I am doing some research into the way mountains are represented on the
internet and how this compares to their actual physical attributes -
putting it simply is there a correlation between the altitude of a hill
and the number of pages written about it?

I am using hill names, plus some other search terms, such as
"mountain" or "lake district" and using a Google api script
that opens a text file full of search terms and returns the number of
pages each search term generates in another text file. Then I can
compare the number of pages a certain search string produces with some
of that hills physical attributes.

Now, it states on the Google help pages that "By default, Google only
returns pages that include all of your search terms. There is no need
to include "and" between terms." So in the case of Allen Crags, the
search phrase:-

"Allen Crags" "Lake District"

the number of results should be the same as

"Allen Crags" AND "Lake District"

the term "Lake District" on its own produces 7 million pages, so
surely an OR would produce much more results - the combination of any
page that contains "lake district" and any that contain "Allen
Crags"..

However, the problem is that an AND or OR search of two produce the
same results in some cases.

For example, this is taken directly from the resulting .csv from my api

"Allen Crags" OR "lake district" 3510 FALSE 0.087456
"Allen Crags" "lake district" 3510 FALSE 0.135247
"Allen Crags" AND "lake district" 3510 FALSE 0.737335

also, when you perform a manual search (go to google and type in the
query) the results are:-

"Allen Crags" OR "lake district" 1180
"Allen Crags" "lake district" 1200
"Allen Crags" AND "lake district" 1180

what is even more confusing is that with the manual search, the name of
the hill has less pages returned than the name of the hill plus the
term "lake district", which is there to reduce the number of
irrelevant pages, by hopefully limiting the results to the Allen Crags
that is in the Lake District, and not including any others there may
be..

"Allen Crags" OR "lake district" 1180
"Allen Crags" 609

surely, you would expect there to be less results when you add the
geographic filter to the search, not more...

So, is it the case that in some popular searches Google does not
actually perform a search, but rather pulls up a pre-prepared list of
results pages? Because it looks to me as if its not actually performing
the search on the fly...

Any help would be greatly appreciated for a confused geographer

Manfred

unread,
Aug 23, 2006, 3:34:49 PM8/23/06
to
ian_s...@yahoo.com wrote:
> "Allen Crags" OR "lake district" 1180
> "Allen Crags" 609
>
> surely, you would expect there to be less results when you add the
> geographic filter to the search, not more...

No. Google clearly states these numbers to be estimates. We dont know
anything about the algorithm used for calculating it, so to assume
basic
arithmetic operators will apply to this set, is a missconception on
your side.
Those estimates are _not_ numbers in the common sense, but look alikes,
to give you some idea --- maybe the wrong one.

Manfred

0 new messages