I have been watching the # of indexed pages in Google for 4 large
websites.
In the past couple months, the # of indexed pages has dropped from:
10 million to 2.6 Million
8 million to 1.6 Million
690K to 150K
3 million to 300K
Have others noticed the same?
Can anyone tell me if Google is is doing anything different, like
cleaning up their index (beyond normal) or perhaps they have changed
the way they calculate their numbers? Thanks.
Calculate is a stretch, estimate is a better term, but accuracy of
that estimate is often off by many orders of magnitude. Anything
beyond 1000 is a pure guess and often below 1000 they get it wrong if
you drill down to the last page. I wouldn't consider the number of
pages indexed to be much of a useful metric, unless of course if that
number is zero.
> I have been watching the # of indexed pages in Google for 4 large
> websites.
> In the past couple months, the # of indexed pages has dropped from:
> 10 million to 2.6 Million
> 8 million to 1.6 Million
> 690K to 150K
> 3 million to 300K
> Have others noticed the same?
> Can anyone tell me if Google is is doing anything different, like
> cleaning up their index (beyond normal) or perhaps they have changed
> the way they calculate their numbers? Thanks.
There are many large, sites out there with a large number of
legitimate pages. eBay has almost 200M pages indexed. Is it annoying
that eBay comes up very often when you search for a particular
product, perhaps, but it is still valid and useful content and
obviously a sound acquisition strategy for eBay.
TEK wrote:
> There are many large, sites out there with a large number of
> legitimate pages. eBay has almost 200M pages indexed. Is it annoying
> that eBay comes up very often when you search for a particular
> product, perhaps, but it is still valid and useful content and
> obviously a sound acquisition strategy for eBay.
True.
To be honest, eBay's appearance in a search result rarely annoys me.
We have around 2.5 million pages on the web of which around 700,000
are set as "no-index" because they are user forms and other boring
stuff. Maybe the fall you are seeing is Google is taking out that
which the web owner should have already taken out themselves? We think
we still have sections to remove ourselves and feel that by doing so
we get some credibility from the dreaded "ALGO" that we got to them
before it did!
When things are stable, i.e. we're not moving things about, we find
the Google numbers in WMT to be quite accurate in displaying the
overall total and the indexed amount of pages. Oddly there are times
Google has more pages in the index than we actually have "live for
indexing" due to the fact we have subsequently "no-indexed" some we
decided are not relevant for user search and Google has not caught up
with our changes.
We have not noticed a fall in indexed numbers actually quite the
opposite and that Google seems to have recently improved its indexing
performance. Perhaps because its now ignoring rubbish.
A week or so ago we split off a large portion (700,000+ pages) of our
main site into a sub-domain because the whole thing was becoming to
large to manage efficiently and within a week Google has already added
in excess of 125,000 pages of the new site into the index, GWT says
its got 156,000 so far but obviously some servers need time to catch
up. GWT reckons its adding between 10K-20K pages per day. This proves
the worth of good 301 redirection and a timely update of the sitemaps.
By the way the sub-domain is all about the taxonomy of the world's
wildlife, you know around 600,000+ species - all different - blame the
number of pages on Darwin!
> There are many large, sites out there with a large number of
> legitimate pages. eBay has almost 200M pages indexed. Is it annoying
> that eBay comes up very often when you search for a particular
> product, perhaps, but it is still valid and useful content and
> obviously a sound acquisition strategy for eBay.
> On Nov 25, 10:22 am, Phil Payne wrote:
> > TEK wrote:
> > > I have been watching the # of indexed pages in Google for 4 large
> > > websites.
> > > In the past couple months, the # of indexed pages has dropped from:
> > > 10 million to 2.6 Million
> > > 8 million to 1.6 Million
> > > 690K to 150K
> > > 3 million to 300K
> > Once again I have to express my wonder at a business model that
> > demands 10 million indexed pages.
Funny or odd ... I feel the other way around: ebay IS really nasty,
amazon decent in comparison.
-luzie-
(I saw ebay appear with: "Stalin - buy Stalin here" (or something,
that was in 'paid results' on the right side of SERP of course), I
don't want to have that)
luzie wrote:
> >>> about the taxonomy of the world's wildlife,
> >>> you know around 600,000+ species
> gulp^^ ... I thought it was a couple of millions more, don't hope
> they've been extinct since january or so ... :-/
Very likely one or two have been extinct since tea-time.
SORRY - I did not include "insects" or "microbes" as they seem to be
too worried about their PR falling to stand still long enough for
their species to be determined...
Not sure why people have one-starred you, and frankly I think that's a
shame - certainly you've asked a very reasonable question, and the one-
star responses are not very friendly.
I'll add a theory I've noticed some people putting forward currently
that may be relevant. Some folks think that Google has moved to
multiple indices as opposed to the primary|supplemental indices of
old. If they have moved to multiple indices then the computation of
that figure may be far more difficult to estimate on the fly.
I'd also say that Google has likely increased quality thresholds for
the content they will index.
Not to mention that the link graph has changed significantly as Google
filters out certain links - this could also have macro effects on
index size.
In reality God knows - could be any combination of the above, could be
none of the above.
> I have been watching the # of indexed pages in Google for 4 large
> websites.
> In the past couple months, the # of indexed pages has dropped from:
> 10 million to 2.6 Million
> 8 million to 1.6 Million
> 690K to 150K
> 3 million to 300K
> Have others noticed the same?
> Can anyone tell me if Google is is doing anything different, like
> cleaning up their index (beyond normal) or perhaps they have changed
> the way they calculate their numbers? Thanks.
To be honest, it's always a harder to bring a reasonable approximation
for sites that show a larger number of URLs in the site:-query. If you
see changes there, it could be any of:
- the previous approximation was incorrect, the current one is closer
to the actual number of URLs that we have indexed or would show to
users
- the previous approximation was close and the current one is worse
than before (this can happen)
- a change in our algorithms (we make a lot of changes that will
impact crawling, indexing and ranking -- for some sites perhaps more
than for others)
- an issue on your website (perhaps it's not reachable or the site is
returning server errors -- you can find out more in your Webmaster
Tools account).
At any rate, it's difficult to say much more without knowing the URLs
involved and even then it's possible that it might just be one of the
first three items which I mentioned. Feel free to post your URL, place
it in your profile or use a service like http://cli.gs/ which lets you
create shortened URLs for your site (and remove the redirects later)
and I'll take a look. In general, without knowing your site, I'd have
to say that yes, change happens and most likely this is not something
you would need to worry about (unless crawl errors in your Webmaster
Tools account tell you a different story).