A while ago I did an experiment, that I have mentioned periodically here, where I disallowed access to a couple of pages that I thought had the most quality inbound links.
At almost the exact same time as they were removed from the index, allowing for a short delay, the number of pages of the site in Google's index dropped and the remaining pages didn't do as well in the SERPs.
Re-allowing indexing access, the site recovered, twice. Once, over a period of about a month and then again, over a period of another month after the "Supplemental Result" indicator was removed which seems to have been implemented by pushing previous data to all datacenters. In any event, I saw the fall and rise of the site occur twice so it is doubtful it was a fluke.
But, I was surprised today to find that although the site is doing well in the SERPs and everything else seems fine, the pages that I had removed access to are NOT in Google's index currently nor have they been for a couple of days which is longer than it took for the site to drop its performance during the test.
If the pages, that seem to contribute so much to the site's aggregate PageRank are not in the index at all now, where is the PageRank coming from?
It might seem possible that all of a sudden the site received some massively powerful links from somewhere but given the period of time and timing, it is very unlikely.
It would seem that if you decide to remove a page, its PageRank is tossed to the wind but if Google decides to not index a given page, any PageRank it has would seem to be retained and its links used to determine PageRank of other pages.
If you are curious, the pages no longer in the index are code example pages where there is basically nothing on the pages except (X)HTML, CSS and Javascript code displayed and virtually no other textual content. By way of example of an example, ;-) here is one of the pages, http://cass-hacks.com/articles/example/js_thumbs_to_full2/
> It would seem that if you decide to remove a page, its PageRank is > tossed to the wind but if Google decides to not index a given page, > any PageRank it has would seem to be retained and its links used to > determine PageRank of other pages.
Just a guess:
I assume that with "decide to remove a page" you mean the URL removal function? I am pretty certain that this is not a real URL removal from the "database", but rather just a filter applied before the search results are shown.
Additionally, I feel that the link lists for pagerank calculations are independent of the indexed pages. Pages can have pagerank when they don't even exist (even when the domain name does not exist). I bet that if a page has been crawled and outbound links were known, those links would pass pagerank even if the page ceased to exist (as long as it was in the pagerank link lists). That would be similar to the "noindex, follow" robots meta-tag.
I wonder (never a good sign).. assuming there is a high-value page that links to your site. Would it be better for your site if that page was removed (404 or URL removal tool) or if the links were removed? My guess is that a missing page would still pass value, while a page with no known outbound links wouldn't. But then again, that effect is bound to be very temporary :-P.
I'm not sure how that would change anything though :-))
> > It would seem that if you decide to remove a page, its PageRank is > > tossed to the wind but if Google decides to not index a given page, > > any PageRank it has would seem to be retained and its links used to > > determine PageRank of other pages.
> Just a guess:
> I assume that with "decide to remove a page" you mean the URL removal > function? I am pretty certain that this is not a real URL removal from > the "database", but rather just a filter applied before the search > results are shown.
> Additionally, I feel that the link lists for pagerank calculations are > independent of the indexed pages. Pages can have pagerank when they > don't even exist (even when the domain name does not exist). I bet > that if a page has been crawled and outbound links were known, those > links would pass pagerank even if the page ceased to exist (as long as > it was in the pagerank link lists). That would be similar to the > "noindex, follow" robots meta-tag.
> I wonder (never a good sign).. assuming there is a high-value page > that links to your site. Would it be better for your site if that page > was removed (404 or URL removal tool) or if the links were removed? My > guess is that a missing page would still pass value, while a page with > no known outbound links wouldn't. But then again, that effect is > bound to be very temporary :-P.
> I'm not sure how that would change anything though :-))
Just because I do that, doesn't mean you get to. :-P
> I assume that with "decide to remove a page" you mean the URL removal > function?
Yep.
> I am pretty certain that this is not a real URL removal from > the "database", but rather just a filter applied before the search > results are shown.
That may be but when the pages I requested to be removed were actually gone from the index, that is when the site took a nose dive into oblivion. Also, when I re-allowed them, that is when the site started to revive but it didn't happen all at once. It seemed to take about the amount of time it would for the links to the previously removed pages to be re-crawled.
> Additionally, I feel that the link lists for pagerank calculations are > independent of the indexed pages.
That would be my guess too.
> Pages can have pagerank when they > don't even exist (even when the domain name does not exist).
A URL can have PageRank for a page that doesn't exist but the page that doesn't exist can't have PageRank, kinda sorta because it doesn't exist. ;-)
> I bet > that if a page has been crawled and outbound links were known, those > links would pass pagerank even if the page ceased to exist (as long as > it was in the pagerank link lists). That would be similar to the > "noindex, follow" robots meta-tag.
That makes sense.
> I wonder (never a good sign).. assuming there is a high-value page > that links to your site. Would it be better for your site if that page > was removed (404 or URL removal tool) or if the links were removed? My > guess is that a missing page would still pass value, while a page with > no known outbound links wouldn't.
Since losing a high value link is never a good thing, obviously removing the link would be bad so the best case scenario is if the page the link is on just goes missing, 404, and that even if Google does remove it from the visible index, that it still retains it somewhere. That is a big "if" though regarding Google keeping data on 404'd pages even after removal from the visible index.
I have yet to experience losing a link of any significance through either method so I guess I can't say which would be "better" if indeed there would be any difference at all.
But, it is an interesting question! Even when Google "removes" a page from its visible index that is returning a 404, how far "removed" is it???
Just a little bit, a lot? :-()
Will a little dab do ya?
> But then again, that effect is > bound to be very temporary :-P.
Is it? Again, what does Google do with pages that are 404'd after finally removing them from its visible index?
It would seem to make sense that Google might actually totally delete them from its database, as opposed to just filtering them like it seems to for owner removed or Google removed pages but are we sure?
On the other hand, set up a bunch of pages with links to your favorite sites and then 404 the page. That wouldn't seem to be beneficial for search quality so my guess is that once a page that is 404'd is gone from the visible index, it becomes so much bit dust in the bit bucket.
> I'm not sure how that would change anything though :-))
Well, make it your first priority to find out as soon as you start work next month and report back to us. :-()
Maybe a better test would be to 301 redirect from the page you wanted to remove PR from next time you do this test? Adam has said that 301's pass pagerank and 'associated signals' so that might be a better way to cut off the PR to 'daughter pages' if the page you are 301 redirecting is the only source of PR for those daughter pages?
Actually the point of the test was to try to throw away PageRank and see what happened so 301'ing the URLs to some other page would have defeated the purpose of the test.
I knew, in fact, I wanted the site's performance in Google to drop although I had no idea how far it would drop until I had lost 2/3'rds of the site! :-()
But, having had to go almost 2 months recollecting the PageRank not once but twice, I'm in no really hurry to see if shooting myself in the foot hurts, again.
> Maybe a better test would be to 301 redirect from the page you wanted > to remove PR from next time you do this test? Adam has said that 301's > pass pagerank and 'associated signals' so that might be a better way > to cut off the PR to 'daughter pages' if the page you are 301 > redirecting is the only source of PR for those daughter pages?
> Cheers,
> M
> On Aug 24, 6:22 pm, cass-hacks wrote:
> > > Does this mean that I can sell links on pages that no longer exist and > > > not have to worry about Google penalizing me for it?
> > No no no!!!!
> > You have to have the link on the page and get money for it first > > before you get rid of the page.
> > You think I would believe you put a link on a page that no longer > > exists and didn't exist when you said you put it there?
> > What do you take me for, an idjit??
> > Umm, don't answer that. :-P
> > I can see this should have gone in Random Chat! :-()
So if pages that were indexed but are manually removed continue to pass on the link love for links they contained the last time they were indexed....isn't this just open up a new toolbox for black hatters?
Say I have this highly trusted domain on a good clean host. I set up a feeder page to send some linkin lovin to my new anti virus venture. Since it's quite popular the page is indexed quite soon. I then remove the page so it is no longer associated with trusted domain. Now I go fire up my blog commenting script and ping about 900,000 blogs with links to my trusted site page. Would this page gain in authority? Would the trusted nature of the removed page clean the mostly spammy links up?
> Actually the point of the test was to try to throw away PageRank and > see what happened so 301'ing the URLs to some other page would have > defeated the purpose of the test.
> I knew, in fact, I wanted the site's performance in Google to drop > although I had no idea how far it would drop until I had lost 2/3'rds > of the site! :-()
> But, having had to go almost 2 months recollecting the PageRank not > once but twice, I'm in no really hurry to see if shooting myself in > the foot hurts, again.
> I think it probably will. :-()
> Craig
> On Aug 24, 7:25 pm, dockarl wrote:
> > Maybe a better test would be to 301 redirect from the page you wanted > > to remove PR from next time you do this test? Adam has said that 301's > > pass pagerank and 'associated signals' so that might be a better way > > to cut off the PR to 'daughter pages' if the page you are 301 > > redirecting is the only source of PR for those daughter pages?
> > Cheers,
> > M
> > On Aug 24, 6:22 pm, cass-hacks wrote:
> > > > Does this mean that I can sell links on pages that no longer exist and > > > > not have to worry about Google penalizing me for it?
> > > No no no!!!!
> > > You have to have the link on the page and get money for it first > > > before you get rid of the page.
> > > You think I would believe you put a link on a page that no longer > > > exists and didn't exist when you said you put it there?
> > > What do you take me for, an idjit??
> > > Umm, don't answer that. :-P
> > > I can see this should have gone in Random Chat! :-()
>Actually the point of the test was to try to throw away PageRank and >see what happened so 301'ing the URLs to some other page would have >defeated the purpose of the test.
Yep, I understand that - what I'm saying is that if you 301'd to some other page on a different site or different 'branch' of your site, you'd achieve the same thing - you'd (as far as PR is concerned) orphan the daughter pages in the same way as just removing the 'parent page', although it would take a bit longer - a 301 passes the pagerank and associated signals of the 301'd page to the target page. Following me?
> Actually the point of the test was to try to throw away PageRank and > see what happened so 301'ing the URLs to some other page would have > defeated the purpose of the test.
> I knew, in fact, I wanted the site's performance in Google to drop > although I had no idea how far it would drop until I had lost 2/3'rds > of the site! :-()
> But, having had to go almost 2 months recollecting the PageRank not > once but twice, I'm in no really hurry to see if shooting myself in > the foot hurts, again.
> I think it probably will. :-()
> Craig
> On Aug 24, 7:25 pm, dockarl wrote:
> > Maybe a better test would be to 301 redirect from the page you wanted > > to remove PR from next time you do this test? Adam has said that 301's > > pass pagerank and 'associated signals' so that might be a better way > > to cut off the PR to 'daughter pages' if the page you are 301 > > redirecting is the only source of PR for those daughter pages?
> > Cheers,
> > M
> > On Aug 24, 6:22 pm, cass-hacks wrote:
> > > > Does this mean that I can sell links on pages that no longer exist and > > > > not have to worry about Google penalizing me for it?
> > > No no no!!!!
> > > You have to have the link on the page and get money for it first > > > before you get rid of the page.
> > > You think I would believe you put a link on a page that no longer > > > exists and didn't exist when you said you put it there?
> > > What do you take me for, an idjit??
> > > Umm, don't answer that. :-P
> > > I can see this should have gone in Random Chat! :-()- Hide quoted text -
> Yep, I understand that - what I'm saying is that if you 301'd to some > other page on a different site or different 'branch' of your site, > you'd achieve the same thing - you'd (as far as PR is concerned) > orphan the daughter pages in the same way as just removing the 'parent > page', although it would take a bit longer - a 301 passes the pagerank > and associated signals of the 301'd page to the target page. Following > me?
Oh, that is a horse of a totally different color!
That I might give a try!
Not right now though, I haven't had much traffic to the site while all this was going on and it is sort of nice seeing people visit it once again. :-()
Thanks for the clarification though, that is something I had never thought about before!