Hey all, I've been meaning to stop by the webmaster help group, and the "Popular Picks" thread drew me in. Here's the question I'll tackle: Admin Aaron asked "What are some appropriate ways to use the nofollow tag other than to protect against blog comment spam?"
My short answer is that the nofollow attribute on links is a pretty general mechanism, and you're welcome to use it how you like. Let me tell you what it does, then I'll give an example or two. I answered a nofollow question for Rand Fishkin recently. You can read the full details at http://www.seomoz.org/blog/questions-answers-with-googles-spam-guru , but I'll quote the important bit:
"The nofollow attribute is just a mechanism that gives webmasters the ability to modify PageRank flow at link-level granularity. Plenty of other mechanisms would also work (e.g. a link through a page that is robot.txt'ed out), but nofollow on individual links is simpler for some folks to use. There's no stigma to using nofollow, even on your own internal links; for Google, nofollow'ed links are dropped out of our link graph; we don't even use such links for discovery. By the way, the nofollow meta tag does that same thing, but at a page level."
So nofollow as a link attribute causes Google to drop those links out of our link graph. If you have a nofollow link from page A to page B, we won't crawl via page A's link to discover page B. Note that we may still find page B via other links around the web, though.
What are some appropriate ways to use the nofollow tag? One good example is the home page of expedia.com. If you visit that page, you'll see that the "Sign in" link is nofollow'ed. That's a great use of the tag: Googlebot isn't going to know how to sign into expedia.com, so why waste that PageRank on a page that wouldn't benefit users or convert any new visitors? Likewise, the "My itineraries" link on expedia.com is nofollow'ed as well. That's another page that wouldn't really convert well or have any use except for signed in users, so the nofollow on Expedia's home page means that Google won't crawl those specific links.
Most webmasters don't need to worry about sculpting the flow of PageRank on their site, but if you want to try advanced things with nofollow to send less PageRank to copyright pages, terms of service, privacy pages, etc., that's your call.
I gave another example where nofollow would work well at http://www.mattcutts.com/blog/quick-comment-on-nofollow/ . Someone wrote an oompa loompa dating site as a joke, but that site started to get hit with spammy comments. If you write custom software where you're worried that people might spam the software with links to, I dunno, Ukrainian porn sites, then you can add nofollow in your software on the links that you think might be spammed. If a spammer has a choice between your software and some other software that doesn't use nofollow, your software might not get hit as often by spammers.
> My short answer is that the nofollow attribute on links is a pretty > general mechanism, and you're welcome to use it how you like.
I would LIKE to use it to automagically increase my SERPs positions. I don't think that will happen though. :-()
Seriously though, I always thought of ad-hoc standards, like robots.txt, sitemap.xml etc the same as W3C RFCs and ITU-T Recommendations, both with an emphasis on "recommendation".
People may have had their various, and usually different reasons for wanting a given recommendation but once it is out in the wild, people are pretty much free to use it as they wish, on both sides of the interoperability table.
One question I still have is regarding, "(e.g. a link through a page that is robot.txt'ed out)".
I am not understanding the "through a page" part.
I read in the linked to interview, "Now, robots.txt says you are not allowed to crawl a page, and Google therefore does not crawl pages that are forbidden in robots.txt. However, they can accrue PageRank, and they can be returned in our search results."
So, to achieve the same thing in a robots.txt file as in the use of nofollow, you would essentially need three pages, the first page with a link to the second, which is disallow'ed and has a link on it to a third, the page one originally wanted nofollow'ed?
Is that the meaning of "through a page"?
By the way, I was surprised to read that a page that is "disallowed" and is not crawled can show up in SERPs. Would that be possible only because of offsite inbound links to that page? If a bot can't crawl a page, how else could it know what is on the page in order to decide what search terms to rank it for?
> Hey all, I've been meaning to stop by the webmaster help group, and > the "Popular Picks" thread drew me in. Here's the question I'll > tackle: Admin Aaron asked "What are some appropriate ways to use the > nofollow tag other than to protect against blog comment spam?"
> My short answer is that the nofollow attribute on links is a pretty > general mechanism, and you're welcome to use it how you like. Let me > tell you what it does, then I'll give an example or two. I answered a > nofollow question for Rand Fishkin recently. You can read the full > details athttp://www.seomoz.org/blog/questions-answers-with-googles-spam-guru > , but I'll quote the important bit:
> "The nofollow attribute is just a mechanism that gives webmasters the > ability to modify PageRank flow at link-level granularity. Plenty of > other mechanisms would also work (e.g. a link through a page that is > robot.txt'ed out), but nofollow on individual links is simpler for > some folks to use. There's no stigma to using nofollow, even on your > own internal links; for Google, nofollow'ed links are dropped out of > our link graph; we don't even use such links for discovery. By the > way, the nofollow meta tag does that same thing, but at a page level."
> So nofollow as a link attribute causes Google to drop those links out > of our link graph. If you have a nofollow link from page A to page B, > we won't crawl via page A's link to discover page B. Note that we may > still find page B via other links around the web, though.
> What are some appropriate ways to use the nofollow tag? One good > example is the home page of expedia.com. If you visit that page, > you'll see that the "Sign in" link is nofollow'ed. That's a great use > of the tag: Googlebot isn't going to know how to sign into > expedia.com, so why waste that PageRank on a page that wouldn't > benefit users or convert any new visitors? Likewise, the "My > itineraries" link on expedia.com is nofollow'ed as well. That's > another page that wouldn't really convert well or have any use except > for signed in users, so the nofollow on Expedia's home page means that > Google won't crawl those specific links.
> Most webmasters don't need to worry about sculpting the flow of > PageRank on their site, but if you want to try advanced things with > nofollow to send less PageRank to copyright pages, terms of service, > privacy pages, etc., that's your call.
> I gave another example where nofollow would work well athttp://www.mattcutts.com/blog/quick-comment-on-nofollow/. Someone > wrote an oompa loompa dating site as a joke, but that site started to > get hit with spammy comments. If you write custom software where > you're worried that people might spam the software with links to, I > dunno, Ukrainian porn sites, then you can add nofollow in your > software on the links that you think might be spammed. If a spammer > has a choice between your software and some other software that > doesn't use nofollow, your software might not get hit as often by > spammers.
Thinking about robots.txt'ed files accruing PageRank, that would seem to mean that if a given or directory is disallow'ed yet linked to within the site, the links to whatever is disallow'ed should be nofollow'ed otherwise they will end up passing PageRank to pages that don't need it thereby reducing PageRank available for other links on those pages, right?
Craig
p.s. mpilatow, fortunately you are not in charge of Google's methods of determining intent. ;-)
> > My short answer is that the nofollow attribute on links is a pretty > > general mechanism, and you're welcome to use it how you like.
> I would LIKE to use it to automagically increase my SERPs positions. > I don't think that will happen though. :-()
> Seriously though, I always thought of ad-hoc standards, like > robots.txt, sitemap.xml etc the same as W3C RFCs and ITU-T > Recommendations, both with an emphasis on "recommendation".
> People may have had their various, and usually different reasons for > wanting a given recommendation but once it is out in the wild, people > are pretty much free to use it as they wish, on both sides of the > interoperability table.
> One question I still have is regarding, "(e.g. a link through a page > that is robot.txt'ed out)".
> I am not understanding the "through a page" part.
> I read in the linked to interview, "Now, robots.txt says you are not > allowed to crawl a page, and Google therefore does not crawl pages > that are forbidden in robots.txt. However, they can accrue PageRank, > and they can be returned in our search results."
> So, to achieve the same thing in a robots.txt file as in the use of > nofollow, you would essentially need three pages, the first page with > a link to the second, which is disallow'ed and has a link on it to a > third, the page one originally wanted nofollow'ed?
> Is that the meaning of "through a page"?
> By the way, I was surprised to read that a page that is "disallowed" > and is not crawled can show up in SERPs. Would that be possible only > because of offsite inbound links to that page? If a bot can't crawl a > page, how else could it know what is on the page in order to decide > what search terms to rank it for?
Essentially, yes, to link to page 3 from page 1, you run the link through page 2 which is robots.txt'd out. It's not the same as nofollow though, but it has the same effect of not passing PR to page 3. The bummer is that it DOES pass PR to page 2, which ends up being this black hole full of beautiful green PR..... (does that make it a green hole, anyway I digress, but as you mentioned in your follow up comment, you'd have to actually hit the link to page two with a nofollow as well to make sure you don't pass all your PR into a big hole - theoretically the page with the highest pagerank on your site could be a disallowed page!). Using nofollow on a link from the first page to the third page will not pass PR to page 3. Now here is where the real catch comes in because all search engines treat this blooming attribute differently. Wikipedia has a good rundown on it: http://en.wikipedia.org/wiki/Nofollow . As you can see, Matt's statement further underlines the statements on that wikipedia page. So from Google's point of view, the nofollow tag is a much easier and better choice since you don't have the whole issue of PR 'leakage'. I think Google's is the most easy to understand considering the thing is called "no follow", but I can see why Yahoo doesn't use it as such because that is originally not how the attribute was intended. Anyway, if you want to use something that all search engines do the same with I guess the option with a nofollowed link to a disallowed page which then links on to the final target page is the only one that works.
I was also surprised by the disallow'd pages still being able to show, but I guess the explanation in the article made some sense in a wacky kind of way.
"If a bot can't crawl a page, how else could it know what is on the page in order to decide what search terms to rank it for?"
The power of anchor text I'd guess. These would have to be pretty detailed search queries I'd say, ones that include a brandname plus some specific terms that don't have many other results and especially not other results from that site as these would outdo the disallowed page easily. You'd get the typical disallowed pages look in the index. I did think the article was funny with regards to ebay. They used to not allow Google to crawl them and now they make up 25% of the Google serps LOL
>From a personal point of view I don't buy the "we'd look suboptimal if
we didn't return these results when someone searched for them" reasoning as to why it was done this way. I know that I was trying to do searches earlier today using our brandname + a unique phrase and our site wasn't returned. It's the same when searching "John Chow" by name. I agree that it makes Google look suboptimal, but they've known about this for a while and aren't doing anything to make sure that what the user is searching for gets returned - and these are allowed pages. I've had to switch to Yahoo regularly of late due to cases like this.
> Thinking about robots.txt'ed files accruing PageRank, that would seem > to mean that if a given or directory is disallow'ed yet linked to > within the site, the links to whatever is disallow'ed should be > nofollow'ed otherwise they will end up passing PageRank to pages that > don't need it thereby reducing PageRank available for other links on > those pages, right?
> Craig
> p.s. mpilatow, fortunately you are not in charge of Google's methods > of determining intent. ;-)
> On Oct 9, 5:25 pm, cass-hacks wrote:
> > Nice!
> > > My short answer is that the nofollow attribute on links is a pretty > > > general mechanism, and you're welcome to use it how you like.
> > I would LIKE to use it to automagically increase my SERPs positions. > > I don't think that will happen though. :-()
> > Seriously though, I always thought of ad-hoc standards, like > > robots.txt, sitemap.xml etc the same as W3C RFCs and ITU-T > > Recommendations, both with an emphasis on "recommendation".
> > People may have had their various, and usually different reasons for > > wanting a given recommendation but once it is out in the wild, people > > are pretty much free to use it as they wish, on both sides of the > > interoperability table.
> > One question I still have is regarding, "(e.g. a link through a page > > that is robot.txt'ed out)".
> > I am not understanding the "through a page" part.
> > I read in the linked to interview, "Now, robots.txt says you are not > > allowed to crawl a page, and Google therefore does not crawl pages > > that are forbidden in robots.txt. However, they can accrue PageRank, > > and they can be returned in our search results."
> > So, to achieve the same thing in a robots.txt file as in the use of > > nofollow, you would essentially need three pages, the first page with > > a link to the second, which is disallow'ed and has a link on it to a > > third, the page one originally wanted nofollow'ed?
> > Is that the meaning of "through a page"?
> > By the way, I was surprised to read that a page that is "disallowed" > > and is not crawled can show up in SERPs. Would that be possible only > > because of offsite inbound links to that page? If a bot can't crawl a > > page, how else could it know what is on the page in order to decide > > what search terms to rank it for?
Yeah Craig, this issue (along with the paid links issue) have been pretty popular lately across a number of blogs and forums I peruse. It just kinda annoys me that they want you to build sites for users but then tell you to use a tag that is expressly for the Google spider. When it was first introduced it was designed to combat blog spam and from the looks if things it has had limited success there. Then it was to link to sites you can't vouch for, but why would I link to a site that I can't vouch for. Now they are saying it might be a good idea to manipulate page rank (which I always thought was another guideline, something about don't get involved with link schemes designed to manipulate page rank). Anyway, I understand the reason it was introduced but I still have some problems with the way it is being used now.
> Essentially, yes, to link to page 3 from page 1, you run the link > through page 2 which is robots.txt'd out. It's not the same as > nofollow though, but it has the same effect of not passing PR to page > 3. The bummer is that it DOES pass PR to page 2, which ends up being > this black hole full of beautiful green PR..... (does that make it a > green hole, anyway I digress, but as you mentioned in your follow up > comment, you'd have to actually hit the link to page two with a > nofollow as well to make sure you don't pass all your PR into a big > hole - theoretically the page with the highest pagerank on your site > could be a disallowed page!). Using nofollow on a link from the first > page to the third page will not pass PR to page 3. Now here is where > the real catch comes in because all search engines treat this blooming > attribute differently. Wikipedia has a good rundown on it:http://en.wikipedia.org/wiki/Nofollow. As you can see, Matt's > statement further underlines the statements on that wikipedia page. So > from Google's point of view, the nofollow tag is a much easier and > better choice since you don't have the whole issue of PR 'leakage'. I > think Google's is the most easy to understand considering the thing is > called "no follow", but I can see why Yahoo doesn't use it as such > because that is originally not how the attribute was intended. Anyway, > if you want to use something that all search engines do the same with > I guess the option with a nofollowed link to a disallowed page which > then links on to the final target page is the only one that works.
> I was also surprised by the disallow'd pages still being able to show, > but I guess the explanation in the article made some sense in a wacky > kind of way.
> "If a bot can't crawl a page, how else could it know what is on the > page in order to decide what search terms to rank it for?"
> The power of anchor text I'd guess. These would have to be pretty > detailed search queries I'd say, ones that include a brandname plus > some specific terms that don't have many other results and especially > not other results from that site as these would outdo the disallowed > page easily. You'd get the typical disallowed pages look in the index. > I did think the article was funny with regards to ebay. They used to > not allow Google to crawl them and now they make up 25% of the Google > serps LOL
> >From a personal point of view I don't buy the "we'd look suboptimal if
> we didn't return these results when someone searched for them" > reasoning as to why it was done this way. I know that I was trying to > do searches earlier today using our brandname + a unique phrase and > our site wasn't returned. It's the same when searching "John Chow" by > name. I agree that it makes Google look suboptimal, but they've known > about this for a while and aren't doing anything to make sure that > what the user is searching for gets returned - and these are allowed > pages. I've had to switch to Yahoo regularly of late due to cases like > this.
> On Oct 9, 5:58 pm, cass-hacks wrote:
> > Thinking about robots.txt'ed files accruing PageRank, that would seem > > to mean that if a given or directory is disallow'ed yet linked to > > within the site, the links to whatever is disallow'ed should be > > nofollow'ed otherwise they will end up passing PageRank to pages that > > don't need it thereby reducing PageRank available for other links on > > those pages, right?
> > Craig
> > p.s. mpilatow, fortunately you are not in charge of Google's methods > > of determining intent. ;-)
> > On Oct 9, 5:25 pm, cass-hacks wrote:
> > > Nice!
> > > > My short answer is that the nofollow attribute on links is a pretty > > > > general mechanism, and you're welcome to use it how you like.
> > > I would LIKE to use it to automagically increase my SERPs positions. > > > I don't think that will happen though. :-()
> > > Seriously though, I always thought of ad-hoc standards, like > > > robots.txt, sitemap.xml etc the same as W3C RFCs and ITU-T > > > Recommendations, both with an emphasis on "recommendation".
> > > People may have had their various, and usually different reasons for > > > wanting a given recommendation but once it is out in the wild, people > > > are pretty much free to use it as they wish, on both sides of the > > > interoperability table.
> > > One question I still have is regarding, "(e.g. a link through a page > > > that is robot.txt'ed out)".
> > > I am not understanding the "through a page" part.
> > > I read in the linked to interview, "Now, robots.txt says you are not > > > allowed to crawl a page, and Google therefore does not crawl pages > > > that are forbidden in robots.txt. However, they can accrue PageRank, > > > and they can be returned in our search results."
> > > So, to achieve the same thing in a robots.txt file as in the use of > > > nofollow, you would essentially need three pages, the first page with > > > a link to the second, which is disallow'ed and has a link on it to a > > > third, the page one originally wanted nofollow'ed?
> > > Is that the meaning of "through a page"?
> > > By the way, I was surprised to read that a page that is "disallowed" > > > and is not crawled can show up in SERPs. Would that be possible only > > > because of offsite inbound links to that page? If a bot can't crawl a > > > page, how else could it know what is on the page in order to decide > > > what search terms to rank it for?
mpilatow - nothing personal BUT I feel that people will bitch about anything, Matt gives us few helpful hints for better webmastering and you and others seem to still not be happy. Some have legit complaints, yeah... while most just use this weak argument (build sites for humans) to lash out at Google.
Let's grow up and realize that Google is only a few years old, they adapt just as we do, Matt offering this information on how to better use the nofollow is very useful to those who build websites they want to make more crawlable.
YES today you can still do a few things (including using the nofollow) to gain a bit of an advantage in Google but do not worry, with your continual whines, Google will close ALL the holes and then you can find something else to complain about...
If you do not have HUGE PR pointing pagerank in the right direction is a smart idea, a gift really.
A quick example, a blog has a sitemap AND categories which are kind of duplicate in a way, I just pinched off the pagerank to my sitemap with a nofollow because I want it to go to each individual category. Is this gaming Google? Not at all, this is making a more crawlable/stable website with limited pagerank, THIS helps "mom and pop".
> Yeah Craig, this issue (along with the paid links issue) have been > pretty popular lately across a number of blogs and forums I peruse. It > just kinda annoys me that they want you to build sites for users but > then tell you to use a tag that is expressly for the Google spider. > When it was first introduced it was designed to combat blog spam and > from the looks if things it has had limited success there. Then it was > to link to sites you can't vouch for, but why would I link to a site > that I can't vouch for. Now they are saying it might be a good idea to > manipulate page rank (which I always thought was another guideline, > something about don't get involved with link schemes designed to > manipulate page rank). Anyway, I understand the reason it was > introduced but I still have some problems with the way it is being > used now.
> > Essentially, yes, to link to page 3 from page 1, you run the link > > through page 2 which is robots.txt'd out. It's not the same as > > nofollow though, but it has the same effect of not passing PR to page > > 3. The bummer is that it DOES pass PR to page 2, which ends up being > > this black hole full of beautiful green PR..... (does that make it a > > green hole, anyway I digress, but as you mentioned in your follow up > > comment, you'd have to actually hit the link to page two with a > > nofollow as well to make sure you don't pass all your PR into a big > > hole - theoretically the page with the highest pagerank on your site > > could be a disallowed page!). Using nofollow on a link from the first > > page to the third page will not pass PR to page 3. Now here is where > > the real catch comes in because all search engines treat this blooming > > attribute differently. Wikipedia has a good rundown on it:http://en.wikipedia.org/wiki/Nofollow. As you can see, Matt's > > statement further underlines the statements on that wikipedia page. So > > from Google's point of view, the nofollow tag is a much easier and > > better choice since you don't have the whole issue of PR 'leakage'. I > > think Google's is the most easy to understand considering the thing is > > called "no follow", but I can see why Yahoo doesn't use it as such > > because that is originally not how the attribute was intended. Anyway, > > if you want to use something that all search engines do the same with > > I guess the option with a nofollowed link to a disallowed page which > > then links on to the final target page is the only one that works.
> > I was also surprised by the disallow'd pages still being able to show, > > but I guess the explanation in the article made some sense in a wacky > > kind of way.
> > "If a bot can't crawl a page, how else could it know what is on the > > page in order to decide what search terms to rank it for?"
> > The power of anchor text I'd guess. These would have to be pretty > > detailed search queries I'd say, ones that include a brandname plus > > some specific terms that don't have many other results and especially > > not other results from that site as these would outdo the disallowed > > page easily. You'd get the typical disallowed pages look in the index. > > I did think the article was funny with regards to ebay. They used to > > not allow Google to crawl them and now they make up 25% of the Google > > serps LOL
> > >From a personal point of view I don't buy the "we'd look suboptimal if
> > we didn't return these results when someone searched for them" > > reasoning as to why it was done this way. I know that I was trying to > > do searches earlier today using our brandname + a unique phrase and > > our site wasn't returned. It's the same when searching "John Chow" by > > name. I agree that it makes Google look suboptimal, but they've known > > about this for a while and aren't doing anything to make sure that > > what the user is searching for gets returned - and these are allowed > > pages. I've had to switch to Yahoo regularly of late due to cases like > > this.
> > On Oct 9, 5:58 pm, cass-hacks wrote:
> > > Thinking about robots.txt'ed files accruing PageRank, that would seem > > > to mean that if a given or directory is disallow'ed yet linked to > > > within the site, the links to whatever is disallow'ed should be > > > nofollow'ed otherwise they will end up passing PageRank to pages that > > > don't need it thereby reducing PageRank available for other links on > > > those pages, right?
> > > Craig
> > > p.s. mpilatow, fortunately you are not in charge of Google's methods > > > of determining intent. ;-)
> > > On Oct 9, 5:25 pm, cass-hacks wrote:
> > > > Nice!
> > > > > My short answer is that the nofollow attribute on links is a pretty > > > > > general mechanism, and you're welcome to use it how you like.
> > > > I would LIKE to use it to automagically increase my SERPs positions. > > > > I don't think that will happen though. :-()
> > > > Seriously though, I always thought of ad-hoc standards, like > > > > robots.txt, sitemap.xml etc the same as W3C RFCs and ITU-T > > > > Recommendations, both with an emphasis on "recommendation".
> > > > People may have had their various, and usually different reasons for > > > > wanting a given recommendation but once it is out in the wild, people > > > > are pretty much free to use it as they wish, on both sides of the > > > > interoperability table.
> > > > One question I still have is regarding, "(e.g. a link through a page > > > > that is robot.txt'ed out)".
> > > > I am not understanding the "through a page" part.
> > > > I read in the linked to interview, "Now, robots.txt says you are not > > > > allowed to crawl a page, and Google therefore does not crawl pages > > > > that are forbidden in robots.txt. However, they can accrue PageRank, > > > > and they can be returned in our search results."
> > > > So, to achieve the same thing in a robots.txt file as in the use of > > > > nofollow, you would essentially need three pages, the first page with > > > > a link to the second, which is disallow'ed and has a link on it to a > > > > third, the page one originally wanted nofollow'ed?
> > > > Is that the meaning of "through a page"?
> > > > By the way, I was surprised to read that a page that is "disallowed" > > > > and is not crawled can show up in SERPs. Would that be possible only > > > > because of offsite inbound links to that page? If a bot can't crawl a > > > > page, how else could it know what is on the page in order to decide > > > > what search terms to rank it for?
> Craig, I'm biting my fingers here to stop from typing "search and > though shalt find" as I've seen you reply a few times :)
I'm guessing what is really holding you back is your seeing the difference between the two situations.
On one hand, a polite request for confirmation, which doesn't preclude the possibility of additional information or references being known.
And, on the other hand, a confrontational claim seemingly based on the assumption that absence of knowledge equals knowledge of absence which almost by definition precludes knowing of available information or references.
Maybe we can just chalk it up to a difference in communication styles although you have to know that if one starts out on the offensive, someone invariably has to "lose". ;-)
Right, it still seems "wacky" though. (I'm beginning to like that term!)
> Essentially, yes, to link to page 3 from page 1, you run the link > through page 2 which is robots.txt'd out. It's not the same as > nofollow though, but it has the same effect of not passing PR to page > 3. The bummer is that it DOES pass PR to page 2, which ends up being > this black hole full of beautiful green PR.
That's definitely wacky! :-()
>.... (does that make it a > green hole, anyway I digress, but as you mentioned in your follow up > comment, you'd have to actually hit the link to page two with a > nofollow as well to make sure you don't pass all your PR into a big > hole - theoretically the page with the highest pagerank on your site > could be a disallowed page!).
What I was thinking was actually to get rid of the intermediate page and just nofollow a link to a disallow'ed target page. That would seem to work for Google at least, I think.
> Using nofollow on a link from the first > page to the third page will not pass PR to page 3. Now here is where > the real catch comes in because all search engines treat this blooming > attribute differently. Wikipedia has a good rundown on it:http://en.wikipedia.org/wiki/Nofollow.
The Wikipedia article seems rather confusing at times.
It states, "experiments conducted by SEOs show conflicting results".
But, the SEO experiment(s) cited doesn't seem to prove much of anything as to whether or not Google follow'ed nofollow'ed links.
About the only thing the cited experiment shows is that Google didn't index the experiment page for quite some time.
I was thinking the Wiki article could be helped with a definition of "follow" but at the same time, what "follow" actually means is moot as it is implementation/search engine specific, the real questions are concerning indexing, SERPs and PageRank.
So, saying someone "follows" something or not seems to add nothing except confusion.
Either that or I am easily confused. :-()
> As you can see, Matt's > statement further underlines the statements on that wikipedia page.
Part of the problem is the Wiki page confused the hell out of me as it seems to say a lot about very little and cited experiments seem to not prove what they set out to or what they are claimed to.
> So > from Google's point of view, the nofollow tag is a much easier and > better choice since you don't have the whole issue of PR 'leakage'.
The PageRank "leakage" would seem a very important point, which is definitely one I had missed before.
> I > think Google's is the most easy to understand considering the thing is > called "no follow", but I can see why Yahoo doesn't use it as such > because that is originally not how the attribute was intended.
As far as a nofollow'ed link not passing ranking juice, it would seem that Yahoo does seem to follow the original intent more closely by not involving other issues, like indexing and related, showing up in SERPs, for which one should use nodinex of one really cared.
I think it might be easily agreed though that "nofollow" might not have been the most appropriate name for this tag, at least without an agreed upon definition of what "follow" means.
Maybe "nolinkjuice" would have been more fitting. :-()
> Anyway, > if you want to use something that all search engines do the same with > I guess the option with a nofollowed link to a disallowed page which > then links on to the final target page is the only one that works.
True, although all I am really concerned about is Google, for which a nofollow'ed link to a disallow'ed page, with no intermediate page, seems like it should work.
It is not that I don't care about ranking in MSN or Yahoo but instead, even when I essentially wiped out a site's PageRank in an experiment, search results and indexed page counts for Yahoo and MSN didn't even seem to notice.
> I was also surprised by the disallow'd pages still being able to show, > but I guess the explanation in the article made some sense in a wacky > kind of way.
True, it is wacky but then again, the whole seemingly ill named nofollow thingy is wacky so what do you expect? :-()
I think were it better named, it wouldn't be so strange but as it is, since it seems everyone has their own definition, it ends up with one having to try to thread a needle from ten feet away to keep everyone happy.
> "If a bot can't crawl a page, how else could it know what is on the > page in order to decide what search terms to rank it for?"
> The power of anchor text I'd guess.
That would be my guess as well. We can already see pages that are in various SERPs purely due to linkage. But, doesn't that bring up the whole "Google bombing" thingy?
Although that's a whole other subject and I don't want to get into here, if a page can show up in SERPs due to inbound anchor text alone, how is that not the equivalent of what one does when attempting "Google bombing"? Maybe there is a distinction somewhere but I seem to be missing it if there is.
> These would have to be pretty > detailed search queries I'd say, ones that include a brandname plus > some specific terms that don't have many other results and especially > not other results from that site as these would outdo the disallowed > page easily.
It depends on your definition of "detailed". ;-)
I don't have any of my names on a given site at all yet due to external anchor text alone, the site shows up third for one of my names, out of almost 2 million available pages!
I could agree though that a search on an actual name could be considered pretty detailed so maybe that isn't quite fair.
But, even though I know a search for an allow'ed/index'ed/follow'ed page is different than a nofollow'ed page, I wonder if it makes any difference, once the page is actually in Google's index, one way or another.
Either way, it's just plain wacky! :-()
> You'd get the typical disallowed pages look in the index.
"disallowed pages look", I've never seen that. Can you describe it or do you have an example of it?
I seem to remember someone somewhere talking about it but I don't think I've ever seen what it ends up looking like in the SERPs.
I know, I lead a sheltered life. :-()
> I did think the article was funny with regards to ebay. They used to > not allow Google to crawl them and now they make up 25% of the Google > serps LOL
:-()
How about "The Search is mightier than the auction." or maybe, "no search tingy, no auction tingy". :-()
>From a personal point of view I don't buy the "we'd look suboptimal if > we didn't return these results when someone searched for them" > reasoning as to why it was done this way.
I would tend to agree, to an extent. Speaking about nofollow by itself, if I link to a page on my site without a nofollow link and you link to it from your site, with a nofollow link, I would hope it would still have a chance to show up in SERPs so from a "prior knowledge" point of view, I can understand.
But, if only nofollow links exist, I would think it shouldn't ever show up although I don't think it possible to prove experimentally one way or another as it can actually be hard to get people to either not link at all to a given page or, have them link using the "correct" method to protect the experiment.
Of course if we change that around and I nofollow'ed a link to a page on my site and you link to it from yours but don't nofollow it, Google indexing the page and returning it in SERPs based on your "vote" by linking to it without a nofollow would seem consistent with PageRank in general.
> I know that I was trying to > do searches earlier today using our brandname + a unique phrase and > our site wasn't returned. It's the same when searching "John Chow" by > name. I agree that it makes Google look suboptimal, but they've known > about this for a while and aren't doing anything to make sure that > what the user is searching for gets returned - and these are allowed > pages. I've had to switch to Yahoo regularly of late due to cases like > this.
Right, but at the same time, I search for any one of my names and they all lead to one of my sites.
Thinking about that, could the difference between your and my case be that Google possibly sees your site in a similar light as John Chow's?
Anyhoo, I've forgotten what we were talking about. :-()
Oh yeah, I remember now, to be safe across all search engines and one needs 3 pages, source, intermediate and target but, for Google alone, it would seem that only the source and destination are needed with the source page link being nofollow'ed and the target page being disallow'ed, is that right?
> > Craig, I'm biting my fingers here to stop from typing "search and > > though shalt find" as I've seen you reply a few times :)
> I'm guessing what is really holding you back is your seeing the > difference between the two situations.
> On one hand, a polite request for confirmation, which doesn't preclude > the possibility of additional information or references being known.
> And, on the other hand, a confrontational claim seemingly based on the > assumption that absence of knowledge equals knowledge of absence which > almost by definition precludes knowing of available information or > references.
> Maybe we can just chalk it up to a difference in communication styles > although you have to know that if one starts out on the offensive, > someone invariably has to "lose". ;-)
> Right, it still seems "wacky" though. (I'm beginning to like that > term!)
> > Essentially, yes, to link to page 3 from page 1, you run the link > > through page 2 which is robots.txt'd out. It's not the same as > > nofollow though, but it has the same effect of not passing PR to page > > 3. The bummer is that it DOES pass PR to page 2, which ends up being > > this black hole full of beautiful green PR.
> That's definitely wacky! :-()
> >.... (does that make it a > > green hole, anyway I digress, but as you mentioned in your follow up > > comment, you'd have to actually hit the link to page two with a > > nofollow as well to make sure you don't pass all your PR into a big > > hole - theoretically the page with the highest pagerank on your site > > could be a disallowed page!).
> What I was thinking was actually to get rid of the intermediate page > and just nofollow a link to a disallow'ed target page. That would > seem to work for Google at least, I think.
> > Using nofollow on a link from the first > > page to the third page will not pass PR to page 3. Now here is where > > the real catch comes in because all search engines treat this blooming > > attribute differently. Wikipedia has a good rundown on it:http://en.wikipedia.org/wiki/Nofollow.
> The Wikipedia article seems rather confusing at times.
> It states, "experiments conducted by SEOs show conflicting results".
> But, the SEO experiment(s) cited doesn't seem to prove much of > anything as to whether or not Google follow'ed nofollow'ed links.
> About the only thing the cited experiment shows is that Google didn't > index the experiment page for quite some time.
> I was thinking the Wiki article could be helped with a definition of > "follow" but at the same time, what "follow" actually means is moot as > it is implementation/search engine specific, the real questions are > concerning indexing, SERPs and PageRank.
> So, saying someone "follows" something or not seems to add nothing > except confusion.
> Either that or I am easily confused. :-()
> > As you can see, Matt's > > statement further underlines the statements on that wikipedia page.
> Part of the problem is the Wiki page confused the hell out of me as it > seems to say a lot about very little and cited experiments seem to not > prove what they set out to or what they are claimed to.
> > So > > from Google's point of view, the nofollow tag is a much easier and > > better choice since you don't have the whole issue of PR 'leakage'.
> The PageRank "leakage" would seem a very important point, which is > definitely one I had missed before.
> > I > > think Google's is the most easy to understand considering the thing is > > called "no follow", but I can see why Yahoo doesn't use it as such > > because that is originally not how the attribute was intended.
> As far as a nofollow'ed link not passing ranking juice, it would seem > that Yahoo does seem to follow the original intent more closely by not > involving other issues, like indexing and related, showing up in > SERPs, for which one should use nodinex of one really cared.
> I think it might be easily agreed though that "nofollow" might not > have been the most appropriate name for this tag, at least without an > agreed upon definition of what "follow" means.
> Maybe "nolinkjuice" would have been more fitting. :-()
> > Anyway, > > if you want to use something that all search engines do the same with > > I guess the option with a nofollowed link to a disallowed page which > > then links on to the final target page is the only one that works.
> True, although all I am really concerned about is Google, for which a > nofollow'ed link to a disallow'ed page, with no intermediate page, > seems like it should work.
> It is not that I don't care about ranking in MSN or Yahoo but instead, > even when I essentially wiped out a site's PageRank in an experiment, > search results and indexed page counts for Yahoo and MSN didn't even > seem to notice.
> > I was also surprised by the disallow'd pages still being able to show, > > but I guess the explanation in the article made some sense in a wacky > > kind of way.
> True, it is wacky but then again, the whole seemingly ill named > nofollow thingy is wacky so what do you expect? :-()
> I think were it better named, it wouldn't be so strange but as it is, > since it seems everyone has their own definition, it ends up with one > having to try to thread a needle from ten feet away to keep everyone > happy.
> > "If a bot can't crawl a page, how else could it know what is on the > > page in order to decide what search terms to rank it for?"
> > The power of anchor text I'd guess.
> That would be my guess as well. We can already see pages that are in > various SERPs purely due to linkage. But, doesn't that bring up the > whole "Google bombing" thingy?
> Although that's a whole other subject and I don't want to get into > here, if a page can show up in SERPs due to inbound anchor text alone, > how is that not the equivalent of what one does when attempting > "Google bombing"? Maybe there is a distinction somewhere but I seem > to be missing it if there is.
> > These would have to be pretty > > detailed search queries I'd say, ones that include a brandname plus > > some specific terms that don't have many other results and especially > > not other results from that site as these would outdo the disallowed > > page easily.
> It depends on your definition of "detailed". ;-)
> I don't have any of my names on a given site at all yet due to > external anchor text alone, the site shows up third for one of my > names, out of almost 2 million available pages!
> I could agree though that a search on an actual name could be > considered pretty detailed so maybe that isn't quite fair.
> But, even though I know a search for an allow'ed/index'ed/follow'ed > page is different than a nofollow'ed page, I wonder if it makes any > difference, once the page is actually in Google's index, one way or > another.
> Either way, it's just plain wacky! :-()
> > You'd get the typical disallowed pages look in the index.
> "disallowed pages look", I've never seen that. Can you describe it or > do you have an example of it?
> I seem to remember someone somewhere talking about it but I don't > think I've ever seen what it ends up looking like in the SERPs.
> I know, I lead a sheltered life. :-()
> > I did think the article was funny with regards to ebay. They used to > > not allow Google to crawl them and now they make up 25% of the Google > > serps LOL
> :-()
> How about "The Search is mightier than the auction." or maybe, "no > search tingy, no auction tingy". :-()
> >From a personal point of view I don't buy the "we'd look suboptimal if > > we didn't return these results when someone searched for them" > > reasoning as to why it was done this way.
> I would tend to agree, to an extent. Speaking about nofollow by > itself, if I link to a page on my site without a nofollow link and you > link to it from your site, with a nofollow link, I would hope it would > still have a chance to show up in SERPs so from a "prior knowledge" > point of view, I can understand.
> But, if only nofollow links exist, I would think it shouldn't ever > show up although I don't think it possible to prove experimentally one > way or another as it can actually be hard to get people to either not > link at all to a given page or, have them link using the "correct" > method to protect the experiment.
> Of course if we change that around and I nofollow'ed a link to a page > on my site and you link to it from yours but don't nofollow it, Google > indexing the page and returning it in SERPs based on your "vote" by > linking to it without a nofollow would seem consistent with PageRank > in general.
> > I know that I was trying to > > do searches earlier today using our brandname + a unique phrase and > > our site wasn't returned. It's the same when searching "John Chow" by > > name. I agree that it makes Google look suboptimal, but they've known > > about this for a while and aren't doing anything to make sure that > > what the user is searching for gets returned - and these are allowed > > pages. I've had to switch to Yahoo regularly of late due to cases like > > this.
> Right, but at the same time, I search for any one of my names and they > all lead to one of my sites.
> Thinking about that, could the difference between your and my case be > that Google possibly sees your site in a similar light as John Chow's?
> Yeah Craig, this issue (along with the paid links issue) have been > pretty popular lately across a number of blogs and forums I peruse.
True, same here. The problem is though that no one seems to agree on much of anything, which is maybe not all that odd since the search engines themselves seem to not agree as to its implementation. Not like that is anything new though. :-()
> It > just kinda annoys me that they want you to build sites for users but > then tell you to use a tag that is expressly for the Google spider.
I can see your point but at the same time, I look at it a little differently.
For me, it comes down to the term "guideline". In various international and industrial standards circles, "Guideline" has a technical meaning. Without citing a number of references, some of which don't fully agree as to their usage, a guideline is used to help improve interoperability of implementations.
In other words, if we both follow the same guidelines, my web site should interoperate well with your search engine. On the other hand, if either one of chooses to implement something contrary to the guideline, the like result is my implementation, my web site, won't do very well trying to work with your implementation, your search engine.
So, just as we all follow W3C recommendations and actual guideline documents to try to make our web sites interoperate better with browsers, with varying levels of success, search engines, Google specifically have their guidelines.
>From a search engine point of view, it seems they could care less
about all the W3C tag soup and would be more than happy just dealing with pure content but then our sites wouldn't look very well considering how browsers work. Similarly, the W3C doesn't care about search engines but at the same time, our sites have to interoperate with search engines so all we can do is follow their guidelines.
This is probably where you have the main problem though, right? I can understand why, Google says, "build sites for visitors and not for search engines", or something to that effect, which would at first appear to be contradictory.
But, with almost all technical guidelines, it is not the letter of the text that is the "rule" but instead, the intent behind it. If the letter of the text were the rule, it would be a "standard" and not a "guideline". It is this that prompted my comment about being in charge of Google methods of determining webmaster intent.
Just as Google's guidelines say that hidden text is bad, there actually is types of hidden text that are not. Similarly, just as Google says that one should not build pages for search engines but instead, for visitors, that does not say that one shouldn't try to make one's pages as search engine friendly at the same time.
Related, I think it easy to notice a page built specifically for search engines and search engines only but, without looking into the source or having some tool that makes things apparent, can you tell if a given page is noindex'ed or a link is nofollow'ed or not?
True, various elements can be and are added to pages to make communicating intent to search engines more clear but that does not mean the pages are necessarily built only for search engines.
At the same time, how does a robot.txt file fit in since it is not actually a page but is there specifically for bots?
> When it was first introduced it was designed to combat blog spam and > from the looks if things it has had limited success there.
True. I don't think most spammers know enough about search engines to consider much of any link juice benefit. Their goal is and always has been to put their links in front of as many eyeballs as possible.
It seems the most effective means, far and above nofollow, is making it impossible to automate spamming.
> Then it was > to link to sites you can't vouch for, but why would I link to a site > that I can't vouch for.
A fair question but, there are cases where it would make sense. What if I wrote a page about link farms and wanted reference links to them?
I know that is a limited case but I could probably come up with a couple more, although not all that many.
> Now they are saying it might be a good idea to > manipulate page rank (which I always thought was another guideline, > something about don't get involved with link schemes designed to > manipulate page rank). Anyway, I understand the reason it was > introduced but I still have some problems with the way it is being > used now.
Once "technology" is introduced into the wild, all control over it is essentially lost. I know that doesn't help as it makes it seem hopeless but at the same time, it doesn't need to be hopeless because one can use it to one's best advantage or, just not use it at all.
On the other hand, the intent of "don't get involved with link schemes designed..." needs to be considered as well, besides what the text actually says.
I guess one distinction could be that link farms and other schemes are intended to be used to artificially increase ranking, i.e. the links are there only for the link juice they generate while the use of nofollow is intended to support linking without the side effect of passing link juice.
One thing that I haven't seen mentioned throughout all of this though is that is seems "nofollow", or at least something like it but with a better name, has been needed for some time as there would seem to have been a "hole" without it.
"noindex" keeps a page from being indexed, "disallow" keeps a page from being "visited" but both of those are effective only on the owning site's side, both of which can be used by a page owner to indicate their desires concerning a given page. Without "nofollow" a linking to site had no possible influence other than passing a positive vote. With "nofollow", linking to sites now have similar tools as page owners have.
I any event, for better or worse, we seem to have to deal with nofollow one way or another and whether or not we use it or not.
So the indication, at least in this case, is that the <title> is ignored, even though the page has one, and there is no description snippet, which also likely wouldn't be displayed ever were there one?
How did they get in Google's index? Do you link to the various feeds without a link condom?
Ya mate - they are just robotted out rather than nofollowed (typical wordpress behavior) although I may write a plugin to change that.
The title and description are not included simply because google doesn't actually parse the page - they just see a link to it and assume that it must therefore exist, according to Matt's explanation in the Erig Enge interview.
> So the indication, at least in this case, is that the <title> is > ignored, even though the page has one, and there is no description > snippet, which also likely wouldn't be displayed ever were there one?
> How did they get in Google's index? Do you link to the various feeds > without a link condom?
That's the look I was talking about indeed. Our search.cfm page has this (I am actually beginning to think that by following Matt's advice and disallowing the search page, it confused googlebot so much that our site was penalized, but that aside) if you do a search for site:www.travellerspoint.com search.cfm . You'll see what I meant earlier about it being quite a specific query, since other pages on the site that include that file extension somewhere in the body rank for the same thing and it's not until you click on the 'see more results' that you see all the remaining ones still in the index (there aren't that many since I put through a request to have them removed about 6 months ago, but still a few stragglers).
> Thinking about that, could the difference between your and my case be that Google possibly sees your site in a similar light as John Chow's?
I'd be quite insulted by that if it was the case and that's not anything against John Chow because his blog does make for interesting reading (I only found it BECAUSE of his penalty though, so how is that for twisted :) ). But he does so many things that are in the grey zones of Google's terms, whereas no one has been able to point me to one area on our site yet and I know a few well known SEO's who have looked. We've got another on the case right now and in the not too far future, if none of these professionals can figure it out, we won't really have much choice but to assume Googlebot has lost the plot and therefore take it further. The way I see it, if it can happen to us it can happen to 99.99% of the cases out there and that's a worrying thought for anyone in the development field!! Anyway, I digress (again)...
> One thing that I haven't seen mentioned throughout all of this though
is that is seems "nofollow", or at least something like it but with a better name, has been needed for some time as there would seem to have been a "hole" without it.
Yes, I was just thinking that on the way to work this morning. Kind of blows years of theories out of the water where people have been 'sculpting PageRank' by disallowing pages :) Ouch...
Your idea of linking to a disallowed 2nd page via nofollow is of course simpler, but ONLY in the case of the page you are ultimately trying to link to being on your site. For example, since this whole mess started, one thing we've done is link to affiliate programs through a disallowed page (which must be collecting tons of beautiful green pr!) and as you don't control the final destination page, that's the only way to do it if you want to ensure it is doing the same thing across all the search engines. For your own site, your option would be cleaner (if you aren't caring about Yahoo and MSN)... which on a side note, I would. Only catering to one engine isn't the most viable solution in the long run and Yahoo has been making clear improvements across the board. There's lots of talk on webmaster forums about switching, just like there was prior to everyone moving to Google. With a better ad solution in place (man, they are REALLY dropping the ball there) it could seriously start picking up momentum and one benefit they have as a business is multiple income sources. One downturn in ad expenditure or Google marketshare and it's share price is going to plummet, not to mention all these 'free' products will start drying up... but of course I say that as Google hits an all time high in share price! :)
> Ya mate - they are just robotted out rather than nofollowed (typical > wordpress behavior) although I may write a plugin to change that.
> The title and description are not included simply because google > doesn't actually parse the page - they just see a link to it and > assume that it must therefore exist, according to Matt's explanation > in the Erig Enge interview.
> Cheers,
> doc
> On Oct 10, 1:32 pm, cass-hacks wrote:
> > > Hi Craig - I think this is what the 'pages that Google infers are > > > there but can't actually check' pages look like:-
> > So the indication, at least in this case, is that the <title> is > > ignored, even though the page has one, and there is no description > > snippet, which also likely wouldn't be displayed ever were there one?
> > How did they get in Google's index? Do you link to the various feeds > > without a link condom?
> Yes, I was just thinking that on the way to work this morning. Kind of > blows years of theories out of the water where people have been > 'sculpting PageRank' by disallowing pages :) Ouch...
Hmm.. yes it does.. kinda (apart from the fact that by disallowing dupes you end up with a greater chance your 'real' content is represented in the serps and hence more widely linked to). Methinks I need to write another plugin. This thread is a real eye opener.
> That's the look I was talking about indeed. Our search.cfm page has > this (I am actually beginning to think that by following Matt's advice > and disallowing the search page, it confused googlebot so much that > our site was penalized, but that aside) if you do a search for > site:www.travellerspoint.comsearch.cfm . You'll see what I meant > earlier about it being quite a specific query, since other pages on > the site that include that file extension somewhere in the body rank > for the same thing and it's not until you click on the 'see more > results' that you see all the remaining ones still in the index (there > aren't that many since I put through a request to have them removed > about 6 months ago, but still a few stragglers).
> > Thinking about that, could the difference between your and my case be that Google possibly sees your site in a similar light as John Chow's?
> I'd be quite insulted by that if it was the case and that's not > anything against John Chow because his blog does make for interesting > reading (I only found it BECAUSE of his penalty though, so how is that > for twisted :) ). But he does so many things that are in the grey > zones of Google's terms, whereas no one has been able to point me to > one area on our site yet and I know a few well known SEO's who have > looked. We've got another on the case right now and in the not too far > future, if none of these professionals can figure it out, we won't > really have much choice but to assume Googlebot has lost the plot and > therefore take it further. The way I see it, if it can happen to us it > can happen to 99.99% of the cases out there and that's a worrying > thought for anyone in the development field!! Anyway, I digress > (again)...
> > One thing that I haven't seen mentioned throughout all of this though
> is that is seems "nofollow", or at least something like it but with a > better name, has been needed for some time as there would seem to have > been a "hole" without it.
> Yes, I was just thinking that on the way to work this morning. Kind of > blows years of theories out of the water where people have been > 'sculpting PageRank' by disallowing pages :) Ouch...
> Your idea of linking to a disallowed 2nd page via nofollow is of > course simpler, but ONLY in the case of the page you are ultimately > trying to link to being on your site. For example, since this whole > mess started, one thing we've done is link to affiliate programs > through a disallowed page (which must be collecting tons of beautiful > green pr!) and as you don't control the final destination page, that's > the only way to do it if you want to ensure it is doing the same thing > across all the search engines. For your own site, your option would be > cleaner (if you aren't caring about Yahoo and MSN)... which on a side > note, I would. Only catering to one engine isn't the most viable > solution in the long run and Yahoo has been making clear improvements > across the board. There's lots of talk on webmaster forums about > switching, just like there was prior to everyone moving to Google. > With a better ad solution in place (man, they are REALLY dropping the > ball there) it could seriously start picking up momentum and one > benefit they have as a business is multiple income sources. One > downturn in ad expenditure or Google marketshare and it's share price > is going to plummet, not to mention all these 'free' products will > start drying up... but of course I say that as Google hits an all time > high in share price! :)
> It's a wacky world, and only getting wackier....
> On Oct 10, 7:16 am, dockarl wrote:
> > Ya mate - they are just robotted out rather than nofollowed (typical > > wordpress behavior) although I may write a plugin to change that.
> > The title and description are not included simply because google > > doesn't actually parse the page - they just see a link to it and > > assume that it must therefore exist, according to Matt's explanation > > in the Erig Enge interview.
> > Cheers,
> > doc
> > On Oct 10, 1:32 pm, cass-hacks wrote:
> > > > Hi Craig - I think this is what the 'pages that Google infers are > > > > there but can't actually check' pages look like:-
> > > So the indication, at least in this case, is that the <title> is > > > ignored, even though the page has one, and there is no description > > > snippet, which also likely wouldn't be displayed ever were there one?
> > > How did they get in Google's index? Do you link to the various feeds > > > without a link condom?
Hi Matt, thanks for the helpful guide. One thing i am confused about is the example of the expedia site. All their links to customer support are no followed yet that page has a PR6 - presumably because other sites somewhere are linking to it. Therefore if PR still flows from external sources/links is there any point of implementing a nofollow on your site when you may be unaware of other sources having a link to it that follows.
'is there any point of implementing a nofollow on your site when you may be unaware of other sources having a link to it that follows.'
duh - to answer my own qeustion because it will save some of your own internal PR - sorry not thinking straight. It does however seem a little odd that with all of the expedia site having nofollow to customer support why that url still has a PR6 - surely there can't be that many sites linking to that page as presumably many would link straight to their home page.
Admin Aaron, I don't take anything on the Internet personally so don't worry about offending me. I am not lashing out at Google. I think discussions can help make a better engine. Maybe I was a bit snarky in my reply but I still think there is some validity to it. I tend to agree with most of your points on the surface and I do appreciate Matt and other Googlers coming here to answer some of the more vexing questions. IMO the problem is that Google is relying on webmasters to patch a flaw in their algorithm and their choice to provide the silly green PR bar. Because they rely so heavily on links many webmasters are taking advantage. Instead of figuring out a way to take care of it internally they are now asking webmasters to change their sites to fix it for them. Of course, the webmasters that follow SEO are going to do this but those who simply build sites for their users do not know about nofollow and they are the ones who could end up getting slapped. The green PR bar has fueled the link scheme industry and nofollow is helping some crafty webmasters abuse it even more. I still like Google and think they offer the best results and excellent feedback but I believe they deserve some criticism for helping create the link spam beast. The fact is they have lost control of it and need to figure out a way to reign it in and IMO nofollow is not the solution. It is creating more ways to manipulate page rank and while it may be acceptable for internal pages I think it can be abused as a way to manipulate search engine rankings. You make some good points and I generally don't "whine" about many things but I do have a problem with the whole links/nofollow issue and I am going to express my opinion.
> Hi Matt, > thanks for the helpful guide. > One thing i am confused about is the example of the expedia site. All > their links to customer support are no followed yet that page has a > PR6 - presumably because other sites somewhere are linking to it. > Therefore if PR still flows from external sources/links is there any > point of implementing a nofollow on your site when you may be unaware > of other sources having a link to it that follows.
> Admin Aaron, I don't take anything on the Internet personally so don't > worry about offending me. I am not lashing out at Google. I think > discussions can help make a better engine. Maybe I was a bit snarky in > my reply but I still think there is some validity to it. I tend to > agree with most of your points on the surface and I do appreciate Matt > and other Googlers coming here to answer some of the more vexing > questions. IMO the problem is that Google is relying on webmasters to > patch a flaw in their algorithm and their choice to provide the silly > green PR bar. Because they rely so heavily on links many webmasters > are taking advantage. Instead of figuring out a way to take care of it > internally they are now asking webmasters to change their sites to fix > it for them. Of course, the webmasters that follow SEO are going to do > this but those who simply build sites for their users do not know > about nofollow and they are the ones who could end up getting slapped. > The green PR bar has fueled the link scheme industry and nofollow is > helping some crafty webmasters abuse it even more. I still like Google > and think they offer the best results and excellent feedback but I > believe they deserve some criticism for helping create the link spam > beast. The fact is they have lost control of it and need to figure out > a way to reign it in and IMO nofollow is not the solution. It is > creating more ways to manipulate page rank and while it may be > acceptable for internal pages I think it can be abused as a way to > manipulate search engine rankings. > You make some good points and I generally don't "whine" about many > things but I do have a problem with the whole links/nofollow issue and > I am going to express my opinion.
> On Oct 10, 8:45 am, silverstall wrote:
> > Hi Matt, > > thanks for the helpful guide. > > One thing i am confused about is the example of the expedia site. All > > their links to customer support are no followed yet that page has a > > PR6 - presumably because other sites somewhere are linking to it. > > Therefore if PR still flows from external sources/links is there any > > point of implementing a nofollow on your site when you may be unaware > > of other sources having a link to it that follows.
> > Yes, I was just thinking that on the way to work this morning. Kind of > > blows years of theories out of the water where people have been > > 'sculpting PageRank' by disallowing pages :) Ouch...
> Hmm.. yes it does.. kinda (apart from the fact that by disallowing > dupes you end up with a greater chance your 'real' content is > represented in the serps and hence more widely linked to). Methinks I > need to write another plugin. This thread is a real eye opener.
Well yes, but I meant the pure PR sculpting side of things. I remember back in 2002 reading about people doing it with robots.txt but I never could be bothered :) But this aspect of things is certainly an eye opener to me. I always just assumed those pages wouldn't be getting PR. Essentially what this means schematically speaking is that PR is assigned on the page the referral link is on, not once it hits the target page. I can see it taking place in my head as I type...
> > Yes, I was just thinking that on the way to work this morning. Kind of > > blows years of theories out of the water where people have been > > 'sculpting PageRank' by disallowing pages :) Ouch...
> Hmm.. yes it does.. kinda (apart from the fact that by disallowing > dupes you end up with a greater chance your 'real' content is > represented in the serps and hence more widely linked to). Methinks I > need to write another plugin. This thread is a real eye opener.
> doc
> On Oct 10, 5:46 pm, Sam I Am wrote:
> > That's the look I was talking about indeed. Our search.cfm page has > > this (I am actually beginning to think that by following Matt's advice > > and disallowing the search page, it confused googlebot so much that > > our site was penalized, but that aside) if you do a search for > > site:www.travellerspoint.comsearch.cfm. You'll see what I meant > > earlier about it being quite a specific query, since other pages on > > the site that include that file extension somewhere in the body rank > > for the same thing and it's not until you click on the 'see more > > results' that you see all the remaining ones still in the index (there > > aren't that many since I put through a request to have them removed > > about 6 months ago, but still a few stragglers).
> > > Thinking about that, could the difference between your and my case be that Google possibly sees your site in a similar light as John Chow's?
> > I'd be quite insulted by that if it was the case and that's not > > anything against John Chow because his blog does make for interesting > > reading (I only found it BECAUSE of his penalty though, so how is that > > for twisted :) ). But he does so many things that are in the grey > > zones of Google's terms, whereas no one has been able to point me to > > one area on our site yet and I know a few well known SEO's who have > > looked. We've got another on the case right now and in the not too far > > future, if none of these professionals can figure it out, we won't > > really have much choice but to assume Googlebot has lost the plot and > > therefore take it further. The way I see it, if it can happen to us it > > can happen to 99.99% of the cases out there and that's a worrying > > thought for anyone in the development field!! Anyway, I digress > > (again)...
> > > One thing that I haven't seen mentioned throughout all of this though
> > is that is seems "nofollow", or at least something like it but with a > > better name, has been needed for some time as there would seem to have > > been a "hole" without it.
> > Yes, I was just thinking that on the way to work this morning. Kind of > > blows years of theories out of the water where people have been > > 'sculpting PageRank' by disallowing pages :) Ouch...
> > Your idea of linking to a disallowed 2nd page via nofollow is of > > course simpler, but ONLY in the case of the page you are ultimately > > trying to link to being on your site. For example, since this whole > > mess started, one thing we've done is link to affiliate programs > > through a disallowed page (which must be collecting tons of beautiful > > green pr!) and as you don't control the final destination page, that's > > the only way to do it if you want to ensure it is doing the same thing > > across all the search engines. For your own site, your option would be > > cleaner (if you aren't caring about Yahoo and MSN)... which on a side > > note, I would. Only catering to one engine isn't the most viable > > solution in the long run and Yahoo has been making clear improvements > > across the board. There's lots of talk on webmaster forums about > > switching, just like there was prior to everyone moving to Google. > > With a better ad solution in place (man, they are REALLY dropping the > > ball there) it could seriously start picking up momentum and one > > benefit they have as a business is multiple income sources. One > > downturn in ad expenditure or Google marketshare and it's share price > > is going to plummet, not to mention all these 'free' products will > > start drying up... but of course I say that as Google hits an all time > > high in share price! :)
> > It's a wacky world, and only getting wackier....
> > On Oct 10, 7:16 am, dockarl wrote:
> > > Ya mate - they are just robotted out rather than nofollowed (typical > > > wordpress behavior) although I may write a plugin to change that.
> > > The title and description are not included simply because google > > > doesn't actually parse the page - they just see a link to it and > > > assume that it must therefore exist, according to Matt's explanation > > > in the Erig Enge interview.
> > > Cheers,
> > > doc
> > > On Oct 10, 1:32 pm, cass-hacks wrote:
> > > > > Hi Craig - I think this is what the 'pages that Google infers are > > > > > there but can't actually check' pages look like:-
> > > > So the indication, at least in this case, is that the <title> is > > > > ignored, even though the page has one, and there is no description > > > > snippet, which also likely wouldn't be displayed ever were there one?
> > > > How did they get in Google's index? Do you link to the various feeds > > > > without a link condom?
> So nofollow as a link attribute causes Google to drop those links out > of our link graph. If you have a nofollow link from page A to page B, > we won't crawl via page A's link to discover page B. Note that we may > still find page B via other links around the web, though.
As far as finding a page via links around the web... If page B has a 'noindex,nofollow' metatag on it, the 'noindex' would keep this from happening, right? I mean... I guess it would be found, but would immediately be ingored once found, right?
> Your idea of linking to a disallowed 2nd page via nofollow is of > course simpler, but ONLY in the case of the page you are ultimately > trying to link to being on your site.
Ah, right, I was only thinking about linking to another page on the same site.
Considering how much you and I talked about trying to disconnect your site from its contained blogs to see what happened, I don't know why I didn't think of the link to an "external" site case. My bad.
> For example, since this whole > mess started, one thing we've done is link to affiliate programs > through a disallowed page (which must be collecting tons of beautiful > green pr!) and as you don't control the final destination page, that's > the only way to do it if you want to ensure it is doing the same thing > across all the search engines.
Exactly. Again, a stupid mistake on my part for not thinking of that scenario.
> For your own site, your option would be > cleaner (if you aren't caring about Yahoo and MSN)... which on a side > note, I would.
True, but I never saw much of a difference either way I did things, only disallow'ing or disallow'ing and nofollow'ing. It is not that I don't care about either one but instead, in the tests I did on the sites I did the tests on, I saw no measurable difference. That does NOT mean it won't make a difference for any and every site though.
> Only catering to one engine isn't the most viable > solution in the long run and Yahoo has been making clear improvements > across the board.
True.
> There's lots of talk on webmaster forums about > switching, just like there was prior to everyone moving to Google.
Switching what, use as a search tool or trying to rank well?
> With a better ad solution in place (man, they are REALLY dropping the > ball there) it could seriously start picking up momentum and one > benefit they have as a business is multiple income sources.
Doesn't Yahoo still prevent one from using ad services other than their own on the same page? I know that Google just recently stopped doing that but I thought Yahoo still did, no?
> One > downturn in ad expenditure or Google marketshare and it's share price > is going to plummet, not to mention all these 'free' products will > start drying up...
That may be the case but I think it a scenario not likely to happen. What I think may happen however is a lower rate of increase in ad expenditure but I don't really see any downturn happening. Of course my crystal ball is in the shop for repairs so don't hold me to that prediction. :-()
> but of course I say that as Google hits an all time > high in share price! :)
:-() As interesting as owning Google stock might be, I'll stick to my old standbys, Xerox, Coca-Cola and IBM. Considering the length of time I've had them and how much they cost me initially, they could all drop 50% and I'll still make a killing. :-()
> It's a wacky world, and only getting wackier....
I have started to implement the NF for affiliate links, but did so because I thought it just might be a good idea to do in general after finding some aff programs to include keywords in the code to attempt to make the link transfer PR.
If your site is massive you will accumulate a lot of aff links, and I just assume that can't be good.