pretty popular lately across a number of blogs and forums I peruse. It
then tell you to use a tag that is expressly for the Google spider.
from the looks if things it has had limited success there. Then it was
that I can't vouch for. Now they are saying it might be a good idea to
manipulate page rank). Anyway, I understand the reason it was
used now.
> Craig, I'm biting my fingers here to stop from typing "search and
> though shalt find" as I've seen you reply a few times :) Matt
> explained the 'through a robots.txt'd page' a few times, including
> here:
http://www.mattcutts.com/blog/how-to-report-paid-links/.
> There's an official statement on the webmaster central blog at
http://googlewebmastercentral.blogspot.com/2007/06/more-ways-for-you-...
> Essentially, yes, to link to page 3 from page 1, you run the link
> through page 2 which is robots.txt'd out. It's not the same as
> nofollow though, but it has the same effect of not passing PR to page
> 3. The bummer is that it DOES pass PR to page 2, which ends up being
> this black hole full of beautiful green PR..... (does that make it a
> green hole, anyway I digress, but as you mentioned in your follow up
> comment, you'd have to actually hit the link to page two with a
> nofollow as well to make sure you don't pass all your PR into a big
> hole - theoretically the page with the highest pagerank on your site
> could be a disallowed page!). Using nofollow on a link from the first
> page to the third page will not pass PR to page 3. Now here is where
> the real catch comes in because all search engines treat this blooming
> attribute differently. Wikipedia has a good rundown on it:http://en.wikipedia.org/wiki/Nofollow. As you can see, Matt's
> statement further underlines the statements on that wikipedia page. So
> from Google's point of view, the nofollow tag is a much easier and
> better choice since you don't have the whole issue of PR 'leakage'. I
> think Google's is the most easy to understand considering the thing is
> called "no follow", but I can see why Yahoo doesn't use it as such
> because that is originally not how the attribute was intended. Anyway,
> if you want to use something that all search engines do the same with
> I guess the option with a nofollowed link to a disallowed page which
> then links on to the final target page is the only one that works.
> I was also surprised by the disallow'd pages still being able to show,
> but I guess the explanation in the article made some sense in a wacky
> kind of way.
> "If a bot can't crawl a page, how else could it know what is on the
> page in order to decide what search terms to rank it for?"
> The power of anchor text I'd guess. These would have to be pretty
> detailed search queries I'd say, ones that include a brandname plus
> some specific terms that don't have many other results and especially
> not other results from that site as these would outdo the disallowed
> page easily. You'd get the typical disallowed pages look in the index.
> I did think the article was funny with regards to ebay. They used to
> not allow Google to crawl them and now they make up 25% of the Google
> serps LOL
> >From a personal point of view I don't buy the "we'd look suboptimal if
> we didn't return these results when someone searched for them"
> reasoning as to why it was done this way. I know that I was trying to
> do searches earlier today using our brandname + a unique phrase and
> our site wasn't returned. It's the same when searching "John Chow" by
> name. I agree that it makes Google look suboptimal, but they've known
> about this for a while and aren't doing anything to make sure that
> what the user is searching for gets returned - and these are allowed
> pages. I've had to switch to Yahoo regularly of late due to cases like
> this.
> On Oct 9, 5:58 pm, cass-hacks wrote:
> > Thinking about robots.txt'ed files accruing PageRank, that would seem
> > to mean that if a given or directory is disallow'ed yet linked to
> > within the site, the links to whatever is disallow'ed should be
> > nofollow'ed otherwise they will end up passing PageRank to pages that
> > don't need it thereby reducing PageRank available for other links on
> > those pages, right?
> > Craig
> > p.s. mpilatow, fortunately you are not in charge of Google's methods
> > of determining intent. ;-)
> > On Oct 9, 5:25 pm, cass-hacks wrote:
> > > Nice!
> > > > My short answer is that the nofollow attribute on links is a pretty
> > > > general mechanism, and you're welcome to use it how you like.
> > > I would LIKE to use it to automagically increase my SERPs positions.
> > > I don't think that will happen though. :-()
> > > Seriously though, I always thought of ad-hoc standards, like
> > > robots.txt, sitemap.xml etc the same as W3C RFCs and ITU-T
> > > Recommendations, both with an emphasis on "recommendation".
> > > People may have had their various, and usually different reasons for
> > > wanting a given recommendation but once it is out in the wild, people
> > > are pretty much free to use it as they wish, on both sides of the
> > > interoperability table.
> > > One question I still have is regarding, "(e.g. a link through a page
> > > that is robot.txt'ed out)".
> > > I am not understanding the "through a page" part.
> > > I read in the linked to interview, "Now, robots.txt says you are not
> > > allowed to crawl a page, and Google therefore does not crawl pages
> > > that are forbidden in robots.txt. However, they can accrue PageRank,
> > > and they can be returned in our search results."
> > > So, to achieve the same thing in a robots.txt file as in the use of
> > > nofollow, you would essentially need three pages, the first page with
> > > a link to the second, which is disallow'ed and has a link on it to a
> > > third, the page one originally wanted nofollow'ed?
> > > Is that the meaning of "through a page"?
> > > By the way, I was surprised to read that a page that is "disallowed"
> > > and is not crawled can show up in SERPs. Would that be possible only
> > > because of offsite inbound links to that page? If a bot can't crawl a
> > > page, how else could it know what is on the page in order to decide
> > > what search terms to rank it for?
> > > Craig