you and others seem to still not be happy. Some have legit complaints,
yeah... while most just use this weak argument (build sites for
humans) to lash out at Google.
> Yeah Craig, this issue (along with the paid links issue) have been
> pretty popular lately across a number of blogs and forums I peruse. It
> just kinda annoys me that they want you to build sites for users but
> then tell you to use a tag that is expressly for the Google spider.
> When it was first introduced it was designed to combat blog spam and
> from the looks if things it has had limited success there. Then it was
> to link to sites you can't vouch for, but why would I link to a site
> that I can't vouch for. Now they are saying it might be a good idea to
> manipulate page rank (which I always thought was another guideline,
> something about don't get involved with link schemes designed to
> manipulate page rank). Anyway, I understand the reason it was
> introduced but I still have some problems with the way it is being
> used now.
> On Oct 9, 3:26 pm, Sam I Am wrote:
> > Craig, I'm biting my fingers here to stop from typing "search and
> > though shalt find" as I've seen you reply a few times :) Matt
> > explained the 'through a robots.txt'd page' a few times, including
> > here:http://www.mattcutts.com/blog/how-to-report-paid-links/.
> > There's an official statement on the webmaster central blog athttp://googlewebmastercentral.blogspot.com/2007/06/more-ways-for-you-...
> > Essentially, yes, to link to page 3 from page 1, you run the link
> > through page 2 which is robots.txt'd out. It's not the same as
> > nofollow though, but it has the same effect of not passing PR to page
> > 3. The bummer is that it DOES pass PR to page 2, which ends up being
> > this black hole full of beautiful green PR..... (does that make it a
> > green hole, anyway I digress, but as you mentioned in your follow up
> > comment, you'd have to actually hit the link to page two with a
> > nofollow as well to make sure you don't pass all your PR into a big
> > hole - theoretically the page with the highest pagerank on your site
> > could be a disallowed page!). Using nofollow on a link from the first
> > page to the third page will not pass PR to page 3. Now here is where
> > the real catch comes in because all search engines treat this blooming
> > attribute differently. Wikipedia has a good rundown on it:http://en.wikipedia.org/wiki/Nofollow. As you can see, Matt's
> > statement further underlines the statements on that wikipedia page. So
> > from Google's point of view, the nofollow tag is a much easier and
> > better choice since you don't have the whole issue of PR 'leakage'. I
> > think Google's is the most easy to understand considering the thing is
> > called "no follow", but I can see why Yahoo doesn't use it as such
> > because that is originally not how the attribute was intended. Anyway,
> > if you want to use something that all search engines do the same with
> > I guess the option with a nofollowed link to a disallowed page which
> > then links on to the final target page is the only one that works.
> > I was also surprised by the disallow'd pages still being able to show,
> > but I guess the explanation in the article made some sense in a wacky
> > kind of way.
> > "If a bot can't crawl a page, how else could it know what is on the
> > page in order to decide what search terms to rank it for?"
> > The power of anchor text I'd guess. These would have to be pretty
> > detailed search queries I'd say, ones that include a brandname plus
> > some specific terms that don't have many other results and especially
> > not other results from that site as these would outdo the disallowed
> > page easily. You'd get the typical disallowed pages look in the index.
> > I did think the article was funny with regards to ebay. They used to
> > not allow Google to crawl them and now they make up 25% of the Google
> > serps LOL
> > >From a personal point of view I don't buy the "we'd look suboptimal if
> > we didn't return these results when someone searched for them"
> > reasoning as to why it was done this way. I know that I was trying to
> > do searches earlier today using our brandname + a unique phrase and
> > our site wasn't returned. It's the same when searching "John Chow" by
> > name. I agree that it makes Google look suboptimal, but they've known
> > about this for a while and aren't doing anything to make sure that
> > what the user is searching for gets returned - and these are allowed
> > pages. I've had to switch to Yahoo regularly of late due to cases like
> > this.
> > On Oct 9, 5:58 pm, cass-hacks wrote:
> > > Thinking about robots.txt'ed files accruing PageRank, that would seem
> > > to mean that if a given or directory is disallow'ed yet linked to
> > > within the site, the links to whatever is disallow'ed should be
> > > nofollow'ed otherwise they will end up passing PageRank to pages that
> > > don't need it thereby reducing PageRank available for other links on
> > > those pages, right?
> > > Craig
> > > p.s. mpilatow, fortunately you are not in charge of Google's methods
> > > of determining intent. ;-)
> > > On Oct 9, 5:25 pm, cass-hacks wrote:
> > > > Nice!
> > > > > My short answer is that the nofollow attribute on links is a pretty
> > > > > general mechanism, and you're welcome to use it how you like.
> > > > I would LIKE to use it to automagically increase my SERPs positions.
> > > > I don't think that will happen though. :-()
> > > > Seriously though, I always thought of ad-hoc standards, like
> > > > robots.txt, sitemap.xml etc the same as W3C RFCs and ITU-T
> > > > Recommendations, both with an emphasis on "recommendation".
> > > > People may have had their various, and usually different reasons for
> > > > wanting a given recommendation but once it is out in the wild, people
> > > > are pretty much free to use it as they wish, on both sides of the
> > > > interoperability table.
> > > > One question I still have is regarding, "(e.g. a link through a page
> > > > that is robot.txt'ed out)".
> > > > I am not understanding the "through a page" part.
> > > > I read in the linked to interview, "Now, robots.txt says you are not
> > > > allowed to crawl a page, and Google therefore does not crawl pages
> > > > that are forbidden in robots.txt. However, they can accrue PageRank,
> > > > and they can be returned in our search results."
> > > > So, to achieve the same thing in a robots.txt file as in the use of
> > > > nofollow, you would essentially need three pages, the first page with
> > > > a link to the second, which is disallow'ed and has a link on it to a
> > > > third, the page one originally wanted nofollow'ed?
> > > > Is that the meaning of "through a page"?
> > > > By the way, I was surprised to read that a page that is "disallowed"
> > > > and is not crawled can show up in SERPs. Would that be possible only
> > > > because of offsite inbound links to that page? If a bot can't crawl a
> > > > page, how else could it know what is on the page in order to decide
> > > > what search terms to rank it for?
> > > > Craig