Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Discussions > Crawling, indexing, and ranking > how robots.txt behaves for main domain and addon domains
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Ems  
View profile  
 More options Jun 13 2007, 12:05 pm
From: Ems
Date: Wed, 13 Jun 2007 16:05:31 -0000
Local: Wed, Jun 13 2007 12:05 pm
Subject: how robots.txt behaves for main domain and addon domains
Hello

I have a question regarding robots.txt as its confusing me

I have a main domain, say domainA.com. I have put an addon domain
domainB.com hence creating a directory on my main domain as
domainB.domainA.com

in a nutshell, domainB.domainA.com and domainB.com is pointing to the
same content

however in my main domainA.com, i have setup a robots.txt with the ff
content:

User-agent: Googlebot
Disallow: /

I understand that domainB.domainA.com should NOT be indexed by google
(is that correct?) but my question is, will domainA.com be indexed by
google?

Thanks in advance for your help. I really need enlightenment on this
one.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Berghausen Google employee  
View profile  
 More options Jun 15 2007, 6:46 pm
From: Berghausen
Date: Fri, 15 Jun 2007 22:46:42 -0000
Local: Fri, Jun 15 2007 6:46 pm
Subject: Re: how robots.txt behaves for main domain and addon domains
That's a good question, EMS.  The robots.txt protocol can get kind of
confusing when you think about it too long, and it sounds like you've
thought about this a bit.  However, in this case, it might help to
look at robots.txt from the perspective of the spider.

When a spider finds a URL, it takes the whole domain name (everything
between 'http://' and the next '/'), then sticks a '/robots.txt' on
the end of it and looks for that file.  If that file exists, then the
spider should read it to see where it is allowed to crawl.

In your case, Googlebot, or any other spider, should try to access
three URLs: domainA.com/robots.txt, domainB.domainA.com/robots.txt,
and domainB.com/robots.txt.  The rules in each are treated as
separate, so disallowing robots from domainA.com/ should result in
domainA.com/ being removed from search results while
domainB.domainA.com/ remains unaffected, which does not sound like not
something you want.

The problem you might have with the setup you have described is this--
in order to keep domainB.domainA.com out of the results, you would
need to have domainB.domainA.com/robots.txt exclude robots, while
domainB.com/robots.txt welcomes them.  This means that you would need
to have a way to make domainB.domainA.com/ and domainB.com/ serve
different information, and judging from what you've described, you
have not set up your server to do so yet.

Of course, it is always possible that I have assumed to much about
your situation, so it is a good idea to use Google's robots.txt
analysis tool (see http://www.google.com/support/webmasters/bin/topic.py?topic=8475
) to see if your robots.txt files already produce the results you
want.

If using robots.txt files doesn't solve the problem, and assuming that
you want to continue hosting all of your content on domainA.com, one
strategy you really should look into would be setting up a 301
redirect from the pages on domainB.domainA.com/ to domainB.com/ .  If
you need more advice on how to do this with your server software, your
hosting company's tech support would definitely be the best place to
start, but this group is here to help if more isues arise. :-)

Hope that helps!
-Bergy

On Jun 13, 9:05 am, Ems wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sebastian  
View profile  
 More options Jun 15 2007, 7:35 pm
From: Sebastian
Date: Fri, 15 Jun 2007 23:35:10 -0000
Local: Fri, Jun 15 2007 7:35 pm
Subject: Re: how robots.txt behaves for main domain and addon domains
Consolidate properly. 301 everything to the canonical server name:
http://www.smart-it-consulting.com/article.htm?node=166&page=129
Sebastian

On Jun 13, 6:05 pm, Ems wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »