Google Groups Home
Help | Sign in
Discussions > Sitemap Protocol > Sitemap is Indexed
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  5 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
SLeach  
View profile
 More options May 11, 4:28 pm
From: SLeach
Date: Sun, 11 May 2008 13:28:03 -0700 (PDT)
Local: Sun, May 11 2008 4:28 pm
Subject: Sitemap is Indexed
My sitemap file "itself" is indexed.  I want the sitemap to be read by
the crawlers, but I do not want the sitemap file to be indexed.  How
do I block it from being indexed without blocking it from being
crawled?

ref: the sitemap.xls file at http://www.dimfuzzies.com/

Thx, SL


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
cristina  
View profile
 More options May 11, 6:17 pm
From: cristina
Date: Sun, 11 May 2008 15:17:56 -0700 (PDT)
Local: Sun, May 11 2008 6:17 pm
Subject: Re: Sitemap is Indexed
I do not know if it will work,
and be careful to monitor how your site will be indexed
after that,
but a suggestion for stopping the sitemap URL to be indexed
would be to add to the HTTP header of the
sitemap file the noindex X-Robot-tag, see
http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-...

For an Apache server you can add a X-Robot-tag
to the HTTP header of a file sitemap.xml like this

<Files "sitemap.xml">
Header set X-Robots-Tag "noindex"
</Files>

I repeat, keep checking if your site's search results
are affected by this and remove the noindex
directive for the sitemap from the .htaccess file
if something goes wrong.
Check HTTP headers of your site's URLs with
HTTP header viewers.

Cristina.

On May 11, 9:28 pm, SLeach wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile
 More options May 11, 9:48 pm
From: webado
Date: Sun, 11 May 2008 18:48:04 -0700 (PDT)
Local: Sun, May 11 2008 9:48 pm
Subject: Re: Sitemap is Indexed
I'd say easier  to just rename it.

Delete the old one from Webmaster Tools and submit the new name.

If your sitemap is specified in robots.txt don't forget to change that
url.

It probably got indexed due to a link to it being found somewhere.

There's nothing wrong with that as such,  but I can see how it's
annoying.

Your robots.txt file needs a few more things to be disallowed.
Anything concerning register, login for forum and blog, reply in the
forum, post a comment, feeds, trackbaks, and whatever else is no
business of robots or represents duplciated content, should be
disallowed.

You do not need the last line you currently have:
Allow: /

By definition anyhting which is not disallowed is allowed.

You also need to take care of the canonical domain issue, where www
and non-www urls serve the same content. Pick one, make sure all yrou
navigation use it and 301 redirect all urls from the other form to the
one you picked.

See here:
http://groups.google.com/group/only-validation/web/fix-canonical-issu...

You seem to have lots of broken links on your site. Use Xenu to check
it over. Xenu does not  follow robots.txt so you have to tell it
specifically which url prefixes not to crawl.

you also have redriected urls:

http://www.dimfuzzies.com/gallery
redirected to: http://www.dimfuzzies.com/gallery/
status code: 301 (object permanently moved)
linked from page(s):
        http://www.dimfuzzies.com/
        http://www.dimfuzzies.com/2008/04/10/new-life/
        http://www.dimfuzzies.com/category/site-information/
        http://www.dimfuzzies.com/2008/04/

http://www.dimfuzzies.com/gallery/
redirected to: http://www.dimfuzzies.com/gallery/main.php
status code: 302 (object temporarily moved)
linked from page(s):
        http://www.dimfuzzies.com/
        http://www.dimfuzzies.com/2008/04/26/celebrate-national-astronomy-day...
        http://www.dimfuzzies.com/2008/04/10/rocky-mountain-star-stare-2008/
     etc.

The first one is because you left out the trailing slash and the
server does a 302 redirection automatically.

Robots should not meet redirections when they crawl. Use the final
destination at all times.

On May 11, 6:17 pm, cristina wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
SLeach  
View profile
 More options May 16, 8:42 am
From: SLeach
Date: Fri, 16 May 2008 05:42:01 -0700 (PDT)
Local: Fri, May 16 2008 8:42 am
Subject: Re: Sitemap is Indexed
Thank you for the pointers.  I knew of a few empty links created by WP
that I do not know how to delete, but I did not know of any dead
ones.

Where did you find the redirected url "...gallery" redirected to
"...gallery/"?

Gallery is new to me and I have not had a chance to get inside and
tweak it yet.

SL

On May 11, 7:48 pm, webado wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
cristina  
View profile
 More options May 16, 10:24 am
From: cristina
Date: Fri, 16 May 2008 07:24:00 -0700 (PDT)
Subject: Re: Sitemap is Indexed
You can see the redirect for the gallery URL with the
W3 link checker
http://validator.w3.org/checklink
for your home page.

The W3C link checker gives
http://www.dimfuzzies.com/gallery
redirected 301 to http://www.dimfuzzies.com/gallery/main.php
because http://www.dimfuzzies.com/gallery is
redirected 301 to http://www.dimfuzzies.com/gallery/
and http://www.dimfuzzies.com/gallery/
is redirected 302 to http://www.dimfuzzies.com/gallery/main.php

You can check HTTP status response for
individual URLs with
http://www.asymptoticdesign.com/aux/header_etal.cgi

Cristina.

On May 16, 1:42 pm, SLeach wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google