Google Groups Home
Help | Sign in
Discussions > Google webmaster tools > Removing pages and directories
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  10 messages - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
JAC  
View profile
 More options Sep 24 2007, 7:32 pm
From: JAC
Date: Mon, 24 Sep 2007 16:32:36 -0700
Local: Mon, Sep 24 2007 7:32 pm
Subject: Removing pages and directories
Firstly, I am very let down that you can no longer contact Google for
help. Even their contact page is only links to self help pages.

I am trying to remove pages and directories from Google's index, these
pages and directories have been removed and return an 404 error.
Google has denied my requests... why???

http://www.gamersunderground.net/GameFilter/

Above is a link to a directory I would love to have removed, if you
navigate to it you won't find it, so why is Google "denying" me???

Thanks.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile
 More options Sep 24 2007, 9:58 pm
From: webado
Date: Tue, 25 Sep 2007 01:58:34 -0000
Local: Mon, Sep 24 2007 9:58 pm
Subject: Re: Removing pages and directories
Your robots.txt file is a mess.
After allowing everybody and their uncle everywhere, and at the end
you have this:

User-agent: *
Disallow: /

So all robots are disallowed from the whole site.

You had better fix that anomaly first because you whole site will end
up being dropped from the index, not just the folder you want to
remove.

You could use this (get rid of all the other robots from robots.txt or
at least figure otu exactly what you want to do with them):

User-agent: *
Disallow: /GemaFilter/

This will disallow that folder - and then you can get it removed.

On Sep 24, 7:32 pm, JAC wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JAC  
View profile
 More options Sep 25 2007, 1:40 am
From: JAC
Date: Mon, 24 Sep 2007 22:40:08 -0700
Local: Tues, Sep 25 2007 1:40 am
Subject: Re: Removing pages and directories
Actually, that is not an anomaly. My robots.txt file says that only
the listed bots are okay and all others are not. And the list of
"everybody and their uncle" are friendly (and very common) bots...
there are other search engines than Google.

Back to my question: I have complied with Google's requirement by
having one of the three listed options, mine being a 404 error and yet
I am still denied. I do understand I can also control this with my
robots.txt file and by using a meta tag, but I have too many to list.
Any help would be appreciated.

Check out http://del.icio.us/robots.txt or http://www.nytimes.com/robots.txt
they too do not have an anomaly.

Thanks.
http://www.gamersunderground.net/robots.txt

On Sep 24, 6:58 pm, webado wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
cristina  
View profile
 More options Sep 25 2007, 11:19 am
From: cristina
Date: Tue, 25 Sep 2007 08:19:42 -0700
Local: Tues, Sep 25 2007 11:19 am
Subject: Re: Removing pages and directories
Hi JAC,
I cannot find URLs from www.gamersunderground.net/GameFilter/
in search results.
Can you give the full URL of a Google cache for one of your URLs?

Cristina.


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JAC  
View profile
 More options Sep 25 2007, 5:13 pm
From: JAC
Date: Tue, 25 Sep 2007 14:13:51 -0700
Local: Tues, Sep 25 2007 5:13 pm
Subject: Re: Removing pages and directories
Thanks Cristina,

I can't find a cache either, I see the links in Google's Webmaster
Tools within the Web crawl Not Found area... the problem is A: they
don't ever disappear off that list and B: because there are thousands
of them, every week or so more and more are added to the list. This is
why I want to remove the entire directory.

Thanks,
JAC

On Sep 25, 8:19 am, cristina wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
cristina  
View profile
 More options Sep 25 2007, 6:36 pm
From: cristina
Date: Tue, 25 Sep 2007 22:36:33 -0000
Local: Tues, Sep 25 2007 6:36 pm
Subject: Re: Removing pages and directories
Hi JAC,

There is no referer in the list of not-found URLs
in Google Webmaster Tools, so there is no way
to know where these URLs are followed from by Googlebot,
maybe from some out-of-date links (?)

If these URLs do not appear in search results
then the removal tool does not apply to them,
since it removes URLs from the search results.

I suggest you block these URLs in your robots.txt file,
as Webado already wrote.

If you disallow these URLs in your robots.txt file
then Googlebot will not follow them, so
maybe in time it will stop looking for them.

Cristina.

On Sep 25, 10:13 pm, JAC wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Susan Moskwa Google employee  
View profile
 More options Sep 25 2007, 6:52 pm
From: Susan Moskwa
Date: Tue, 25 Sep 2007 22:52:56 -0000
Local: Tues, Sep 25 2007 6:52 pm
Subject: Re: Removing pages and directories
Hi JAC--

It looks like there are two separate issues going on here.
First, webado is correct in saying that in order to request a removal
of http://www.gamersunderground.net/GameFilter/ you would need to
block
/GameFilter/ using your robots.txt file. Check out this help topic for
details: http://google.com/support/webmasters/bin/answer.py?answer=59819
In particular:
"To remove a directory and its contents, you must ensure that the
pages you want to remove have been blocked using a robots.txt file.
Returning a 404 isn't enough, because it's possible for a directory to
return a 404 status code, but still serve out files underneath it.
Using robots.txt to block a directory ensures that all of its children
are disallowed as well."

However, as Cristina points out, you don't seem to have any pages from
http://www.gamersunderground.net/GameFilter/ currently indexed. The
purpose of a URL removal request is to request that URLs get removed
from our index; and since this directory is already not in our index,
a URL removal request would have no effect.

If you're concerned about the URLs appearing in the 'web crawl'
section of your webmaster tools account, please check out the #1 entry
in our FAQ, which addresses this issue:
http://groups.google.com/group/Google_Webmaster_Help/web/faqs-for-web...


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JAC  
View profile
 More options Sep 25 2007, 9:24 pm
From: JAC
Date: Tue, 25 Sep 2007 18:24:24 -0700
Local: Tues, Sep 25 2007 9:24 pm
Subject: Re: Removing pages and directories
Thank you Cristina and Susan,

Susan, as I stated to webado, who I only disagreed with her regarding
the incorrect statement about my robots.txt file being a "MESS," (last
time I'm stating this) I have too many to list in a robots.txt file...
that is why I am here! The directory /GameFilter is not on my server
(AT ALL), thus there are no files "underneath" it. I read the FAQ and
I do feel much better now.

FOR THOSE SEEKING AN ANSWER:

In a very odd way the Crawl index is not a list of indexed pages, but
rather a list of pages the googlebot tried to follow (likely from
other sites linking to dead pages), but failed. With out Google
providing the source of the link the list is mostly not useful, but
could be informative to tell you which pages, which cannot be removed,
could use a 301 redirect which I guess is the answer for me!!!

Thanks.

On Sep 25, 3:52 pm, Susan Moskwa wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile
 More options Sep 25 2007, 9:43 pm
From: webado
Date: Tue, 25 Sep 2007 18:43:20 -0700
Local: Tues, Sep 25 2007 9:43 pm
Subject: Re: Removing pages and directories
JAC, your robots.txt is still a tangled mess LOL Though maybe it might
sort of  work at the moment but only just. It's high maintenance I
feel.

Rogue robots do not read and obey robots.txt in any case, so trying to
disallow all but those you listed which you consider good robots, is
the same thing as not disallowing any.

On Sep 25, 9:24 pm, JAC wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sam I Am  
View profile
 More options Sep 27 2007, 5:21 am
From: Sam I Am
Date: Thu, 27 Sep 2007 02:21:22 -0700
Local: Thurs, Sep 27 2007 5:21 am
Subject: Re: Removing pages and directories

> In a very odd way the Crawl index is not a list of indexed pages, but
> rather a list of pages the googlebot tried to follow (likely from
> other sites linking to dead pages), but failed. With out Google
> providing the source of the link the list is mostly not useful, but
> could be informative to tell you which pages, which cannot be removed,
> could use a 301 redirect which I guess is the answer for me!!!

This is not really odd at all. If it was a list of indexed pages it
would probably be called "indexed pages", not "web crawl". To see
pages that link to you go to Links > Pages that link to you - makes
sense to me :)

The "web crawl" can be very useful to identify pages that you might
have forgotten about and are still being followed so that you can take
action (301/re-implement etc.). Also note that it appears that
sometimes Googlebot follows imaginary links, so the links that you see
do not necessarily ever have to have existed on your site or be linked
to from anywhere. This is a recent thing though and most likely just a
googlebot bug.

You mention having too many to list so just to avoid confusion I'll
repeat what the others said. You do not have to add every single file
to robots.txt, just a simple
User-agent: *
Disallow: /GameFilter/

will block the entire directory and also stop those "web crawl" errors
from appearing (since then the googlebot won't try and follow those
links anymore). If that's the errors you want to stop seeing, this is
one way to go although I'd personally 301 any pages that are being
linked to before doing this.

It can take months and months of returning a 404 before Google stops
crawling pages without any links pointing to them that might once upon
a time have had a link. I still have Google trying to find a couple of
pages that were only linked to in that format for 24 hours last
December. In that 24 hours, it managed to crawl and keep in memory all
those links and it's been trying to find those pages ever since
(admittedly, now it's down to the last 3 or 4, so looks like it's
finally cleaning out that memory!).

On Sep 26, 3:24 am, JAC wrote:


    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google