Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Discussions > Crawling, indexing, and ranking > Webmaster Tools shows 404 errors for old pages that don't exist - what to do ?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Snowman2468  
View profile  
 More options Sep 1 2008, 12:50 am
From: Snowman2468
Date: Sun, 31 Aug 2008 21:50:31 -0700 (PDT)
Local: Mon, Sep 1 2008 12:50 am
Subject: Webmaster Tools shows 404 errors for old pages that don't exist - what to do ?
We have some strange 404 results in webmaster tools.

Over 3,000 Web Crawl “Not found” results for URLs which were on the
site 2 years ago but since have been removed for example :
http://www.cheaperthanhotels.co.uk/Argentina/Calafate/

Over 30,000 URLs restricted by robots.txt for URLs which are not
restricted by robots.txt at all eg.
http://www.cheaperthanhotels.co.uk/Ecuador/Quito/Town/Cafe-Cultura-Hotel
-Quito-L57945H.htm (this was an old hotel name)

Someone suggested redirecting all the “Not found” pages to related
existing pages on the same site to reduce as many of the errors as
possible.

Then with the robots.txt file, to try including “Allow:” then list the
URL’s we want to be  crawled.

I don't know why Google has got them listed as restricted if they’re
not in the robot.txt file.

What to do ?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
CityAlbum  
View profile  
 More options Sep 1 2008, 5:52 am
From: CityAlbum
Date: Mon, 1 Sep 2008 02:52:37 -0700 (PDT)
Local: Mon, Sep 1 2008 5:52 am
Subject: Re: Webmaster Tools shows 404 errors for old pages that don't exist - what to do ?
Under webmaster tools use tools - URLs delete
list all your old urls you want do delete in Googles index
wait until deleted message appears

Since Google found these old urls in the www these old urls may appear
again in webmastertools. Google found these old urls on other websites
which are not updated.

On 1 Sep., 06:50, Snowman2468 wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JohnMu Google employee  
View profile  
 More options Sep 1 2008, 9:49 am
From: JohnMu
Date: Mon, 1 Sep 2008 06:49:00 -0700 (PDT)
Subject: Re: Webmaster Tools shows 404 errors for old pages that don't exist - what to do ?
Hi snowman2468 and welcome to the groups!

Accessing the URL you specified (
http://www.cheaperthanhotels.co.uk/Ecuador/Quito/Town/Cafe-Cultura-Ho...
) with a Googlebot user agent redirects to http://www.cheaperthanhotels.com/file-not-found.aspx
which in turn is blocked from being crawled through your robots.txt
file.

The Googlebot user agent I used is:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/
bot.html)

As you are using IIS for hosting, it's possible that this issue is
related to an issue described (and solved) in:
http://www.kowitz.net/archive/2006/12/11/asp.net-2.0-mozilla-browser-...
http://todotnet.com/archive/0001/01/01/7472.aspx

It would most likely be easier to recognize this as a server error if
your error page was not blocked from crawling and instead returned the
result code which the server generates (500 instead of 404). That
said, it doesn't matter to the Googlebot, so as long as you can keep
track of and solve any issues that come up, you should be fine :).

Hope it helps!
John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Snowman2468  
View profile  
 More options Sep 1 2008, 11:09 pm
From: Snowman2468
Date: Mon, 1 Sep 2008 20:09:37 -0700 (PDT)
Local: Mon, Sep 1 2008 11:09 pm
Subject: Re: Webmaster Tools shows 404 errors for old pages that don't exist - what to do ?
Thanks John, that looks like it may be the issue.

CityAlbum - we have tried what you suggested and WMT “denied” our
request with the following message:

The search terms you entered appear on the live third-party page.

As you may know, information in our search results is actually located
on third-party, publicly available webpages. Even if we removed this
page from our index, the content in question would still be available
on the web.

To remove this information from our search results and from the web,
you'll need to contact the webmaster <https://www.google.com/
webmasters/tools/answer.py?answer=64035>  of this third-party site.
Once the webmaster makes the change, you can submit a request to
remove the cached copy or simply wait for our search results to
reflect this change the next time we crawl the page.

Does this mean that these URLs are showing in WMT as errors because
there are other pages linking to them?

On Sep 1, 11:49 pm, JohnMu wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »