Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Discussions > Sitemap Protocol > URL timeout: robots.txt timeout
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 26 - 44 of 44 - Collapse all  -  Translate all to Translated (View all originals) < Older 
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Tim Abracadabra  
View profile  
 More options Oct 19 2008, 5:42 am
From: Tim Abracadabra
Date: Sun, 19 Oct 2008 02:42:03 -0700 (PDT)
Local: Sun, Oct 19 2008 5:42 am
Subject: Re: URL timeout: robots.txt timeout
Hi coldrick,

Excellent feedback!
Very informative.

I'm sorry you had to go through all that but,
I am sure this will help many.

Thanks so much,
Abracadabra

On Oct 19, 3:35 am, coldrick wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
JohnMu Google employee  
View profile  
 More options Oct 19 2008, 6:19 am
From: JohnMu
Date: Sun, 19 Oct 2008 03:19:18 -0700 (PDT)
Local: Sun, Oct 19 2008 6:19 am
Subject: Re: URL timeout: robots.txt timeout
Hi coldrick

It looks like things are working again, thanks to your perseverance!

One thing you might want to keep in mind with regards to your
whitelist is that the Googlebot IP addresses may change over time
(though as far as I know, it's not something that happens frequently).
Your best bet is to regularly check the blocked IP list and check for
valid Googlebot addresses using the reverse DNS lookup as described in
http://www.google.com/support/webmasters/bin/answer.py?answer=80553

Thanks for posting the details, it's good to have a post like yours
which we can point other users to!

John


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options Oct 19 2008, 1:53 pm
From: webado
Date: Sun, 19 Oct 2008 10:53:24 -0700 (PDT)
Local: Sun, Oct 19 2008 1:53 pm
Subject: Re: URL timeout: robots.txt timeout
Goodness me, I can't believe One-Star_Luke is still at it one-starring
JohnMu's post!

May Luke's computer  be overrun by trojans and his homepage hijacked
to smithereens ;)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
silentptnr  
View profile  
 More options Oct 20 2008, 11:02 am
From: silentptnr
Date: Mon, 20 Oct 2008 08:02:45 -0700 (PDT)
Local: Mon, Oct 20 2008 11:02 am
Subject: Re: URL timeout: robots.txt timeout
I am now having the same problem with http://www.zp1.com. I've tested
the robots.txt and the sitemap file and don't know what could be
causing my error?  I even checked with the host and, of course, they
said they are not blocking googlebot or any crawler.  Any ideas?

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
coldrick  
View profile  
 More options Oct 20 2008, 7:08 pm
From: coldrick
Date: Mon, 20 Oct 2008 16:08:35 -0700 (PDT)
Local: Mon, Oct 20 2008 7:08 pm
Subject: Re: URL timeout: robots.txt timeout
Yes. Read the post 4 back from this one.

Rgds

On Oct 21, 1:02 am, silentptnr wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
silentptnr  
View profile  
 More options Oct 21 2008, 1:27 pm
From: silentptnr
Date: Tue, 21 Oct 2008 10:27:18 -0700 (PDT)
Local: Tues, Oct 21 2008 1:27 pm
Subject: Re: URL timeout: robots.txt timeout
I read the post and checked with the host.  I also checked my log and
don't even see where googlebot attempted.  Any ideas?  The url is
http://www.zp1.com.  Thanks

On Oct 20, 4:08 pm, coldrick wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
coldrick  
View profile  
 More options Oct 21 2008, 3:35 pm
From: coldrick
Date: Tue, 21 Oct 2008 12:35:56 -0700 (PDT)
Local: Tues, Oct 21 2008 3:35 pm
Subject: Re: URL timeout: robots.txt timeout
What firewall is the site using?

On Oct 22, 3:27 am, silentptnr wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options Oct 21 2008, 5:12 pm
From: webado
Date: Tue, 21 Oct 2008 14:12:31 -0700 (PDT)
Local: Tues, Oct 21 2008 5:12 pm
Subject: Re: URL timeout: robots.txt timeout

On 21 oct, 13:27, silentptnr wrote:

> I read the post and checked with the host.  I also checked my log and
> don't even see where googlebot attempted.

That is proof that the server's firewall blocks Googlebot.

 Any ideas?  The url ishttp://www.zp1.com.  Thanks


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
coldrick  
View profile  
 More options Oct 21 2008, 7:11 pm
From: coldrick
Date: Tue, 21 Oct 2008 16:11:29 -0700 (PDT)
Local: Tues, Oct 21 2008 7:11 pm
Subject: Re: URL timeout: robots.txt timeout
I agree.

The server logs won't show an attempted connection once the firewall
starts to block it.

The firewall logs will show a connection atempt on the day that the
Googlebot was blocked and the Googlebot IP addresses will be listed in
the firewall's deny list.

If you have access to the firewall yourself, check the denied IP
lists.

If you don't have access yourself, go back to the Internet Host and
ask them to check the Firewall's denied IP lists for any of the Google
IP addresses listed earlier.

By the way webado. I see you belong to an internet hosting company. I
guess my point one about the online games doesn't apply to all IT
people. You are doing a great job on this forum. I got two of your
suggestions fixed on our sites. Not sure how to go about the Subnet
settings but I'll muddle through it I guess.

Just a note on the IP addresses I listed earlier - the last two don't
seem to be Google. Is there anyone who can remove them from the list?

On Oct 22, 7:12 am, webado wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
silentptnr  
View profile  
 More options Oct 21 2008, 10:28 pm
From: silentptnr
Date: Tue, 21 Oct 2008 19:28:24 -0700 (PDT)
Local: Tues, Oct 21 2008 10:28 pm
Subject: Re: URL timeout: robots.txt timeout
I have another site which I consult for and it is having a problem.
This site has always been an authority site for over ten years and is
a very seasoned site.  The site is by far the best resource for
information in it's category.  And for some reason in the past week
the site has dropped from a number one ranking to between page 5 and
7???  Nothing has changed on the site so I'm wondering if Google has
penalized the site for something.  John if you could give your insight
I would appreciate it.  The url of the site is http://www.auditions.com.
Perhaps google has changed its algorithm or something.  I can't
understand why the search position would drop so quickly and
dramatically.  Now when I search for the term "auditions" the top site
listed is a shoe site????

On Oct 21, 4:11 pm, coldrick wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
silentptnr  
View profile  
 More options Oct 24 2008, 9:22 pm
From: silentptnr
Date: Fri, 24 Oct 2008 18:22:38 -0700 (PDT)
Local: Fri, Oct 24 2008 9:22 pm
Subject: URL timeout: robots.txt timeout
Not sure why I'm getting this message, maybe someone could give me a
tip?

URL timeout: robots.txt timeout
We encountered an error while trying to access your Sitemap. Please
ensure your Sitemap follows our guidelines and can be accessed at the
location you provided and then resubmit.

My url is http://www.zp1.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Abracadabra  
View profile  
 More options Oct 25 2008, 3:35 am
From: Tim Abracadabra
Date: Sat, 25 Oct 2008 00:35:40 -0700 (PDT)
Local: Sat, Oct 25 2008 3:35 am
Subject: Re: URL timeout: robots.txt timeout
You are seeing this error because as you
mentioned before, You don't see Google in your
logs and as coldrick and webado concur, something
is blocking Google from reaching your web site.

It could be a network firewall or a software firewall
in the server that manages the IP tables.

I did a check with the Googlebot user agent and can
access your site and robots.txt  fine so I would
imagine if you are still having an issue with this
then the block is likely based on IP address.

For more information on host provider blocking
Googlebot see this article and feel free to
share it with your provider.
http://www.aitechsolutions.net/google-block-network.html

and like coldrick mentioned the block might be
in the IP deny lists in the firewall. Have the provider
check it out and keep at them.

BTW - Your robots.txt is in the wrong format

Sitemap: http://zp1.com/sitemap.xml

User-agent: *
Disallow: /cgi-bin

Put the Sitemap record on the last line
with a blank line between it and the statements above.
Like this:

User-agent: *
Disallow: /cgi-bin

Sitemap: http://zp1.com/sitemap.xml

Hope that helps,
Abracadabra

On Oct 24, 9:22 pm, silentptnr wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
silentptnr  
View profile  
 More options Oct 25 2008, 11:36 am
From: silentptnr
Date: Sat, 25 Oct 2008 08:36:28 -0700 (PDT)
Local: Sat, Oct 25 2008 11:36 am
Subject: Re: URL timeout: robots.txt timeout
Thanks Tim

On Oct 25, 12:35 am, Tim Abracadabra wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
silentptnr  
View profile  
 More options Oct 25 2008, 12:27 pm
From: silentptnr
Date: Sat, 25 Oct 2008 09:27:20 -0700 (PDT)
Local: Sat, Oct 25 2008 12:27 pm
Subject: Re: URL timeout: robots.txt timeout
Do you think there could be anything with my htaccess file?

On Oct 25, 12:35 am, Tim Abracadabra wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
coldrick  
View profile  
 More options Oct 26 2008, 6:57 am
From: coldrick
Date: Sun, 26 Oct 2008 03:57:54 -0700 (PDT)
Local: Sun, Oct 26 2008 6:57 am
Subject: Re: URL timeout: robots.txt timeout
Unlikely. As already suggested, it is most likely your server
firewall. Do you have a reason to doubt this?

On Oct 26, 2:27 am, silentptnr wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
coldrick  
View profile  
 More options Nov 2 2008, 7:17 pm
From: coldrick
Date: Sun, 2 Nov 2008 16:17:29 -0800 (PST)
Local: Sun, Nov 2 2008 7:17 pm
Subject: Re: URL timeout: robots.txt timeout
For everyone's info I've found out a bit more about the CIDR IP range
blocking and allowing formats.

The current range we have set in our firewall is as follows

216.239.32.0/19 # Googlebot
64.233.160.0/19 # Googlebot
72.14.192.0/18 # Googlebot
209.85.128.0/17 # Googlebot
66.102.0.0/20 # Googlebot
74.125.0.0/16 # Googlebot
66.249.64.0/19 #Googlebot

and I've confirmed these using a whois search. However as JohnMu
mentioned earlier, Google don't publish these addresses because they
may change over time. I only offer them as advice and suggest you do
your own whois search prior to implementig them. The list may not be
complete either.

What I've found out regarding this problem and getting it fixed at the
server is as follows...

This applies to the firewall program CSF (ConfigServer Security and
Firewall), but most likely applies to other firewall programs in some
manner.

1. You need to check through the list of FIREWALL DENY IP'S and remove
any of the Googlebot IP addresses that appear in the list. These are
what are stopping Googlebot accessing your site.

2. You need to add the range of Googlebot IP addresses to bot the
FIREWALL ALLOW IP'S and the LFD IGNORE IP'S in the firewall setup.
This will make sure Googlebot doesn't get blocked again.

If Googlebot appears to get blocked again at a future date and the
above ALLOWS and IGNORES are still in place, you need to go back and
check the log of denied IP's to find any that appear to be Googlebot,
like the one I got this morning which prompted me to dig a little
deeper and update the IP list above...

IP:       66.249.67.59 (US/United States/
crawl-66-249-67-59.googlebot.com)
Failures: 5 (mod_security)
Interval: 300 seconds
Blocked:  Yes

If you then do a WHOIS search on the above IP the whois search will
confirm that it is indeed Google and just under halfway down the list
of info you will see the following cidr range. Follow steps 1 and 2
above for the IP address listed.

CIDR:       66.249.64.0/19

You can do a WHOIS search here http://whois.domaintools.com/

and there is a good tutorial on CIDR here
http://help.yahoo.com/l/us/yahoo/smallbusiness/store/risk/risk-22.html

If you are on shared hosting and don't have access to your server
firewall configuration, you need to talk to your Internet Hosting
Company and try to convince them to take the above action in order to
allow Googlebot back onto your site.

I hope this helps someone.

On Oct 19, 5:35 pm, coldrick wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options Nov 2 2008, 8:49 pm
From: webado
Date: Sun, 2 Nov 2008 17:49:24 -0800 (PST)
Local: Sun, Nov 2 2008 8:49 pm
Subject: Re: URL timeout: robots.txt timeout
Great information Coldrick.
Bookmarked this thread.
Thank you.

On Nov 2, 7:17 pm, coldrick wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
247drugmart  
View profile  
 More options Nov 5 2008, 3:34 am
From: 247drugmart
Date: Wed, 5 Nov 2008 00:34:05 -0800 (PST)
Local: Wed, Nov 5 2008 3:34 am
Subject: Re: URL timeout: robots.txt timeout
Hi

i have surfed this error like : URL timeout: robots.txt timeout

can any one help me to solve this error.

Thanks

Sitename : www.247drugmart.com

On Oct 19, 12:35 pm, coldrick wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
webado  
View profile  
 More options Nov 5 2008, 8:22 am
From: webado
Date: Wed, 5 Nov 2008 05:22:14 -0800 (PST)
Local: Wed, Nov 5 2008 8:22 am
Subject: Re: URL timeout: robots.txt timeout
This thread if you read from the begining explains exactly what is
likely to be happening and what you have to do to solve it.
Have you read it?

On Nov 5, 3:34 am, 247drugmart wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages < Older 
« Back to Discussions « Newer topic     Older topic »