504 Gateway Time-out

117 views
Skip to first unread message

Davood Hadiannejad

unread,
Apr 12, 2023, 11:04:58 AM4/12/23
to Common Crawl
Hey guys,

I used to search for a domain  pre-index like:

and it was working perfectly, but now I'm getting the following error:
b'<html>\r\n<head><title>504 Gateway Time-out</title></head>\r\n<body>\r\n<center><h1>504 Gateway Time-out</h1></center>\r\n<hr><center>nginx/1.18.0 (Ubuntu)</center>\r\n</body>\r\n</html>\r\n'

if I print the cc_url, it looks like: 

And now if you copy and past these URLs on the browser, you'll get :
504 Gateway Time-out"


Does anyone know if anything changed lastly? 

Best 
Davood 



Sebastian Nagel

unread,
Apr 12, 2023, 4:25:58 PM4/12/23
to common...@googlegroups.com
Hi Davood,

as of now, I only can recommend to be patient and wait for
a response or send your request again if it fails. Please, also
reduce the request rate to max. 1 request per second (on HTTP level).

The index server is currently heavily loaded, responding to several
million requests per day, with more than a few requests failing due to
queue overflows. I'll have a look in the next days whether I can make
the index server more responsive. Thanks for the patience!

Best,
Sebastian

On 4/12/23 17:04, Davood Hadiannejad wrote:
> Hey guys,
>
> I used to search for a domain  pre-index like:
>

Davood Hadiannejad

unread,
Apr 13, 2023, 3:26:19 AM4/13/23
to common...@googlegroups.com
Hi Sebastian
thanks for your response. Ok, I will reduce my request.
Thanks and regards
Davood 

--
You received this message because you are subscribed to the Google Groups "Common Crawl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to common-crawl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/common-crawl/97e90cbd-54fc-408d-7155-c5481fe936d2%40commoncrawl.org.
Reply all
Reply to author
Forward
0 new messages