getting around of 503

685 views
Skip to first unread message

ivanb

unread,
Apr 19, 2012, 10:29:51 AM4/19/12
to scrapy-users
I'm scraping this site for product information, and the thing is that
I get 503 error after like 6,7 links scraped on that site.

I know there are ways to go around this, and I would like to know if
you have some advice or experience on this and if you could post some
useful links here to get me started on this.

Thanks

Rivka Shenhav

unread,
Apr 19, 2012, 1:05:36 PM4/19/12
to scrapy-users
What you want to do is emulate human user access to the site. To that end, you may want to slow down the rate of requests to the site by using  setting parameters such as:

CONCURRENT_ITEMS
CONCURRENT_REQUESTS_PER_DOMAIN
and 
CONCURRENT_REQUESTS
 and setting them to low numbers - also use an increased range of download delay (
DOWNLOAD_DELAY) 

Also - - After each relatively short burst of the above requests - you may want to stop for 10-20 minutes.

Rivka

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To post to this group, send email to scrapy...@googlegroups.com.
To unsubscribe from this group, send email to scrapy-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/scrapy-users?hl=en.



Pablo Hoffman

unread,
Apr 19, 2012, 4:47:07 PM4/19/12
to scrapy...@googlegroups.com, ivanb
Reply all
Reply to author
Forward
0 new messages