the website in sample code is broken

36 views
Skip to first unread message

赵祎

unread,
Sep 9, 2016, 4:33:29 AM9/9/16
to scrapy-users
the website of https://blog.scrapinghub.com  for the sample code of the homepage is broken and thus the sample code cannot work

also,when I follow the sample under Walk-through of an example spider
I got a problem like 
Ignoring response <403 http://stackoverflow.com/questions?sort=votes>: HTTP status code is not handled or not allowed
and I didn't find the solution, any help is appreciate,thanks a lot 

Valdir Stumm Junior

unread,
Sep 9, 2016, 9:20:26 AM9/9/16
to scrapy...@googlegroups.com
What's happening when you run your spider for blog.scrapinghub.com?


About the StackOverflow one, it looks like they are denying access due to the spider's User Agent. I tried using a browser's User Agent just to check and it worked:

    $ scrapy runspider spider.py -s USER_AGENT='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'



--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users+unsubscribe@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at https://groups.google.com/group/scrapy-users.
For more options, visit https://groups.google.com/d/optout.



--
Scrapinghub

Valdir Stumm Junior 
Software Engineer, Scrapinghub 

Skypestummjr
TwitterGithub
TwitterLinkedInGithub

We turn web content into structured data. Lead maintainers of Scrapy.

Reply all
Reply to author
Forward
0 new messages