are there legal or protocol issues with scraping?

64 views
Skip to first unread message

Nilanjan Bhattacharya

unread,
Aug 27, 2014, 11:48:50 PM8/27/14
to scrap...@googlegroups.com
Is it OK to scrape commercial sites like ESPN?  I realize this isn't something scraperwiki can necessarily answer.

If anyone has asked sites like this for permissions, do you get a positive response?

Thad Guidry

unread,
Aug 28, 2014, 12:10:49 AM8/28/14
to scrap...@googlegroups.com
The large sites response is always..."we shall sue you, and take your children, and theirs as well"

The small sites, typically say, what do you want? we might give you the database in exchange for some coding help, or perhaps a small $50-100 fee, or sometimes the answer is "yeah, just don't hit our servers hard and take your time, besides half our data is copyrighted somewhere in the world and other parts it's not, so don't ask, don't tell us"


On Wed, Aug 27, 2014 at 10:48 PM, Nilanjan Bhattacharya <nilanj...@gmail.com> wrote:
Is it OK to scrape commercial sites like ESPN?  I realize this isn't something scraperwiki can necessarily answer.

If anyone has asked sites like this for permissions, do you get a positive response?

--
You received this message because you are subscribed to the Google Groups "ScraperWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scraperwiki...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Thad Guidry

unread,
Aug 28, 2014, 12:13:06 AM8/28/14
to scrap...@googlegroups.com
I should have also stated:  I always ask, and I have been quite lucky with getting a positive response and usually involving some barter arrangement with them.  Scratch my back, I scratch theirs.

"Just ask them" is the ethical thing to do and my experience is that it usually works out quite well !

Francis Irving

unread,
Aug 28, 2014, 6:03:08 AM8/28/14
to scrap...@googlegroups.com

ScraperWiki's policy is here:

https://classic.scraperwiki.com/docs/python/faq/#scraping_legality

We are a bit broader than Thad, in that we think if robots.txt gives permission (e.g. it is absent entirely) that is a license to scrape. Also we think most Government data is fair game due to public interest.

Otherwise, asking is a good idea!

More in this blog post:

https://blog.scraperwiki.com/2012/04/is-scraping-legal/

There are whole industries, notably price comparison amongst ecommerce providers, which are rife with scraping that breaks the above rules.

Francis

Nilanjan Bhattacharya

unread,
Aug 28, 2014, 8:00:27 PM8/28/14
to scrap...@googlegroups.com
Thanks.  Most of the sites I see have scary warnings about scraping or API usage.  I am still on the fence about sites like ESPN.

As an aside - I think the equation completely changes if you use the word curation instead of scraping.  e.g., I just create a list on list.ly - (http://list.ly/list/QTo-indian-startup-punters).  I created it by pasting a url - all the information is pulled from the URL.  Of course, list.ly also has a notice saying you need to make sure you have permission, etc.  However, not sure if it is taken seriously.
Reply all
Reply to author
Forward
0 new messages