Twitter Harvesting Blocked?

222 views
Skip to first unread message

Stever

unread,
Jun 19, 2013, 2:55:37 AM6/19/13
to pattern-f...@googlegroups.com
Hello everyone,

I've been harvesting on Twitter for research etc. using pattern.web, and from a couple of days ago it stopped working - as in, Twitter is not responding to harvest requests.  I wonder what's going on, all other modules (Google, Newsfeed, etc.) work fine.  I wonder if my IP's gone on the blacklist somehow (pinging too much?)  If so is there a way to unblock this?

Steve

Stever

unread,
Jun 19, 2013, 3:03:42 AM6/19/13
to pattern-f...@googlegroups.com

By the way, the code:

from pattern.web import Twitter
print Twitter().trends(cached=False)

produces this error:

Traceback (most recent call last):
  File "C:directory path\Twitter.py", line 5, in <module>
    print Twitter().trends(cached=False)
  File "C:\Python27\lib\site-packages\pattern\web\__init__.py", line 1234, in trends
    data = url.download(**kwargs)
  File "C:\Python27\lib\site-packages\pattern\web\__init__.py", line 394, in download
    data = self.open(timeout, proxy, user_agent, referrer, authentication).read()
  File "C:\Python27\lib\site-packages\pattern\web\__init__.py", line 363, in open
    raise HTTPError
HTTPError

Stever

unread,
Jun 19, 2013, 3:14:57 AM6/19/13
to pattern-f...@googlegroups.com
Also I wonder if this (the new version of Twitter API) has to do with anything:

https://dev.twitter.com/blog/api-v1-is-retired

Michele Orrù

unread,
Jun 19, 2013, 3:18:27 AM6/19/13
to pattern-f...@googlegroups.com
nope,

>>> from pattern import web
>>> web.TWITTER
'http://api.twitter.com/1.1/

--
ù

Stever

unread,
Jun 19, 2013, 9:45:02 AM6/19/13
to pattern-f...@googlegroups.com
I got:

>>> web.TWITTER
'http://search.twitter.com/'

Tom De Smedt

unread,
Jun 19, 2013, 3:58:44 PM6/19/13
to pattern-f...@googlegroups.com
Twitter updated their API a few days ago, so Pattern had to be updated too. You can download the latest revision from GitHub:

There's a "ZIP" download button on the page. Twitter works again in the latest revision. Ideally, you should now pass a tweet id to the "start" parameter in Twitter().search().

In short:

from pattern.web import Twitter
twitter = Twitter()
last_id = None
for i in range(10):
for tweet in twitter.search("cats", start=last_id, count=100, cached=False):
print tweet.text
last_id = tweet.id

... which gives you a 1,000 tweets about cats.

Also check the example in:
pattern/examples/01-web/04-twitter.py

Best,
Tom

--
 
---
You received this message because you are subscribed to the Google Groups "Pattern" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pattern-for-pyt...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

modciv

unread,
Jan 28, 2014, 11:12:41 PM1/28/14
to pattern-f...@googlegroups.com
Tom: I've been getting a pattern.web.HTTP403Forbidden error w/ this lately. Has something changed again?
To unsubscribe from this group and stop receiving emails from it, send an email to pattern-for-python+unsub...@googlegroups.com.

Tom De Smedt

unread,
Feb 10, 2014, 3:50:40 PM2/10/14
to pattern-f...@googlegroups.com
You should be fine if you use the latest GitHub revision (http://www.github.com/clips/pattern). Haven't had time yet to update the official download on http://www.clips.ua.ac.be/pattern
Reply all
Reply to author
Forward
0 new messages