tweepy stops with errno 104 (conn reset by peer) after 1 week working

2,436 views
Skip to first unread message

D T

unread,
Nov 4, 2011, 11:09:46 AM11/4/11
to twe...@googlegroups.com
Hello,

I am using tweepy as a tweet crawler, gathering tweets using the streaming api. However, around 1 - 1.5 week(s) after started gathering, the script exits with errno 104 (connection reset by peer). I have programmed the crawler to send me an e-mail reporting the errors as follows:

1) on_error, on_timeout functions:

def on_error(self,status_code):
        timestr = strftime("%Y-%m-%d %H:%M:%S", localtime())
        mail("TwCrawler error","Error code: %i at %s" % (int(status_code),timestr))
        return True

def on_timeout(self):
        timestr = strftime("%Y-%m-%d %H:%M:%S", localtime())
        mail("TwCrawler timeout","Timeout at %s" % (timestr))
        return True
2) catching tweepy errors in main:

except TweepError, e:
                timestr = strftime("%Y-%m-%d %H:%M:%S", localtime())
                mail("TwCrawler Tweepy error","TweepError %e " % (e.reason))

However it didn't send me anything when fails due to this error. Is there any way to "detect" this errno through code?

Thanks

Pascal Jürgens

unread,
Nov 5, 2011, 12:19:40 PM11/5/11
to twe...@googlegroups.com
Hi David,

twitter frequently disconnects https connections to the streaming api, so it's definitely the right thing to catch such events and reconnect (make sure to incrementally back off if your first reconnection attempt doesn't work).

The "connection reset by peer" exception is not a TweepyError but an underlying type. If I recall correctly, it's a socket exception. So if you want to catch that, I'd recommend using a blanket "except Exception" (not the best way) or specifically an exceptions. IOError one.

Hope that helps,
regards,
Pascal

Berry van der Linden

unread,
Nov 4, 2011, 1:19:22 PM11/4/11
to twe...@googlegroups.com

Depending on which urllib you use something like this should work:

http://stackoverflow.com/questions/5000138/need-help-with-python-exception-handling

On Nov 4, 2011 9:11 AM, "D T" <david.gonzal...@gmail.com> wrote:

D T

unread,
Nov 6, 2011, 4:31:48 AM11/6/11
to twe...@googlegroups.com
Thanks for your replies.

I still have a couple of doubts on how/where do this (I'm a py-tweepy begginer):

1) Tweepy, in streaming.py, uses urllib (urlencode) and httplib (HTTPSConnection). It seems that Tweepy tries to connect to the sream with httplib and only uses urllib for encoding. Since it doesn't use the "open" function of urllib, I wonder if catching the exception  "urllib.error.URLError" should work... Shouldn't I catch the httplib.HTTPException instead?

1) Should I modify Tweepy source code (streaming.py) ? Or just my crawler (python script) ?

Thanks in advance... 

Pascal Jürgens

unread,
Nov 7, 2011, 7:51:44 AM11/7/11
to twe...@googlegroups.com
Hi David,

as I said, what you receive is basically a socket exception, which is a special case of IOError exceptions.

If you want to catch that, you should do something like:


auth = tweepy.OAuthHandler(… # set up your oauth here
try:
stream = tweepy.Stream(auth=auth, listener=SomeListener()) # start the stream
except IOError, ex:
print 'I just caught the exception: %s' % ex

Given that this works, wrap it in a while True loop that includes an increasing backoff
(So that there is some pause between attempts to reconnect).

Hope that helps,
regards,
Pascal

D T

unread,
Nov 9, 2011, 7:36:20 AM11/9/11
to twe...@googlegroups.com
Thanks Pascal!

I am currently trying your approach, I will check if catching the exception there works, and post back the results.
 

D T

unread,
Nov 11, 2011, 5:25:40 AM11/11/11
to twe...@googlegroups.com
Pascal, thank you very much! It worked!

Reply all
Reply to author
Forward
0 new messages