Firehose vs. Rate Limiting

63 views
Skip to first unread message

KCL

unread,
Sep 10, 2015, 1:12:16 PM9/10/15
to Phirehose Users
I'm hoping someone can clarify something for me regarding rate limiting and Phirehose.

I'm trying to use Ghetto Collector and have had gotten it working.  I'm running an Amazon virtual machine with Ubuntu and spawn a new process for each item I want to track.

When I have 1, it works flawlessly.

When I have 2, it works flawlessly.

When I have 3, I start getting (but not always) the Rate Limited (RL) message "Enhance Your Calm".

When I have 4, most of the time I get the RL message.

When I have 5, I continuously get the RL message.

Now even when I have 5, it seems the most active streams stick around but the other three drop out due to the "PhirehoseConnectLimitExceeded" error (going over 20 retries).  What is it "trying" to do that gets rate limited?

I thought that once the stream was open, there were no more real "requests".  It just caught whatever was being thrown at it.


I'm really trying to play by the rules but I think I have a basic misunderstanding of how Phirehose and/or the Firehose stream work.


Any help will be greatly appreciated.  Phenomenal help will be rewarded with cookies.


Scott.


Darren Cook

unread,
Sep 10, 2015, 1:17:50 PM9/10/15
to phireho...@googlegroups.com
> I'm trying to use Ghetto Collector and have had gotten it working. I'm
> running an Amazon virtual machine with Ubuntu and spawn a new process for
> each item I want to track.
>
> When I have 1, it works flawlessly.
>
> When I have 2, it works flawlessly.

Are you using a different Twitter API user for each process? (If not,
I'm surprised it works with 2 connections, but maybe they have some
leeway??)

Alternatively, you could collect all keywords that you want to follow,
with one process?

Darren

Kallen Scott

unread,
Sep 10, 2015, 1:22:41 PM9/10/15
to phireho...@googlegroups.com
Darren,

Thanks for jumping in so quickly. No. I don’t believe I am. I have one registered app with Twitter that I use the consumer key/client secret from — is that what you mean?


Scott.
> --
>
> ---
> You received this message because you are subscribed to a topic in the Google Groups "Phirehose Users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/phirehose-users/ssH_dU7mlYI/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to phirehose-use...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Darren Cook

unread,
Sep 10, 2015, 1:51:14 PM9/10/15
to phireho...@googlegroups.com
> Thanks for jumping in so quickly. No. I don’t believe I am. I have
> one registered app with Twitter that I use the consumer key/client
> secret from — is that what you mean?

Yes. AFAIK, Twitter only expect you to make one connection at a time
with your credentials.

Darren

Adam Green

unread,
Sep 10, 2015, 1:52:58 PM9/10/15
to phireho...@googlegroups.com
To be clear, are you:

A. Starting one process that connects to the streaming API and leaving it running, then trying to start a second process with the same app keys.

Or

B. Starting a process, stopping it, and then starting another?

If A, that will not work. As Darren said, you should fail on the second process. You are only allowed to make one API connection at a time for each set of keys. Each app only gets a single set of keys.

The solution is to search for all the keywords in a single process, then parse out the results separately based on the keywords found in the tweets returned.

That is not a Phirehose limit, it is inherent in the Twitter API.

Sent from my iPhone
> You received this message because you are subscribed to the Google Groups "Phirehose Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to phirehose-use...@googlegroups.com.

Kallen Scott

unread,
Sep 10, 2015, 1:54:37 PM9/10/15
to phireho...@googlegroups.com
Ah. OK. A bit of a monkeywrench.

So, next question — how many things am I allowed to -track- using the single phirehose? And I mean “OR”s, not “AND”s….

Scott.

Darren Cook

unread,
Sep 10, 2015, 3:37:04 PM9/10/15
to phireho...@googlegroups.com
> So, next question — how many things am I allowed to -track- using
> the single phirehose? And I mean “OR”s, not “AND”s….

https://dev.twitter.com/streaming/overview/request-parameters#track

The official answer, to how many, seems to be unmentioned, but
https://dev.twitter.com/streaming/overview/connecting says you get a 413
error if you go over this unknown limit.

I know from experience that 20 keywords works fine. So, keep adding
until you hit the limit, back off 25% as a buffer, then go and create
another twitter account, another set of API credentials, and repeat :-)

Darren

Adam Green

unread,
Sep 10, 2015, 3:39:51 PM9/10/15
to phireho...@googlegroups.com
It's traditionally been 400 keywords, and 5,000 user_ids. The fact that the limits are now nowhere to be found in the docs means they may want to adjust that as load and business model demands. 


Darren

--

---
You received this message because you are subscribed to the Google Groups "Phirehose Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phirehose-use...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Kallen Scott

unread,
Sep 10, 2015, 5:01:34 PM9/10/15
to phireho...@googlegroups.com
That’s very helpful, thank you both.  Man, I sure read SOMETHING wrong somewhere.


Scptt.

You received this message because you are subscribed to a topic in the Google Groups "Phirehose Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/phirehose-users/ssH_dU7mlYI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to phirehose-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages