Is it possible to use flowControl options on firebase function pubSub?

248 views
Skip to first unread message

Phil Hodey

unread,
May 4, 2019, 5:03:38 AM5/4/19
to Firebase Google Group
Hi,

In native cloud pubsub you are able to define flowControl parameters in the subscriber...



Here you can set maxMessages to limit the amount of messages being pushed downstream.

In my use case I have a firebase function that is indexing firebase data into an elasticsearch cluster. When there is a bulk update in firebase it is causing the elastic indexer to max out. The simplest solution would be to reduce the flow of messages coming from pubsub using the flowControl.maxMessages but it does not look like we have any control over the configuration of the pubsub queue,

Is there anything I can do on pubsub.onPublish definition to control throughput?




Kato Richardson

unread,
May 6, 2019, 5:22:30 PM5/6/19
to Firebase Google Group
Hi Phil,

I don't see any indication that Functions will work with flowControl.maxMessages, given that it's a client side feature and also knowing that Functions is essentially a stateless microservice (it wouldn't store state across all instances to know how many are received in total).

I'll ping a few Functions gurus to make sure I'm not leading you down the wrong path here. But it seems like you might be better served by a server instance which can retain state, such as GCE / GAE instance. Then it can receive the Firebase writes and batches them for insertion into ES (which should be much more optimal), and also utilize flowControl.maxMessages as needed.

☼, Kato

--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To post to this group, send email to fireba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/9015fc3f-aea6-42e7-92dc-edc2e2e6f5c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

Kato Richardson | Developer Programs Eng | kato...@google.com | 775-235-8398

Message has been deleted

Phil Hodey

unread,
May 7, 2019, 11:48:45 AM5/7/19
to Firebase Google Group

Hi Kato,

Thanks for that, having dug a little deeper I can see that cloud functions use push delivery not pull delivery. The flow control is a feature of the pull subscriber type so clearly will not be supported. 

I have seen this though...


It does look like there is a degree of flow control embedded into the push subscriber type based upon the processing rate of the downstream service. 

I notice in the core cloud functions documentation there is a scaling option to prevent more than a specific number of instances from being launched...


I could therefore control flow by limiting the number of instances that can be launched. For this to work I need to be able to access max-instances from a firebase cloud function configuration.

Finally...


It looks like the push subscriber type looks for a status code response from the subscriber, which I presume the firebase function is handling directly are there are no return types specified on the pubsub onMessage documentation.

So for the max instances config to be of any use we'd also need some clarity on how functions handle returning status codes.

Any thoughts on this?





On Monday, 6 May 2019 22:22:30 UTC+1, Kato Richardson wrote:
Hi Phil,

I don't see any indication that Functions will work with flowControl.maxMessages, given that it's a client side feature and also knowing that Functions is essentially a stateless microservice (it wouldn't store state across all instances to know how many are received in total).

I'll ping a few Functions gurus to make sure I'm not leading you down the wrong path here. But it seems like you might be better served by a server instance which can retain state, such as GCE / GAE instance. Then it can receive the Firebase writes and batches them for insertion into ES (which should be much more optimal), and also utilize flowControl.maxMessages as needed.

☼, Kato

From: 'Phil Hodey' via Firebase Google Group <fireba...@googlegroups.com>
Date: Sat, May 4, 2019 at 2:03 AM
To: Firebase Google Group

Hi,

In native cloud pubsub you are able to define flowControl parameters in the subscriber...



Here you can set maxMessages to limit the amount of messages being pushed downstream.

In my use case I have a firebase function that is indexing firebase data into an elasticsearch cluster. When there is a bulk update in firebase it is causing the elastic indexer to max out. The simplest solution would be to reduce the flow of messages coming from pubsub using the flowControl.maxMessages but it does not look like we have any control over the configuration of the pubsub queue,

Is there anything I can do on pubsub.onPublish definition to control throughput?




--
You received this message because you are subscribed to the Google Groups "Firebase Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fireba...@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/9015fc3f-aea6-42e7-92dc-edc2e2e6f5c6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kato Richardson

unread,
May 8, 2019, 4:57:54 PM5/8/19
to Firebase Google Group
None : ) You've exhausted the depth of my knowledge here.  Let me see if I can find someone with more Functions chops to help out...

To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Phil Hodey

unread,
May 10, 2019, 6:21:39 AM5/10/19
to Firebase Google Group
Hey Kato,

Quick update for you, using stackdriver I could see that at peak flows the function that processed incoming pubsub events was scaling up to 200+ instances. I've limited the max number of instances to 50 and it has reduced the flow sufficiently to prevent downstream buffers overflowing. Watching the pub sub queue on stackdriver you can now see the queue building/draining in a nice orderly fashion.

As a request, it would be great to be able to expose max instances config via the firebase function spec instead of having to go into the console to change.


Kato Richardson

unread,
May 10, 2019, 12:06:19 PM5/10/19
to Firebase Google Group
Great stuff and thanks for this! Passing on to the experts to evaluate.

To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.

To post to this group, send email to fireba...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

David DeRemer

unread,
Dec 28, 2019, 10:59:30 AM12/28/19
to Firebase Google Group
Hi Phil, thanks for posting this. I was just researching controlling the impact on our RTDB utilization by setting max instances on a PubSub-triggered function. Basically, we have an integration with a third party and they can send us as much data in as many messages as they'd like. This occasionally creates spikes in the PubSub-triggered function that impacts our DB utilization.

So what I wanted to confirm (through your experience and perhaps if Kato received more input from the "experts")... that setting a "max instances" limit on a pubsub-triggered function will in fact restrict the flow, and most importantly that the message queue will simply build up until the functions can handle them. We definitely do NOT want to lose any messages.

We appreciate the help and the insights you've already provided in this thread!

Phil Hodey

unread,
Dec 28, 2019, 1:38:40 PM12/28/19
to Firebase Google Group
Hi David, I can confirm that this is working well as a workaround. We have a number of overnight processes that will generate a lot of pubsub messages and by limiting the number of instances that can run we are able to control the message flow. It is not ideal but it does work and is much simpler than the alternative so I'd certainly recommend giving it a go. I found it useful to setup stackdriver monitoring to view number of instances and pubsub queue build up. We have managed to fine tune the number of instances to optimise the throughput accordingly.

Preston Holmes

unread,
Dec 28, 2019, 2:13:23 PM12/28/19
to fireba...@googlegroups.com
When you are using max-instances, it is important to understand that it is for concurrency control, not rate-limiting.

If you set max instances to 4 - and the function takes very little time/CPU - it can still result in very high throughput.

Often for databases this is exactly what you want.  But for true rate limiting you might use  task-queues, or integrate an external rate limiter global for example via redis, which could let you use custom rate limiting keys such as per user.

I cover a few of these patterns with links to some concrete tutorial references in a blog post here

-Preston

To unsubscribe from this group and stop receiving emails from it, send an email to firebase-tal...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/firebase-talk/05015328-b05c-4dd9-8f8f-d363854e18e1%40googlegroups.com.


--
Preston Holmes | Cloud Platform Solutions | pt...@google.com | (805) 399-2661


Jerry Sumpton

unread,
Dec 28, 2019, 5:30:28 PM12/28/19
to Firebase Google Group
My solution is to use Cloud Run and an API based on Express.

I find it easier to develop with. I have routes, controllers and CRUD for all functions and develop using the API and migrate selected functions to Cloud Functions as it makes sense.

I also use Elasticsearch and it is easier to control flooding from a long running API.

Phil Hodey

unread,
Dec 28, 2019, 5:30:29 PM12/28/19
to Firebase Google Group
Hi Preston, yes I understand that, thanks for highlighting. For my needs it works but is not the perfect solution. Adding external rate limiters is overly complex for what we want to achieve. 
To unsubscribe from this group and stop receiving emails from it, send an email to fireba...@googlegroups.com.

David DeRemer

unread,
Dec 28, 2019, 8:30:18 PM12/28/19
to Firebase Google Group
Thanks everyone for the insight. Phil, we are definitely going to give this a shot and I really appreciate your help!

Preston, it's a good clarification for future readers that if the operation of your functions is very fast you still might get very high throughput with a max instances set. In our case we are trying to limit the impact on IO utilization of a Firebase RTDB instance, so I think this solution will work nicely.

Overall, for bursty background tasks that are async, idempotent, and not time sensitive to complete, controlling invocation concurrency via the number of max instances is a really interesting way to control the load.

As an aside... @Kato, if you see this, (or others) I was wondering what effect "max instances" would have on RTDB or Firestore triggered functions. If a handler function has a max number of instances, then my assumption would be that the triggers would just queue up and slowly drain just like with PubSub. Is that correct?

David DeRemer

unread,
Jan 9, 2020, 3:08:45 PM1/9/20
to Firebase Google Group
In another thread I got some insight from the Cloud Function team about how background triggers (e.g., database, storage, etc.) are handled when they are being invoked faster than the max instances can handle them. Linking here as I think it is relevant to this thread...

Reply all
Reply to author
Forward
0 new messages