Task rate limits?

291 views
Skip to first unread message

Emlyn

unread,
Jun 14, 2017, 1:58:18 AM6/14/17
to google-a...@googlegroups.com
I'm doing a lot with code that enqueues huge amounts of tasks (hundreds of thousands) in short periods of time (over a couple of minutes). Some tasks enqueue other tasks, in a fan-out.

I'm 99% sure that I'm hitting some kind of hidden rate limitation from time to time. The behaviour I'm seeing is that my queue stops processing any events suddenly, for a few minutes, might be for ten minutes. Then it picks up and continues like nothing happened.

Is there some kind of rate limit under the covers that I'm hitting?  

--
Emlyn

https://medium.com/the-infinite-machine - A publication about Google App Engine
sutllang.com - My language sUTL
https://plus.google.com/u/0/100281903174934656260 - Google+

Attila-Mihaly Balazs

unread,
Jun 14, 2017, 7:55:01 AM6/14/17
to Google App Engine
There seems to be some kind of "safety throttling" with the queues. Whenever I hit it, there is a yellow warning sign next to the queue in the cloud console: https://console.cloud.google.com/appengine/taskqueues and the tooltip says something about the queue being throttled to avoid impacting other customers.

So perhaps you could check your cloud console and see if there are any warnings whenever the queue seems to be throttled?

Attila

Emlyn

unread,
Jun 15, 2017, 1:31:54 AM6/15/17
to google-a...@googlegroups.com
I don't see any yellow tooltips. 


A bit more info: I'm not sure, but I think it applies to certain tasks, not the entire queue. So maybe under load some rare tasks get into a temporary error/stuck state?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/a0cff223-8a7d-4de3-bb4e-28db0d0ad17a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jordan (Cloud Platform Support)

unread,
Jun 15, 2017, 4:51:45 PM6/15/17
to Google App Engine
The limits on Task Queue calls can be found on the Quota page, specifically the 'Queue execution rate' being of importance. 

Too many tasks being executed on a single queue will inevitably result in underlying contention. This leads to the slow down of task execution (aka exponential-backoff retry) while tasks wait for resources to handle them. It is recommended to instead shard your tasks across multiple queues to get around this limit. You can also tweak your queue.yaml 'max_concurrent_requests' setting to prevent any single queue from hitting this limit. 

Additionally, it is always good to check your logs to ensure that you didn't see any errors during the gap, as you could be hitting other quotas limits (which would require time to refill).

Emlyn

unread,
Jun 15, 2017, 8:58:32 PM6/15/17
to google-a...@googlegroups.com
Thanks for that feedback Jordan, that's potentially really useful.

So maybe you are proposing that something like this is happening:

- I enqueue a massive amount of tasks very quickly (and those tasks enqueue more tasks etc).
- Inevitably, there is less capacity available (instances) than necessary. So some tasks get *individually* delayed (they fail to be handled, and go into exponential backoff).
- AppEngine keeps spinning up instances until the contention goes away
- But, some tasks may have failed to be scheduled enough times in the interim that they now appear "stuck" for a little while. What's actually going on is that they've been "backed off", and have a timestamp in the future when they will try to run.

So, it'll look like some tasks are getting stuck.

Is that right?

Further, and crucially to me, is this invisible via the console? ie: would these tasks be sitting there and not running, but I can't see any evidence of why? I'm asking this because this isn't, for instance, affecting the task's ETA (it might say something like "8 mins ago").
 



--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.

For more options, visit https://groups.google.com/d/optout.

Jordan (Cloud Platform Support)

unread,
Jun 16, 2017, 11:54:54 AM6/16/17
to google-a...@googlegroups.com
You are exactly correct in your interpretation.  

To clarify 'ETA', this is an option that can be specified when you add a task to a queue. If an ETA is not provided, it will be set to 'now'. The ETA designates the absolute earliest time a task should run, forcing workers to wait at least the 'ETA' before leasing the task. This is not an actual up-to-date dynamic ETA, but an option configured by the requester. Hence why you would see "8 mins ago" if the task hasn't been leased by a worker for 8 minutes past the ETA.

If you are looking for more alerting or monitoring for task queues, you can specify the details of the information you would like to see present via a Feature Request

 

Attila-Mihaly Balazs

unread,
Jun 27, 2017, 1:03:51 AM6/27/17
to Google App Engine
Just a quick note (and I'm sure that you're already aware of this, but for the benefit of others reading this): when "some tasks ... have failed [to be scheduled]" the task queue uses an exponential backoff algorithm, so the failed tasks will be delayed. See the configuration options in queue.yaml (https://cloud.google.com/appengine/docs/standard/python/config/queueref) - especially the "retry_parameters" and their default values.

If you browse your queue in the cloud console, you can see how many (if any) times a task has been retried. You can also check this from within the task by looking at the "X-AppEngine-TaskRetryCount" header.

Cheers,
Attila
Reply all
Reply to author
Forward
0 new messages