How to set up task queues to prevent bursts?

30 views
Skip to first unread message

Pol

unread,
Oct 31, 2011, 12:13:39 AM10/31/11
to Google App Engine
Hi,

Through a task queue, we manage calls from GAE to an external system,
which is fully-scalable but takes minutes to do so. The task queue
rate is therefore set to 200/s.

So what happens is that sometimes we get bursts of activity, and the
task queue sends 200 requests at once: the vast majority fails in the
external system as it can't handle this sudden request in load. But
the external system doesn't start scaling up as there are no more
requests coming and it's scaling based on the CPU load. Then suddenly
you get another burst of requests from the task queue retrying, and
then again the majority fails and so on.

So how can we configure a task queue to have a very high rate *but*
prevent it to get to this rate too fast? I noticed the bucket size
parameter, but I'm not sure how to use it properly.

Thanks,

- Pol

Nicholas Verne

unread,
Oct 31, 2011, 12:22:27 AM10/31/11
to google-a...@googlegroups.com
If your external system is called synchronously by your tasks, you could try setting the queue's max_concurrent_requests parameter in queue.yaml/xml

This is documented for Java, but the same general rules apply for Python or Go applications.


Nick Verne




- Pol

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.


Pol

unread,
Oct 31, 2011, 1:07:13 AM10/31/11
to Google App Engine
Sorry, I realize I didn't explain everything: since the external
system can handle an arbitrary load (as long as this load doesn't grow
too fast) and requests get processed in 1 to 2 seconds, we already set
the processing rate to the maximum (200), as well as the concurrent
(200).

The problem is not to limit the number of requests being handled, but
preventing bursts e.g. scaling up smoothly.

On Oct 30, 9:22 pm, Nicholas Verne <nve...@google.com> wrote:
> If your external system is called synchronously by your tasks, you could
> try setting the queue's max_concurrent_requests parameter in queue.yaml/xml
>
> This is documented for Java, but the same general rules apply for Python or
> Go applications.http://code.google.com/appengine/docs/java/config/queue.html#Defining...

Jeff Schnitzer

unread,
Oct 31, 2011, 1:13:38 AM10/31/11
to google-a...@googlegroups.com
Interesting problem.  While the sudden load may be undesirable it seems that the real problem is that the task queue backoff is too aggressive - if it kept trying, it would eventually spin up enough hardware at the external system.

You can configure the retry schedule explicitly - maybe try setting it so that requests are retried more often?  You'll still get the initial errors but at least you'll get the queues cleared eventually.

Jeff

Pol

unread,
Oct 31, 2011, 1:21:14 AM10/31/11
to Google App Engine
Good point, forcing the tasks to retry every few seconds instead of at
an increasing back off might do it. Keep pounding the system to force
it to scale, but it's not very elegant :)

My original idea was to play with the "bucket size" parameter for the
queue, as the docs seem to imply that it allows to control bursts, but
it's not clear at all. I reduced it to 1, but I don't understand what
it means in practice if max rate and max concurrency are at 200.

Dale

unread,
Oct 31, 2011, 4:11:02 PM10/31/11
to Google App Engine
What about using pull queues? Your external system could poll the
queues for work, and scale based on the number of items in the queue.
That would give you the most control. If you didn't want to poll, you
could set up a push notification to your external system, telling it
that there is new work available.
Reply all
Reply to author
Forward
0 new messages