Task Queue on backend runs much slower than it's configured to

160 views
Skip to first unread message

pdknsk

unread,
Aug 15, 2011, 7:22:55 PM8/15/11
to Google App Engine
I've got a dynamic backend just for sending mail (billing enabled).
Mails are enqueued elsewhere, not on this backend, practically
instantly. The queue is configured like this.

- name: mail
rate: 360/m
bucket_size: 40
retry_parameters:
task_age_limit: 1d
min_backoff_seconds: 20

And the dashboard figures it out like this.

mail 360/m 0.75/s 40.0

Slower than configured at 0.75/s enforced rate, but still acceptable,
because mails are send much slower.

2011-08-15 16:04:28.329 /mail/ 200 94ms 0cpu_ms 0kb instance=0
2011-08-15 16:04:08.254 /mail/ 200 26ms 23cpu_ms 0kb instance=0
2011-08-15 16:03:48.250 /mail/ 200 30ms 23cpu_ms 0kb instance=0
2011-08-15 16:03:28.258 /mail/ 200 44ms 0cpu_ms 0kb instance=0
2011-08-15 16:02:30.802 /mail/ 200 317ms 0cpu_ms 0kb instance=0
2011-08-15 16:02:30.770 /_ah/start 404 89ms 70cpu_ms 0kb instance=0

The previous run worked better, but you'll still notice the huge delay
between the first and second mail.

2011-08-15 13:02:27.016 /mail/ 200 53ms 0cpu_ms 0kb instance=0
2011-08-15 13:02:26.676 /mail/ 200 45ms 0cpu_ms 0kb instance=0
2011-08-15 13:02:26.545 /mail/ 200 248ms 23cpu_ms 0kb instance=0
2011-08-15 13:02:25.833 /mail/ 200 217ms 0cpu_ms 0kb instance=0
2011-08-15 13:01:50.969 /mail/ 200 347ms 0cpu_ms 0kb instance=0
2011-08-15 13:01:50.931 /_ah/start 404 200ms 70cpu_ms 0kb instance=0

What could be the reason? There are no errors on the backend in the
past 7 days.

pdknsk

unread,
Aug 15, 2011, 8:03:52 PM8/15/11
to Google App Engine
And the next run.

2011-08-15 17:02:28.509 /mail/ 200 32ms 23cpu_ms 0kb instance=0
2011-08-15 17:02:25.841 /mail/ 200 28ms 0cpu_ms 0kb instance=0
2011-08-15 17:02:23.184 /mail/ 200 38ms 0cpu_ms 0kb instance=0
2011-08-15 17:02:20.505 /mail/ 200 124ms 0cpu_ms 0kb instance=0
2011-08-15 17:01:46.807 /mail/ 200 405ms 0cpu_ms 0kb instance=0
2011-08-15 17:01:46.778 /_ah/start 404 83ms 70cpu_ms 0kb instance=0

pdknsk

unread,
Aug 15, 2011, 8:12:04 PM8/15/11
to Google App Engine
Sorry, I've figured out the delay between the first and second mail.
Both get enqueued from different functions, which can run about 30
seconds apart. Haven't considered that. The relatively low rate from
the latest run and the very low rate from the first run is an actual
problem though.

pdknsk

unread,
Aug 15, 2011, 9:07:02 PM8/15/11
to Google App Engine
Just for good measure, the next run. I'll leave it at that. Not as
slow, but clearly much slower than it could and should.

2011-08-15 18:02:18.611 /mail/ 200 141ms 0cpu_ms 0kb instance=0
2011-08-15 18:02:15.834 /mail/ 200 24ms 0cpu_ms 0kb instance=0
2011-08-15 18:02:13.164 /mail/ 200 26ms 0cpu_ms 0kb instance=0
2011-08-15 18:02:10.790 /mail/ 200 320ms 0cpu_ms 0kb instance=0
2011-08-15 18:02:08.009 /mail/ 200 217ms 23cpu_ms 0kb instance=0
2011-08-15 18:02:07.980 /_ah/start 404 76ms 46cpu_ms 0kb instance=0

Nick Johnson

unread,
Aug 16, 2011, 2:22:08 AM8/16/11
to google-a...@googlegroups.com
Your task queue can only send tasks to your backend as fast as the backend can execute them, and unlike a task queue, it can't dynamically spin up more instances to handle the load. Setting your rate to 360/m will only work if your backend can handle 6 requests a second (eg, latency less than 166ms).

Why aren't you just using a task queue for this?

-Nick Johnson


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--
Nick Johnson, Developer Programs Engineer, App Engine


pdknsk

unread,
Aug 17, 2011, 12:08:15 PM8/17/11
to Google App Engine
It's currently mostly for organisational purposes, to have separate
logs and to notice spikes in the dashboard graph better. The backend
can probably handle 360/m because as you notice from the logs it takes
much less than 100ms to send the mail out. The problem though is that
it sends out much much slower by setting a low enforced rate even
though there were no errors on the backend in the past 7 days.

And in a related matter: I've uploaded new code yesterday and there
was a syntax error which I didn't notice. The backend got constant
start requests at a very fast rate.

2011-08-17 08:46:32.180 /_ah/start 500 70ms 70cpu_ms 0kb instance=0
2011-08-17 08:46:32.076 /_ah/start 500 71ms 70cpu_ms 0kb instance=0
2011-08-17 08:46:31.966 /_ah/start 500 66ms 46cpu_ms 0kb instance=0
2011-08-17 08:46:31.864 /_ah/start 500 69ms 46cpu_ms 0kb instance=0
2011-08-17 08:46:31.766 /_ah/start 500 77ms 46cpu_ms 0kb instance=0

Maybe this should be throttled?

And once the error was fixed and the pending mails were send out, the
backend set a very low rate of about one mail every 30 seconds. This
seems like a mistake to me because it wasn't the task (or class) that
failed, which justifies to progressively lower the task rate as it's
currently done, but rather the backend itself.

pdknsk

unread,
Aug 17, 2011, 12:32:11 PM8/17/11
to Google App Engine
Actually I just checked the dashboard and noticed that the backend got
52712 failed start requests in about 20 hours.

pdknsk

unread,
Aug 18, 2011, 4:24:02 AM8/18/11
to Google App Engine
It seems like the enforced rate situation has improved.

2011-08-17 23:02:10.404 /mail/ 200 21ms 0cpu_ms 0kb instance=0
2011-08-17 23:02:10.270 /mail/ 200 221ms 0cpu_ms 0kb instance=0
2011-08-17 23:02:09.739 /mail/ 200 24ms 0cpu_ms 0kb instance=0
2011-08-17 23:02:09.407 /mail/ 200 26ms 0cpu_ms 0kb instance=0
2011-08-17 23:02:09.141 /mail/ 200 439ms 0cpu_ms 0kb instance=0
2011-08-17 23:02:09.117 /_ah/start 404 303ms 93cpu_ms 0kb instance=0

2011-08-17 22:02:27.251 /mail/ 200 26ms 0cpu_ms 0kb instance=0
2011-08-17 22:02:26.919 /mail/ 200 28ms 0cpu_ms 0kb instance=0
2011-08-17 22:02:26.638 /mail/ 200 80ms 0cpu_ms 0kb instance=0
2011-08-17 22:02:26.258 /mail/ 200 26ms 0cpu_ms 0kb instance=0
2011-08-17 22:02:26.108 /mail/ 200 227ms 23cpu_ms 0kb instance=0
2011-08-17 22:02:26.081 /_ah/start 404 91ms 46cpu_ms 0kb instance=0

Could it be that factors outside of the app, like high server load,
can influence task rate on backends?

pdknsk

unread,
Aug 22, 2011, 5:27:07 PM8/22/11
to Google App Engine
Any further insight from Googlers on this? It mostly settles at
several mails per second now, which is good enough albeit less than
configured, but there are occasionally still huge delays without any
apparent reason.

2011-08-22 01:03:57.052 /mail/ 200 92ms 0cpu_ms 0kb instance=0
2011-08-22 01:03:36.984 /mail/ 200 29ms 23cpu_ms 0kb instance=0
2011-08-22 01:03:07.315 /mail/ 200 38ms 0cpu_ms 0kb instance=0
2011-08-22 01:02:07.805 /mail/ 200 539ms 0cpu_ms 0kb instance=0
2011-08-22 01:02:07.028 /mail/ 200 95ms 23cpu_ms 0kb instance=0

I've filed a bug report on the other problem.

http://code.google.com/p/googleappengine/issues/detail?id=5669

pdknsk

unread,
Aug 23, 2011, 4:08:05 AM8/23/11
to Google App Engine
It can't be a coincidence that it sends one mail every 20 seconds,
which happens to match min_backoff_seconds? It's not the first time I
noticed this either.

2011-08-23 01:03:14.844 /mail/ 200 28ms 0cpu_ms 0kb instance=0
2011-08-23 01:02:54.841 /mail/ 200 31ms 23cpu_ms 0kb instance=0
2011-08-23 01:02:34.832 /mail/ 200 28ms 23cpu_ms 0kb instance=0
2011-08-23 01:02:14.826 /mail/ 200 28ms 0cpu_ms 0kb instance=0
2011-08-23 01:02:03.220 /mail/ 200 377ms 0cpu_ms 0kb instance=0
2011-08-23 01:02:03.188 /_ah/start 404 97ms 46cpu_ms 0kb instance=0

I checked enforced rate and it's at 1.5/s in the dashboard so it
should send much faster.

Robert Kluin

unread,
Aug 23, 2011, 10:21:55 AM8/23/11
to google-a...@googlegroups.com
The problem is that your backend is not executing requests fast enough, right?

Have you filed a production issue ticket?

Robert

pdknsk

unread,
Aug 26, 2011, 11:50:57 AM8/26/11
to Google App Engine
I haven't, but maybe I should.

I reduced the task rate to 120/m to put less pressure on the task
queue (or however this might work) - to at least reliably get this
lower rate, but it thanked me by setting the enforced rate to 0.5/
s! :) So that didn't work.

I've been wondering: maybe the backend doesn't like frequent
deployments? I haven't made any code changes to this backend, but my
deployment script updates frontend and backends automatically.

Robert Kluin

unread,
Aug 28, 2011, 2:17:46 AM8/28/11
to google-a...@googlegroups.com
If you file a production issue I'd like to know the issue number. I'd really like to know the cause and resolution of this issue.

Robert
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>
>

--
------
Robert Kluin
Ezox Systems, LLC




pdknsk

unread,
Aug 31, 2011, 6:10:38 AM8/31/11
to Google App Engine
I haven't been able to figure out what the reason is. It seems to have
improved slightly to about 3/s but it still had 20 seconds delays
occasionally for no apparent reason. I've since moved mail sending to
a named version (rather than backend), and it works great. It can
easily send 6/s, with one instance.

Bogdan Nourescu

unread,
Jul 24, 2012, 4:07:31 PM7/24/12
to google-a...@googlegroups.com, pdk...@googlemail.com
I've designed a mail sender using java and front-end instances that can send ~70-80 mails /second  (i sometimes over-hit that 4900 mails/minute that is enforce by Google)(Using 5 front-end instances and thread-safe activated)
If you are not send mails single thread i suggest you read about another but I've encountered (http://code.google.com/p/googleappengine/issues/detail?id=5769 - appears when you try to send 2 mails in the exact same millisecond or something like that)
I've seen that 20 seconds delay in some of my tasks execution, but only in queues that have a low number of tasks in them (1-5).

Jeff Schnitzer

unread,
Jul 24, 2012, 4:52:03 PM7/24/12
to google-a...@googlegroups.com
This sounds related to a thread I started a couple weeks ago:

https://groups.google.com/d/topic/google-appengine/rAmZi6a8ZaI/discussion

If I understand my results correctly, every request to a backend has
an extra 100+ms added to it in transit (before your code starts
executing). It could actually be considerably worse; doing a urlfetch
from a frontend to a backend (no-op) typically takes hundreds of
milliseconds. Combined with doing actual work per request, this could
explain low throughput numbers.

It would make sense that the task queue is overloading the backend and
therefore failing a lot of requests. These failures will cause the
task queue to back off, producing the longer delays you see.

This really isn't a good application for a backend. If you want to
split logs out, put your processing on a separate frontend version.
Alternatively, you could convert your backend push-queue to a
pull-queue... but it really sounds like a job for a properly scaled
and load-balanced frontend. If you are worried about spinning up too
many instances, throttle the queue with max-concurrent-requests.

Jeff
Reply all
Reply to author
Forward
0 new messages