Max number of concurrent requests in manual scaling

329 views
Skip to first unread message

Jungho Ahn

unread,
Aug 3, 2016, 10:52:32 PM8/3/16
to Google App Engine
Hello,

I'm trying to do load tests on my app engine but I couldn't get more than about 8 qps.
Each request has long latency w/ external accesses and the cpu utilization is very low.
Is there any limitation on the max active requests? The app is running in manual scaling.

Thanks,


Adam (Cloud Platform Support)

unread,
Aug 5, 2016, 4:30:47 PM8/5/16
to Google App Engine
What runtime are you using? Are you using the Standard or Flexible environment? Could you post some details from your app.yaml / appengine-web.xml?

Jungho Ahn

unread,
Aug 5, 2016, 4:41:21 PM8/5/16
to google-a...@googlegroups.com
We're running on Flexible. Here is my app.yaml:

service: ranking
runtime: python
runtime_config:
    python_version: 2
vm: true
entrypoint: gunicorn -b :$PORT main:app

resources:
  cpu: 32
  memory_gb: 120

manual_scaling:
  instances: 3

We're testing various configurations of gunicorn like

gunicorn -b :$PORT --threads=64 main:app.

Until now we could get QPS around 28 with CPU utilization 70% and we're trying to push more.
I'm just wondering if there is any hard limit on in-flight requests.


--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/xzVWKR-ulQ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/d5f7bdb4-177f-4c7c-8624-4282de3731fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Jungho Ahn

Adam (Cloud Platform Support)

unread,
Aug 9, 2016, 4:51:23 PM8/9/16
to Google App Engine
By default you're only running gunicorn executing one serial request at a time. At a minimum you should specify the number of workers eg.

entrypoint: gunicorn -b :$PORT -w 8 main:app

There doesn't appear to be a '--threads' option to gunicorn. See the docs for more information about tuning the number of workers and setting the worker class. The FAQ recommends 2-4 workers per virtual core.

On Friday, August 5, 2016 at 4:41:21 PM UTC-4, Jungho Ahn wrote:
We're running on Flexible. Here is my app.yaml:

service: ranking
runtime: python
runtime_config:
    python_version: 2
vm: true
entrypoint: gunicorn -b :$PORT main:app

resources:
  cpu: 32
  memory_gb: 120

manual_scaling:
  instances: 3

We're testing various configurations of gunicorn like

gunicorn -b :$PORT --threads=64 main:app.

Until now we could get QPS around 28 with CPU utilization 70% and we're trying to push more.
I'm just wondering if there is any hard limit on in-flight requests.

Reply all
Reply to author
Forward
0 new messages