2nd Gen GAE Python 3.7 memory usage

592 views
Skip to first unread message

Adam Lugowski

unread,
Sep 14, 2018, 6:38:26 PM9/14/18
to Google App Engine
I have an app that works well in the python37 runtime.

However I keep running into memory usage problems. When I profiled my app running locally it uses about 115 MB and usage remains flat when processing requests. On the F1 instance I can understand that individual requests could bump over 128 MB. However, going up instance tiers doesn't help. Even F4 runs out of memory at nearly the same rate as F1, even though local profiling shows memory usage should never be anywhere near as high as AppEngine reports in its log message.

I have two questions:
 - How do I profile my AppEngine app so I can find out what is eating memory? Local profiling apparently has little relation to what happens on AppEngine.
 - Why does using a larger instance not help?

Adam Lugowski

unread,
Sep 14, 2018, 6:47:41 PM9/14/18
to Google App Engine
The only theory I have is informed by these log messages on startup:

A  [2018-09-14 19:28:38 +0000] [1] [INFO] Starting gunicorn 19.9.0
A  [2018-09-14 19:28:38 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
A  [2018-09-14 19:28:38 +0000] [1] [INFO] Using worker: threads
A  [2018-09-14 19:28:38 +0000] [8] [INFO] Booting worker with pid: 8
A  [2018-09-14 19:28:38 +0000] [9] [INFO] Booting worker with pid: 9
A  [2018-09-14 19:28:38 +0000] [10] [INFO] Booting worker with pid: 10
A  [2018-09-14 19:28:38 +0000] [11] [INFO] Booting worker with pid: 11
A  [2018-09-14 19:28:38 +0000] [12] [INFO] Booting worker with pid: 12
A  [2018-09-14 19:28:38 +0000] [13] [INFO] Booting worker with pid: 13
A  [2018-09-14 19:28:39 +0000] [14] [INFO] Booting worker with pid: 14
A  [2018-09-14 19:28:39 +0000] [15] [INFO] Booting worker with pid: 15



It looks like multiple workers are started, so maybe the fixed memory overhead is just duplicated by each worker.
This is also supported by the fact that some of my requests load matplotlib. Locally this makes a slow first request then fast on subsequent ones. On GAE F4 the first few are slow, then it stays fast, as if multiple processes have to be warmed up.
If this is true, then going to a larger instance is poor advice; doubling memory size also doubles CPU count which doubles fixed overhead and we're back at square one.

rah...@google.com

unread,
Sep 16, 2018, 9:59:38 AM9/16/18
to Google App Engine
Hello Adam,
  Could you paste your app.yaml? Are you explicitly configuring the number of workers?
Thanks,
~Rahul.

Adam Lugowski

unread,
Sep 17, 2018, 1:30:06 PM9/17/18
to Google App Engine
Hi Rahul,

I'm not manually configuring workers. Happy to learn of how to do that. My app.yaml is simple:

runtime: python37
service : xyz

inbound_services:
- warmup

instance_class: F4

handlers:
....

Phillip Pearson

unread,
Sep 17, 2018, 4:25:55 PM9/17/18
to Google App Engine
Hi Adam,

Here's a few ideas for diagnosis:

1. Can you try printing or logging repr(os.environ) and checking for the GAE_MEMORY_MB environment variable?  I just tried deploying a test app with instance_class: F4 and service: xyz, and print(repr(os.environ)) gives me something like this:

  environ({'GAE_MEMORY_MB': '512', 'GAE_INSTANCE': ...etc etc...})

The python37 runtime *should* start one worker if GAE_MEMORY_MB <= 128, two if <= 256, four if <= 512, otherwise eight.

If you see 'GAE_MEMORY_MB': '512' in there but anything other than four workers starting up, I'm extremely interested in figuring out why :)  Likewise if you see anything other than 512 for an F4 instance.

2. Check that you're looking at the logs for the correct service -- make sure you see "GAE Application, xyz" and not "GAE Application, Default Service" or just "GAE Application" (which combines logs from all services) in the logs viewer.

3. Once you've done those two, if nothing sticks out, you can manually configure the worker count by adding a line like this to app.yaml:

  entrypoint: gunicorn main:app --workers 2 -c /config/gunicorn.py

This will start up two workers.  You can also configure the number of threads per worker:

  entrypoint: gunicorn main:app --threads 8 --workers 2 -c /config/gunicorn.py

(The default if you just using 'gunicorn main:app -c /config/gunicorn.py' is four workers with four threads each.)

Cheers,
Phil

Adam Lugowski

unread,
Sep 20, 2018, 5:12:15 PM9/20/18
to Google App Engine
Thanks for the suggestions, Phillip.

Yes the memory is as expected. This was never in doubt, the out-of-memory message says both usage and the quota. The quota matches GAE_MEMORY_MB. The problem is why the usage is so high, not that the quota isn't changing.

Thank you for the command to use. Google's docs didn't make it clear that this is how you control parallelism. After adding gunicorn to requirements.txt, the following now helps:

entrypoint: gunicorn main:app --threads 4 --workers 1 -c /config/gunicorn.py

It only solves half the problem. This now no longer crashes an F4. It does still crash an F2, with 270 MB usage. Locally it's still only 115MB.

Phillip Pearson

unread,
Sep 20, 2018, 7:44:46 PM9/20/18
to google-a...@googlegroups.com
Hi Adam,

How are you measuring memory usage locally?  If you're looking at the resident size (RSS in ps, RES in top), that's only a subset of the actual memory used by your application, so you should expect the memory usage reported in the App Engine console to be higher than what you see locally -- it includes kernel memory specific to your app, mapped file data, and some other bits and pieces.

As for the entrypoint line, we haven't advertised this usage too widely because we hope the defaults will work for most people -- and just in case we come up with a better way to do it :)

Cheers,
Phil

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/1c80dd99-9b67-41c2-aec2-f1c3492b6a7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adam Lugowski

unread,
Sep 27, 2018, 1:33:56 PM9/27/18
to Google App Engine
I'm using a Python profiler that appears to report resident size.

I understand what you're saying that there is more to memory usage than just the app. But I hope you can understand my frustration: AppEngine provides no tooling or metrics to help track down what is the root of a problem that is not reproducible anywhere else. Maybe Googlers have some internal metrics and/or profiling tools, because clearly the runtime is monitoring memory usage.

Even something as simple as a plot of the app's memory usage over time would be useful. I can make changes to the app, monitor requests, and see which request is causing a spike.

As of now, it looks like I'm just supposed to guess if the issue lies in the app, the kernel, memory mappings, or other bits and pieces.

rah...@google.com

unread,
Sep 27, 2018, 1:44:35 PM9/27/18
to Google App Engine
Hello Adam,
  We have a Stackdriver based profiler in development(soon to be in alpha) which may help. I have forwarded your thread to them as input as well.

Until then, could you leverage the memory usage dashboard in the cloud console from App Engine -> Dashboard -> [Select Memory Usage from the dropdown] to help?
Thanks,
~Rahul
Reply all
Reply to author
Forward
0 new messages