Python 2.7: Instance memory limitations with concurrent requests

Andrin von Rechenberg

unread,

Feb 20, 2012, 5:15:13 AM2/20/12

to google-a...@googlegroups.com

Hey there

I'm planning to migrate to Python 2.7.

I'm now running on Python2.5. There is one kind of request

(lets call it people-search) that uses A LOT of ram.

From time to time instances get killed because they exceed

the allowed memory allocation. These requests do a lot of

RPCs. One people search uses about 100mb of ram.

In Python 2.7 I would guess that one instance will handle

many people-searches at the same time, because they

mostly wait for RPCs. That means that an instance

will go out of memory much more often, because every

pending request will use 100mb. You handle 2 at the same

time, an F1 instance dies. I'm afraid that I will end up in a

crash loop pretty soon.

Am I right in my assumption?

Is there a way to limit the amount of people-searches

that one instance handles at the same time?

Like threadsafe:false for a certain handler?

The only solution I see at the moment is to deploy 2 versions

of my app. The main version with threadsafe enabled and

a special version with threadsafe disabled.

The main app would then not handle people-searches directly,

but do a urlfetch to the threadsafe disabled version for

people-searches. This way one instance would always

only do one people-search because threadsafe is disabled.

But it's soooo ugly from a maintenance point of view.

Really ugly.

Any better suggestions?

Cheers,

-Andrin

Johan Euphrosine

unread,

Feb 20, 2012, 5:20:57 AM2/20/12

to google-a...@googlegroups.com

Hi Andrin,

Have you considered using a dynamic B2 backend?

If the request originate from a form you can even address the backend by name directly with:

backendname.appid.appspot.com

Hope that helps.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

--
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

Brandon Wirtz

unread,

Feb 20, 2012, 5:23:27 AM2/20/12

to google-a...@googlegroups.com

Soft memory limit, you can exceed the memory limit for several seconds. I have seen one instance consume 300+ M of memory in bursts. On F1.

But you should probably use a queue system and a back end instance or Differed Task and then you could pace those large memory requests, to prevent consuming lots of ram.

Brandon Wirtz
BlackWaterOps: President / Lead Mercenary

Work: 510-992-6548
Toll Free: 866-400-4536

IM: dra...@gmail.com (Google Talk)
Skype: drakegreene
YouTube: BlackWaterOpsDotCom

BlackWater Ops

Cloud On A String Mastermind Group

--

image001.jpg

Andrin von Rechenberg

unread,

Feb 20, 2012, 5:25:20 AM2/20/12

to google-a...@googlegroups.com

I guess that's the same solution as just deploying two different

versions. A threadsafe one and a non threadsafe one. Or did

I misunderstand you?

My main concern is having two versions running.

@Brandon: Do you know if an instance is killed during handling requests or afterwards?

And what happens with if there are concurrent requests?

Cheers,

-Andrin

Johan Euphrosine

unread,

Feb 20, 2012, 5:33:20 AM2/20/12

to google-a...@googlegroups.com

On Mon, Feb 20, 2012 at 11:25 AM, Andrin von Rechenberg <and...@miumeet.com> wrote:

I guess that's the same solution as just deploying two different
versions. A threadsafe one and a non threadsafe one. Or did
I misunderstand you?

appcfg.py provide commands to help you manage your backends deployment and configuration, with `backends.yaml` and `appcfg backends update`.

That make the solution more convenient than using multiple versions on frontend instances.

Robert Kluin

unread,

Feb 21, 2012, 2:37:42 AM2/21/12

to google-a...@googlegroups.com

Hey Andrin,
If you hit a hard limit it will be killed immediately, no exception,
no nothing, just dead. If you hit a soft limit it will be killed
after the request finishes. You'll have no way within the app to know
that either case happened.

I view this as a very serious (security) problem within tasks, for
some more info:
http://code.google.com/p/googleappengine/issues/detail?id=6616

Also, please star that. ;)

Robert

Jeff Schnitzer

unread,

Feb 21, 2012, 9:36:11 AM2/21/12

to google-a...@googlegroups.com

On Mon, Feb 20, 2012 at 5:33 AM, Johan Euphrosine <pro...@google.com> wrote:

On Mon, Feb 20, 2012 at 11:25 AM, Andrin von Rechenberg <and...@miumeet.com> wrote:

I guess that's the same solution as just deploying two different
versions. A threadsafe one and a non threadsafe one. Or did
I misunderstand you?

appcfg.py provide commands to help you manage your backends deployment and configuration, with `backends.yaml` and `appcfg backends update`.

That make the solution more convenient than using multiple versions on frontend instances.

This adds the significant downside of no auto-scaling. I don't see how that's more convenient :-)

Andrin: In javaland I would simply synchronize the people-search function so that at most one thread can execute that routine in a single instance at once. It means all people-search requests in that instance will queue up in serial, which could cause undesirable waits if the function takes significant time, but if the requests are spread out among enough instances it probably won't be an issue.

I don't know what the python equivalent of 'synchronized' is.

Jeff

Johan Euphrosine

unread,

Feb 21, 2012, 9:57:21 AM2/21/12

to google-a...@googlegroups.com

On Tue, Feb 21, 2012 at 3:36 PM, Jeff Schnitzer <je...@infohazard.org> wrote:

On Mon, Feb 20, 2012 at 5:33 AM, Johan Euphrosine <pro...@google.com> wrote:

On Mon, Feb 20, 2012 at 11:25 AM, Andrin von Rechenberg <and...@miumeet.com> wrote:

I guess that's the same solution as just deploying two different
versions. A threadsafe one and a non threadsafe one. Or did
I misunderstand you?

appcfg.py provide commands to help you manage your backends deployment and configuration, with `backends.yaml` and `appcfg backends update`.

That make the solution more convenient than using multiple versions on frontend instances.

This adds the significant downside of no auto-scaling. I don't see how that's more convenient :-)

Very true, I guess I just wanted to point out that the tooling makes it more convenient to deploy multiple backends than different frontend versions (feature request anyone?).

Andrin: In javaland I would simply synchronize the people-search function so that at most one thread can execute that routine in a single instance at once. It means all people-search requests in that instance will queue up in serial, which could cause undesirable waits if the function takes significant time, but if the requests are spread out among enough instances it probably won't be an issue.

I don't know what the python equivalent of 'synchronized' is.

@synchronized?

http://wiki.python.org/moin/PythonDecoratorLibrary#Synchronization

Jeff

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Robert Kluin

unread,

Feb 21, 2012, 11:10:16 AM2/21/12

to google-a...@googlegroups.com

To be clear, the scheduler can still dispatch multiple requests for
that url to the same instance. Only one request will execute at a
time though. One of the biggest issues is that the Python runtime
leaks memory like mad, so in this case you may wind up with 1)
increased latency and 2) still blowing mem limits fast.

Robert

alex

unread,

Feb 21, 2012, 11:33:36 AM2/21/12

to google-a...@googlegroups.com

> One of the biggest issues is that the Python runtime leaks memory like mad

could you elaborate on this? or give some references that support your claims (provided an app code is written according to best practices, etc)

On Tuesday, February 21, 2012 5:10:16 PM UTC+1, Robert Kluin wrote:

To be clear, the scheduler can still dispatch multiple requests for
that url to the same instance. Only one request will execute at a
time though. One of the biggest issues is that the Python runtime
leaks memory like mad, so in this case you may wind up with 1)
increased latency and 2) still blowing mem limits fast.

Robert

On Tue, Feb 21, 2012 at 09:36, Jeff Schnitzer wrote:

> On Mon, Feb 20, 2012 at 5:33 AM, Johan Euphrosine wrote:
>>
>> On Mon, Feb 20, 2012 at 11:25 AM, Andrin von Rechenberg

Mike Wesner

unread,

Feb 21, 2012, 3:22:00 PM2/21/12

to Google App Engine

I would love to learn more from google on python memory handling. As
Robert mentioned, we have observed that memory is not released/garbage
collected on python 2.5 instances. It seems to just hold on to it.
This works because the instances don't live forever and eventually get
shutdown and new ones take their place.

-Mike

alex

unread,

Feb 21, 2012, 4:13:15 PM2/21/12

to google-a...@googlegroups.com

The whole point of this topic was python27 runtime, multithreading and concurrent requests. Your specific case, plus python 2.5, doesn't necessarily means memory leaks in the runtime itself. I'd profile my code that handles most frequently accessed URLs to start off.

Mike Wesner

unread,

Feb 21, 2012, 5:50:37 PM2/21/12

to Google App Engine

I know that, but I thought it might still be relavant.

Brian Quinlan

unread,

Feb 21, 2012, 6:42:13 PM2/21/12

to google-a...@googlegroups.com

Hi Mike,

On Wed, Feb 22, 2012 at 7:22 AM, Mike Wesner <mike....@webfilings.com> wrote:
> I would love to learn more from google on python memory handling. As
> Robert mentioned, we have observed that memory is not released/garbage
> collected on python 2.5 instances. It seems to just hold on to it.
> This works because the instances don't live forever and eventually get
> shutdown and new ones take their place.

We haven't changed Python's built-in memory management in either of
our Python runtimes.

Note that, even excluding cyclic garbage collection, Python and the C
allocator do not always release memory eagerly - they maintain their
own memory pools for performance reasons.

So, for example, you can't assume that an instance consuming 100MB of
memory only has 28MB remaining to handle requests.

Cheers,
Brian

The memory management used in the App Engine Python 2.5 and 2.7 runti

Robert Kluin

unread,

Feb 22, 2012, 3:08:05 AM2/22/12

to google-a...@googlegroups.com

Hey Alex,
I should probably have stated this better as memory is not always
handled well. For example, the ext.db code keeps many copies of the
data in various forms. This can cause rapid and unexpected memory
blowups, and the result is something that appears similar to a memory
leak. As Brian noted, this is partially caused by Python's handling
of memory.

However, there are a number of scenarios where you get real memory
leaks. For example there have recently been several posts / issues
from people having issues with the blobstore leaking memory. Some of
these are quite detailed and the repro code is very simple. If I'm
not mistaken, in the past, we've observed this happening with heavy
datastore use as well, though I don't have simple repro cases for
those.

Robert

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.

> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/ZyB9_mGBPDQJ.

Andrin von Rechenberg

unread,

Feb 24, 2012, 4:38:53 AM2/24/12

to google-a...@googlegroups.com

It turns out that the requests with high ram requirements are so well distributed

that I'm not running out of memory. So my initial concerns (the first mail in this