Instance maximum number of requests

510 views
Skip to first unread message

nickmilon

unread,
Oct 27, 2010, 4:50:56 PM10/27/10
to Google App Engine
I was doing some load tests on app engine today when I noticed a new
Info message in the logs: "After handling this request, the process
that handled this request reached the maximum number of requests that
may be handled in a single process' lifetime, and exited normally."

So what that supposed to mean ?
Up to know we new that application instances are automatically
terminated after some inactivity time out. If I understand this
message well now we know that a process can be terminated after
handling so many requests. How many exactly ? is this a new magic
number ? Lets hope we will have some answers from the always helpful
App Engines team.

Ikai Lan (Google)

unread,
Oct 27, 2010, 7:50:02 PM10/27/10
to google-a...@googlegroups.com
Yes. Application instances are meant to be relatively short lived. Once each instance has served a certain amount of requests, we will gracefully terminate it and spin up a new instance to take its place. The number of requests to trigger this limit is subject to change for performance tuning reasons, but it should be in the ballpark of tens of thousands of requests.

--
Ikai Lan 
Developer Programs Engineer, Google App Engine




--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.


supercobra

unread,
Oct 27, 2010, 9:13:27 PM10/27/10
to google-a...@googlegroups.com
Just curious, do you guys retire instances after around 10k requests
to avoid memory leaks?
--
super...@gmail.com

David Parks

unread,
Oct 27, 2010, 11:06:28 PM10/27/10
to google-a...@googlegroups.com

Does appengine do this in a way which is safe to a heavily-loaded application? i.e. load the new instance while the old instance is still serving requests and then simply re-direct requests to the new instance?

Something like this should really be documented.

Ikai Lan (Google)

unread,
Oct 28, 2010, 2:33:50 PM10/28/10
to google-a...@googlegroups.com
Yes, this is done gracefully. I agree about more documentation about our serving infrastructure. I'd like to describe more general principles, however, such as to design statelessly, applications can be loaded/unloaded at any time, etc, as implementation details are likely to change. Let me revisit our current docs and see if there's a way we can improve them.


--
Ikai Lan 
Developer Programs Engineer, Google App Engine



Iliya Novikov

unread,
Jul 15, 2016, 1:38:06 PM7/15/16
to Google App Engine, ikai.l...@google.com
Hope this thread is still alive.

Periodically finding the same message in the logs. Unfortunately it does not seem that an instance is restarted gracefully after all. Each time I see the message it is followed by another message: "This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application." And as it says it really does take longer. Like a few seconds. This generates huge latency spikes.
Can I do something about this? Something that will start a new instance in advance?

Thank you,
Iliya.

Adam (Cloud Platform Support)

unread,
Jul 17, 2016, 5:57:27 PM7/17/16
to Google App Engine, ikai.l...@google.com
The message you see is the standard message logged when a new instance is started up. It's telling you that the first request to this instance may take longer since the instance needs to finish starting up before it can start serving.

Rather than try to revive a 6 year old discussion, I'd recommend starting a new thread. The discussion that took place here was about basic and manual scaled instances restarting after serving a maximum number of requests, however your question is about automatic scaling behavior that did not exist at this time.

To answer your question, you can configure new instances to start up in advance using warmup requests. You can also tweak parameters for automatic scaling such as idle instances, concurrent requests and pending latency to reduce latency.

Iliya Novikov

unread,
Jul 18, 2016, 5:07:25 AM7/18/16
to Google App Engine, ikai.l...@google.com
Sure. Here is a new thread I started. Thank you.

Fredrik Bertin Fjeld

unread,
Oct 12, 2017, 11:12:19 AM10/12/17
to Google App Engine
As of today the limit is 50K requests for a single F4 instance. Its validated by continuously watching the Cloud Console.

The instance will shutdown regardless of perfect latency, no errors and low memory use, and the following message will be shown:
"After handling this request, the process that handled this request reached the maximum number of requests that may be handled in a single process' lifetime, and exited normally."

This is not to be confused with the message which means its a loading request, starting a new instance, the opposite:
"This request caused a new process to be started...."

The limit of maximum lifetime processed requests per F4 instance is not in official docs, however it would be great for the community if it was present.

Cheers,
Fredrik


On Thursday, October 28, 2010 at 8:33:50 PM UTC+2, Ikai Lan (Google) wrote:
Reply all
Reply to author
Forward
0 new messages