Lots of user-facing loading requests

585 views
Skip to first unread message

Francois Masurel

unread,
Jan 17, 2013, 11:56:42 AM1/17/13
to google-a...@googlegroups.com
For my low traffic website I have found that 50 new instances were started during the last 2 hours and 22 minutes (taken from logs searching for "new process").

Seems quite a lot for me as I already have a resident instance.

But what annoys me the most is that, among those 50 loading requests, 20 were user-facing requests (40%) and 5 were cron requests (10%).

Is it normal behavior ?  Thanx for your help.

François

Takashi Matsuo

unread,
Jan 17, 2013, 3:15:51 PM1/17/13
to google-a...@googlegroups.com

Hi Francois,

If the cron handler usually takes a long time, or uses high amount of resources(CPU, memory), you may want to isolate those cron requests to another version or backends. It will help avoiding a situation where such cron requests occupy your precious frontend instances and thus the user-facing requests are routed to new instances.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/ZyzbO64SF3IJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo | Developers Advocate | tma...@google.com

Francois MASUREL

unread,
Jan 17, 2013, 3:24:21 PM1/17/13
to google-a...@googlegroups.com
Hi Takashi,

Thanx for helping.

In fact my only cron job is there to ping my app and takes usually less than 20ms.

But for sure I will use a backend or another version if it is needed.

I have a few task running from time to time but they don't seem to start new instances.

François

Tom Phillips

unread,
Jan 17, 2013, 5:11:56 PM1/17/13
to Google App Engine
Make sure on your instances graph that the blue "Total" line is not
often, ideally only under unforeseen bursts, going higher than the
green "Billed" line.


Total(blue) > Billed(green): you aren't charged for the delta between
Total and Billed, but lower QOS for users (assuming your startup time
is high) since they get most loading requests.
Total(blue) == Billed(green): Your /_ah/warmup gets all loading
requests, optimal QOS

This is just what I've observed (leaving max idles and min/max latency
at automatic). Unless you have magically low startup latency, boost
min idles until Total == Billed consistently.

/Tom

Francois Masurel

unread,
Jan 17, 2013, 5:21:16 PM1/17/13
to google-a...@googlegroups.com
Hi Tom,

Thanx for your suggestion, that's just the settings I'm testing at the moment.

Seems to reduce significantly user-facing requests but will quite increase my bill (x4) :-(

May be it will be worth it.

I'll tell you how it goes.

François

Francois Masurel

unread,
Jan 18, 2013, 8:25:17 AM1/18/13
to google-a...@googlegroups.com
I have been testing full automatic mode for the last hours (all application settings set to automatic).

Still getting quite a few user-facing loading requests (~20 per hour, 5% of all requests), there is no more warmup requests in the logs.

One dynamic instance has been alive since the beginning of the test and is getting most of the traffic.

Are these numbers normal ?

François

App Id: vncts1

Francois Masurel

unread,
Jan 18, 2013, 8:45:30 AM1/18/13
to google-a...@googlegroups.com
Things are getting worse : 20 loading requests for the last 23 minutes (about 20% of all requests).

Could I have an explanation ?

What am I doing wrong ?  I'm really lost :-(

App Id : vncts1

Tom Phillips

unread,
Jan 18, 2013, 11:11:47 AM1/18/13
to Google App Engine
Hi Francois,

Don't set min idles to automatic, set it to a high enough number that
over time Total instances (blue) == Billed instances (green). Leave
max idles and the two latency settings at automatic.

If you do have min idles configured, can you post a screencap of your
instances graph?

/Tom

Francois Masurel

unread,
Jan 18, 2013, 12:07:22 PM1/18/13
to google-a...@googlegroups.com
Here is my dashboard instance graph.

First part is with 1 min idle instance, second part with min idle instance set to  auto.

I have just set min idle instances to 2 and see how it goes.

François

Tom Phillips

unread,
Jan 18, 2013, 12:22:15 PM1/18/13
to Google App Engine
Yep, you need more min idles. At some min idle setting you'll see that
the blue line no longer goes above green (minus unforeseen bursts,
which need be you can prepare for with an even higher min idle
setting).

Once blue doesn't go above green, you'll see warmup request going to /
_ah/warmup and not to your users.

Basically, you don't get any delta between blue and green for free.
You pay for it by having your users suffer loading requests. That's
one reason why very low startup latency apps shine - they can take
advantage of any delta. Higher startup latency apps (pretty much any
real java app) are OK at steady state, but only if you are willing to
pay to have a high enough min idle setting that you are billed for all
running instances.

/Tom

On Jan 18, 12:07 pm, Francois Masurel <f.masu...@gmail.com> wrote:
> Here is my dashboard instance graph.
>
> First part is with 1 min idle instance, second part with min idle instance
> set to  auto.
>
> <https://lh6.googleusercontent.com/-xVIJBwSd04o/UPmA5FfU4QI/AAAAAAAAz5...>

Carl Schroeder

unread,
Jan 18, 2013, 1:10:25 PM1/18/13
to google-a...@googlegroups.com
There needs to be a checkbox on the dashboard which says "never send user request to cold instances". Then low traffic apps (or versions) with low latency handlers will no longer be at the mercy of a load balancing algorithm making WAGs in an information vacuum. After an app (or version) has enough traffic for the instance scheduler to make educated guesses, this box can be unchecked. 

I think that would fix everyone's issues with cold start latency on GAE java.

Carl Schroeder

unread,
Jan 18, 2013, 1:21:04 PM1/18/13
to google-a...@googlegroups.com
2 Resident instances. App was quiet for 10 minutes, no requests being served. 1 request sent. Neither resident instance took it. A cold instance was started without using /ah_/warmup. That user facing request languished for 30 seconds while GAE java spun up. There seems to be no way to prevent these sorts of catastrophes from happening.

App ID = lemurspot
Instance ID = 00c61b117cb19763aeea814cbc78e80de2f437

Tom Phillips

unread,
Jan 18, 2013, 1:29:02 PM1/18/13
to Google App Engine
Hi Carl,

What does your instances graph look like? Can you provide a
screencap?

What I'm wondering is if for extremely low traffic apps there is no
way to keep Total instances from sometimes exceeding Billed instances
(and hence giving users loading requests).

/Tom

Carl Schroeder

unread,
Jan 18, 2013, 1:56:43 PM1/18/13
to google-a...@googlegroups.com
Attached is my instance graph for the past 24 hours. My app gets the sort of traffic that one instance can satisfy effortlessly (but I have no way to configure that). I am running 2 resident instances (which should be monumental overkill) and still getting cold starts.

FYI, this app can run on 1 micro AWS instance.
pathological.png

Tom Phillips

unread,
Jan 18, 2013, 2:33:08 PM1/18/13
to Google App Engine
You are still getting Total(blue)>Billed(green) so will see user
facing loads rather than /_ah/warmup.

Just to clarify, you have the following settings:

Min Idle: 2 (for most of the graph couldn't have been that because
Billed==1, but I'm assuming you changed it recently to 2)
Max Idle: Automatic
Min/Max Latency: Automatic

If so, try changing Min Idle to 4, or higher to see if you can get the
Total (blue) line to stop jumping over the Billed (green) line. i.e.
pay for all instances that are running. The more latency in the app,
including startup latency, the higher min idle setting you'll need to
achieve this for a given traffic pattern.

It's TBD to see you can achieve steady Total==Billed, but if you can
you should stop seeing user facing loads (albeit with increased
operational cost for your production app).

/T
>  pathological.png
> 56KViewDownload

Carl Schroeder

unread,
Jan 18, 2013, 3:29:29 PM1/18/13
to google-a...@googlegroups.com
I tried 3 min idle with F4 instances a while back because we were going to demo the app. Made no difference other than to my pocket book. Seemingly random user facing instances are sent to cold instance starts.

The problem is not with resident instances that are being used. The problem is with resident instances that are actually idle AND unused. Therefore the cold starting dynamic instances are serving while the residents have empty request queues.

Max idle is currently set to 2 (recommended by appengine support).
Max latency is 14s, min latency is currently 2. But both have at one point been every setting under the sun while attempting to mitigate this.

At this point, I don't believe there is any combination of settings that can make Java GAE act sanely under a low traffic profile. It is excruciating standing over a users shoulder watching the app spin for 30 seconds when I know there is no good reason for it. Excruciating enough to port away from GAE java as fast as I can.

Francois Masurel

unread,
Jan 18, 2013, 4:45:38 PM1/18/13
to google-a...@googlegroups.com
Ok, with 2 idle instances I had 20 loading requests over the last 2h15.

Among them only 3 where warmup requests, the other were user facing requests.

Here is the instances chart :

One of the resident instance was almost never used : only one request served over a 4h38 period (cf. below).

Should I try with 3 min idle instances ?  I'll give it a try.

Does it just mean that GAE Java is simply not compatible with low traffic web sites ?

Could we have an official statement from Google about this ?

Thanx in advance.

François

Francois Masurel

unread,
Jan 18, 2013, 4:56:43 PM1/18/13
to google-a...@googlegroups.com
In fact, you can't set min idle instances to 3 but 4, 8, 9, 10, etc.  Ok let's go for 4. 

Over the previous 2h15 test period, about 700 requests were served, 2.4% of them were user-facing requests.

Carl Schroeder

unread,
Jan 18, 2013, 5:13:52 PM1/18/13
to google-a...@googlegroups.com
The more idle instances I reserve, the worse the problem gets. I have to have at least 1 resident instance in order for warmups to be issued. More than 1 Resident and every other page load (which can consist of multiple REST calls to GAE) hangs on a cold start.

The part that really irritates me is that Google had this working fine 3 months ago. They made an undocumented and unannounced change to the instance scheduler which has ruined the QOS measurments of my app ever since. 

GAE Java is supposedly not experimental. Go looks like it would offer improved performance, but it is labeled experimental. Given the train wreck GAE Java has morphed into, I am loathe to attach my wagon to something that could derail my development efforts (yet again) after a few months.

Francois Masurel

unread,
Jan 18, 2013, 6:47:12 PM1/18/13
to google-a...@googlegroups.com
Ok for the past 1h40 my app have been running with 4 resident instances and ~3 dynamic instances.

I have found only 9 warmup requests and no user-facing loading requests.

Great ! But among the 4 resident instances 3 served only 1 request and the last one only 75 (cf. below).


Minimal price for such a configuration : 0.05 (reserved hour) * 7 (instances) * 24 (hours) * 30 (days) = $256 / month.

Isn't it a bit expensive for such a low traffic website ?  In my personal case I can't afford it.

Google, can you confirm that we need such a configuration to avoid user-facing loading requests for GAE Java apps ?

Thanx for your answer.

PS:

On my point of view, there is definitely something broken in the scheduler, such a waste of resources and energy is not ecological at all.

If GAE is only suited for very high traffic Java apps, it should be clearly stated somewhere in the docs then.

Could we imagine using a different scheduler for low traffic apps ?

Marcel Manz

unread,
Jan 19, 2013, 12:45:15 AM1/19/13
to google-a...@googlegroups.com
From my perspective Google should have built the way the scheduler makes use of resident instances the other way around. It would be much more logical to understand and hopefully would work better.

Currently the scheduler sends requests always to dynamic instances and only when it sees that traffic is increasing and more instances are needed, it would temporarly direct the traffic to a warmed up resident instance. Which, according many user claims isn't always working as it should and the request hits a new dynamic instance which cold starts.

Wouldn't it be much easier if requests ALWAYS hit (the user chosen reserved amount) of resident instances first, and only then additional dynamic instances are started as traffic is increasing. The request would always hit a ready resident instance or is kept shortly in the queue until the request can be served by such. Meanwhile the scheduler can start as many dynamic instances it thinks are required and only direct traffic to them once they are warmed up.

/Marcel

Cesium

unread,
Jan 19, 2013, 11:26:22 AM1/19/13
to google-a...@googlegroups.com
Here we go again!

My low traffic app has been running great for the past couple of weeks.

Now I am seeing new instances being created every few minutes.

Again, we have a change in the scheduler behavior for low traffic apps.

Last time this happened I threw a fit.

This time I am saddened and depressed.

David

Michael Migdol

unread,
Jan 19, 2013, 1:38:02 PM1/19/13
to google-a...@googlegroups.com
I led a team of around 8 developers that ran a cloud-based VXML developer environment.  We had about 20K active developers enrolled at our peak.  We had 24 hour response time to all of our developer forum posts.  It is completely unfathomable to me how a company the size of Google could let urgent pleas for assistance like this to go completely unanswered.  It makes me sad that a product that could be so beneficial to so many small-app developers has been left to rot... 


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/WnZfTIf4FckJ.

Francois MASUREL

unread,
Jan 19, 2013, 2:30:54 PM1/19/13
to google-a...@googlegroups.com
You're totally right, we are still waiting for an official Google statement about the scheduler behavior : is GAE the right choice for low traffic apps ?

Francois Masurel

unread,
Jan 20, 2013, 8:43:22 AM1/20/13
to google-a...@googlegroups.com
Went back to 1 min idle instance and everything else to automatic, so now I have 2 instances billed most of the time.

For the last 27 minutes, we had 20 user-facing loading requests, almost one per minute.

I can't afford to go back to 4 idle instances and 7 billed instances.

What should I do ? :-(((

Carl Schroeder

unread,
Jan 20, 2013, 11:42:02 AM1/20/13
to google-a...@googlegroups.com
At the moment, I have a config that is almost bearable:
I am running 1 idle, 1 max resident. I also have a script peppering the site with requests once every 10 seconds to keep a dynamic instance alive.

I will be rolling out client side retries on the idempotent REST calls. 
I am also moving non-idempotent REST calls to Go, so instance start-up should happen in 75ms.

Cesium

unread,
Jan 20, 2013, 11:45:32 AM1/20/13
to google-a...@googlegroups.com
I am also writing a "pepper script" today.

husayt

unread,
Jan 22, 2013, 10:38:39 AM1/22/13
to google-a...@googlegroups.com
This is a major issue for us too. Can somebody from google give some official reply to this? or can they at least acknowledge this as a problem.

Huseyn Gulilyev

Cesium

unread,
Jan 22, 2013, 11:28:48 AM1/22/13
to google-a...@googlegroups.com
For the record:

Now my app is running so smooth on a single instance. No new instances being created. Latency is low.

My depression is gone.

Now, I am truly gobsmacked.

David

Michael Hermus

unread,
Jan 22, 2013, 12:08:11 PM1/22/13
to google-a...@googlegroups.com
While Googles lack of official acknowledgement on these threads has been very frustrating, I take heart in the fact that the related tracking issue has been accepted by the team: http://code.google.com/p/googleappengine/issues/detail?id=7865

I just wish it were marked as High priority instead of medium; I would encourage everyone on this thread to star it if you haven't already.

Francois Masurel

unread,
Jan 22, 2013, 5:09:54 PM1/22/13
to google-a...@googlegroups.com
Great ! But how did you get there?

François

Cesium

unread,
Jan 22, 2013, 5:16:51 PM1/22/13
to google-a...@googlegroups.com
Step 1: Sprinkle sour Skittles on the floor. (Sour, NOT regular Skittles.)
Step 2: Wait for Google to release the unicorns for feeding.
Step 3: When unicorns eat sour skittles, they release low latency mojo.
Step 4: How the heck should I know? I just sit on my rear, day after day and watch the scheduler's schizophrenic behavior! I did nothing!

David

Francois Masurel

unread,
Jan 22, 2013, 5:24:09 PM1/22/13
to google-a...@googlegroups.com
Sadly, nothing changed for me :  20 loading requests over the last 30 minutes, instances are killed almost immediately.

Cesium

unread,
Jan 22, 2013, 5:43:05 PM1/22/13
to google-a...@googlegroups.com
Please accept my apology for being flippant.

I am going to try an experiment inspired by Jeff's idea.

Jeff wrote:
<cronentries> 
    <cron> 
        <url>/some-non-static-url</url> 
        <schedule>every 1 minutes</schedule> 
    </cron> 
</cronentries> 
This will keep one instance warm. 
Jeff  

But instead of a fixed 1 minute interval, it will be adaptive like a feedback control servo.
Also, the sign of the feedback will be just the opposite of what you would expect.
If the call to my app takes a long time, I will call it more often to keep the instance alive.

Of course, this simple design will lead to much suffering and cost, so there will be limits placed
on the call rate, feedback gain, etc.

Just a thought,

Francois MASUREL

unread,
Jan 22, 2013, 5:48:56 PM1/22/13
to google-a...@googlegroups.com
In fact I succeeded in keeping one instance alive using PingDom service.

But it still keeps starting new instances very frequently.

It looks like pending latency setting doesn't work at all (set to 15s min for my app).

François


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/-ojM5D_cMsgJ.

Carl Schroeder

unread,
Jan 22, 2013, 9:09:05 PM1/22/13
to google-a...@googlegroups.com, mas...@mably.com
The key to being a low traffic app on appengine is to not to be a low traffic app on appengine.
I use a script to pepper some dynamic URLs that incur no API costs.
I endeavor to keep at least one dynamic instance alive, and sadly I also have to have 1 resident instance as well.
It keeps the site limping along while I port stuff to Go. ;)

Jeff Schnitzer

unread,
Jan 23, 2013, 2:12:13 AM1/23/13
to Google App Engine
I'm not convinced that high traffic apps fair any better, except
perhaps that user-facing loading requests are a smaller percentage of
the total load. I've heard people with high-traffic apps on this list
say "it's fine" but I've not heard any quantitative metrics like % of
requests that go to cold starts.

In the absence of specific numbers to the contrary, I'm inclined to
believe that high-traffic apps still dump an unacceptable number of
requests on cold instances.

Jeff
> https://groups.google.com/d/msg/google-appengine/-/l0LilFn2kSAJ.

Francois Masurel

unread,
Jan 23, 2013, 3:57:00 AM1/23/13
to google-a...@googlegroups.com
You might be right about the unicorns :-)

I didn't change a thing besides a simple redeploy for an HTML fix but since then (1 hour ago) I didn't have any loading requests.

Performance might be depending on the time of day : it's the morning here in Europe and everybody is sleeping in the US.

Will see how it goes in the next hours.

François


On Tuesday, January 22, 2013 11:16:51 PM UTC+1, Cesium wrote:

Francois Masurel

unread,
Jan 23, 2013, 4:34:49 PM1/23/13
to google-a...@googlegroups.com
Only one new instance started over the last hour.

Yesterday, at the same time, it was 2 instance loading requests every 3 minutes.

Google definitely fixed something on their side.

Thanx guys.  But please, don't touch anything now :-)

François

Mike Brown

unread,
Jan 31, 2013, 2:02:08 PM1/31/13
to google-a...@googlegroups.com

I am experiencing this exact behavior where new instances are spun up when existing instances sit idle.

As as test I spun up 5 instances (set min to 5 and max to Automatic).  Then even then with just one user, new
instances were started while the 5 sat idle.

I will try the work arounds listed above, however, what are my options beyond that?
If I rewrite the backend from Java to Python (or Go) will this solve the problem?
Seems I may be forced to port the whole thing to AWS and manage scalability myself. 

I can't even demo my app effectively and now have to look into unnatural work arounds or even a massive migration (in terms of effort) just
because AppEngine won't use existing idle servers.

Michael Hermus

unread,
Jan 31, 2013, 2:30:32 PM1/31/13
to google-a...@googlegroups.com
I have seriously considered moving user facing functionality to a default python version (dynamic web pages and Ajax calls) but the effort would be pretty massive, and I am clinging to hope that they will fix this issue soon. Although its not in the 1.7.5 release :(

Igor Kharin

unread,
Jan 31, 2013, 3:57:03 PM1/31/13
to google-a...@googlegroups.com
Hello Mike,

On Fri, Feb 1, 2013 at 2:02 AM, Mike Brown <virtua...@gmail.com> wrote:
> As as test I spun up 5 instances (set min to 5 and max to Automatic). Then
> even then with just one user, new instances were started while the 5 sat idle.

It's perfectly normal (and documented). Resident instances are a
"fallback instances" App Engine uses in case of a traffic spike. You
could set min pending latency to a high number to tell the scheduler
"I really don't want you to start new instances". That would higher
probability of requests being served by resident instances, but not
much since min latency has little to no effect on them.

> If I rewrite the backend from Java to Python (or Go) will this solve the
> problem?

No, since it is expected behaviour.

Kristopher Giesing

unread,
Jan 31, 2013, 4:28:56 PM1/31/13
to google-a...@googlegroups.com
I'm not sure Google has ever admitted there is a problem to be fixed.

Kristopher Giesing

unread,
Jan 31, 2013, 4:29:55 PM1/31/13
to google-a...@googlegroups.com
The behavior is expected, but rewriting in Python or Go may allow cold instances to spin up faster, which does alleviate the resulting user impact.

On Thursday, January 31, 2013 12:57:03 PM UTC-8, Igor Kharin wrote:
Hello Mike,

Kristopher Giesing

unread,
Jan 31, 2013, 4:36:06 PM1/31/13
to google-a...@googlegroups.com
On Wednesday, January 23, 2013 1:34:49 PM UTC-8, Francois Masurel wrote:
Only one new instance started over the last hour.

Yesterday, at the same time, it was 2 instance loading requests every 3 minutes.

Google definitely fixed something on their side.

Thanx guys.  But please, don't touch anything now :-)

François

I don't think Google fixed anything.  I think the behavior is just dependent on what else is going on in the data center.

Here is the image of my instance count over the last month.  Notice how, about 2.5 weeks ago, the blue line started to be way higher than the green line?  That's when GAE gets into its poor behavior state, from what I can tell.  Nothing about my app changed at all in this time, including average user activity.

The only reason I can live with this behavior is that my cold start time is fast enough for the relatively lax latency requirements of my app.  If my requirements were different I would be screaming bloody murder.  As it is, I don't recommend GAE for small projects under most conditions.

chart.png

 

Kristopher Giesing

unread,
Jan 31, 2013, 4:39:03 PM1/31/13
to google-a...@googlegroups.com
Image didn't come through, here is the url:

Francois Masurel

unread,
Jan 31, 2013, 4:54:26 PM1/31/13
to google-a...@googlegroups.com
GAE starting new instances like crazy again making my app pretty unusable.

Status page shows an "anomaly" state for Java apps :



Hoping this will get fixed soon.

Mike Brown

unread,
Jan 31, 2013, 5:04:59 PM1/31/13
to google-a...@googlegroups.com

I'm sorry but this is not what the documentation says and certainly not expected.
It says:

- https://developers.google.com/appengine/docs/adminconsole/instances
"You can specify a minimum number of idle instances. Setting an appropriate number of idle instances for your application based on request volume allows your application to serve every request with little latency, unless you are experiencing abnormally high request volume."

The key word here being every.

The way the scheduler is currently working does not, in any situation, allow me to handle every request with little latency.
As I mentioned I started up 5 resident and still got 2-3 dynamics while the residents sat idle.

It is completely counter intuitive that I would be paying an hourly rate for a resident instance only for it to sit idle while dynamic instances are spun up to handle requests. What makes complete sense is dynamics are started only if the load is beyond what the residents can handle.

What would make for a real scheduler would be for it to be predictive in starting up dynamics before they are needed and then not load balancing to them until they are completely spun up and ready. Having a calls block on instances starting up is just lame.

Also, I read somewhere (not sure if it was official documentation) that an instance should be able to handle up to 10 concurrent requests. I am not getting anywhere near that.

Francois Masurel

unread,
Jan 31, 2013, 5:16:17 PM1/31/13
to google-a...@googlegroups.com
AppEngine down again !  My app is not responding.

I told you guys : don't touch a thing when it works :-)



On Wednesday, January 23, 2013 10:34:49 PM UTC+1, Francois Masurel wrote:

Igor Kharin

unread,
Jan 31, 2013, 6:14:49 PM1/31/13
to google-a...@googlegroups.com
OK you won. This was discussed back and forth countless times ever since new performance settings introduced. Every. Month. I've tried. The team also did. I guess it is how this work as you can't keep explaining the same things to everyone.
 
A high minimum allows you to prime the application for rapid spikes in request load. App Engine keeps that number of instances in reserve at all times, so an instance is always available to serve an incoming request, but you pay for those instances. This functionality replaces the deprecated "Always On" feature, which ensured that a fixed number of instances were always available for your application. Once you've set the minimum number of idle instances, you can see these instances marked as "Resident" in the Instances tab of the Admin Console. 
  
Note: If you set a minimum number of idle instances, the pending latency slider will have less effect on your application's performance. Because App Engine keeps idle instances in reserve, it is unlikely that requests will enter the pending queue except in exceptionally high load spikes. You will need to test your application and expected traffic volume to determine the ideal number of instances to keep in reserve. 
... 
The second slider sets the maximum period of time (at most 15 seconds) that the scheduler will wait before resolving to create a new instance for the request.
  • A high maximum means users may wait longer for their requests to be served (if there are pending requests and no idle instances to serve them), but your application will cost less to run.
Your arguments are absolutely valid, though. Back then we all thought resident instances are just more flexible version of what "Always On" was, but they are not (and that's why engineers were forced to explain it here).

Yes, an instance might be able to serve up to 10 concurrent requests, but it's much more complicated. Johan Euphrosine explained it in all the details:


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.

To post to this group, send email to google-a...@googlegroups.com.

Mike Brown

unread,
Jan 31, 2013, 9:49:35 PM1/31/13
to google-a...@googlegroups.com
Thank you Igor. I certainly get your frustration :) 

So let me make sure I have this right.  The purpose of Residence instances is to serve as an always running pool of instances which are leveraged only at times of load spikes.
The reason this is valuable is it takes AppEngine some time to spin up Dynamic Instances and while these are spinning up, the Residence Instances are leveraged to handle any
"overflow" requests (requests pending while Dynamics are spun up). Once the load flattens out and the Dynamics have caught up, the traffic is once again routed away from the Residence Instances.

I think a name change and maybe UI change in the panel would help a lot to alleviate confusion. Currently Residence instances are termed "Idle Instances" which made me think these are the number on instances running when my app is idle and therefore available to handle any incoming traffic. This makes perfect sense with the notion of Always On notion but not with "overflow".

Renaming Idle Instances to something like, Overflow Instances (or maybe Balance Instances, Spike Instances, Reserve Instances, <something else>?) I think would be better.

Then perhaps bring back the notion of Always On as well to help with us folks with minimal traffic.

So from a UI perspective change the naming to use Minimum and Overflow (or alike).

So Minimum would mean minimum number running and ready to serve any request. Once load exceeds the capacity of Minimum instances, Dynamics are spun up.
Overflow would mean the number of instances held in reserve to manage latency during spikes. The steeper your spikes the more Overflow instances you should have.

So in the end you have your base set of instances for your base load and then Dynamics and Overflow for auto-scaling.

gafal

unread,
Feb 1, 2013, 3:06:59 AM2/1/13
to google-a...@googlegroups.com
Hi Igor,

I understand your frustation here. We are also frustrated.
It's still unacceptable to serve user request with cold instances (in java) and have such high latencies.
We see idle dynamic instances that do not serve traffic and new instances being spun up. I don't want to repeat describing what the issues are, you know them.
why does it take more than half a year to solve this issue? Issue 7865: User-facing requests should never be locked to cold instance starts
It's critical for low traffic apps.

I do not want to be rude but It might be time to stop sticking to your position and listen to your customers unsatisfaction.
We've been complaining about this for years now.
May be it's time to rethink what the scheduler should be, from scratch...


Hoping to be heard,

Gael

Kristopher Giesing

unread,
Feb 1, 2013, 10:48:51 AM2/1/13
to google-a...@googlegroups.com
I've never been able to validate the algorithm there with my own app.  As mentioned in the comments to that answer, I've definitely seen new instances spawn when (to the best of my ability to tell) none of the listed criteria were true.

Perhaps it would be a good idea if, in the "This request spawned a new instance" message, there was an additional note indicating the number of other instances that were examined before spawning the new request, along with more detailed information on why those other instance weren't deemed suitable.  This could get verbose, so perhaps an admin setting could toggle it on/off so that it could be turned on temporarily for investigative purposes and turned back off later.

- Kris


On Thursday, January 31, 2013 3:14:49 PM UTC-8, Igor Kharin wrote:
Reply all
Reply to author
Forward
0 new messages