The first is that with 1.5.2 we changed how resident instances work,
so that if a dynamic instance existed, the scheduler would prefer an
idle dynamic instance to an idle resident instance. Further, if a
resident instance was busy, we would use this as a cue to warmup a new
dynamic instance. The point of these changes is to turn the resident
instances into the reserve for the app. They are generally idle, but
if the app gets more traffic than it can handle with non-reserved
instances, then it will use some of the reserved instances (and this
will in turn spawn a new dynamic instance).
Generally, Always On is going away with the new billing plan, and
being replaced by Min Idle Instances, which is how the reserved
instances have been changed to behave with 1.5.2. We're continuing to
evaluate all aspects here, both how well these reserve instances are
working, what we should be doing, what we should change about the
scheduler and the billing plan, and so on.
In terms of this specific example, the slow request was caused by
general bigtable slowness during that time interval. This can be seen
somewhat here: http://code.google.com/status/appengine/detail/datastore/2011/07/27#ae-trust-detail-datastore-get-latency
This can also be investigated somewhat using our logs viewer. For
example, we can see all loading requests for an app with:
https://appengine.google.com/logs?app_id=wordpong&severity_level_override=1&severity_level=3&tz=US%2FPacific&filter=loading_request%3D1&filter_type=regex&date_type=now&date=2011-07-27&time=15%3A46%3A45&limit=20&view=Search.
Note how the only loading requests this app has received have been
/_ah/warmup
Also we can see all requests sent to a specific instance. Here's the
one with the log line you listed above:
https://appengine.google.com/logs?app_id=wordpong&severity_level_override=1&severity_level=3&tz=US%2FPacific&filter=instance%3D00c61b117c39de5ca2d60c64270fc703fdce1355&filter_type=regex&date_type=now&date=2011-07-27&time=15%3A56%3A40&limit=20&view=Search.
Note how the first request the instance served was /_ah/warmup,
followed by a pause of 4 seconds, followed by the /game/Game.wp
request which ran for 9 seconds.
There are a couple of things that can be done now to get different
behaviors. One is to set Max Idle Instances to three, which will kill
off the dynamic instances for your app, and leave the app with just
the resident instances. The other is to use Backends, which will give
you direct control over how many instances run for your app and their
properties: http://code.google.com/appengine/docs/java/backends/overview.html
Hopefully that helps. There is also a lengthy discussion going on at:
http://groups.google.com/group/google-appengine/browse_thread/thread/baf439a6e073f6da
> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>
>
That is, to repeat, the invariant that the scheduler is trying to
maintain here is that your app has at least 3 idle instances.
And if an instance is getting traffic, then it isn't idle. The value
of an idle instance is that it can process requests right-away
if needed, without having to first warmup or do a loading request.
It sounds like what you'd really prefer is something like
Min-Instances, but that's not presently an available option.
If you set Max-Idle-Instances, then you will never be charged for
any instances which are idle in excess of the configured
value. In this example, if Max-Idle-Instances=3, and the
scheduler is running 6 instances, and three of them are always
idle, then the resulting charge would be for 3 instances.
--You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/4x8OnZt-sSUJ.
This whole thread is an excellent example of why I'm not at all enthusiastic about the new pricing.
Mike is 100% correct, this behavior is amazing unintuative and confusing. If we will be paying around $60 per month per instance, why would we ever want one to sit idle? Shouldn't App Engine serve the incoming request with an idle instance and spin up a new instance to be in reserve, if needed?
Also, if I understand correctly, if I set min idle instances to, say, 5. It will nearly always have 5 instance sitting there doing nothing. If one does get used, it will spin up another immediately. Is the point of min idle instances to handle bursts of traffic (though it isn't clear ut does that either since it may well send requests to new instances)? Otherwise I can hardly see the point in ever having more than 1 idle instance.
Robert
To that last question, that's precisely what its doing. It sends
requests to available instances, and if after having done that there
are no longer N Min-Idle-Instances, it will spin up new ones to meet
that invariant.
Also, if you set Max-Idle-Instances, then you are not charged for
these extra instances. By setting Max-Idle-Instances, to say 3, then
you are never charged for more than three idle instances. This works
whether or not you have configured Min-Idle-Instances.
> Also, if I understand correctly, if I set min idle instances to, say, 5. It
> will nearly always have 5 instance sitting there doing nothing. If one does
> get used, it will spin up another immediately. Is the point of min idle
> instances to handle bursts of traffic (though it isn't clear ut does that
> either since it may well send requests to new instances)? Otherwise I can
> hardly see the point in ever having more than 1 idle instance.
It depends on the app and the developer. In the most extreme case, of
an app with very high request latency, that is not concurrent, and for
whom loading request latency is even worse, then the only way it can
handle additional load is to have idle instances ready to take the
additional requests. Some apps share properties with this extreme app,
and have asked for more than 3 idle instances as such.
Another example is that of a load test. By using this feature, you get
as many instances turned up as you'd like, without having to wait for
the scheduler to turn them up for you. For apps that run at hundreds
or thousands of instances, having more control over this can be
useful.
For your warmup requests, I think you should try to do as much
initialization as you can within the 30s time limit. Warmup
requests are a special kind of loading request, in that they are
not end-user facing:
http://code.google.com/appengine/docs/python/config/appconfig.html#Inbound_Services
By having your warmup requests do more work, you should avoid the
warm-but-not-hot instances, which appear to be the cause of the
higher end-user latencies in your case. I should point out, in
case this is not obvious, when an instance is being warmed-up, we
are not sending it any other traffic. We will wait until
/_ah/warmup completes before we send any user requests to it.
@Tim, Min-Idle-Instances is not yet tunable (other than 0 or 3
with Always-On), but will be soon. We're working on it.
@All, thanks for filing the production issues, and providing the
helpful data and feedback. We are continuing to debate the matter
internally and are continuing to look for the best solution here.
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/1WiVSAKF5loJ.