New Billing: Absolutely make sure you set Max Idle Instances to a fixed value

Emlyn

unread,

Oct 12, 2011, 9:10:38 AM10/12/11

to google-a...@googlegroups.com

I've been testing this hypothesis:

Hypothesis: Ignoring the 15 minute cost for spinning up new instances,
the price we pay is the moment by moment minimum of (total instances)
and (active instances + Max Idle Instances). If Max Idle Instances is
set to Automatic, then we pay for the moment by moment total
instances.

It holds. That is, if you leave the default Max Idle Instances setting
(Automatic), you'll be billed for every bit of instance time the
scheduler chooses to run. If you set it to a fixed number, you'll have
a fixed cost cap based on the actual work you are doing (a lot more
like cpu time), in most cases a lot lower.

Here are a couple of posts in detail, with graphs, billing numbers, pictures!

The Spiny Norman Test
http://appenginedevelopment.blogspot.com/2011/10/spiny-norman-test.html

Go Spiny Norman, Go
http://appenginedevelopment.blogspot.com/2011/10/go-spiny-norman-go.html

--
Emlyn

http://my.syyn.cc - Synchonise Google+, Facebook, WordPress and Google
Buzz posts,
comments and all.
http://point7.wordpress.com - My blog
Find me on Facebook and Buzz

vlad

unread,

Oct 12, 2011, 8:45:07 PM10/12/11

to google-a...@googlegroups.com

Thanks! That is a good information. Still why does it takes an IQ of > 200 to understand GAE new billing?

Johan Euphrosine

unread,

Oct 13, 2011, 3:52:22 AM10/13/11

to google-a...@googlegroups.com

Hi Emlyn,

Thanks for sharing those articles, it is very nice that you were able
to backup the billing formula with hard facts.

As it was discussed in the groups during the pricing model change the
billing formula under the new model will be:
billable_instances_rate = min(active_instances_rate +
max_idle_instances, total_instances_rate)

where in the dashboard:
- active_instances_rates is the yellow line
- total_instances_rate is the blue line
- max_idle_instances is the upper bound of "Idle Instances" performance settings

If you set max_idle_instances to automatic, it's equivalent to setting
it to a very large number making the formula essentially become:
billable_instances_rate = total_instances_rate

See the following threads where Jon McAlister commented about the
billing formula:
https://groups.google.com/d/msg/google-appengine/zuRXAphGnPk/UiTgTIIesL0J
https://groups.google.com/d/msg/google-appengine/W-17IhgwrLI/05Wti7I39EUJ
https://groups.google.com/d/msg/google-appengine/T-dJtXmOO8U/npM69XZAJFcJ

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>
>

--
Johan Euphrosine (proppy)
Developer Programs Engineer
Google Developer Relations

Emlyn

unread,

Oct 13, 2011, 4:21:01 AM10/13/11

to google-a...@googlegroups.com

np Johan.

I was confused earlier on by posts such as

http://blorn.com/post/10013293300/the-unofficial-google-app-engine-price-change-faq

which focuses on multithreading to get pricing down, which I think is
just wrong. There are lots of good reasons to write multithreaded
code, but AppEngine pricing isn't really one of them.

Jeff Schnitzer

unread,

Oct 13, 2011, 5:16:45 AM10/13/11

to google-a...@googlegroups.com

I'm afraid you are still confused. You have ignored the entire point
of the "max idle instances" slider in the first place. GAE keeps idle
instances around so that sudden bursts of traffic don't cause users to
sit around waiting while your django/spring app spends 5+ seconds
loading.

Will turning max-idle-instances down to 1 reduce your bill? Of
course. But only at the expense of user experience. The second user
to arrive is going to wait for five seconds. Maybe you'll get lucky
and Google will actually leave your idle instances running for longer
than you've decided you're willing to pay for. Or maybe they won't
and your users will wonder why your site is broken and go to your
competitor's site instead. If you have a lot of idle instances
running, it's because you're lucky that memory pressure hasn't evicted
your spares, not because you're paying for service.

My point in the "unofficial faq" is that this slider does not have a
single "correct" setting, and if there was, it would certainly not be
1. You're basically paying for RAM. If you have a single-threaded
server, your 128 MB frontend will run one concurrent request at a
time. If you have an efficient multi-threaded server, your 128 MB
frontend can serve an order of magnitude more requests - possibly more
depending on how I/O bound your process is.

I'm not saying you shouldn't set your max-idle-instances to 1 right
now. For the time being, it *might* give you great performance at a
great price. I'm just saying this isn't any kind of a real solution -
you won't be able to complain when your latency goes to hell.

Put it this way: If your multi-threaded system has an avg concurrency
of 10, setting max-idle-instance to 1 is the equivalent of setting it
to 10 on a single-threaded system. If you care about your user
experience, you want that number high.

Jeff

Emlyn

unread,

Oct 13, 2011, 6:32:53 AM10/13/11

to google-a...@googlegroups.com

On 13 October 2011 19:46, Jeff Schnitzer <je...@infohazard.org> wrote:
> I'm afraid you are still confused.

Possibly not.

> You have ignored the entire point
> of the "max idle instances" slider in the first place.

I might be accused of ignoring its intended function and focusing on
its practical function.

> GAE keeps idle
> instances around so that sudden bursts of traffic don't cause users to
> sit around waiting while your django/spring app spends 5+ seconds
> loading.

Sure, so that's going to matter if traffic is very bursty, and you
need to service it immediately. True of some apps, not true of all by
any means.

For instance, a lot of my work is with systems whose major work is in
the background in tasks. Latency is a non-issue there.

Also, that's only loosely related to the Max Idle Instances setting.

> Will turning max-idle-instances down to 1 reduce your bill? Of
> course. But only at the expense of user experience.

Only true if I have no idle instances ready in reality.

> The second user
> to arrive is going to wait for five seconds. Maybe you'll get lucky
> and Google will actually leave your idle instances running for longer
> than you've decided you're willing to pay for. Or maybe they won't
> and your users will wonder why your site is broken and go to your
> competitor's site instead. If you have a lot of idle instances
> running, it's because you're lucky that memory pressure hasn't evicted
> your spares, not because you're paying for service.

True. I've tuned based on testing, not based on theory. The current
scheduler seems to kick off instances like crazy, no matter how you
set Max Idle Instances. So having enough idle instances to service
requests isn't currently a problem.

Note that I'm not actually recommending setting Max Idle Instances to
1, but setting it to a fixed number. How high? It depends on the size
of traffic bursts you absolutely must service immediately. How big is
that number? Up to your situation. I bet it's smaller than infinity
(which is what Automatic means).

> My point in the "unofficial faq" is that this slider does not have a
> single "correct" setting, and if there was, it would certainly not be
> 1. You're basically paying for RAM. If you have a single-threaded
> server, your 128 MB frontend will run one concurrent request at a
> time. If you have an efficient multi-threaded server, your 128 MB
> frontend can serve an order of magnitude more requests - possibly more
> depending on how I/O bound your process is.

I haven't seen anyone's numbers from the java side (and I haven't
tried python 2.7 yet); is there actual data floating around on how
many requests you can expect to serve concurrently per instance, using
multi-threading?

I guess what's crucial here is, if you use multi-threaded request
processing code, do you see a serious decrease in the number of
instances the scheduler wants to kick off?

- If Max Idle Instances is set to Automatic, a serious decrease will
mean a serious billing decrease, because you are paying for all of
those instances
- If Max Idle Instances is set to a fixed number which lands Active
Instances + Max Idle Instances lower than Total Instances, then you
are paying for that setting + your activity, ie: no difference even if
your code is multi-threaded.

Am I right that you are proposing that multi-threaded request handling
will seriously lower your total instance count (in reality, not just
in theory), and that you will therefore pay much less, with Max Idle
Instances set to Automatic, than a single threaded app with a low Max
Idle Instances setting would pay under the same load?

> I'm not saying you shouldn't set your max-idle-instances to 1 right
> now. For the time being, it *might* give you great performance at a
> great price. I'm just saying this isn't any kind of a real solution -
> you won't be able to complain when your latency goes to hell.

Maximizing future complaint possibilities is not the same as actually
lowering your bill.

Right now, it's absolutely a great solution to set that number low.
Everyone should do it. Everyone should also monitor their total
instance count and keep an eye out for a change in scheduler behaviour
that will render this advice incorrect, of course. But you should do
that anyway, because Google could change their scheduler's behaviour
without warning. If Max Idle Instances is set to Automatic, you'd pay
real money for that.

> Put it this way: If your multi-threaded system has an avg concurrency
> of 10, setting max-idle-instance to 1 is the equivalent of setting it
> to 10 on a single-threaded system. If you care about your user
> experience, you want that number high.
>
> Jeff

Sure, if you can get average concurrency of 10, then that's excellent.
But that's a performance consideration, not a price consideration.
Note that we are in a context where people are not complaining about
poor performance, they're complaining about increased costs. Also,
testing would indicate that moving Max Idle Instances around does not,
in practice, at the moment, change the actual performance of actual
apps significantly, only the guaranteed performance. People's needs
vary, but I'd hazard a guess that actual price is higher on most dev's
radars (particularly smaller devs) than performance guarantees,
especially while actual performance is good.

For those of us who are price sensitive (and we are talking about
price), lowering max idle instances (with the current scheduler
behaviour) is a no brainer. Meanwhile, for python users, multi
threading is only just becoming an option, and may still not be ready
for production use (I don't know, I haven't tried it yet). Also,
modifying existing apps to work correctly in a multi-threaded
environment is non-trivial (read: tricky and risky) development work.
It's something we should all do eventually, but do the benefits as of
right now actually warrant the costs and the risks? I'm not convinced
that they do.

So I'll stand by this on pricing: Theoretically, multi-threaded apps
are superior on AppEngine and you should go that way. But in practice,
as of right now, and especially for Python, they are not, particularly
given that a good pragmatic solution to the issue of instance pricing
is available at the touch of a slider. This pragmatic situation may
(likely will) change, but right now it is absolutely the case.

Note that I'm not saying that multi-threaded code is a bad idea. It's
a good idea for all kinds of other reasons. But currently, pricing is
not one of those reasons.

Reading back over this, if I were on the AppEngine team I would be
thinking "he's saying we should aggressively clean up idle instances
above Max if we want people to move to Python 2.7 and write more
efficient multi-threaded code". I may have shot myself in the foot ;-)

Jeff Schnitzer

unread,

Oct 13, 2011, 7:12:04 AM10/13/11

to google-a...@googlegroups.com

Let me summarize:

You ran an experiment and discovered that GAE left idle instances
running, above and beyond your max-idle-instances setting, for your
application at the time that you ran the test. Ok.

Unfortunately there's little reason to believe that this behavior
(lots of free instances) will continue for other applications at other
times; it may just be that you ran at an off-peak period. There's
little reason to believe this behavior will continue as Google
optimizes the scheduler, and a lot of reason to believe that it won't
(free instances are bad for the suddenly-relevant bottom line).

Should you set the max-idle-instances to a fixed number instead of
"auto"? Depends on your app. If you know how to balance your
traffic, startup latency, and avg request lantecy, then sure. If you
you're more concerned about your bill than your user experience, then
sure.

Does this mean you don't need to move to a multi-threaded system? No,
because whatever number of idle instances you think you need, you can
divide that by N if you go multithreaded. N is a hard number to
predict because it depends on how much of your app is i/o bound.
"Normal" Java appservers serving typical webapps achieve concurrency
in the hundreds with ease. Ikai once mentioned that the initial
concurrency of Java was hardcoded at 10, but that has probably changed
by now. I don't know what it will be like with Python.... but really,
it's hard to imagine anything else you can possibly do that will have
an order of magnitude effect on your bill.

Jeff

Emlyn

unread,

Oct 13, 2011, 8:06:48 AM10/13/11

to google-a...@googlegroups.com

On 13 October 2011 21:42, Jeff Schnitzer <je...@infohazard.org> wrote:
> Let me summarize:
>
> You ran an experiment and discovered that GAE left idle instances
> running, above and beyond your max-idle-instances setting, for your
> application at the time that you ran the test. Ok.

Well, that test was in fact designed to force the scheduler's hand. I
made the burstiest load that I could (I guess I could have changed the
default queue settings to make it even worse). I was testing a
conjecture about how the billing works, not the normal performance,
and it's wouldn't be right to draw conclusions about normal scheduler
behaviour based on that.

> Unfortunately there's little reason to believe that this behavior
> (lots of free instances) will continue for other applications at other
> times; it may just be that you ran at an off-peak period. There's
> little reason to believe this behavior will continue as Google
> optimizes the scheduler, and a lot of reason to believe that it won't
> (free instances are bad for the suddenly-relevant bottom line).

(Well, I ran the tests for 20 hours at a time, so not offpeak I think.)

In my main real app, changing the scheduler down to min idle instances
= 1 more or less halves the idle instances (they're still up at 6 or 7
or so while the active count is below 1).

But I don't care too much about that, because it's mostly background
processing. A little latency here and there means nothing in that
context.

>
> Should you set the max-idle-instances to a fixed number instead of
> "auto"? Depends on your app. If you know how to balance your
> traffic, startup latency, and avg request lantecy, then sure. If you
> you're more concerned about your bill than your user experience, then
> sure.
>
> Does this mean you don't need to move to a multi-threaded system? No,
> because whatever number of idle instances you think you need, you can
> divide that by N if you go multithreaded. N is a hard number to
> predict because it depends on how much of your app is i/o bound.
> "Normal" Java appservers serving typical webapps achieve concurrency
> in the hundreds with ease. Ikai once mentioned that the initial
> concurrency of Java was hardcoded at 10, but that has probably changed
> by now. I don't know what it will be like with Python.... but really,
> it's hard to imagine anything else you can possibly do that will have
> an order of magnitude effect on your bill.

It can't have an order of magnitude effect on your bill unless you are
running with Max Idle Instances on Auto or have set it far too high.

>
> Jeff

Changing Max Idle Instances to 1 for http://my.syyn.cc dropped my
projected new billing charges by a factor of 6, with no perceivable
performance change (see
http://point7.wordpress.com/2011/09/03/the-amazing-story-of-appengine-and-the-two-orders-of-magnitude/
and
http://point7.wordpress.com/2011/09/07/appengine-tuning-an-instance-of-success/

For me, to get another order of magnitude would be nice, but is no
longer an emergency.

And that's something important, because a lot of people rage-quit
AppEngine when the new billing appeared, in the usually false belief
that they were hosed under the new system. In fact, while I was in the
process of introducing AppEngine in my commercial work for our new
development, I got some push back specifically around this issue,
because people had read that you must have multi-threaded code or else
the billing would be sky high (probably reading your article in fact).
And the perception was that this would make the whole thing too hard,
and why don't we look at some other platform?

So I'm *not* saying there's no point to a multi-threaded code base.
I'm saying that it is not the only way to fix instance pricing woes,
or even the best way (especially from a developer effort POV). And the
idea that multi-threading is the best and only way is turning small
time developers off the platform.

But I'll tell you what; the effect of multi-threading on appengine
python app instance pricing warrants a good hard look, with numbers
and tables and graphs and whatnot. Reading your stuff in detail has
convinced me to put it on my shortlist, particularly with an eye to
developing some simple techniques for web apps, for people who would
otherwise be intimidated by heavy concurrency work. So I apologise for
raising your hackles, and thank you for the interaction, and the
motivation to take a look at multi-threaded Python.

Alexis

unread,

Oct 14, 2011, 4:30:18 AM10/14/11

to Google App Engine

I have not read in details previous posts but would like to share some
figures on this subject.
I agree that setting max idle to automatic is very expensive, in my
case I adjusted the scheduler to get an acceptable balance between
latency and price.

In order to do so, I've downloaded logs from the application and
studied them.
The "pending_ms" tag being important here.

Instances graph:
http://dl.dropbox.com/u/497622/instances.png

I changed the max idle instances from automatic to 20.
Before change:
----------
Out of 214903 studied requests we have:
(from 2011-10-13 09:23:07 to 2011-10-13 09:46:31)
Average ms: 303
Pending requests: 2.4 %
Global Average pending_ms: 45
Pending Average pending_ms: 1896
Loading requests: 0.0
----------

1h after the change:

----------
Out of 270115 studied requests we have:
(from 2011-10-13 11:25:56 to 2011-10-13 11:55:46)
Average ms: 345
Pending requests: 17.3 %
Global Average pending_ms: 90
Pending Average pending_ms: 522
Loading requests: 0.3
----------

So reducing the max idle instances settings had the following impact:
- general latency increase of only 50ms
- but 17% of the requests now wait 500ms before being served

This appears the be acceptable for our application, and the price
reduction is really worth it (from 240 instances down to 60!).
Having an even lower max idle instance setting (10? 5?) would further
increase latency while the price reduction would not be significant
(as we have many instances running).

Will

unread,

Oct 14, 2011, 11:47:33 AM10/14/11

to google-a...@googlegroups.com

Alexis,

Where can I find and download the said logs?

Thanks,

Will

Alexis

unread,

Oct 14, 2011, 12:13:57 PM10/14/11

to Google App Engine

Hi Will,

I used an appcfg.py command, like this:

appcfg.py --num_days=1 --append --include_all request_logs <path to
your app root directory> myAppLogs.txt

Will

unread,

Oct 14, 2011, 7:34:34 PM10/14/11

to google-a...@googlegroups.com

Alexis,

Works like a charm. Wrote a program using the data to get a good
overall picture. Thanks!

Will

Tim

unread,

Oct 16, 2011, 9:17:38 AM10/16/11

to google-a...@googlegroups.com

On Thursday, October 13, 2011 1:06:48 PM UTC+1, Emlyn wrote:

But I'll tell you what; the effect of multi-threading on appengine
python app instance pricing warrants a good hard look, with numbers
and tables and graphs and whatnot. Reading your stuff in detail has
convinced me to put it on my shortlist, particularly with an eye to
developing some simple techniques for web apps, for people who would
otherwise be intimidated by heavy concurrency work. So I apologise for
raising your hackles, and thank you for the interaction, and the
motivation to take a look at multi-threaded Python.

If you guys keep on backing up opinions with test cases and data, agreeing to the validity of each other's points, and then reaching conclusions in a consensual and respectful fashion, you'll give the internet a bad name.

C'mon.. won't somebody call someone else a Nazi ?? Think of Godwin's children...

:)

[i.e. investigative work and associated discourse is appreciated]

--

Tim

Reply all

Reply to author

Forward