Where to start optimizing cost?

262 views
Skip to first unread message

Andrin von Rechenberg

unread,
Dec 31, 2011, 10:42:25 AM12/31/11
to google-a...@googlegroups.com
Hey there

I'm an absolute GAE-Lover! The only thing that is bothering me is cost.
We spend about $10'000 a month in GAE cost. That's just too much for
the traffic we serve. We are starting to look around for alternatives, but
I'd really love to stay with GAE and just optimize the system.

One of GAE's biggest flaws IMHO is that it is really hard to see where
the costs are coming from. (Yes we have appstats in place, but our
system is just too big to manually sample hundreds of requests).

So here is a feature request that might help us optimize and stay with GAE:
Show cost per URI in the dashboard. That would be incredibly helpful.

Does anyone know of an existing elegant way to figure out where the
cost come from? (I'm looking more for a way to measure rather than
software solutions like "use memcache"). I could do something with
prodeagle.com and estimate the cost per request myself but I'd rather
use an already existing solution...

Cheers,
-Andrin

Kaan Soral

unread,
Dec 31, 2011, 2:34:43 PM12/31/11
to google-a...@googlegroups.com
Wow very impressive, out of curiosity what kind of app do you have?

It's also impressive that you are on 10k$ levels and with hundreds of requests, and have no idea about the source of costs, don't get me wrong, its very interesting

( I am not giving any advices since its not what you are looking for, but daily consumptions should give a general idea )

Brandon Wirtz

unread,
Dec 31, 2011, 4:40:02 PM12/31/11
to google-a...@googlegroups.com

I do consulting on this.  You are likely in the price range where you could benefit from it.

 

You should also check the archive for some of the optimizations I have talked about publicly.  But if you want to hit me up off list, I can usually give you an estimate of how much reduction can be made just by looking at your Google Analytics, and a few hours of your apps logs.

 

Most clients see a reduction of about 30% on an 8 Hour consult. During that consult we’d discuss code logic, your team would make the code changes.  If my team makes the code changes it will cost a bit more.  We have on rare instances seen up to 80% reductions in hosting costs, but 30% is very typical, and even companies that have done a lot of optimizations often are surprised to see 25%.  If I can’t fine enough to offset my consulting price in 90 days, I won’t charge you for the consult.

 

-Brandon

 

Brandon Wirtz
BlackWaterOps: President / Lead Mercenary

Description: http://www.linkedin.com/img/signature/bg_slate_385x42.jpg

Work: 510-992-6548
Toll Free: 866-400-4536

IM: dra...@gmail.com (Google Talk)
Skype: drakegreene

BlackWater Ops

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

image003.jpg

Yohan Launay

unread,
Dec 31, 2011, 11:59:34 PM12/31/11
to Google App Engine
10,000$ a month in gae that's a lot and then 10,000$ in consulting
fees with brandon haha

anyway back to business:
1) are you using python or java
2) how many apps are generating that much traffic
3) check the admin panel maybe even post a
screenshot here (or link to it or something) it will already give you
good clues where you spend the most money on (bandwidth, data write,
read, instances, urls that have higher traffic) whih should help
identify where to start.
4) are you using third party frameworks like jdo, objectify, etc

Another problem will be if every url is optimized, cached etc but you
just have a lot of hits that are generating small reasonable costs.
Then you must start thinking about proxying and relaying the content
from external systems and also batch+delayed processing of your data
(i.e. instead pf writing to datastore at each request you push it to a
pull queue that will write in bulk).

I have no doubt you already know that are are probably doing it
already. Also i dont like appstats very hard to track properly and i
believe it is adding more load on the requests.

GAE used to return the cost of each request in the http response
header maybe it is still there.

Another thing easy to forget is that memcache sucks since you have
zero control on data expiry (you can put in memcache and 2 ms later
your data will be gone thus hitting your datastore at every request),
dont hesitate to implement your own in memory caching solution (a
simple java hashmap could do).


Keep us posted.

Cheers

On Jan 1, 5:40 am, "Brandon Wirtz" <drak...@digerat.com> wrote:
> I do consulting on this.  You are likely in the price range where you could
> benefit from it.
>
> You should also check the archive for some of the optimizations I have
> talked about publicly.  But if you want to hit me up off list, I can usually
> give you an estimate of how much reduction can be made just by looking at
> your Google Analytics, and a few hours of your apps logs.
>
> Most clients see a reduction of about 30% on an 8 Hour consult. During that
> consult we'd discuss code logic, your team would make the code changes.  If
> my team makes the code changes it will cost a bit more.  We have on rare
> instances seen up to 80% reductions in hosting costs, but 30% is very
> typical, and even companies that have done a lot of optimizations often are
> surprised to see 25%.  If I can't fine enough to offset my consulting price
> in 90 days, I won't charge you for the consult.
>
> -Brandon
>
> Brandon Wirtz
> BlackWaterOps: President / Lead Mercenary
>
> IM: drak...@gmail.com (Google Talk)
> Skype: drakegreene
>
>  <http://www.blackwaterops.com/> BlackWater Ops
>  image003.jpg
> < 1KViewDownload

jon

unread,
Jan 1, 2012, 7:02:45 PM1/1/12
to Google App Engine
Yohan your suggestion regarding the use of pull queue for batch writes
is interesting. In the context of keeping counts, do you think this
technique works better than sharded counters? I'm using the latter but
it has what I think are non-trivial drawbacks:
1. The code has too many moving parts: the sharded counter
implementation, a job that regularly calculates aggregate
2. One count value requires writing and reading multiple pieces of
data: the shards and the aggregated value. More reads/writes = more
expensive bill.

What are the pros and cons of using pull queue + batch writes instead
of sharded counters?

Yohan Launay

unread,
Jan 2, 2012, 6:47:07 PM1/2/12
to Google App Engine
Hi Jon,

I don't like sharded counter with the datastore and always preferred
pull queue or memcache counters for the job. Only issue with memcache
counters is that you dont control when the cached key will be flushed
so you have to check the counters very often. i.e. frequency depends
on how often you are ready to lose data : ex if you can bear to lose
the last 1min of data just check+rest the counters in memcache every
1min. For memcache counters i check every 1-5s: i.e. read counter
value and decrement by the value you just read. If you use the same
key frequently enough it shouldnt disappear so often.

For pull queue i've been using that all along and so far it works
great. It's like sharded counters without the datastore writes which
is great for temporary data (the counters). Just make sure your pull
queue is large enough. Also you may want to use multiple queues if
your write frequency is very high. Or switch to memcache counters of
your write frequency is super high (much faster than the queue api but
risk of losing data).

Cayden Meyer

unread,
Jan 2, 2012, 10:09:52 PM1/2/12
to Google App Engine
Hi Andrin,
The admin console can provide a great deal of information on where
your costs are coming from.
I know that you specified that you wanted ways to monitor rather than
solutions to reduce costs, however these are four fairly easy to
measure changes:
- If your application is not CPU bound you may wish to migrate to
Python2.7. This can lower the number of instances required to serve
the same amount of traffic. Changes can be seen by comparing the
number of active instances with python2.7 to serve traffic y vs
activate instances with python2.5 to serve traffic y. Note: Python 2.7
is currently experimental and the your latency may increase when
moving to Python2.7.
- Using memcache can reduce datastore operations.
- Use edge caching where possible, this can reduce the number of
instances required to serve traffic.

- If you can roughly predict the amount of traffic you will receive,
discount instance hours are a good way to reduce costs.
Hope this helps you optimize your application. There are more ways to
optimize your application, however these are just a few simple ones
which can make a quite a difference.
Cayden MeyerProduct Manager, Google App Engine

On Jan 1, 2:42 am, Andrin von Rechenberg <and...@miumeet.com> wrote:
> Hey there
>
> I'm an absolute GAE-Lover! The only thing that is bothering me is cost.
> We spend about $10'000 a month in GAE cost. That's just too much for
> the traffic we serve. We are starting to look around for alternatives, but
> I'd really love to stay with GAE and just optimize the system.
>
> One of GAE's biggest flaws IMHO is that it is really hard to see where
> the costs are coming from. (Yes we have appstats in place, but our
> system is just too big to manually sample hundreds of requests).
>
> So here is a feature request that might help us optimize and stay with GAE:
> *Show cost per URI in the dashboard. That would be incredibly helpful.*

Brandon Wirtz

unread,
Jan 2, 2012, 10:54:35 PM1/2/12
to google-a...@googlegroups.com
Cayden,

I'm a big fan of Python 2.7, but I wouldn't dream of telling someone running
this large of an App to move to it right now. I've seen what happens when
the scheduler gets things wrong, and that "increased latency" results in
"massive time outs".

Python 2.7 is awesome if all of your requests are under 7 seconds. But our
experience has been that if you have more than 1 in 50 requests taking more
than 7 seconds python 2.7 will buckle under load.

I blame the scheduler, and that might not be the problem, but I know that
when you start having Long requests things start timing out, and it appears
to be that the scheduler is willing to stack things poorly, under load these
timeouts cascade.

-Brandon

--

Andrin von Rechenberg

unread,
Jan 3, 2012, 8:02:51 AM1/3/12
to google-a...@googlegroups.com
Hey there

First of all: It's great to have such an active community.

(@Kaan) My app is called MiuMeet. It's one of the leading location based Social/Dating Networks on mobile.

(@all) The dashboard gives me a nice idea about where the costs come from. I'd like to analyze Datastore read/writes more closely. All static content is cached forever externally (im using url cache busting)

(@yohan) It's run on python. It's a single app that generates this amount of traffic. I dont use thirdparty frameworks. Thanks for the pointer with the http cost return header. The problem is, since I do heavy caching, this header will only help me if the cache is cold. I can get an idea from this header but I would have to record a couple of thousand headers to get an idea. Is there no middleware for this like appstats? :)

(@jon) Why use sharded counters (maybe i've understood something wrong) ? I have built a pretty cool counter system for appengine that was in the Google AppEngine blog:

(@Cayden) I use discounted hours heavily. I would also love to try Python2.7 but as Brandon points out one has some reasons to be hesitant.

(@Brandon) Your offer sounds interesting. I think you'd need a little more than 8h due to the size of the system - but maybe I underestimate the power of your mermaid costume :) I have built quite a few systems during my time as a Google Employee that are much bigger than MiuMeet and have a lot of ideas how to optimize MiuMeet. My main problem is that I need to figure out quickly how much cost I can save with which optimization. And therefor I'd like to measure better where the money is spent. When I see where it is spent I will probably have an idea how to optimize it. My problem is not the engineering challenge but the time to implement all optimizations. So I'd like to start with the low hanging fruits. But I will def think about your offer.

(@all) As I mentioned above: Given that there is a http return header that estimates the cost of a request, shouldn't it be quite straight forward to build a middleware like appstats that lists cost per request path? (I haven't looked at all at building middlewares)

Cheers and thanks to everyone for the replies
-Andrin

blackpawn

unread,
Jan 4, 2012, 3:49:08 PM1/4/12
to Google App Engine
my biggest cost savings so far have unfortunately come from moving
chunks over to Amazon S3 and EC2. i love App Engine and want to keep
as much as i can on it but the pricing changes have been a real
bummer. the easiest wins were moving all static content to S3 so i
don't have to pay for GAE instances to serve those and eliminating use
of Channels in favor of socket.io on EC2.

On Dec 31 2011, 3:42 pm, Andrin von Rechenberg <and...@miumeet.com>
wrote:
> Hey there
>
> I'm an absolute GAE-Lover! The only thing that is bothering me is cost.
> We spend about $10'000 a month in GAE cost. That's just too much for
> the traffic we serve. We are starting to look around for alternatives, but
> I'd really love to stay with GAE and just optimize the system.
>
> One of GAE's biggest flaws IMHO is that it is really hard to see where
> the costs are coming from. (Yes we have appstats in place, but our
> system is just too big to manually sample hundreds of requests).
>
> So here is a feature request that might help us optimize and stay with GAE:
> *Show cost per URI in the dashboard. That would be incredibly helpful.*

Andrin von Rechenberg

unread,
Jan 5, 2012, 10:18:45 AM1/5/12
to google-a...@googlegroups.com
Actually appstats gives me pretty much what I need, now that I looked at it more carefully.
If you put the Billing History side by side with the appstats RPC stats (that you can have per path)
you see exactly what paths your cost comes from (except CPU time).

Cheers,
-Andrin

jon

unread,
Jan 8, 2012, 2:13:01 AM1/8/12
to Google App Engine
Andrin what conclusion did you draw from that side by side view?

For us it's datastore that costs the most. I spent days rewriting our
fanout implementation from a datastore-oriented one to a memcache-
oriented alternative. Ironically our initial implementation was based
on Brett Slatkin's talk.
> >http://googleappengine.blogspot.com/2011/10/prodeagle-analyzing-your-...
>
> > (@Cayden) I use discounted hours heavily. I would also love to try
> > Python2.7 but as Brandon points out one has some reasons to be hesitant.
>
> > (@Brandon) Your offer sounds interesting. I think you'd need a little more
> > than 8h due to the size of the system - but maybe I underestimate the power
> > of your mermaid costume :) I have built quite a few systems during my time
> > as a Google Employee that are much bigger than MiuMeet and have a lot of
> > ideas how to optimize MiuMeet. My main problem is that I need to figure out
> > quickly how much cost I can save with which optimization. And therefor I'd
> > like to measure better where the money is spent. When I see where it is
> > spent I will probably have an idea how to optimize it. My problem is not
> > the engineering challenge but the time to implement all optimizations. So
> > I'd like to start with the low hanging fruits. But I will def think about
> > your offer.
>
> > (@all) As I mentioned above: Given that there is a http return header that
> > estimates the cost of a request, shouldn't it be quite straight forward to
> > build a middleware like appstats that lists cost per request path? (I
> > haven't looked at all at building middlewares)
>
> > Cheers and thanks to everyone for the replies
> > -Andrin
>

Andrin von Rechenberg

unread,
Jan 9, 2012, 6:06:58 AM1/9/12
to google-a...@googlegroups.com
The biggest conclusion I drew was that mapreduce was the main
cost in datastore reads. I have several different mapreduces
running over the same entities. I now combined all mapreduces
into one combined mapreduce which reduced cost quite a bit.
Additionally I created an Environment class that acts as an
instance cache and only "puts" entities back to the datastore
to avoid multiple puts during a single request (once you have
a huge system its sometimes hard to track which parts of the
system might already do put requests)...

Cheers,
-Andrin

objectuser

unread,
Jan 9, 2012, 2:47:47 PM1/9/12
to google-a...@googlegroups.com
Seems like we keep seeing this.  I think this may be due to a variety or reasons that are either application or developer specific, but I also think that developers are slowly learning what apps work best on a particular platform, and part of that is realizing that GAE is not really a good fit for some kinds of apps ... maybe the majority of apps for all I know.

I architected my app so that the persistence layer is isolated from the rest of the app, but I'm sure there's some conceptual leakage into the app, if for no other reason than that I structured my data to be optimized for GAE.  But it makes me wonder how hard it would be to port it to a traditional infrastructure.  Something I may be forced to find out.
Reply all
Reply to author
Forward
0 new messages