Why Google AppEngine sucks

850 views
Skip to first unread message

walterc

unread,
Sep 24, 2009, 10:25:55 AM9/24/09
to Google App Engine
is there any response to this: http://3.14.by/en/read/why-google-appengine-sucks

we are seriously considering using gae over ec2 but if there is any
truth to the issues raised in the above link, we might need to
reconsider. of course we will do some preliminary stress testing of
our own before we decide but it does raise concerns.

Brandon N. Wirtz

unread,
Sep 24, 2009, 2:04:33 PM9/24/09
to google-a...@googlegroups.com
It's written by a PHP/LAMP guy struggling with the language, not the
service.

GAE vs EC2 is really about what are you doing with the platforms. There are
even cases where you may want to do portions of an application that run on
GAE, and others that Run on EC2.

For Instance. I'm using GAE to deliver all of the static portions of my
websites because the cost per bit is cheaper and the multiple Points of
presence gives better Repsonse time to my visitors.

But Many of those sites use EC2 for doing API calls to up to 40 sources at a
time, because GAE has a 10 second Limit and some of those API's take 11
seconds to come back, individually, let alone all at once.

It's like comparing my '66 Mustang to my '08 Mini cooper. In the quarter
mile the Stang is faster, but at Leguna Seca the Mini is almost twice as
fast. And For Cruizing to the Frozen Yogourt stand the Stang gets more head
turns, but cross country the mini gets better milage.

Each tool has its right purpose. Neither Google nor Amazon are idiots.
They approached the market differently, but possibly equally. You need to
look at your application, not the results of a guy with a strange .by domain
name.

-Brandon Wirtz
BlackWaterOps.com

Joshua Smith

unread,
Sep 24, 2009, 2:13:09 PM9/24/09
to google-a...@googlegroups.com
We moved our corporate web site ( http://www.kaon.com ) to GAE
recently. It's mostly static content, with some simple dynamic stuff
for events, press, etc. Performance is way up over our old hosted PHP
environment, and the read-only stuff has continued to work through all
the maintenance outages.

Certainly you need to write datastore-accessing code being wary of
exceptions. If you do that from the outset, and think through the
right thing to do in exceptional cases, it's really not a big deal.

The rest of his "microbenchmark"-based complaints are kind of silly.

-Joshua

Sri

unread,
Sep 24, 2009, 9:20:47 PM9/24/09
to Google App Engine
Hi,

There is "some" truth to it. One thing the article does highlight
is that you have to write things the appengine way. For people coming
from a RDBMS background thats a hard thing to swallow. The counter
example for instance. The GAE team recommends a sharded approach to
it. Yep that sounds great but at the prototyping phase you certainly
would be hardpressed to be doing this kind of optimisation or even
thinking about it when your first priority is to throw things and see
what sticks.

My personal gripe is the ludicrous amount of time it takes on
clearing a database. Again I cannot see why a single "clear
production datastore" wast available. It took me about 2 days to
delete a million records (i had reached the free limits by then).
Using (10) threads came down to about 16 hours. Not sure if it was
"me" or just that the threads were thrashing the system but even if
the time came down to 3 hours thats a unnecessarily large amount of
time just to clear a gigabyte of data (indexes and all). Yes I agree
that a well engineered system should not have to delete data at all
and migration/model evolution should be simple and planned for. But
again when prototyping what are you trying to focus on? By the way
the 2 million records was actually test data not exactly production
data either.

The problem comes down what is your alternative in the price
range?

cheers
Sri

Sri

unread,
Sep 24, 2009, 9:21:03 PM9/24/09
to Google App Engine

Tim Hoffman

unread,
Sep 24, 2009, 9:27:58 PM9/24/09
to Google App Engine
There are some grains for truth, however he fails to detail the effort
you need to put in to build a moderatley reliable system let alone one
that can with stand
multiple failures.

That alone is worth the price of admission.

Try building a linux ha cluster with load balancing, failover and
replicated database.

Its not easy and requires a lot of expertise.

Even with all the tools in EC2 its not a simple task,

I personally don't have a problem with the non sql nature of the
environment , I only occasionally use SQL for things over the
last 10 years,

T

Gabriel Robertson

unread,
Sep 24, 2009, 10:59:17 PM9/24/09
to google-a...@googlegroups.com
On Thu, Sep 24, 2009 at 7:27 PM, Tim Hoffman <zute...@gmail.com> wrote:
>
> There are some grains for truth, however he fails to detail the effort
> you need to put in to build a moderatley reliable system let alone one
> that can with stand
> multiple failures.
>
> That alone is worth the price of admission.
>
> Try building a linux ha cluster with load balancing, failover and
> replicated database.
>
> Its not easy and requires a lot of expertise.
>
> Even with all the tools in EC2 its not a simple task,
>
> I personally don't have a problem with the non sql nature of the
> environment , I only occasionally use SQL for things over the
> last 10 years,

I found the datastore pretty obvious to use, but then again mnasia is
like it. Ooo, that would be so rather awesome, and relatively easy
for Google to implement, an Erlang back-end, I would use that in a
heart-beat. It would perform *very* well (better then Python,
probably not as fast as Java), and the programming style is
*perfectly* designed for this, lots of little Actors with no
interacting data except through the datastore, would be perfect. :)

Robin B

unread,
Sep 25, 2009, 1:49:45 AM9/25/09
to Google App Engine
That's interesting you mention Erlang: I was working on building an
Erlang based App Cluster around the time when AppEngine was announced/
released. You can achieve a much higher handlers/cpu or handlers/
memory density using Erlang because each handler is a green thread
with cheap context switching, each handler/process costs as little as
200 bytes of system memory, and system libraries can loaded once into
memory and shared between all apps because Erlang is a functional
programming language. The thing that made me trade Erlang for
AppEngine was having access to BigTable.

AppEngine has numerous features (simple deployment, load balancing,
dynamic scalability), but the main benefit is access to a scalable
database; BigTable provides seamless multi-master database writes. If
a developer has never considered the challenges of scaling database
write throughput, then they would not realize how much time AppEngine
will save them in designing and hosting a truly scalable web
application.

Robin

On Sep 24, 9:59 pm, OvermindDL1 <overmind...@gmail.com> wrote:

Gabriel Robertson

unread,
Sep 25, 2009, 1:56:17 AM9/25/09
to google-a...@googlegroups.com
On Thu, Sep 24, 2009 at 11:49 PM, Robin B <rob...@gmail.com> wrote:
>
> That's interesting you mention Erlang: I was working on building an
> Erlang based App Cluster around the time when AppEngine was announced/
> released.  You can achieve a much higher handlers/cpu or handlers/
> memory density using Erlang because each handler is a green thread
> with cheap context switching, each handler/process costs as little as
> 200 bytes of system memory, and system libraries can loaded once into
> memory and shared between all apps because Erlang is a functional
> programming language.  The thing that made me trade Erlang for
> AppEngine was having access to BigTable.
>
> AppEngine has numerous features (simple deployment, load balancing,
> dynamic scalability), but the main benefit is access to a scalable
> database; BigTable provides seamless multi-master database writes.  If
> a developer has never considered the challenges of scaling database
> write throughput, then they would not realize how much time AppEngine
> will save them in designing and hosting a truly scalable web
> application.

It would not be hard to add BigTable into Erlang though, Erlang is not
that hard to bind to after all. The Mnasia database built into Erlang
though is fully distributed and fault tolerant and things can be made
to exist on disk or in memory only for speed and all sorts of things,
it is actually quite powerful, just a bit slower then normal SQL
servers of course, due to the distributed nature. Erlang also has
load balancing, dynamic scalability, and the deployment when using the
Erlang webserver Yaws is quite simple, it is fully ready to handle
just about everything you could ever throw at it, you just need a few
computers to load it on first. :)

I prefer Python as a programming language (although I would still use
Erlang if AppEngine ever supports it, it is just an awesome language).
I am using AppEngine because other people requested I did, been
learning it. :)

Walter Chang

unread,
Sep 25, 2009, 2:08:11 AM9/25/09
to google-a...@googlegroups.com
thanks a lot for all the comments.  i think one comment sums it up beautifully: "Neither Google nor Amazon are idiots" so both have their good and bad points.  i think we will go with gae for the prototype but keep the datastore bits separated just in case the need to switch to ec2 in the future.
--
.......__o
.......\<,
....( )/ ( )...

Jason Smith

unread,
Sep 25, 2009, 8:17:04 AM9/25/09
to Google App Engine
About the link from the OP:

I can confirm better performance than that. I manage a web API on App
Engine that reaches 80-90 hits per second every day during peak load.
About 2/3 of the queries are reads, and 1/3 are write/read combos
reads (6-10 CPU seconds' worth per request).

On Sep 25, 1:08 pm, Walter Chang <weih...@gmail.com> wrote:
> thanks a lot for all the comments.  i think one comment sums it up
> beautifully: "Neither Google nor Amazon are idiots" so both have their good
> and bad points.  i think we will go with gae for the prototype but keep the
> datastore bits separated just in case the need to switch to ec2 in the
> future.
>
>
>
> On Fri, Sep 25, 2009 at 1:56 PM, OvermindDL1 <overmind...@gmail.com> wrote:

GregF

unread,
Sep 25, 2009, 8:22:38 AM9/25/09
to Google App Engine
Having administered a small (4 machine) cluster for a minor web app, I
appreciate what it takes to do the job properly. EC2 takes hardware
out of the equation, but you still need to know your OS, middleware
and database like the back of your hand, and you need to continuously
manage it. Scaling databases is another issue, and is not for the
faint hearted.

And then along comes Google offering a managed environment, and a
proven scalable datastore - at virtually no cost. There is undeniably
a learning curve to effectively use the datastore, and there are
system limitations that can be frustrating when you first encounter
them. In some cases these will preclude using Appengine exclusively,
but as another poster mentioned, you can fairly easily build hybrid
systems with other servers. I'd strongly recommend hiring someone
familiar with Appengine to advise you about where you might strike
problems, and how you might work around them.

We decided a year ago to build a new business on Appengine, despite
the pre-release tag. I'm now sure we made the right decision - we can
concentrate on the application and the business - what we do best -
and leave the system, database and network admin to Google. The app
has proven very popular, and we are in the process of launching in
North America. If it hits Oprah, I'll be guzzling champagne - instead
of trying to figure out how to replicate databases across EC2
instances.

Rob

unread,
Sep 25, 2009, 3:51:46 PM9/25/09
to Google App Engine
Well, unless it's a maintenance Tuesday when you appear on Oprah then
you'll wish you were on EC2 and could roll out a couple of extra app
servers :-)

I think you express a balanced view point, GAE has some severe
limitations that make it unsuitable for a large variety of
applications. EC2 is harder than you think (though not has hard as
some are making it out to be).

The only reason I would say GAE sucks is because of the very low rate
of progress on some key issues and that they are doing fun stuff like
adding Java support instead of getting Python to release quality.

Rob.

Mike

unread,
Sep 25, 2009, 8:35:34 PM9/25/09
to Google App Engine
I am not struggling. I really like the idea of GAE. I would be happy
to have reliable automatically-clustered web applications.
The only problem is that it's not as reliable as I would like. Any
request might fail at any point, and that fades the initial shine of
GAE.

Brandon N. Wirtz

unread,
Sep 25, 2009, 10:25:35 PM9/25/09
to google-a...@googlegroups.com
Did you write the article?

I'm a PHP Lamp guy, Java isn't bad, Python is a complete mystery to me, but
I'm writing in it... Though errors because I haven't indented a line
correctly drives me insane.

I'm not having reliability problems. That's what error handling is for.

-----Original Message-----
From: google-a...@googlegroups.com
[mailto:google-a...@googlegroups.com] On Behalf Of Mike
Sent: Friday, September 25, 2009 5:36 PM
To: Google App Engine

GregF

unread,
Sep 25, 2009, 11:57:19 PM9/25/09
to Google App Engine
On Sep 26, 7:51 am, Rob <robert.osbo...@gmail.com> wrote:
> Well,  unless it's a maintenance Tuesday when you appear on Oprah then
> you'll wish you were on EC2 and could roll out a couple of extra app
> servers :-)

Touche! An apt point. The current frequent planned maintenance is a
nuisance, but we warn customers ahead of time and present a nice
message to users explaining what is going on. One odd effect is that
we've had great feedback from our warning messages - customers
perceive it as proving how hands-on and organised we are.

However, the reasons for the maintenance (see
http://googleappengine.blogspot.com/2009/09/migration-to-better-datastore.html)
are good - I'd rather they happened than Appengine become staler and
staler. And they've still got the pre-release tag to wave at us. :
( But I'm expecting maintenance to get more and more infrequent.

My only other issue with Appengine is support - a user group really
doesn't cut it. So far I haven't had a critical issue to resolve, but
if I did, waiting for a Googler to notice my message on the group
would be extremely frustrating. An email address or a form to submit
issues, or an IRC channel would give more confidence - it's the
difference between polling and interrupt-based comms. To give credit
where it's due, Google have got much better at explaining what is
going on both with planned and unplanned outages.

Brandon N. Wirtz

unread,
Sep 26, 2009, 12:22:20 AM9/26/09
to google-a...@googlegroups.com
>An email address or a form to submit issues, or an IRC channel would give
more confidence

Appengine Has WAY better support than any other Google product. If you post
to this list server Nick , Jason, and Jeff are good about responding.

Adsense, and Webmasters has awful support. And Adwords used to give me an
account rep I could call but they seeming upped what you have to spend to
get that level of support.

I should organize a take GAE Support to lunch day. You guys like Schwerma?


Jeroen

unread,
Sep 26, 2009, 8:37:17 AM9/26/09
to Google App Engine
Biggest problem: The datastore is deadslow and uses indane amounts of
cpu. I found 2 ways around it, backwards ones imho, but if it works,
it works.
Maybe my usecase is unique, as it involves frequent updates to the
data (10k records) stored.

1st solution:
Only update the datastore after 2 new updates of the data, store
intermittent data in memcache. (eg: 1) store in datastore & put in
cache, 2) fetch from cache, update cache (if not in cache update
datastore) 3) store in datastore and update cache 4) fetch from cache,
update cache (if not in cache update datastore) 5) datastore, 6)
cache, 7) .... etc )

2nd solution:
Store non indexed data (about 10 fields) in one big blob, that you
serialize when storing data, deserializing when reading.

Both work fairly well (combining both methods reduced cpa usage by
over 50%), but are cripled, by appenginge.

The 1st method need a more reliable memache, atleast its limits needs
to be clear (there havwe been moments it was only able to held 8k
(total 10mb data) items, and moments it would held 20k (adding to
about 30mb), when only holding 8k, data gets lost. Of course the
nature of a cache is that it can loose date, but it would be nice if
it behaved in a predicatable way)

The 2nd method needs a good performing serialization mechanism. For
python the obvious choice is pickle (which i'm using), but in all it's
wisdom google decided not to include cpickle. Thus performance is
terrible. (Yaml yielded even worse results, as the c-extention needed
to speed things up isn't availble )(Another option might be protocol
buffers, well.. those don't work on appengine (the google package in
which the python code resides is locked or something))

All this gives me the feeling that I'm forced to pay CPU costs that
shouldn't be there:
- i didn't ask for a deadslow bigtable datastore (it really is that
damned datastore that's still eating half mu CPU usage)
- i try to optimize, but the tools for it are crippled

I fully understand the successtory related to serving static content.
But for dynamic content, for future project, i'll hapillily not try
using appengine anymore.

Walter Chang

unread,
Sep 26, 2009, 12:33:49 PM9/26/09
to google-a...@googlegroups.com
the 2nd method makes sense to me.  the 1st method looks to me like playing russian roulette with my data; it is twice as fast when everything is working but what if memcached crashes?  memcached is not fault tolerant and it was not design to be, after all, its just cache.  i think there are applications that can tolerate hiccups like that but most of the applications i worked on can't.

Joe Bowman

unread,
Sep 26, 2009, 1:52:33 PM9/26/09
to Google App Engine
Here's my thoughts on the matter, as posted a few weeks ago

http://joerussbowman.tumblr.com/post/182818817/why-im-dropping-google-appengine-for-my-primary

Basically, it depends on whether or not appengine is the right tool
for the job. If you have a lot of reading/writing to backend
datastore, such as in the case of http sessions, then I'd probably not
recommend it. I believe a higher profile app will see less errors due
to it being kept hot most of the time, however for low use apps that
will see more cold boots on requests, you really have to be careful.

I will say, since whatever fix happened for the imports issue, I've
seen a little more reliability from my apps, but, it's not been long
enough (or used enough) for me to really stand behind that feeling. My
app, django using appengine-patch and gaeutilities session, was seeing
enough problems that even I stopped using it, and have since decided
to move off of Appengine for that specific project.

GregF

unread,
Sep 27, 2009, 8:31:18 PM9/27/09
to Google App Engine
I'm not seeing the same issues as you - I have more objects but you
didn't specify how frequently you update, so maybe that's where the
difference is.

Dumb (of me if you have, of you if you haven't!) question - have you
added "Indexed=False" to your model properties? This can make an
enormous difference.

K Miller

unread,
Sep 26, 2009, 12:25:19 AM9/26/09
to google-a...@googlegroups.com
Googles 'App-Engine' SUCKS
 
Because it is Googles feeble attempt
 - of trying to be like Microsoft - here.
 
Just because you employ the same
'Lame-Brain' (just out of college) MORONs
 - to write the CODE here.
 
It will not bring the CALIBRE of this site
 - up to Microsoft's standards !!!
 
Just A Thought - HERE.
 
You NIM-RODs
 
To qoute a line (here)
 -  from the Great Prophet "Bob Dylan"
 
"Where have all the great Coders gone
 - a long time, in passing !!!"
 
"Where have all the great Coders gone
 - from the great time - Ah Go."
 
 
;- )
 
--
Milka 1010 As 'spoken' - by me !!!

Jeroen

unread,
Sep 28, 2009, 10:49:29 AM9/28/09
to Google App Engine
I did have indexed=False at strategic places ;) However, 5 properties
were listproperties (updates were hourly).
It's mostly annoying that something I have very little control over
turned out to be the most cpu intensive part of my application.

herbie

unread,
Sep 28, 2009, 5:46:37 PM9/28/09
to Google App Engine
GAE doesn't suck. But...

I've really enjoyed building my first GAE project. It does almost
everything I want it to do and so far its seems responsive and
reliable enough. But it is expensive in terms of api_cpu - which is
billable. The vast majority of my quota is used making api calls in
particular writing to the datastore. Reading/writing to the GAE
datastore is too expensive (esp. in relation to other billable
services).

I have to trust Google to make these api calls as efficient as
possible, as I can't change them.

What concerns me most is that recently I experienced a sudden and
dramatic increase in api_cpu usage with no changes to my code. (see
http://groups.google.com/group/google-appengine/browse_thread/thread/bbcf268e9df1d43f#
) The api_cpu values have remained much higher than previously. I
still have no idea why. I haven't released my app yet, but if I had
my bill would have doubled overnight - with no changes at my end!


Could Google work on optimising reading & writing to the datastore and/
or reduce the billing on api_cpu?

Peter Liu

unread,
Sep 28, 2009, 6:18:47 PM9/28/09
to Google App Engine
After looking at the reported api_cpu usage numbers for so long, I am
convinced that the usage is estimated with a formula or model. It
varies over time, maybe depending on server condition in that time
period.

For example:
1 put -> 66ms
10 batch put same object-> 666ms
100 batch put same object ->6666ms

Those are real numbers reported. I see plenty of logs that have
similar pattern.

From my observation, the usage ms depends on variables:
1. The API call itself, batch put is just multiple of the cpu of a
single put
2. The number of fields that needed to be updated (for new objects,
most fields, my guess on why new objects use way higher cpu then a
subsequent update)
3. Whether the fields are indexed (can be major for a wide schema,
since by default all index-able fields are indexed)

Very very surprisingly, it's doesn't depend much on the object
size.... say you have kinds with blob of 1k, 10k, or 100k. Each of
those use the same api_cpu. In fact, a blob of 100k might take less
cpu than an string field that's indexed.

This is one of the weirdness that I hope google will clarify. The
developers will optimize base on those reported numbers (after all,
that's all we can observe). The current way of how api_cpu is
calculated will encourage developers to bias towards practices that
minimize that number, simply because it's the bottleneck (again, all
that we can see on the quota dashboard).

If google don't want to release the formula, at least tell us whether
it will be changed in the future. I hate to see people's optimization
effort going to waste because the formula is changed.

To answer your question of why the usage is increased without code
change. My guess is that the cpu usage formula is changed. I doubt the
datastore suddenly become less efficient and use 3x cpus.

On Sep 28, 2:46 pm, herbie <4whi...@o2.co.uk> wrote:
> GAE doesn't suck.   But...
>
> I've really enjoyed building my first GAE project. It does almost
> everything I want it to do and so far its seems responsive and
> reliable enough.  But it is expensive in terms of api_cpu  - which is
> billable.   The vast majority of my quota is used making api calls in
> particular writing to the datastore.  Reading/writing to the GAE
> datastore is too expensive (esp. in relation to other billable
> services).
>
> I have to trust Google to make these api calls as efficient as
> possible, as I can't change them.
>
> What concerns me most is that recently I experienced a sudden and
> dramatic increase in api_cpu usage with no changes to my code.  (seehttp://groups.google.com/group/google-appengine/browse_thread/thread/...

herbie

unread,
Sep 29, 2009, 5:03:48 AM9/29/09
to Google App Engine

A Very good point re. Google changing the cpu usage formula.
Unfortunately we have to pay when Google change the formula. It would
be great if they could tell us if this is true and why.

Your points on what effects api_cpu usage the most seem spot on.

I've stripped back my indexed fields to the bare minimum and stopped
using run_in_transaction() (a big cpu hit) to try and reduce the
api_cpu usage. But still the reported values a much high now than
previously when all that stuff was still in!

Roy Smith

unread,
Sep 28, 2009, 10:32:32 PM9/28/09
to google-a...@googlegroups.com
Touche! An apt point. The current frequent planned maintenance is a
nuisance, but we warn customers ahead of time and present a nice
message to users explaining what is going on.

 
Since I've never seen my app during maintenance, can you elaborate on the "nice message" bit?  Are all GAE apps disabled, or are they allowed to run with reduced functionality? If the latter, is there any systemic information available to the app to tell it that it's running in maintenance mode?

best
Roy


 

GregF

unread,
Sep 30, 2009, 1:45:46 AM9/30/09
to Google App Engine
On Sep 29, 3:32 pm, Roy Smith <roy.smith....@googlemail.com> wrote:
> Since I've never seen my app during maintenance, can you elaborate on the
> "nice message" bit?  Are all GAE apps disabled, or are they allowed to run
> with reduced functionality? If the latter, is there any systemic information
> available to the app to tell it that it's running in maintenance mode?

Usually Google only stops write access when in maintenance. There is
an undocumented API which allows your app to detect if a particular
capability is not available, or will become unavailable in a given
timespan. I detailed my use of it here:

http://appengine-cookbook.appspot.com/recipe/gracefully-handling-system-maintenance

Roy Smith

unread,
Sep 30, 2009, 2:34:01 AM9/30/09
to google-a...@googlegroups.com
Perfect, and a really great site btw. Highly commended
Reply all
Reply to author
Forward
0 new messages