Why Google App Engine is broken and what Google must do to fix it.

11 views
Skip to first unread message

Aral Balkan

unread,
Oct 3, 2008, 2:52:32 PM10/3/08
to Google App Engine
I just wrote up a blog post summarizing the biggest issues I have with
App Engine:
http://aralbalkan.com/1504

If you are developing real-world/commercial apps with App Engine,
please add your thoughts to the discussion.

Thanks,
Aral

Bill

unread,
Oct 3, 2008, 3:54:45 PM10/3/08
to Google App Engine
Aral, your blog entry clearly outlines some major issues, so thanks
for posting it.

Here are my thoughts on the issues:

1) 1 MB limit
It's a problem if python variables also have this limit. I was not
aware of that. I have been using external resources (Scribd and
Amazon S3) to store larger items like documents. This is similar to
the way Amazon wants you to use their SimpleDB but at least App Engine
datastore has more headroom.

2) The 1000 limit offset
I typically have created_at and updated_at time fields for each
entity. Couldn't you page through all your items by sorting on the
created_at property, fetching the first n+1 records, then fetching
additional records using >= the timestamp for the n+1th record? Of
course, you'd have to put in some code to check for duplicates or use
a secondary property as part of the sort.

3) Short-term high CPU quotas
Yeah, I'm not sure how these are computed or what might trip it.
Hopefully this gets resolved and/or clarified.

4) Quotas in general.
I generally agree but think we should cut Google some slack. This is
a preview mode. You're a pioneer and you're taking some arrows with
your app :) When pay-as-you-go gets initiated, hopefully soon, the
major quotas go away with your credit card. As a counter-point, you
can read Dave's commentary on how he's scaled BuddyPoke (http://
groups.google.com/group/google-appengine/msg/2449df5e53b5f026).

5) Long-running processes.
I agree and the App Engine developers know this is a need. I believe
they are working on it.

6) 25% ready for primetime
Depends on your app and need for processing across a large number of
records.

I absolutely, completely, emphatically agree with you that shoring up
the deficiencies should be prioritized over adding new languages.
Hopefully the development team agrees and isn't swayed by the relative
number of stars in the App Engine issues. (Can I remove my star from
the Ruby request?)
-Bill

davew

unread,
Oct 3, 2008, 4:27:16 PM10/3/08
to Google App Engine
Hi Aral,

Your "25% ready for primetime" certainly depends on what criteria you
are basing your decisions on. I personally disagree, but my needs are
possibly very different. My number one requirement is scalability. For
that I give it two thumbs up, and for everything else I'm willing to
adapt to achieve that one goal.

If I wanted to put a 2MB file on App Engine and couldn't, I certainly
wouldn't knock App Engine 25% for it. (Although I think I can. Simply
by splitting the file and concatenating the blobs as I write the
response.)

To me it also sounds like you're knocking App Engine on some points
without profiling live code with real world usage.

You say: "In my tests, I've found that the high CPU limit can be
randomly triggered even in calls that return within a second. To tell
you the truth, I don't actually know what causes these."

If you don't know what causes the problems, then find out? Is it in
your code? Is it a datastore put timeout? Profile it?

Also, make sure you run tests that are accurate representations of
real world usage patterns and curves. Typically the guard mechanisms
kick in when devs run tests that jump from 0 calls to 1000 calls per
second, without any kind of ramp up. That's not a good representation
of real world usage.

In terms of the offsets, etc. Personally it's not a big deal for me.
Once I had a better understanding of the architecture it became more
obvious why certain limitations are in place. Those limitations are
the reason why this platform can scale so beautifully.

-Dave

Josh Heitzman

unread,
Oct 4, 2008, 2:06:28 AM10/4/08
to Google App Engine
From near the bottom of
http://codecrafter.wordpress.com/2008/09/16/light-web-strategy-game-data-model-smacks-into-gae-limitations/
:

After some thought I decided my requirements for an enhanced GAE data
model layer were:

1. Transactions spanning entity groups
2. [redacted - more of a nice to have]
3. Schema version management

And from earlier tonight
http://codecrafter.wordpress.com/2008/10/03/google-app-engine-scalability-that-doesnt-just-work/

* [redacted - not really broken if other issues are fixed]
* [redacted - not really broken if other issues are fixed]
* [redacted - not really broken if other issues are fixed]
* GAE can’t not tell you how many total entities there are of a
specific type and it can’t count more then 1000 entities.
* GAE limits the entities that can be processed in a single
transaction to those in the same entity group and only one request
processing instance can write to an entire entity group at a time.
* [redacted - not really broken if other issues are fixed]
* [redacted - not really broken if other issues are fixed]
* GAE’s version of Python does not include marshall or cpickle,
only the slow pickle

Of these issues, the two I'm finding most annoying is that
transactions can't span entity groups (combined with needing to keep
entity groups small to avoid put contention so the app will actually
scale) and the lack of cpickle (or something more peformant then
pickle) so I could efficiently serialize collections of small data
structures associated with another entity for which I have need to
query.

Josh Heitzman

On Oct 3, 11:52 am, Aral Balkan <aralbal...@gmail.com> wrote:

David Symonds

unread,
Oct 4, 2008, 2:25:41 AM10/4/08
to google-a...@googlegroups.com
On Sat, Oct 4, 2008 at 4:06 PM, Josh Heitzman <JoshHe...@hotmail.com> wrote:

> * GAE can't not tell you how many total entities there are of a
> specific type and it can't count more then 1000 entities.
> * GAE limits the entities that can be processed in a single
> transaction to those in the same entity group and only one request
> processing instance can write to an entire entity group at a time.

I fail to see how either of these could be implemented in a truly
scalable and distributed system such as App Engine. The data is
potentially spread over thousands of machines, so any kind of global
co-ordination (needed for both entity counts and transactions) would
require a single machine to be "in charge", and hence the scaling
would be limited to that machine's capacity.

If you care about counting the number of a particular kind or all
kinds, simply increment a sharded counter each time you add an entity.
That would scale very well, and give you quick access to totals.

Dave.

Josh Heitzman

unread,
Oct 4, 2008, 4:00:06 AM10/4/08
to Google App Engine
If sharded entity counters can be implemented at the application level
then they can be implemented at the data store level as well. Sharded
counters implemented at the application level in python code have
scalability limits (see referenced blog post for details) that are
lower when then they would be if they implemented directly by the data
store (i.e. this is a common enough problem to be pushed down into the
platform).

As far as transactions go, the global coordination is already there at
the entity group level (if it wasn't, then there wouldn't be any
transactions at all). Sure it will be more costly at the entity level
then at the group level, but it will still be less costly then
layering the equivalent on top of the data store API in the python
application code. Allowing transactions spanning entity groups,
doesn't mean that entity groups and non-spanning transactions need be
eliminated. Leave it the application developer to choose the
appropriate transaction type as needed.

Josh

On Oct 3, 11:25 pm, "David Symonds" <dsymo...@gmail.com> wrote:

Sylvain

unread,
Oct 4, 2008, 4:30:25 AM10/4/08
to Google App Engine
Aral,

Great article. I think it summarize very well GAE and its limitations

I think you can add other issues :
- GQL limitation and particularly with these :
* Inequality Filters Are Allowed On One Property Only
* Properties In Inequality Filters Must Be Sorted Before Other Sort
Orders
* missing keyword like 'distinct', 'average', ... and so you have to
spend time to recreate these basic features and during this time you
don't focus on your app.
All these features recreated in python (sort(),...) will consume CPU (-
> CPU quota)

- Datasore Timeout/Exception :
Each day, i've about 10 timeouts on a simple put() and it seems that
these Timeout will continue after the release.

Regards

@davew : "If you don't know what causes the problems, then find out?
Is it in
your code? Is it a datastore put timeout? Profile it? "
I've spent a lot of time to understand it too. And I realize that you
can raise this Quota Warning in 2 lines :
myModel.all(), myModel.fetch(100). So here, there is nothing to
optimize or to profile, particularly when it is random.

Aral Balkan

unread,
Oct 4, 2008, 5:40:00 AM10/4/08
to Google App Engine
Hi David,

The point of my post is that the "scalable and distributed system"
should be just one mode of operation of Google App Engine as it
ignores the other 50% use case which is admin and long-running
processes.

Those do _not_ have to scale.

But they are essential.

Aral

> I fail to see how either of these could be implemented in a truly
> scalable and distributed system such as App Engine. The data is
> potentially spread over thousands of machines, so any kind of global
> co-ordination (needed for both entity counts and transactions) would
> require a single machine to be "in charge", and hence the scaling
> would be limited to that machine's capacity.
<snip>

Aral Balkan

unread,
Oct 4, 2008, 5:42:26 AM10/4/08
to Google App Engine
@davew:

As Sylvain points out, below, the simplest things can cause this. I've
had straight-to-template calls cause high CPU errors and, to tell you
the truth, even if I could meaningfully profile Django templates, I
don't think there's much I could do about it. Especially, as Sylvain
states, it appears to be random.

Aral
Message has been deleted

David Symonds

unread,
Oct 4, 2008, 5:53:38 AM10/4/08
to google-a...@googlegroups.com
On Sat, Oct 4, 2008 at 6:00 PM, Josh Heitzman <JoshHe...@hotmail.com> wrote:

> As far as transactions go, the global coordination is already there at
> the entity group level (if it wasn't, then there wouldn't be any
> transactions at all). Sure it will be more costly at the entity level
> then at the group level, but it will still be less costly then
> layering the equivalent on top of the data store API in the python
> application code. Allowing transactions spanning entity groups,
> doesn't mean that entity groups and non-spanning transactions need be
> eliminated. Leave it the application developer to choose the
> appropriate transaction type as needed.

You're right, application-wide transactions don't have to subsume
entity-group transactions; you could, after all, put every entity in
your application in a single entity group. They are, however, still
inherently unscalable. Entity groups mean that global coordination is
*not* required, only that there is a "home" for each entity group. As
long as the entity groups are small, this responsibility can be split
across machines quite easily.


Dave.

Josh Heitzman

unread,
Oct 4, 2008, 1:56:10 PM10/4/08
to Google App Engine
If the entity-groups have a (single) home and there is not any global
coordination for transactions that means that there is a single
machine in charge of that entity group. Allowing cross entity group
transactions does not require that a single machine be in charge of
all entities, it just means that the request processing process will
need to coordinate with the machines in charge of each entity or that
the machines in charge of each entity need would need to coordinate
with each other as peers. That said I'm doubtful that entity groups
have a single home as they would need to be mirrored around to
actually scale, so it seems likely to me there is already coordination
between the nodes mirroring the same entity group.

On Oct 4, 2:53 am, "David Symonds" <dsymo...@gmail.com> wrote:

Josh Heitzman

unread,
Oct 4, 2008, 2:00:34 PM10/4/08
to Google App Engine
This is a good point. This isn't something I care about for the web
app I writing at the moment, but I've already there are other apps
I've though about doing that I definitely would not want be willing
upload non-obfuscated source code for, and I was figuring I'd just
have to host those elsewhere.

On Oct 4, 2:48 am, Davide Rognoni <davide.rogn...@gmail.com> wrote:
> The business is the main issues.
>
> Search the word "obfuscate"http://evolvingtrends.wordpress.com/2008/07/23/google-app-engine-thre...

Nash

unread,
Oct 4, 2008, 6:20:24 PM10/4/08
to Google App Engine
Great article Aral,

My strong suggestion at this stage to anyone considering GAE for a
production, business use DO NOT USE GAE.

GAE has significant flaws; these are basic flaws and the time spent
writing a work-around to these problems is far too great for very
short internet times.

Let me add some basic very necessary items (I am going to do a blog
entry as well on these issues)

1. No Bulk Delete/Update: If you ever create a one-to-many or many-to-
many relation; you will inevitably come to a point when you have to
remove an object. In doing so, if that object was being referred by
400 objects; you have to read, change and write those objects back.
GAE does not allow you to change that this many objects. A user might
leave your service, you would want to remove all their data or mark it
as unavailable. A large tweet comes in, 20,000 followers need to be
updated. A group gets deleted, all the contacts referring to that
group need to be updated.

2. Random Datastore timeouts: The most annoying issue is that a single
object read/writeback can sometimes be <1000mc or can be more than
5000mc. This random behavior cripples the app and makes it impossible
to optimize

3. The Google Quota Bad: if your app exceeded it's short-term quota
the punishment is very excessive: a 24 hour ban. This is the worst
possible action you can take on a dot com: You take it offline.

4. No sorting: When using lists, inequalities etc you can't sort on
multiple properties. You just can't.

5. Limited Datastore functionality and very poor workarounds: Want to
use OR? Sorry, you can't. What to simulate OR in memory? Sorry, your
process will be killed either because of high quota or long response
time. Even if you get your app to do it in memory, it is a ticking
timebomb, it will explode when more users come in. Very unsalable in
that regard. Want to use two inequalities? Sorry.

6. Magic Exploding Indexes: Yes, Lists are a great concept but they
can cut you in half. You cannot mix multiple lists in a single WHERE
clause, or suffer explosion. You cannot create too many (20< if that
sounds like too many) indexes, you will have them explode. Lists are a
double-edged sword which cut you a lot more than help you.

7. In 2008, GAE keeps on making you reinvent the wheel: As a
webapplication/startup, the most important thing is feature velocity.
How fast can you deliver features? With GAE, some very common
functionality has to be reinvented over and over. To the point where
it consumes so much time that the cost-time benefits are completely
lost.

8. No HTTPS. Toy apps aside (apologies to wordle and buddypoke), if
google wants serious applications it NEEDS to add HTTPS support. In
this day and age of trust building, colored address-bar to peace of
mind; you cannot leave this feature out.

9. Dev Server is broken. The local test server doesn't work on half
our development systems. Its broken. Its results do not reflect the
behavior of GAE itself. It won't do simple things like load static
files.

10. No support: Ofcourse, this is a preview, if you get in a mess and
need a Googler's attention; it's up to their discretion and leisure
time do they respond to you. Nothing is binding at this time; you're
not paying them anything yet.

11. GAE Admin is NO replacement for the Django Admin. But that's how
it is portrayed in the GAE's documentation. You think that Google's
Admin is a replacement for django's admin. Boy, are you wrong. The GAE
admin is a very limited app; on dev server it will keep throwing
errors at you. The app looks more like a backend for programmers where
as the whole point of the django admin was to allow for "admin USERS"
to access it and make changes to the data. In its current form, it
can't be used by non-tech-savy users.

12. Very slow GAE upgrades: The GAE team is very slow on introducing
changes to appengine. For something that's targeted for release by the
end of the year, this is not at all going to the pace required.

13. No roadmap shared: We'd all shutup on the features if Google said
"we're working on it, it'll be out"; Google won't even say it's
working on it or that there is work being done

I'm a big fan of Django and when Google announced a scalable web
application framework with Django, I was thrilled! But I have been
very disappointed. This "preview" is not up to the level of Google's
"previews". Google has shown us that it has high quality standards so
much so that we trust it with our private email, docs etc whereas all
these services are in Beta.

My software shop had a team of 6 GAE developers, but until GAE can get
it's act together, we're pulling away from it. The time and money
wasted on getting simple things to work is atrocious and the light at
the end of the tunnel is just way too far away.

WS
-Nash

Jorge Vargas

unread,
Oct 4, 2008, 11:37:43 PM10/4/08
to google-a...@googlegroups.com
You clearly don't get must of the stuff you are talking about.

ALL the quotas are there for you to be efficient they the whole
purpose, they are not hard coded to make your live bad they are in
there to make the system better.
I'll point you to Guido's latest talk on the issues you suggest
http://www.youtube.com/watch?v=CmyFcChTc4M&feature

The "admin" complain seems like a django-junkie missing it, rather
than a real complain, you want an admin go write it.

The only place where I agree with you is the lack of the long running
process, although the way you point it seems to be more like,
longer-than-now not really long running.

Aral Balkan

unread,
Oct 5, 2008, 6:47:28 AM10/5/08
to Google App Engine
Hi Jorge,

> You clearly don't get must of the stuff you are talking about.
>
> ALL the quotas are there for you to be efficient they the whole
> purpose, they are not hard coded to make your live bad they are in
> there to make the system better.

I feel that you have missed the point of my post. Rest assured that I
get the stuff I'm talking about.

Please re-read the post and you will see that it is not an attempt to
bash Google App Engine. Quite the contrary, it is a simple summary of
the issues encountered by one developer while developing a real-world
application. The reason for the post is to highlight these issues so
that they can be fixed and thus, ultimately, to improve Google App
Engine.

As you can see from the replies here, and on the post, I am not the
only one who has encountered these issues either.

We provide constructive criticism for the things that we care about so
that they can evolve. I invite you to add value to the conversation
and help make Google App Engine better.

Aral

Aral Balkan

unread,
Oct 5, 2008, 6:53:18 AM10/5/08
to Google App Engine
Hi Nash,

Thank-you for the comprehensive list and for going into further depth
on some of the points that I simply glossed over (I'm looking forward
to seeing your blog post - please let us know when it's up).

Number 8 on your list, "No HTTPS", is actually a core issue that I
forgot to mention in my post. It's been a while since I encountered
the issue, researched the alternatives, and found that PayPal is
currently the only way to implement any sort of ecommerce on Google
App Engine where receiving purchase callbacks is essential
(ironically, Google Checkout requires SSL for its callback API).

Aral

Jonathan Feinberg

unread,
Oct 5, 2008, 8:49:22 AM10/5/08
to Google App Engine


On Oct 4, 6:20 pm, Nash <nasrul...@gmail.com> wrote:
> Great article Aral,
>
> My strong suggestion at this stage to anyone considering GAE for a
> production, business use DO NOT USE GAE.

I just replied to this same post in a different thread, not realizing
it had been posted to this one as well.

http://groups.google.com/group/google-appengine/msg/ced273b3bc1a0fa0

Bill

unread,
Oct 5, 2008, 1:45:54 PM10/5/08
to Google App Engine
I just revisited this thread and noticed someone (probably one
individual) gave "1" ratings to every person responding in the
thread. Is there a way within Google groups to remove these kinds of
individuals? A HackerNews thread discussed destructive people, but
this is my first obvious exposure to how it can corrupt the ratings
system for a community. Aral does not deserve to have a low rating
with the information he's provided so I've counterbalanced with a
"5". Something to remember when implementing a rating system for your
app.

Alex Epshteyn

unread,
Oct 6, 2008, 1:23:59 AM10/6/08
to Google App Engine
Datastore timeouts, which some users on this thread mentioned, are the
most disturbing issue for me at this point. I'm getting timeouts on
3-5% of all write requests.

This is especially bad if you're using GAE as a back-end data layer
that your front-end server talks to through a REST API. Each timeout
ties up a thread on your front-end server for 5 seconds, which is kind
of scary, since timeouts on GAE seem to happen in batches - I often
get 5 timeouts in a row one minute and then 5 more in a row 10-20
minutes later, etc., and this continues 24/7.


On Oct 4, 6:20 pm, Nash <nasrul...@gmail.com> wrote:

Jorge Vargas

unread,
Oct 6, 2008, 8:13:54 AM10/6/08
to google-a...@googlegroups.com
did you watch the video I posted? they are aware of anything you
propose, and half of them are non-fixable specially because they are
impositions made so the system won't crumble with bad code, the other
half are being worked on and as Guido suggested will probably be out
before the end of the year.

> Aral
> >
>

bradj...@gmail.com

unread,
Oct 6, 2008, 11:20:26 AM10/6/08
to Google App Engine
Jorge,

One thing you have to remember it is not what Guido or the engineers
want. If Google App Engine is to succeed it is what the customers
want. If it is designed as you have stated it will never recoup what
Google has spent so far let alone down the road. Google App Engine has
so many many limitations. Regardless if the limitations are by design
or not it is virtually unusable by 99% of all developers. Can Google
make a business off the remaining 1%?

I think Google should pay much more attention to what the posters here
are asking for and less on what they have "designed" App Engine for.
Failure to listen to your customers is Always a Mistake! Defend it if
you like. App Engine is going to be a huge money pit for Google.
Engineers are terrible Businessmen and even worse Marketers Guido said
as much in his talk.

In the end Google is a publicly owned corporation and not a University
and ultimately Google must answer to its customers and stockholders.
The goal is to increase shareholder value through offering products
and services the broad market demands and not basic research. App
Engine in it's current form seems more like someone's thesis project
more than a marketable computing platform.

On Oct 6, 5:13 am, "Jorge Vargas" <jorge.var...@gmail.com> wrote:

Jonathan Feinberg

unread,
Oct 6, 2008, 12:10:17 PM10/6/08
to Google App Engine
On Oct 6, 11:20 am, "bradjyo...@gmail.com" <bradjyo...@gmail.com>
wrote:
> Regardless if the limitations are by design
> or not it is virtually unusable by 99% of all developers.

From which hat are you pulling that number? Source?

Wooble

unread,
Oct 6, 2008, 1:35:17 PM10/6/08
to Google App Engine


On Oct 6, 11:20 am, "bradjyo...@gmail.com" <bradjyo...@gmail.com>
wrote:
> In the end Google is a publicly owned corporation and not a University
> and ultimately Google must answer to its customers and stockholders.
> The goal is to increase shareholder value through offering products
> and services the broad market demands and not basic research.

The goal is whatever the people actually running the company want it
to be. If the stockholders don't like it they'll try to replace
them.

I'm sure Porsche could increase shareholder value by introducing a
$15,000 subcompact, too.

Ross Ridge

unread,
Oct 6, 2008, 10:54:25 PM10/6/08
to Google App Engine
bradj...@gmail.com wrote:
> One thing you have to remember it is not what Guido or the engineers
> want. If Google App Engine is to succeed it is what the customers
> want. If it is designed as you have stated it will never recoup what
> Google has spent so far let alone down the road. Google App Engine has
> so many many limitations. Regardless if the limitations are by design
> or not it is virtually unusable by 99% of all developers. Can Google
> make a business off the remaining 1%?

The question of whether Google can turn Google App Engine into a
profitable business doesn't depend on what percentage of developers
find it useful, but whether Google exploit a competive advantage.
Google could've started up a tradtional web hosting service using
popular SQL databases and other techonologies and created something
that would have had a much broader appeal. Any one could. That's the
problem. Google might be able to grab market share, but without
anything to distiguish themselves from their competitors, a best they
only get a marginal return on their investment.

We can only speculate on what Google business plan for GAE is, but it
seems pretty obvious to me that leveraging Google's own internal
technologies is at the heart of it. A number of limitations and
problems with GAE stem from technologies like Big Table, Google
Frontend and Google Apps. Another part of their plan appears to be
keeping support costs low, so you're not given much rope to hang
yourself (or others). If, in the long term, Google can't make a
business following this plan, if it doesn't give them enough a
competive advanage, then there's probably no way they can make the
kind profits from a hosting service that Google's investors expect.

(While it's not terribly relevent to this discussion, I suspect Google
has some other goals for GAE that don't deal directly with its
viability as a business. One is to educate programmers in the Google
way of doing things. I'm sure Google has been fustrated with tons of
amazing job applicants with advanced degrees, 10+ years of WWW
experience, and the inability work with anything but PHP and SQL.
Another is that they want to make even easier for people to create WWW
sites, the sort of small little sites that through AdWords/AdSense,
Google has made billions.)

Ultimately, what matters is what you want and what Google is willing
give you. It doesn't matter what 99% developers want. The are number
of problems and limitations with GAE that will be fixed. You can look
at the issue database to get and idea of what these are. However,
there are no timelines, so don't plan anything being fixed tommorow or
even a year from now. Many limitations will always be there. You're
never going to get all the functionality of an SQL database, nor will
GAE be suitable for computationally intensive tasks.

Look at a GAE, and see if it offers it what you want as it is now. If
it's close but not quite there maybe play around with it, maybe go so
far as making a proof of concept of something. On the other hand, if
GAE is far away from what you want, then walk away. GAE isn't for
you, and probably won't ever be. Maybe check back in a year or so,
but now you should be looking for another hosting solution.

Ross Ridge

Jorge Vargas

unread,
Oct 7, 2008, 8:47:14 AM10/7/08
to google-a...@googlegroups.com
On Mon, Oct 6, 2008 at 9:20 AM, bradj...@gmail.com
<bradj...@gmail.com> wrote:
>
> Jorge,
>
> One thing you have to remember it is not what Guido or the engineers
> want. If Google App Engine is to succeed it is what the customers
> want. If it is designed as you have stated it will never recoup what
> Google has spent so far let alone down the road. Google App Engine has
> so many many limitations. Regardless if the limitations are by design
> or not it is virtually unusable by 99% of all developers. Can Google
> make a business off the remaining 1%?
>
Leaving aside your exaggeration on the numbers. You got it wrong, to
paraphrase the talk, BigTable and by extension GAE's db store, is
ablel to expand to infinity storage, the only possible limitation is
the number of machines it is configured to use and that can always be
expanded.
Which means these are not design limitations, they are resource
optimization. The restrictions are in place so you are encourage to
use resources efficiently. And they will be set higher in cases where
needed. An example is the "top gae applications" they had their quotas
raised.

Lets look at it from a performance perspective.

1- 1MB datastructure - which unit of data (leaving files outside)
should be bigger than 1mb? IMO that's a badly design datastore.
2- 1000 query limit, which user is going to want 1000 results?
3- Short CPU, it is common knowledge that a user will go away from a
page after 3 seconds of loading. so in order to eliminate this
bottleneck you use catching, after all if it's intensive to compute
it's worth catching.
4- Quotas in general, not even google has enough machines for us to waste.
5- Admin, a django junkie complaining for the lack of UI

The only concern where I agree is file upload, we do need a facility
to uplod videos or pdf or images or whatever we want, but that is
being worked out, same with SSL.

Funny that the OP didn't mention SSL as that IS a showstopper for a
LOT of applications
http://code.google.com/p/googleappengine/issues/detail?id=15

Greg

unread,
Oct 7, 2008, 7:16:15 PM10/7/08
to Google App Engine
Davew said it way back at the top - appengine's killer feature is
scalability. That is what sets it apart from the other cloud systems
out there, and it is also the root cause of most complaints (except
the quotas, which will disappear when you get to pay for the service).

For the application I'm working on, I'm happy to trade off lack of a
relational database for the future gain of scalability. My guess is
that most of you haven't had the nightmare of an application that
suddenly became popular, and you had to become an expert at database
replication, load balancing and multi-system maintenance overnight.
It's a very stressful situation.

So my advice is that if you don't need scalability, get a normal
hosting account or EC3. Then you can have PHP, Ruby, MySQL, cron jobs,
anything you want - problem solved. Oh, yes you are going to have to
shell out a few buck a month.

But if you do need scalability, then appengine is a godsend. The
limitations are there to make it safe and scalable, not because Google
wants to annoy you. You spend a little more time now working around
the limitations, and save endless time later managing systems and
capacity.

And lastly, I believe that many of the complaints come from people
just wanting a free hosting service, and not finding what they are
used to. It would be a crying shame if Google listened to these people
and turned appengine into a vanilla PHP/MySQL hosting service.
Appengine is so much more...

On Oct 7, 3:54 pm, Ross Ridge <rri...@csclub.uwaterloo.ca> wrote:

Josh Heitzman

unread,
Oct 7, 2008, 11:32:27 PM10/7/08
to Google App Engine
Assuming you can actually work around all of the limitation
simultaneously, considering the more things you work around the close
you come to the CPU time quota.

It also is not completely accurate to state that the limitations are
there to make apps safe and scalable. For example transactions being
limited to a single entity group makes it very complicated (i.e. not
safe) to write code that needs to reliably update entities from
different groups (it isn't always possible to structure entity groups
such that everything that is needs to be updated for a request can be
all be in one entity group).

Also given the roundness of the quotas, I find it very unlikely that
the quota numbers were choosen based on an in depth analysis or a
broad sampling of data, rather the being choosen fairly arbitrarily
(possibly based to some extent on what works for google's own apps).

Ian Lewis

unread,
Oct 8, 2008, 1:24:24 AM10/8/08
to google-a...@googlegroups.com
The documentation says that an entity group could be as large as a single user's data. It seems to me that to avoid the problem you are describing that it would be a good idea to do so. I have a feeling that the number of cases where you would need to update more than one user's data in a single transaction would be limited.

2008/10/8 Josh Heitzman <JoshHe...@hotmail.com>

Josh Heitzman

unread,
Oct 8, 2008, 1:44:13 AM10/8/08
to Google App Engine
Not all applications' data can be modeled such that that each piece of
data belongs to one and only one user.

On Oct 7, 10:24 pm, "Ian Lewis" <ianmle...@gmail.com> wrote:
> The documentation says that an entity group could be as large as a single
> user's data. It seems to me that to avoid the problem you are describing
> that it would be a good idea to do so. I have a feeling that the number of
> cases where you would need to update more than one user's data in a single
> transaction would be limited.
>
> 2008/10/8 Josh Heitzman <JoshHeitz...@hotmail.com>

Filip

unread,
Oct 9, 2008, 8:09:33 AM10/9/08
to Google App Engine

> Lets look at it from a performance perspective.
>
> 1- 1MB datastructure - which unit of data (leaving files outside)
> should be bigger than 1mb? IMO that's a badly design datastore.
> 2- 1000 query limit, which user is going to want 1000 results?
> 3- Short CPU, it is common knowledge that a user will go away from a
> page after 3 seconds of loading. so in order to eliminate this
> bottleneck you use catching, after all if it's intensive to compute
> it's worth catching.
> 4- Quotas in general, not even google has enough machines for us to waste.
> 5- Admin, a django junkie complaining for the lack of UI

Let's stop the slogans, and get down to a real discussion here.
Discussions are useful. This post misrepresents the original article
needlessly and this does not add to the discussion at all.

Use-cases for 1 MB were given, and there are plenty more. It is not
sufficient that you can use S3, because the over 1 MB file may well be
autogenerated (like a downloadable PDF or Excel or XML file).

The query limit was given specifically in combination with the lack of
expression power to select the records. Nobody wants to return 10.000
records to the browser, but you have to be able to get the 50 records
you do want to return. True, some applications know upfront what the
exact key will be, but some applications need more dynamic querying.
Also, the limit is hurtful because currently MapReduce can only be
implemented as a series of successive calls.

Likewise, the CPU was not discussed in the context of rendering a
standard user page but in the context of background processing and/or
report building.

Quota's were discussed with lot of understanding, but also a lot of
nuance. How do you actually respond to the remarks made? Let's get
down to real applications.

Admin again was not about being able to use Django, but about how to
do data transformation on your database. Do you think Google would be
able to rebuild its index, or do any other part of its magic, without
MapReduce? Admin is about the ability to prepare things before the
user needs them, so that yes you can respond in subsecond turnarounds.

> The only concern where I agree is file upload, we do need a facility
> to uplod videos or pdf or images or whatever we want, but that is
> being worked out, same with SSL.

Look, Google has declined to provide forward looking data. Yes, I was
at Google IO and they ARE good guys and I believe them when they
express their best intentions to meet specific targets (back then, at
the beginning of the year). But not providing an official calendar
means they are not putting their foot down, essentially that they
don't know themselves. Google is asking us to use what is there, and
only what is there (not that I wouldn't like a calendar, and that
would reframe this discussion completely - but it simply isn't there).
And there is little tangible evidence that warrant the faith that they
will in fact have a big bang improvement within what has now become a
really very short time frame. So my guess is that we'll see
substantial delays relative to what was said back at Google IO.

And I like Google App Engine very much. Really. And yes, I do build
real applications on top of it. And I do believe that many of their
limitations stem from the need to scale.

I feel that still leaves plenty of room for discussion, and especially
for this blog post that makes a lot of very good arguments. Plus the
author clearly uses and understands GAE, uses Python, isn't
complaining about the designers philosophy.

The good news is that by the end of *2009*, the world might be really
interesting with GAE, and with competing platforms driving each others
features.

Filip

Jorge Vargas

unread,
Oct 9, 2008, 3:36:19 PM10/9/08
to google-a...@googlegroups.com
On Thu, Oct 9, 2008 at 6:09 AM, Filip <filip.v...@gmail.com> wrote:
>
>
>> Lets look at it from a performance perspective.
>>
>> 1- 1MB datastructure - which unit of data (leaving files outside)
>> should be bigger than 1mb? IMO that's a badly design datastore.
>> 2- 1000 query limit, which user is going to want 1000 results?
>> 3- Short CPU, it is common knowledge that a user will go away from a
>> page after 3 seconds of loading. so in order to eliminate this
>> bottleneck you use catching, after all if it's intensive to compute
>> it's worth catching.
>> 4- Quotas in general, not even google has enough machines for us to waste.
>> 5- Admin, a django junkie complaining for the lack of UI
>
> Let's stop the slogans, and get down to a real discussion here.
> Discussions are useful. This post misrepresents the original article
> needlessly and this does not add to the discussion at all.
>
> Use-cases for 1 MB were given, and there are plenty more. It is not
> sufficient that you can use S3, because the over 1 MB file may well be
> autogenerated (like a downloadable PDF or Excel or XML file).
>
Ok that seems like a valid use case, I'll give you that point when you
are generating a file on the fly you may get stuck with the 1MB limit.
But keep in mind the "big file serving" Guido talks about still
apllies to this use case.

> The query limit was given specifically in combination with the lack of
> expression power to select the records. Nobody wants to return 10.000
> records to the browser, but you have to be able to get the 50 records
> you do want to return. True, some applications know upfront what the
> exact key will be, but some applications need more dynamic querying.
> Also, the limit is hurtful because currently MapReduce can only be
> implemented as a series of successive calls.
>

will you post an example where you need 1000 results to then narrow it
down to 50? this seems to me like a "joins" design which is something
you shouldn't be doing in datastore, it has been discuss several times
that you shouldn't use datastore as a relation database. You may
disagree if you are strong on SQL but denormalization is the first
step of scalability.

> Likewise, the CPU was not discussed in the context of rendering a
> standard user page but in the context of background processing and/or
> report building.
>

Which are facilities that aren't out on the engine, so you can't
critize an interface that was build to serve pages as a bad background
processing tool, it will be like complaining about how bad that
Italian restaurant is at making Hamburgers.

> Quota's were discussed with lot of understanding, but also a lot of
> nuance. How do you actually respond to the remarks made? Let's get
> down to real applications.
>

I'm a minimalistic at heart, and I strongly believe that any app
running that is getting over the head of it's quota is because it
isn't efficient enough, with the sole exception of the max page views
in which case google has and will increase that bar for the
applications that demand it. Again from the talk, the top 5
applications running on GAE have had their quotas raised because they
really need it not because some script kiddie is loading half his
datastore into memory on every request.

> Admin again was not about being able to use Django, but about how to
> do data transformation on your database. Do you think Google would be
> able to rebuild its index, or do any other part of its magic, without
> MapReduce? Admin is about the ability to prepare things before the
> user needs them, so that yes you can respond in subsecond turnarounds.
>

what? the Admin (in a django concept) is a way to fix error in the
data and sometimes input data by advanced users, you shouldn't be
manipulating a db structure from a GUI, that is why we have migration
scripts and so many project dedicated to them.

>> The only concern where I agree is file upload, we do need a facility
>> to uplod videos or pdf or images or whatever we want, but that is
>> being worked out, same with SSL.
>
> Look, Google has declined to provide forward looking data. Yes, I was
> at Google IO and they ARE good guys and I believe them when they
> express their best intentions to meet specific targets (back then, at
> the beginning of the year). But not providing an official calendar
> means they are not putting their foot down, essentially that they
> don't know themselves. Google is asking us to use what is there, and
> only what is there (not that I wouldn't like a calendar, and that
> would reframe this discussion completely - but it simply isn't there).
> And there is little tangible evidence that warrant the faith that they
> will in fact have a big bang improvement within what has now become a
> really very short time frame. So my guess is that we'll see
> substantial delays relative to what was said back at Google IO.

Well that's a social issue, and I wasn't there so I don't have the
first hand experience. But do keep in mind the same thing could be
said about any product. But I'm not the person to answer this, they
are.

>
> And I like Google App Engine very much. Really. And yes, I do build
> real applications on top of it. And I do believe that many of their
> limitations stem from the need to scale.
>
> I feel that still leaves plenty of room for discussion, and especially
> for this blog post that makes a lot of very good arguments. Plus the
> author clearly uses and understands GAE, uses Python, isn't
> complaining about the designers philosophy.

Isn't this what we are doing? my point is that most of the complains
in this group come from people that want GAE to be X,Y and Z without
any criteria, specially with the limits discussion, most people say
you are google you are huge give me your resources for free, instead
of sitting down with their code and fix it.

> The good news is that by the end of *2009*, the world might be really
> interesting with GAE, and with competing platforms driving each others
> features.
>

Maybe, my crystal ball is clowdy today.

> Filip
>
>
> >
>

Reply all
Reply to author
Forward
0 new messages