Bad news for GAE/Java from Google I/O

Jeff Schnitzer

unread,

May 15, 2013, 7:52:51 PM5/15/13

to Google App Engine

I attended the "Autoscaling Java" session at Google I/O. In summary, the advice is:

* Don't use dependency injection.

* Don't use AOP.

* Hardcode configuration values as much as possible.

In other words, go back to Java circa 2002. There was no discussion of changing routing so that user requests don't see cold starts. I asked about this in person - apparently they're still "talking about it" and nothing has been done about it.

I am sad.

Jeff

Ibrahim Arief

unread,

May 16, 2013, 8:06:56 AM5/16/13

to google-a...@googlegroups.com, je...@infohazard.org

Hi Jeff,

Any shared reasoning behind those advices? I assume it is because those approaches are increasing the total latency required for processing each incoming requests?

I was experimenting on using Guice and its AOP method interception to replace my increasingly complex ContainerRequestFilter, and I love how much simpler things are after the I made the changes. It is disheartening to hear that those approaches turns out to be discouraged in App Engine apps by Google themselves.

Is there any recording of the session, by the way? Would it be uploaded at some point on the (hopefully near) future?

Cheers,
Ibrahim

de Witte

unread,

May 16, 2013, 9:22:37 AM5/16/13

to google-a...@googlegroups.com, je...@infohazard.org

Aren't you a bit overrating with your subject title?

Dependency injection a la guice and spring, are frameworks which you want to avoid as much as possible.

Op donderdag 16 mei 2013 01:52:51 UTC+2 schreef Jeff Schnitzer het volgende:

Jeff Schnitzer

unread,

May 16, 2013, 12:14:20 PM5/16/13

to de Witte, Google App Engine

No, I'm not "overrating" with my subject. Your ill-educated opinion about DI is not shared by the countless Java developers who find these tools essential for building large, complicated, testable applications.

The apparent reasoning behind those points is that Google still thinks "startup time is your problem", and routes user requests to cold starts. This works in Go but it will never work in Java. The last time I made a Hello, World app with JPA it took 4-5 seconds to startup in production. Somewhere around 5 seconds is where the user thinks your app is broken and hits the reload button. If Google Search pages took 5 seconds to load for a significant percentage of users, heads would roll.

If your app actually _does something_ it's going to take more than 5s to load. Maybe you can make it 10s instead of 30s by adopting 2000s-era programming practices, but it doesn't matter because the user has already considered your app broken.

And don't get me started on the frequent "sick periods" where startup time goes up by 3X...

Jeff

jeffrey_t_b

unread,

May 16, 2013, 7:21:22 PM5/16/13

to google-a...@googlegroups.com, je...@infohazard.org

Jeff, I believe that you had asked on this list, a while ago: In what circumstance is it _ever_ good for user requests to see cold starts?

Did you ever get an answer to that? That is the part that puzzles me still. Is it just too hard? Maybe they don't have a _scalable_ algorithm for directing requests to already-existing instances?

Anyway, with all of the focus on the Compute Engine side, I wonder if improvements to App Engine are going to deprioritized.

Kristopher Giesing

unread,

May 16, 2013, 7:43:37 PM5/16/13

to google-a...@googlegroups.com, je...@infohazard.org

What is AOP in this context?

On Wednesday, May 15, 2013 4:52:51 PM UTC-7, Jeff Schnitzer wrote:

Nick

unread,

May 16, 2013, 8:08:45 PM5/16/13

to google-a...@googlegroups.com, je...@infohazard.org

I don't really understand the philosophy currently in play here.

Obviously when load goes up, new instances need to be spun up. That makes sense.

The idea that its better for the last invoker to wait 5-10 seconds, rather than sharing a reduced latency of say 0.5 seconds by being sent to a 'fully' utilised instance just doesn't make sense to me. The characteristics of this also result in being more punishing when you have a low number of instances, which tends to happen if you have frequent releases.

It would be good to get some insight into why this approach is favoured.

On a wider note, Jeff (or anyone else attending that panel), what in particular did they mean by DI in this context? Spring style annotation scanning, Guice itself, IOC in general (passing objects through constructors/setters) or just not using purely static access patterns?

Building large, maintainable apps without the use of IOC/DI bends my mind. Its painful enough using the static access patterns laid out by all the appengine APIs themselves, the idea that i'd tightly couple all my own and all the library code I consume as well seems bonkers.

Jeff Schnitzer

unread,

May 16, 2013, 9:40:19 PM5/16/13

to Google App Engine

I haven't given up on GAE yet, although I came pretty close to it after walking out of that I/O session. I've actually spent almost all my free time over the last few weeks trying to get the next Objectify release out the door.

By DI the speaker meant Guice, Spring, CDI, etc. By AOP he meant the interceptors that Guice, Spring, etc can create for you. I've never been a fan of Spring, but my app would be unmaintainable without some sort of DI system (I use Guice). And I tend to use a fair number of interceptors, especially around JAX-RS methods. If someone had told me at the beginning "if you want to use GAE, you must hand-wire your services and copy-paste shared logic all over your code", I would have gone elsewhere.

GAE is a productive environment, but all that productivity will disappear if I have to write spaghetti code to perform in production.

The number of classes and jax-rs endpoints in my app is continuing to expand, and all the classloading is pushing up my startup time. I'm never going to have a large number of instances; as a ticketing system, people tend to come to my website when they want to buy something. High value per customer, but very bursty traffic.

The speaker had one suggestion that looks valuable - use Dagger instead of Guice - which might help, although it doesn't look like Dagger supports interceptors, so that may be out.

I've been doing a lot of consulting on other people's Heroku/Appfog/EC2/EB/etc projects (sadly, GAE consulting doesn't pay well). Honestly I think no-autoscaling is better than GAE/Java's autoscaling. Elastic Beanstalk has the best solution - it's easy to understand, easy to configure, and doesn't produce dead user requests. Also, I forgot how much I miss being able to execute ad-hoc queries or queries that update data... and hosted MongoDB is getting cheaper.

This is not to say that these other systems are better that GAE; they all have pretty serious flaws. But it's hard to get over dead user requests.

Jeff

aanaa...@gmail.com

unread,

May 16, 2013, 11:59:57 PM5/16/13

to google-a...@googlegroups.com

I second this view. Unless of course one is heavily tied to and 'dependent' on a DI framework.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

Jeff Schnitzer

unread,

May 17, 2013, 1:21:45 AM5/17/13

to Jon Sawyer, Google App Engine

This problem has occurred in the past with GAE/Python apps of sufficient complexity, but yeah, Java has always assumed a lot of work at startup to load, link, and JIT your app. GAE assumes that your application will always start in a time that is reasonable for a user request. This is an invalid assumption for pretty much any substantial Java app.

"Selecting the right programming language" is a lot more about the complexity of your app, the makeup of your team, and the availability of libraries than it is about picking the language most optimized for GAE. Or at least it should be.

Jeff

On Thu, May 16, 2013 at 7:58 PM, Jon Sawyer <j...@jsawyer.net> wrote:

How much of this issue is endemic to Java's design, and therefore would require significant engineering on Google's part to fix? They seem to have determined the ROI isn't high enough for them to invest here.

Is it just fundamental to the way that Java must unpack jars, and JIT the bytecode, and ... ???

Isn't Python's by-line "batteries included", indicating that the standard runtime contains the kitchen sink? Is the Python environment richer out-of-the-box than Java's, so you don't need to doctor it with Guice, Objectify, Guava, etc.?

This is partly an academic question, since I suspect the only people who really know work at Google and aren't talking. But it's partly a learning exercise; I like the idea of leveraging a PaaS environment like GAE. I've been productive making use of Guice and Objectify (Thanks, Jeff!), but in the future it seems that selecting the right programming environment and language may be a key success factor in leveraging a PaaS application delivery environment if other languages start out with significant disadvantages in that environment.

Marcel Overdijk

unread,

May 17, 2013, 2:20:53 AM5/17/13

to google-a...@googlegroups.com, je...@infohazard.org

This is an interesting question indeed.

I don't believe startup times for Java will become better on GAE; it's also very typical for in Java land that startup times take > 30s for medium to large apps or depending on the frameworks chosen.

This is no problem when you up front spin up the number of required instances like on a VPS or CloudFoundry PaaS.

One of the great things about GAE is the programming API and the autoscaling part, but this autoscaling (and thus spinning up and killing instances) bites back unfortunately.

Couldn't Google prohibit cold requests being served to users? In many cases it would better to have latency when a requested is routed to a warm instance.

I think this would solve many problems. What do you think?

To bad Google is not participating in these discussions.

But with great technologies and frameworks being delivered by Google (like Angular JS) Google cannot be taken seriously when they say just go back 10 years and use servlets, static factories, hard coded configs etc. This is a real development nightmare.

^M

Jon Sawyer

unread,

May 17, 2013, 1:24:05 AM5/17/13

to Jeff Schnitzer, Google App Engine

Your last sentence says it all - "Or at least it should be". The reality is that we work with flawed tools, and that needs to be taken into account with everything else.

Not that Objectify has any flaws, you understand. :-)

Jon

--

Jon Sawyer

j...@jsawyer.net

Rafael

unread,

May 17, 2013, 5:14:45 AM5/17/13

to google-a...@googlegroups.com, je...@infohazard.org

Hello de,

Any chance you're a python dev?

This is a phenomenal statement to make: Dependency injection a la guice and spring, are frameworks which you want to avoid as much as possible.

In any case, i believe that coding java like the Stone Age is not an excuse for the lack of support and features on the appengine platform. Developers won't buy these excuses.

If you're doing a hello world project you may not care about writing ugly and painful code. I believe that java devs are spoiled with at least bare minimum spring mvc.

If real java dev isn't ever going to be supported, it should be at least clearly stated before people even try the platform and bet their business and life's on it.

Have fun :)

Rafa

--

Vinny P

unread,

May 17, 2013, 10:35:31 AM5/17/13

to google-a...@googlegroups.com, de Witte, je...@infohazard.org

+1 to Jeff and this thread.

I was at the same I/O session, and I received the same impression of Java/GAE as Jeff. The recommendation against DI/AOP is a big issue, considering how many Java frameworks and abstractions depend upon those.

-----------------

-Vinny P

Technology & Media Advisor

Chicago, IL

My Go side project: http://invalidmail.com/

On Thursday, May 16, 2013 11:14:20 AM UTC-5, Jeff Schnitzer wrote:

No, I'm not "overrating" with my subject.

The apparent reasoning behind those points is that Google still thinks "startup time is your problem", and routes user requests to cold starts. This works in Go but it will never work in Java. The last time I made a Hello, World app with JPA it took 4-5 seconds to startup in production. Somewhere around 5 seconds is where the user thinks your app is broken and hits the reload button. If Google Search pages took 5 seconds to load for a significant percentage of users, heads would roll.

If your app actually _does something_ it's going to take more than 5s to load. Maybe you can make it 10s instead of 30s by adopting 2000s-era programming practices, but it doesn't matter because the user has already considered your app broken.

And don't get me started on the frequent "sick periods" where startup time goes up by 3X...

Jeff

Tom

unread,

May 17, 2013, 2:46:15 PM5/17/13

to google-a...@googlegroups.com, de Witte, je...@infohazard.org

The recommendation to use Dagger seems to have some merit and it would seem to make some sense that Google would push towards Dagger, given its suitability for Android. I use GAE/J as a backend to Android (and based on io13 sessions, Google is rather pushing this) so things that work well on both GAE/J and Android are attractive.

On a general note, at the end of the keynote someone asked a question suggesting that Google's dependency on a language they don't control was a big problem. I hope that is not the reason behind the lack of progress on GAE/J. I realize that is a common mentality but it is a sad state of affairs and it seemed to me to go against the thrust of Page's speech.

Tom

Ryan Chazen

unread,

May 17, 2013, 4:52:22 PM5/17/13

to google-a...@googlegroups.com

Very unfortunate, but it's been clear that GAE/J has been getting the short end of the stick for awhile. Given the heavy costs for doing anything 'heavy' on GAE/J, and the prevalence of new hosting providers that can give you a full 8 core VM for similar prices to GAE's incredibly underpowered (and limited) instances, it seems like the right thing to do at this point is to bite the bullet and manage a few servers yourself. Often a pair of 8 core servers can do the same work load as an enormous number of GAE instances without any worry of strange app engine related bugs and limitations.

If GAE/J actually provided a better 'just works' solution that they should be then this would still be in favor of GAE, but with the countless hours you need to spend trying to optimize an app to boot quickly, you could easily spend half that time setting up an automated load balancer across some cheap hosting providers.

This news that they're not even fixing these issues is the last straw here, and I'm going to begin transitioning off GAE/J.

On Thursday, May 16, 2013 1:52:51 AM UTC+2, Jeff Schnitzer wrote:

Joakim

unread,

May 17, 2013, 5:22:49 PM5/17/13

to google-a...@googlegroups.com, je...@infohazard.org

It is disheartening to see so little progress being made in this area, not to mention the lack of communication as to Google's stance. I love GAE but couldn't possibly recommend anyone to run Java here until the instance load times or user-facing cold starts are fixed.
I want to say "Just snapshot/save-state the JVM!", but if it was that easy Google would obviously have solved this a long time ago. (I hope)

Also, doesn't bytecode weaving solve most of the performance degradation caused by AOP?

On Thursday, May 16, 2013 1:52:51 AM UTC+2, Jeff Schnitzer wrote:

Jeff Schnitzer

unread,

May 18, 2013, 11:02:55 AM5/18/13

to Moshe Shaham, Google App Engine

DI, AOP are trivially scalable - the solution is simply "don't route requests to instances until they are ready". Every other PaaS that I am aware of follows this approach. Elastic Beanstalk supports autoscaling too.

Furthermore, this is not strictly a DI/AOP issue. Any request >5s might as well never complete; the user has already deemed your app broken and hit the reload button. Last time I tried, Hello, World with JPA takes ~4s to start. If you have a few entity classes and JAX-RS resources, you will easily hit 10s - even without Guice or Spring. The difference between 10s and 30s is immaterial from the user's perspective.

As someone who has done it both ways, the only thing more monstrous than Spring is trying to maintain a large, complicated application without DI or AOP. This is sufficiently conventional wisdom that the JavaEE group saw fit to add CDI to the official specification.

Jeff

On Sat, May 18, 2013 at 7:32 AM, Moshe Shaham <ish...@gmail.com> wrote:

I wonder, why is it hard to accept the fact that DI, AOP are not easily scalable, the same way you accept the fact the relational databases are?
for a long time, if you wanted relational databases - that's fine but not on GAE
you want the magic of spring or guice? great, but GAE will not provide you with shortcuts...

GAE is still useful without those (monstrous, at least spring, IMO) frameworks.

having said that, I completely agree that cold starts should not happen.

Marcel Overdijk

unread,

May 20, 2013, 6:43:53 AM5/20/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

I think the "don't route requests to instances until they are ready" is indeed the only solution there is.

Statements like don't use DI, AOP etc. are non-arguments.

I just experimented with the AppEngine Endpoints and a simple endpoint returning a hardcoded list needed 12s cold startup time...

I don't assume Google would advise to not use Google Endpoints then as well??

Frustrating part is also that Google is not joining this discussion...

Rafael

unread,

May 20, 2013, 1:45:38 PM5/20/13

to google-appengine, Moshe Shaham, je...@infohazard.org

It seems that GAE doesn't route requests to resident instances and wait for a new instance to come up. In theory it shouldn't work this way, is it just my impression?

The worst comes when you use backends, where they don't even have the resident instances configuration.

Marcel Manz

unread,

May 20, 2013, 2:29:46 PM5/20/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Gooogle really never should have used the term 'resident' instances.

It leads to a lots of confusion, making many developers believe that these would be the primary instances requests get routed to. But that's not how GAE works. These are instances that should have been named something like 'scale-buffer' instances or similar and only take requests while GAE spins up more dynamic instances.

This has been endlessly discussed and the only way forward is for GAE to announce in one of their next updates that requests no longer get routed to cold instances. Period, end of discussion and we'll all share our love for GAE much more ;-)

Nobody here really understands what is taking them so long so make this simple scheduler change, where the scheduler looks up a flag - instance up - good, route there. Instance not up - don't route there. Instances getting busy - start up more instances and let the scheduler wait addressing them until they are up and running, accepting additional queue time in case all instances are fully busy serving requests (which is way much better than to wait for a full cold start).

Ryan Chazen

unread,

May 21, 2013, 3:12:52 AM5/21/13

to google-a...@googlegroups.com

Hi Matt,

Thanks for some official word from Google - it was starting to turn into an echo chamber on Google's own product forum. That said, the solution still isn't satisfactory. Even a single 5 second wait at the wrong time is very, very bad. If I'm paying with a credit card, and my request gets routed to a new instance that is still starting up, I'm probably going to cancel the request and not complete the transaction. If I've just been sent a link to a new webapp to check out, and its taking 5 secs to even load, I'm certainty not going to invest in that webapp - I'll tell them to find some new hosting and get back to me when they have a stable platform. Google knows all this very well - you guys spend billions on ensuring fast response times.

The problem is very straight forward - GAE is currently priced as a premium service, but is giving very, very much budget level performance and support. You need to either cut prices drastically, or you need to come to the table and provide guarantees - no requests to cold start instances, etc.

Ryan

On Tuesday, May 21, 2013 2:19:27 AM UTC+2, Matt Stephenson wrote:

First of all, thanks for attending the Google I/O talk, we got great feedback, both positive and constructive. It was a great I/O and very nice to hang out with many of the developers at the event.

This talk is based on my previous experience where I have worked with and promoted the use of Spring, Guice, AspectJ, and many other great java technologies. I in fact mentioned my favorite configuration for restful services during my talk, which includes Guice.

Many things were taken out of context in this thread from the talk I gave. In particular the fact that you shouldn't use DI. I think DI works great, and I was merely giving people tips on getting fast startup for a java app. I do still have my Rod Johnson bobblehead from when I attended The Spring Experience in 2007 on my desk. It's a strong reminder of where we started with Spring 1.0 and where we're at with new DI like Dagger.

On the orthogonal issue of cold start requests, we're certainly doing things to make sure that requests go to warm instances whenever possible. Whatever we do to improve this behaviour will be much simpler if your application starts up quickly.

Shaving 2 seconds of boot time off your application has a significant positive impact on billing and reliability when booting a large amount of instances needed to scale your application. There are actions that you can take on the application level to improve this, and internally we are also working hard on improving scheduling and boot time.

If your application is seeing tons of cold starts, let us know, and we are happy to look in depth as to what is going on in our infrastructure. The vast majority of Java apps on App Engine boot very fast, and we are willing to spend time now to improve the few apps experiencing high variation of time during startup.

Recently we added dynamic loading requests to the administrative dashboard graphs. Hopefully this can help with identification of when changes to the application impact performance.

Would it be easier if we offered the minimum idle instances tuning parameter via an API? That way you could actually do your own prediction and adjust the min idle instances programmatically. That actually sounds very reasonable to do, but I'm just throwing ideas out now. We’re very open to any ideas that would make this simpler.

In my Google I/O talk my goal was to show that if you do use certain facilities in common java frameworks, you need to optimize a few specific points around them.

Matt (always reachable via irc : mattstep on irc.freenode.net in #appengine)

Marcel Overdijk

unread,

May 21, 2013, 3:32:30 AM5/21/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Hi Matt,

Thanks for joining this discussion (I think my tweet did wonders :-)

Anyway back on topic I agree with other people that requests should never go to cold instance startups.

I also wonder if shaving of 2 secs will have that much impact, does it really matter going from 20s to e.g. 18s? Or from 10s to 8s?

For me the discussion goes further then only DI and/or AOP.

Personally I favor Spring, but not only for DI, also it's strong (web) binding capabilities (and in particular i18n data binding). It's just a complete stack in which I can do do traditional MVC, and add REST as wel very easily. It makes developing webapps productive compared to e.g. using plain servlets.

Of course I tried alternatives like Jersey framework but even those very not satisfactory in terms of cold startup times.

A plain Jersey app with 1 simple REST resource (without classpath scanning or even using JPA) takes 17s to startup.

And last weekend I experienced with a simple Google endpoints app based on http://www.planetjones.co.uk/blog/19-05-2013/google-app-engine-endpoints-with-java-maven-part1.html but cold startup time for this is also 12s.

Could you elaborate on your "The vast majority of Java apps on App Engine boot very fast" statement. What is fast? 3s, 5s, ..?

And on what technology/frameworks are these vast majority of apps based? It could help us getting a broader picture.

Autoscaling is one of the unique features I like about GAE (inclusing strong api including memcache, search, etc.). So setting minimum idle instances goes against this principle.

But this might be interesting for some apps.

At the moment I'm using GAE successfully for some clients, but those are small internal 1 instance apps. And I can say I'm happy with it.

But I probably would not use it at the moment for a public large enterprise webapp with unpredictable page visits.

Regards,

Marcel

On Tuesday, May 21, 2013 2:19:27 AM UTC+2, Matt Stephenson wrote:

First of all, thanks for attending the Google I/O talk, we got great feedback, both positive and constructive. It was a great I/O and very nice to hang out with many of the developers at the event.

This talk is based on my previous experience where I have worked with and promoted the use of Spring, Guice, AspectJ, and many other great java technologies. I in fact mentioned my favorite configuration for restful services during my talk, which includes Guice.

Many things were taken out of context in this thread from the talk I gave. In particular the fact that you shouldn't use DI. I think DI works great, and I was merely giving people tips on getting fast startup for a java app. I do still have my Rod Johnson bobblehead from when I attended The Spring Experience in 2007 on my desk. It's a strong reminder of where we started with Spring 1.0 and where we're at with new DI like Dagger.

On the orthogonal issue of cold start requests, we're certainly doing things to make sure that requests go to warm instances whenever possible. Whatever we do to improve this behaviour will be much simpler if your application starts up quickly.

Shaving 2 seconds of boot time off your application has a significant positive impact on billing and reliability when booting a large amount of instances needed to scale your application. There are actions that you can take on the application level to improve this, and internally we are also working hard on improving scheduling and boot time.

If your application is seeing tons of cold starts, let us know, and we are happy to look in depth as to what is going on in our infrastructure. The vast majority of Java apps on App Engine boot very fast, and we are willing to spend time now to improve the few apps experiencing high variation of time during startup.

Recently we added dynamic loading requests to the administrative dashboard graphs. Hopefully this can help with identification of when changes to the application impact performance.

Would it be easier if we offered the minimum idle instances tuning parameter via an API? That way you could actually do your own prediction and adjust the min idle instances programmatically. That actually sounds very reasonable to do, but I'm just throwing ideas out now. We’re very open to any ideas that would make this simpler.

In my Google I/O talk my goal was to show that if you do use certain facilities in common java frameworks, you need to optimize a few specific points around them.

Matt (always reachable via irc : mattstep on irc.freenode.net in #appengine)

On Monday, May 20, 2013 11:29:46 AM UTC-7, Marcel Manz wrote:

Marcel Manz

unread,

May 21, 2013, 3:54:12 AM5/21/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Hi Matt,

Thanks for your feedback.

I really would like to unterstand the reason for Google being unable to introduce a switch labeled "route to warmed up instances only".

What is so complicated with that?

Personally I don't want API controls for min-idle/resident/scaling instances, as it still doesn't guarantee that requests never get routed to cold starts. As a matter of fact that's exactly what has been observed by most: idling resident instances waiting longer for doing nothing, while new dynamic instances get hit with cold starts. Further, we're using PAAS to exactly not have to take care on controlling such scaling / cold start issues. If I have to take care on it, then this would be the wrong platform for me.

To me I really don't care if the instances take one, two or more seconds to boot. It's unacceptable to *ever* have requests getting directed to cold starts.

Would Adwords run on this system, Google would be losing billions of dollars, if viewers would be presented with ads 2 seconds, 1 second or even few hundred milliseconds too late.

We don't want that and all we're asking for is consistent performance in response times. That requires routing to warmed up instances only.

Can you please spend that 1 Million it might take your engineers to adapt the scheduler to the way it should be, as it will greatly pay you off in the long run.You really have to do this, if you want to become an Amazon killer and win the crowd for your platform.

Marcel

Ian Marshall

unread,

May 21, 2013, 5:11:39 AM5/21/13

to Google App Engine

If (even an experimental) switch labelled

"Route requests to warmed-up instances only (except if none is
already running)"

was introduced, I think that you will find a huge take-up, which will
indicate the strong desire/need of GAE/J developers for such a
feature!

Drew Spencer

unread,

May 21, 2013, 5:35:14 AM5/21/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

"I really would like to unterstand the reason for Google being unable to introduce a switch labeled "route to warmed up instances only".

What is so complicated with that?"

+1 from me

Anyone else get the feeling there's something else going on here? I'm the last person to want to accuse Google of skullduggery, but it's odd that SUCH a simple thing hasn't been put in place yet, isn't it? That and we're being told it can't be, when we know they make things beyond our wildest dreams every single year.

You can make Google Glass but you can't spin me up a new JVM a couple seconds before it's needed? Give us a break.

Ian Marshall

unread,

May 21, 2013, 5:35:10 AM5/21/13

to Google App Engine

I am sure that most, if not all, contributors to this thread (except
perhaps Google's own Matt Stephenson!) have already starred GAE Issue
7865 (User-facing requests should never be locked to cold instance
starts) at:

https://code.google.com/p/googleappengine/issues/detail?id=7865&colspec=ID%20Type%20Component%20Status%20Stars%20Summary%20Language%20Priority%20Owner%20Log

It was raised, acknowledged and accepted last summer and has currently
got 155 stars.

If there is a "Beg" panel for this issue, I shall visit it and click
the "Beg" icon.

Drew Spencer

unread,

May 21, 2013, 5:41:26 AM5/21/13

to google-a...@googlegroups.com

Unfortunately, it isn't that high in terms of stars: https://code.google.com/p/googleappengine/issues/list

timh

unread,

May 21, 2013, 6:42:31 AM5/21/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Would it be easier if we offered the minimum idle instances tuning parameter via an API? That way you could actually do your own prediction and adjust the min idle instances programmatically. That actually sounds very reasonable to do, but I'm just throwing ideas out now. We’re very open to any ideas that would make this simpler.

I personally think this would be a great idea. I have built a number of apps over the years, and unlike big social sites etc... they just never get much traffic over night. It would make sense for applications with this sort of traffic to be able to drop the number of idle instances down during quiet periods and ramp them back up for the busy times. (Oh python , but of course any scheduling would affect everybody ;-)

Vinny P

unread,

May 21, 2013, 2:42:23 PM5/21/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Hello Matt,

First of all, I appreciated your Google I/O talk. It's always interesting to see what Google's thinking about my favorite language :-).

Secondly, I feel that we're caught in a Dilbert comic strip, in particular, this one here: http://imgur.com/0AHeS4Q . I.E. on one hand the official company line is "J/GAE is great! So is DI/AOP!" and then on the other hand the unspoken "gentleman's understanding" is that if developers use DI/AOP, we get unacceptably long startup times. Startup times that turn off and rebuff users. Other people have said this before, but I will reiterate: startup times that far exceed 2-3 seconds are not acceptable in the long-term. It doesn't matter if you reduce startup times of an app from 30 seconds to 10 seconds because the user has already browsed away after the first 5 seconds.

In complete fairness to Google, I'm not sure this problem is fixable without a whole lot of work and money spent. I want to say the answer is just taking a snapshot of the JVM + app (as someone already mentioned in this thread) and then redeploying the snapshot, but it's probably much more complicated than that. Java is a heavyweight in terms of initial startup - we're always going to have higher startup times compared to Go, PHP, etc.

Feature Request: Can you guys put up an automatic "Loading.../Redirecting..." page when requests are routed to cold instance starts? The major problem here really isn't the loading time, it's that there's no user-facing acknowledgement that something is going on. When a user clicks a link and gets directed to a cold-start instance they only see a blank page - they don't know if the page is loading, if their browser crashed, if the site broke down, etc. Have GAE offer an option where if an instance is cold started, and the loading is taking more than x seconds, the user is shown a "LOADING..." graphic, then is redirected to the page they requested when the instance is finished loading. This isn't new technology for Google - Gmail does the same thing if email loading is slow.

Again, I enjoyed your talk, and thanks for the opportunity to discuss. If you're ever in Chicago hit me up and I'll buy you a beer ;-)

-Vinny

-----------------

-Vinny P

Technology & Media Advisor

Chicago, IL

My Go side project: http://invalidmail.com/

On Monday, May 20, 2013 7:19:27 PM UTC-5, Matt Stephenson wrote:

First of all, thanks for attending the Google I/O talk, we got great feedback, both positive and constructive. It was a great I/O and very nice to hang out with many of the developers at the event.

This talk is based on my previous experience where I have worked with and promoted the use of Spring, Guice, AspectJ, and many other great java technologies. I in fact mentioned my favorite configuration for restful services during my talk, which includes Guice.

Many things were taken out of context in this thread from the talk I gave. In particular the fact that you shouldn't use DI. I think DI works great, and I was merely giving people tips on getting fast startup for a java app. I do still have my Rod Johnson bobblehead from when I attended The Spring Experience in 2007 on my desk. It's a strong reminder of where we started with Spring 1.0 and where we're at with new DI like Dagger.

On the orthogonal issue of cold start requests, we're certainly doing things to make sure that requests go to warm instances whenever possible. Whatever we do to improve this behaviour will be much simpler if your application starts up quickly.

Shaving 2 seconds of boot time off your application has a significant positive impact on billing and reliability when booting a large amount of instances needed to scale your application. There are actions that you can take on the application level to improve this, and internally we are also working hard on improving scheduling and boot time.

If your application is seeing tons of cold starts, let us know, and we are happy to look in depth as to what is going on in our infrastructure. The vast majority of Java apps on App Engine boot very fast, and we are willing to spend time now to improve the few apps experiencing high variation of time during startup.

Recently we added dynamic loading requests to the administrative dashboard graphs. Hopefully this can help with identification of when changes to the application impact performance.

Would it be easier if we offered the minimum idle instances tuning parameter via an API? That way you could actually do your own prediction and adjust the min idle instances programmatically. That actually sounds very reasonable to do, but I'm just throwing ideas out now. We’re very open to any ideas that would make this simpler.

In my Google I/O talk my goal was to show that if you do use certain facilities in common java frameworks, you need to optimize a few specific points around them.

Matt (always reachable via irc : mattstep on irc.freenode.net in #appengine)

On Monday, May 20, 2013 11:29:46 AM UTC-7, Marcel Manz wrote:

Marcel Manz

unread,

May 21, 2013, 3:03:05 PM5/21/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Hi Vinny

Feature Request: Can you guys put up an automatic "Loading.../Redirecting..." page when requests are routed to cold instance starts? The major problem here really isn't the loading time, it's that there's no user-facing acknowledgement that something is going on. When a user clicks a link and gets directed to a cold-start instance they only see a blank page - they don't know if the page is loading, if their browser crashed, if the site broke down, etc. Have GAE offer an option where if an instance is cold started, and the loading is taking more than x seconds, the user is shown a "LOADING..." graphic, then is redirected to the page they requested when the instance is finished loading. This isn't new technology for Google - Gmail does the same thing if email loading is slow.

Fair input, but unfortunately totally unacceptable for our business case. See, not everyone is in machine <> human communications, many of our apps serve other machines in M2M (machine 2 machine) communication and I certainly don't want to output any 'Come back later' page to a remote API expecting a different input.

This said, I'm currently forced to use resident backends in order to overcome the cold-start loading requests, but this solution won't scale automatically with the external load. Hence all the discussion about using standard frontend instances that can handle and scale with load, but require Google to do their homework and never serve requests from cold start instances.

Once this problem is solved, a lot of burden is taken off the platform as it can be used in a more flexible way as of now. All other solutions that still involve cold instance starts are crap, sorry to say.

Marcel

Vinny P

unread,

May 21, 2013, 5:22:26 PM5/21/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

On Tuesday, May 21, 2013 2:03:05 PM UTC-5, Marcel Manz wrote:

...many of our apps serve other machines in M2M (machine 2 machine) communication and I certainly don't want to output any 'Come back later' page to a remote API expecting a different input.

Fair enough, I was just throwing that idea out as a temporary band-aid over the current issue.

On Tuesday, May 21, 2013 2:03:05 PM UTC-5, Marcel Manz wrote:

Hence all the discussion about using standard frontend instances that can handle and scale with load

Believe me, I'm in your camp. I'd love to see this fixed.

But a fix isn't going to be out anytime soon, and until it is, I ( and my users) would appreciate a bit more user-friendliness!

Drew Spencer

unread,

May 22, 2013, 5:28:32 AM5/22/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Good points from both of you gents.

I agree with Vinny that a loading message would be good, but obviously in cases such as Marcel's this is not a solution.

Please mind my ignorance, but why can't Google just keep track of how many requests are coming in and spin up a new instance whenever usage of the current one gets to 80% capacity, or something like that?

Are we mainly talking about (what I would call) extreme scenarios where traffic goes up exponentially in an instant, like when tickets go on sale or some such event?

Carl Schroeder

unread,

May 23, 2013, 12:00:07 AM5/23/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

The Java cold starts issue was severe enough that we rewrote our entire app in Go and Python (for the API bits that are not yet supported in Go).

GAE/Java worked well when we started with it, then the instance cold start issue appeared and that was a deal breaker.

Let me tell you how awesome it was to demo the application to VC's and have the REST requests take 20 seconds each because GAE thinks that cycling instances on and off is the right thing to do. It nearly ruined us.

Now that we are on Go, REST calls to cold instances take 100ms to complete. I don't recommend that everyone port to Go. We had no choice.

The GAE/Java dashboard is missing one check-box: "Never send a request to an instance that has not returned from _ah/warmup."

Rafael

unread,

May 23, 2013, 5:02:56 AM5/23/13

to google-appengine, Moshe Shaham, je...@infohazard.org

The same here,

Coding java-web without spring-mvc just feels like December 1973.

Anyone has an actual solution? Maybe a more lightweight solution that I don't know?

thanks

rafa

Ryan Chazen

unread,

May 23, 2013, 2:07:18 PM5/23/13

to google-a...@googlegroups.com, Jeff Schnitzer

And it's possible to not use app engine at all, and then not have to worry about how fast your app starts. As a bonus, it's far cheaper as well! "Possible" is never a useful word to throw around here - it's possible to rewrite your java webapp in Go, it's possible to convince people they don't even need a webapp and just use pen and paper.

The point is, GAE/J pointing user requests to servers that have not finished loading is broken, and Google needs to fix it if they want to compete with Heroku/Azure/etc who all seem to have this issue sorted out. This isn't some kind of technical impossibility here - every other hosting provider of this type manages to solve this issue. And GAE is the most expensive of the lot in terms of compute resources. GAE should be offering the best solution, not a "jump through hoops and pray it works until the next GAE release where everything crawls at half speed for a day" solution.

On Wed, May 22, 2013 at 11:53 AM, Bediako George <bediako...@lucidtechnics.com> wrote:

So our Airlift application framework starts up pretty much immediately on App Engine for Java.

It runs on the Java runtime, but developers write request handlers in JavaScript via Rhino. Even though it is production ready (we do have several customers using it already) it is not ready for public consumption (we are looking to release it at the end the summer).

I am not mentioning this as an alternative to what you are doing. Instead, I want to make the point that it is possible to create a framework in Java that plays well with App Engine.

https://github.com/LucidTechnics/Airlift

Bediako

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/Nz4Yt8V6PB0/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to google-appengi...@googlegroups.com.

Tom

unread,

May 23, 2013, 2:55:11 PM5/23/13

to google-a...@googlegroups.com

I'm not that experienced with GAE, but I'm wondering whether it really is that simple. If an instance of my app takes 10s to start up, and my app can, at peak, receive X requests in a 10s period, then wouldn't I require close to X idle instances running all the time to satisfy your criteria. And wouldn't that be prohibitively expensive for most apps?

If my understanding is correct, then I think a simple on/off option (which has been suggested) would not be a good solution for most developers. I certainly want them to figure out how to reduce response latencies and instance startup times, but I would need to be in control of the balance between response time and cost.

As to your point about Heroku/Azure, do you know how they resolved the problem? Did they manage to significantly reduce startup times or have they come up with some other scheme?

One final point - I'm baffled at Google's decision to support Go on GAE. When I looked at Go and wondered about design decisions that seemed odd, the answer I always got was that it was designed to reduce build times for enormous systems. How does that primary design criteria make it suitable for GAE? If they had instead added a runtime for something like LLVM then people like myself (and the vast number of other devs who know C/C++/OC) could take advantage of the near-instantaneous startup times that you are talking about, with our existing skills and libraries.

Tom

Nick

unread,

May 23, 2013, 7:33:50 PM5/23/13

to google-a...@googlegroups.com, Moshe Shaham, je...@infohazard.org

Shameless plug: http://3wks.github.io/thundr/

thundr is our lighweight web-mvc - we built it specifically for use on appengine.

No classpath scanning
Direct control of configuration (in testable code)
DI
Basic interceptor pattern for controllers
Date binding of webrequests into controllers
Out of the box works with jsp, json, exception handling
Modular, so it can be easily extended (for example, we have modules for handlebars, google-prediction, bigquery, async http connectivity, google analytics, cloud storage, mailgun, webpurify and its super easy to publish your own)

We're currently working on changes to make it more restful friendly - right now its perfect for web apps that mix in services, but could be a little stronger for pure restful apps. A new version will be coming out soon.

Alas, it can't fix your cold start issues.

View our sample app here: https://github.com/3wks/thundr-sample

Rafael

unread,

May 24, 2013, 4:44:37 AM5/24/13

to google-appengine, Moshe Shaham, je...@infohazard.org

Cold startup issues shouldn't be an excuse for reinventing the wheel.

Anyone knows if spring will ever be capable to do it's initialization stuff at compilation time? Something like android resources? At least the heavy mapping stuff?!

thanks

Jon Sawyer

unread,

May 24, 2013, 11:08:06 AM5/24/13

to Google App Engine, Moshe Shaham, je...@infohazard.org

Regarding Spring wiring itself ahead of time:

https://code.google.com/p/reflections/

and especially:

https://code.google.com/p/reflections/wiki/ReflectionsSpring

Note that I have not tried this at all. Spring is still initializing itself at startup, but the scanning is done at compile time, as I understand it.

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/Nz4Yt8V6PB0/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to google-appengi...@googlegroups.com.

To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.

--

Jon Sawyer

j...@jsawyer.net

Kristopher Giesing

unread,

May 24, 2013, 12:02:56 PM5/24/13

to google-a...@googlegroups.com

I don't think so. You just put the X requests into the queues for the N instances already running. This gives you additional latency for the X requests, of course - it's not free - but if the latency added is less than the instance startup time, then it's a better solution than what we have. In my own case, the queue length for existing instances is never greater than 1 or 2, and the average latency per request is about 200ms, while instance startup is 10s. The math in my case is pretty simple: it's hardly ever really worth spinning up a 2nd instance, but GAE does it pretty frequently.

Someone should craft a Google Code Jam problem based on this issue.

- Kris

Kristopher Giesing

unread,

May 24, 2013, 12:05:54 PM5/24/13

to google-a...@googlegroups.com

Oh, also: Google uses GAE internally. I've always assumed that the primary motivation for establishing Go support came from inside Google. (Not necessarily a bad thing - Google is on the cutting edge of web tech in a lot of ways, so if Google internally sees value in it, there's probably a good reason.)

- Kris

On Thursday, May 23, 2013 11:55:11 AM UTC-7, Tom wrote:

Bediako George

unread,

May 30, 2013, 5:51:42 AM5/30/13

to google-a...@googlegroups.com, Jeff Schnitzer

Hmmm ... I think what I was trying to say is that if you bring pre-conceived notions of how to build a Java application on GAE you will get burnt. Once you understand how GAE for Java works you will choose not to use frameworks that were not designed to run in a Java GAE environment, OR as you said you will not use App Engine at all.

I was also posting this for the benefit of people that would read this thread and just think "Java is slow on GAE" and "Java has start up issues on GAE". That is simply not the case. Hence the use of the word "possible".

Perhaps Sun is ultimately to blame for all of this ... "write once, run anywhere" is one of the best marketing slogans, but it also technically incorrect for more than one reason. This whole discussion seems to point to one of them.

Bediako

On Thursday, May 23, 2013 2:07:18 PM UTC-4, Ryan Chazen wrote:

Jeff Schnitzer

unread,

May 30, 2013, 10:46:23 AM5/30/13

to Bediako George, Google App Engine

By that definition there are no frameworks which work on GAE.

Persistence? You you're stuck with the low-level API. Everything else (including Objectify) requires some amount of classloading and introspection at startup. You just blew the 5s limit.

With all due respect to the Thundr developers - I'm sure it's a lovely framework, but it requires defining all your endpoints in a JSON file. I have hundreds of endpoints; if I wanted the "big chunk of XML" approach I'd use http://mav.sf.net/ which I wrote back in 2001. Also, the API requires returning things like JspView, which complicates testing. JAX-RS (which requires classloading and introspection) allows us to make clean, testable business methods... it's one of the few standards that came out of the JCP that didn't suck in the first iteration.

GAE can't expect us to roll back java development to 2001. Applications have gotten a lot more complicated since then. And even if we did, how much are we *really* going to improve startup time? Remember, cutting 30s down to 10s is irrelevant - the user still abandoned our site at 5s.

Jeff

Bediako George

unread,

May 31, 2013, 9:07:25 AM5/31/13

to google-a...@googlegroups.com, Bediako George, je...@infohazard.org

On Thursday, May 30, 2013 10:46:23 AM UTC-4, Jeff Schnitzer wrote:

By that definition there are no frameworks which work on GAE.

I would not say there are none. But if your point is that there are plenty that work just fine on the traditional JVM platform but they suck on GAE, well I agree.

Persistence? You you're stuck with the low-level API. Everything else (including Objectify) requires some amount of classloading and introspection at startup. You just blew the 5s limit.

Are you suggesting that the low-level API is not very good? Works for me just fine ... could be that my applications are simple. Personally, I do not use Objectify or JDO. One of the reasons, and there are more than one, is precisely the point you mentioned.

With all due respect to the Thundr developers - I'm sure it's a lovely framework, but it requires defining all your endpoints in a JSON file. I have hundreds of endpoints; if I wanted the "big chunk of XML" approach I'd use http://mav.sf.net/ which I wrote back in 2001. Also, the API requires returning things like JspView, which complicates testing. JAX-RS (which requires classloading and introspection) allows us to make clean, testable business methods... it's one of the few standards that came out of the JCP that didn't suck in the first iteration.

GAE can't expect us to roll back java development to 2001. Applications have gotten a lot more complicated since then. And even if we did, how much are we *really* going to improve startup time? Remember, cutting 30s down to 10s is irrelevant - the user still abandoned our site at 5s.

So I would say again, if you let go of some of the things that got Java to where it is today, and embrace Google App Engine for what it is, 1) I don't think you will experience 5 second start up times, and 2) given the scalable cloud architecture, you should not feel like you are rolled back to 2001. But if you are hell bent on using the Java frameworks that are optimized for other platforms and environments on App Engine, then that seems to me that you are ice skating up hill. You would be better off not using Google App Engine at all.

Jeff Schnitzer

unread,

May 31, 2013, 10:54:33 AM5/31/13

to Bediako George, Google App Engine

If your application is "just fine" using the low-level API then yes, your applications are indeed simple. Mine are not.

Latest metric:

* 409 rest endoints

* 69 persistent domain classes

* ~25k lines of Java (half that again in coffeescript)

...and growing every week as customers demand more features. This is still small beans as enterprise apps go. Abstractions like ORM let us scale complexity. There's a reason why we don't write large applications in assembler or C anymore.

Jeff

Cesium

unread,

May 31, 2013, 10:57:04 AM5/31/13

to google-a...@googlegroups.com, Bediako George, je...@infohazard.org

I suspect that's the first and last time we'll be hearing from Matt on this topic.

David

Rafael

unread,

May 31, 2013, 3:23:46 PM5/31/13

to google-appengine, Bediako George

Jeff is right.

In my opinion there's no point starting a project and loosing time on something that's not going to grow. Unfortunately coding in Java like the cobol isn't scalable. Not only for code maintenance, but also for team morale.

I hope they fix this soon.

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.

Ales Justin

unread,

May 31, 2013, 3:40:52 PM5/31/13

to google-a...@googlegroups.com

Hearing all this things about JEE on GAE being slow, etc,

I've been "saying" this 2y back at JavaOne:

* http://parleys.com/play/5148922a0364bc17fc56c653/chapter8/about

But apart from this "workarounds",

we've actually produced a lot more, an alternative to direct GAE:

* http://www.jboss.org/capedwarf

Where for all "Doubting Thomas's" we have this:

* https://github.com/GoogleCloudPlatform/appengine-tck

(in case you missed it at I/O ;-)

-Ales

Nick

unread,

May 31, 2013, 10:18:43 PM5/31/13

to google-a...@googlegroups.com

While I don't necessarily disagree with your points Jeff, I feel given that you pulled out a couple of specific ones publicly here that its worth addressing them here, but i'll be quick so I don't distract from the conversation at hand.

Regarding configuring routes in one location (in this case JSON, but Java is also possible, and more testable). In the context of a web application, having all the endpoint mappings in one location can really help with discoverability and maintenance. With restful services, there is a much more logical place to home things (usually around the entity), so having the endpoint definitions scattered around code is less problematic. If you have an application developed by many people, which is iterating quickly it can be very useful to see and update all endpoints in one spot. We chose this pattern because our experience on large apps before has shown that little used endpoints (for admin functions, rarely used user features, etc) are much more vulnerable to bitrot.

Regarding returning concrete view types, like JspView, rather than domain models:
I think you can split the needs of a framework right down the middle - data views and template views. Data views usually expose some sort of API model (which you back with a java model), and consumers need to understand the available Apis. Most people do this restfully these days, get put post delete.
Then there are template views. Rendering the template (usually to HTML, but emails, messages etc) fold together a couple of concepts.
1) screen flow - any post (or put or delete) will have a paired get request simply to show a page with a form and probably some JavaScript. Usually it goes further with input validation, redirect on post etc
2) supplementary/reference data - every drop down needs possible options, usernames are displayed, carousel widgets, navigation elements etc.

Specific view types allow the explicit definition of the required data. For example, a jsp requires a view name (the jsp file), and model data. Once it's concretely defined, it becomes easy to test.

There are other advantages, for example a unified output engine means you can reuse views, such as jsps for rendering emails or mms, rather than a different templating language.

Tl;dr
Testability, maintainability and discoverability trumps saving yourself typing 'new JspView(' or 'new Js<ctrl><space><enter>'

Alexandre Cassimiro Andreani

unread,

Jan 13, 2014, 7:15:34 AM1/13/14

to google-a...@googlegroups.com

News?

Reply all

Reply to author

Forward