Startup time exceeded...on F4?!

1,219 views
Skip to first unread message

David Hardwick

unread,
Jul 12, 2012, 12:26:40 PM7/12/12
to google-a...@googlegroups.com
Hello,

I realize there's been a lot of discussion on startup times exceeded on this forum recently, but wanted needed to post this experience we had this morning to keep the attention on this important issue.

We uploaded a point release of our app to a "not-live" version this morning and, of course, we were going to click around on that instance to make sure it's all kosher before making that version "live."   The warm-up requests for the "not-live" version were exceeding the deadline limit of 60s... __and__we__are__on__F4s__!_!.

However, the LIVE version of the app crashed too, 500 server errors, instance counts went to zero, all sorts of whacky stuff was seen in the control panel.  All that happened to our LIVE version without when all we did was upload another "non-live" version and hit it with a single request...did I mention we were on F4s?  ;-)  Does the failure of any instance to exceed the 60s limit take down all instances to include live one?

We did a few things as quickly as possible since our live application was down, so clearly we didn't have the time to take the scientific approach of only changing one thing at a time and wait to see if it that did it.

We...
1. Switched from F4s to F2 (i figured if this would least get us on some new servers/instances)
2. Increased max idle instances from 1 to 2 (with F4s running, I'm fine with having just 1 idle instance and not at all happy about paying for 2 idle instances, so maybe we'll just increase this prior to deployments and then back down again after the deployment succeeds until we know more)
3. Made the recently uploaded version live (hey, why not, the production app was down for 10 minutes, so how much more harm could we do?)

We use GWT and Guice, we jar everything (as I have been paying attention to this startup time discussions for quite some time now.  We are also considering switching our Guice libraries to a non-AOP version as we saw suggested in another blog since we just need the injection.

Any insight, and I'm all ears!  app_id=s~myflashpanel

Regards,
  -Hardwick

--

We make Google Apps even better.

David Hardwick
CTO
david.h...@bettercloud.com
 
Signature by Flashpanel


David Hardwick

unread,
Jul 12, 2012, 3:46:06 PM7/12/12
to Google App Engine
Some additional observations and questions...

After reading this [Link 1] stack overflow article that mentioned an
issue with having your Max Idle count below 6, we started looking at
our warmup request on our staging environment because that app-id has
Idle Instances set to Auto-Auto, while production had specific values.

But...Where did all the "/_ah/warmup" requests go? When doing a label
search for these staging environment logs ["path:/_ah/warmup" (doing a
label search)] we couldn't find any warmup request!!(yes, we have
warmup requests turned on)...we would just see the first cold-start
request would take around 15 seconds to load (F1) and 10 seconds to
load on (F2).

I even shut down every instance and hit the staging server again to
see if I could find a warmup request in the logs...nope. Honestly, I
would rather have a user wait 10 seconds for the first request to that
server as opposed risking the warmup requests failing again.

Where did all the "/_ah/warmup" requests go? More importantly, why
would we have such different times for warmup requests compared to
cold starts? Shouldn't they be nearly identical?!

Rock on,
-Hardwick

[Link 1] - http://stackoverflow.com/questions/9422698/ah-warmup-producing-harddeadlineexceedederror


On Jul 12, 12:26 pm, David Hardwick <david.hardw...@bettercloud.com>
wrote:
>  *We make Google Apps even better.*
>
> *David Hardwick*
> *CTO*
> david.hardw...@bettercloud.com
>
> *Signature by Flashpanel <http://flashpanel.com/>*
>  *See us in Mashable: Growing Up Google: How Cloud Computing Is Changing a
> Generation <http://mashable.com/2012/04/30/generation-growing-up-google/>*

Tom Phillips

unread,
Jul 12, 2012, 4:39:48 PM7/12/12
to Google App Engine
Interesting..I checked and I too have 100% of my loading requests on
user facing URLs instead of /_ah/warmup.

Warmup requests are enabled and Automatic-Automatic for both instance
sliders.

I used to see at least a decent percentage of loading requests on /_ah/
warmup, but haven't looked in quite a while.

/Tom

On Jul 12, 3:46 pm, David Hardwick <david.hardw...@bettercloud.com>
wrote:
> Some additional observations and questions...
>
> After reading this [Link 1] stack overflow article that mentioned an
> issue with having your Max Idle count below 6, we started looking at
> our warmup request on our staging environment because that app-id has
> Idle Instances set to Auto-Auto, while production had specific values.
>
> But...Where did all the "/_ah/warmup" requests go?  When doing a label
> search for these staging environment logs ["path:/_ah/warmup" (doing a
> label search)] we couldn't find any warmup request!!(yes, we have
> warmup requests turned on)...we would just see the first cold-start
> request would take around 15 seconds to load (F1) and 10 seconds to
> load on (F2).
>
> I even shut down every instance and hit the staging server again to
> see if I could find a warmup request in the logs...nope.  Honestly, I
> would rather have a user wait 10 seconds for the first request to that
> server as opposed risking the warmup requests failing again.
>
> Where did all the "/_ah/warmup" requests go?   More importantly, why
> would we have such different times for warmup requests compared to
> cold starts?  Shouldn't they be nearly identical?!
>
> Rock on,
>   -Hardwick
>
> [Link 1] -http://stackoverflow.com/questions/9422698/ah-warmup-producing-hardde...

Michael Hermus

unread,
Jul 13, 2012, 8:52:17 AM7/13/12
to google-a...@googlegroups.com
Same for me.. I just checked: no calls to warmup, lots of loading requests.

**shakes fist at App Engine**

Takashi Matsuo

unread,
Jul 13, 2012, 9:28:07 AM7/13/12
to google-a...@googlegroups.com

Now the warmup requests are fired only if you set min idle instances to some value(not automatic).

-- Takashi


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/Bs6JKwLYDAMJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Mauricio Aristizabal

unread,
Jul 13, 2012, 5:56:47 PM7/13/12
to google-a...@googlegroups.com
Takashi, this is a piece of info that should be clearly spelled out in the docs IMO.  I spent many frustrating hours a while back trying to figure out why my warmup requests were not working until i pieced it all together from different posts such as this one and got 1 resident instance configured.  Hopefully new users don't have to go through this pain which needlessly makes GAE look bad.

-Mauricio

To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Wilson MacGyver

unread,
Jul 13, 2012, 6:18:40 PM7/13/12
to google-a...@googlegroups.com
any reason behind this change? this seems kind of strange. I too was
wondering by the
behavior till you posted this. I had my min set to automatic also
Omnem crede diem tibi diluxisse supremum.

Takashi Matsuo

unread,
Jul 13, 2012, 7:07:55 PM7/13/12
to google-a...@googlegroups.com

Hi Mauricio,

Yes I understand that, and we've been working on it. Sorry that it's taking long.

Hi Wilson,

It's coming with the pricing change. Since we started charging with number of instances, we needed to provide more control on how we spin-up a new instances.
In other words, if we continued the old behavior, many developers would have complained like "App Engine spins up unnecessary instances.".

We believe that it makes more sense when warmup requests are used only for resident idle instances.

-- Takashi
Takashi Matsuo

Tom Phillips

unread,
Jul 13, 2012, 11:41:46 PM7/13/12
to Google App Engine
Hi Takashi,

Now that I've added a resident idle instance, the logs seem to confirm
that only the loading of the resident instance is via /_ah/warmup.
It's difficult to tell that for sure, but all my current dynamic
instances were loaded on a user URL, and the current resident
instance was loaded via /_ah/warmup.

Resident idle instances are loaded only infrequently, and serve little
traffic, so of what use are warmup requests if only they get them? The
vast majority of loading requests are now on user URLs. That's a 20+
second wait (Java) by a user on almost every loading request.

Back in the Always-on days, I would see virtually all loading requests
use /_ah/warmup, as long as traffic was relatively stable. So an
instance loading rarely affected an actual user. What do we configure
now to get this same behavior?

/Tom

On Jul 13, 7:07 pm, Takashi Matsuo <tmat...@google.com> wrote:
> Hi Mauricio,
>
> Yes I understand that, and we've been working on it. Sorry that it's taking
> long.
>
> Hi Wilson,
>
> It's coming with the pricing change. Since we started charging with number
> of instances, we needed to provide more control on how we spin-up a new
> instances.
> In other words, if we continued the old behavior, many developers would
> have complained like "App Engine spins up unnecessary instances.".
>
> We believe that it makes more sense when warmup requests are used only for
> resident idle instances.
>
> -- Takashi
>
> On Sat, Jul 14, 2012 at 7:18 AM, Wilson MacGyver <wmacgy...@gmail.com>wrote:
>
>
>
>
>
>
>
>
>
> > any reason behind this change? this seems kind of strange. I too was
> > wondering by the
> > behavior till you posted this. I had my min set to automatic also
>
> > On Fri, Jul 13, 2012 at 9:28 AM, Takashi Matsuo <tmat...@google.com>
> > wrote:
>
> > > Now the warmup requests are fired only if you set min idle instances to
> > some
> > > value(not automatic).
>
> > > -- Takashi
>
> > > On Fri, Jul 13, 2012 at 9:52 PM, Michael Hermus <
> > michael.her...@gmail.com>

Takashi Matsuo

unread,
Jul 14, 2012, 9:49:46 AM7/14/12
to google-a...@googlegroups.com


On Jul 14, 2012 12:42 PM, "Tom Phillips" <tphil...@gmail.com> wrote:
>
> Hi Takashi,
>
> Now that I've added a resident idle instance, the logs seem to confirm
> that only the loading of the resident instance is via /_ah/warmup.
> It's difficult to tell that for sure, but all my current dynamic
> instances were loaded on a user URL, and  the current resident
> instance was loaded via /_ah/warmup.

Yes, it is expected behavior.

>
> Resident idle instances are loaded only infrequently, and serve little
> traffic, so of what use are warmup requests if only they get them? The
> vast majority of loading requests are now on user URLs. That's a 20+
> second wait (Java) by a user on almost every loading request.

Those resident instances are used when there is no available dynamic instance. If there's no available resident instance, your user will see the loading requests.

In general, if you see too many user-facing loading requests, there is still a room for raising your number of resident instances in order to reduce the number of user-facing loading requests.

>
> Back in the Always-on days, I would see virtually all loading requests
> use /_ah/warmup, as long as traffic was relatively stable. So an
> instance loading rarely affected an actual user. What do we configure
> now to get this same behavior?

Probably you can not expect the very same behavior.

How many resident instances are you configured? Are you configured the 'Max Idle instances' to 'Automatic'? How spiky is your access pattern? Are you using the concurrent requests?

-- Takashi

Takashi Matsuo

unread,
Jul 14, 2012, 9:53:20 AM7/14/12
to google-a...@googlegroups.com


On Jul 14, 2012 10:49 PM, "Takashi Matsuo" <tma...@google.com> wrote:
>
>
> On Jul 14, 2012 12:42 PM, "Tom Phillips" <tphil...@gmail.com> wrote:
> >
> > Hi Takashi,
> >
> > Now that I've added a resident idle instance, the logs seem to confirm
> > that only the loading of the resident instance is via /_ah/warmup.
> > It's difficult to tell that for sure, but all my current dynamic
> > instances were loaded on a user URL, and  the current resident
> > instance was loaded via /_ah/warmup.
>
> Yes, it is expected behavior.
>
> >
> > Resident idle instances are loaded only infrequently, and serve little
> > traffic, so of what use are warmup requests if only they get them? The
> > vast majority of loading requests are now on user URLs. That's a 20+
> > second wait (Java) by a user on almost every loading request.
>
> Those resident instances are used when there is no available dynamic instance. If there's no available resident instance, your user will see the loading requests.
>
> In general, if you see too many user-facing loading requests, there is still a room for raising your number of resident instances in order to reduce the number of user-facing loading requests.
>
> >
> > Back in the Always-on days, I would see virtually all loading requests
> > use /_ah/warmup, as long as traffic was relatively stable. So an
> > instance loading rarely affected an actual user. What do we configure
> > now to get this same behavior?
>
> Probably you can not expect the very same behavior.
>
> How many resident instances are you configured? Are you configured the 'Max Idle instances' to 'Automatic'? How spiky is your access pattern? Are you using the concurrent requests?

Sorry my above sentences are not correct.

How many resident instances are you having now? Do you set the 'Max Idle instances' to 'Automatic'? How spiky is your access pattern? Are you using the concurrent requests?

Tom Phillips

unread,
Jul 14, 2012, 11:30:42 AM7/14/12
to Google App Engine
Hi Takashi,

Most of my traffic is from other systems, with predictable ramp ups/
down at peak periods during week days (app is cliniconexapp-hrd). Not
sure if it would be considered spiky.

Right now I have 1 resident and 4 dynamic. That's more instances than
usual though - usually a couple only, and App Engine just seems to be
keeping a few extra around right now for whatever reason.

Java, multi-threaded. Max idle instances is automatic.

The resident instance handles almost no requests (only 3 so far). If a
dynamic instance needs to be started, couldn't the scheduler send at
least some requests to the idle instance while calling /_ah/warmup on
the spawning dynamic one?

The main problem with loading requests for me is for an incoming SMS
handler for Twilio. Twilio callbacks have a hard timeout of 15
seconds, so if it hits a loading request the user will see a fallback
error SMS response. Not ideal, and i'd like to minimize the chances of
it.

It sounds like I could pay to reserve even more excess capacity to do
this, but I'd think that my 1 resident instance could at least be used
a bit (say 30% load temporarily) to allow some predictive loading of
dynamic instances via /_ah/warmup.

/Tom



On Jul 14, 9:49 am, Takashi Matsuo <tmat...@google.com> wrote:
> ...
>
> read more »

Michael Hermus

unread,
Jul 15, 2012, 9:40:14 AM7/15/12
to google-a...@googlegroups.com
I have to agree with this; it seems completely backwards to me. Wouldn't resident instance warmups be extremely infrequent since they are.... well, resident! Under load, I wouldI want user facing requests to hit the resident instance UNTIL a new dynamic instance has been spun up AND hit with a warmup request.



On Friday, July 13, 2012 11:41:46 PM UTC-4, Tom Phillips wrote:
> Hi Takashi,
>
> Now that I&#39;ve added a resident idle instance, the logs seem to confirm
> that only the loading of the resident instance is via /_ah/warmup.
> It&#39;s difficult to tell that for sure, but all my current dynamic
> instances were loaded on a user URL, and the current resident
> instance was loaded via /_ah/warmup.
>
> Resident idle instances are loaded only infrequently, and serve little
> traffic, so of what use are warmup requests if only they get them? The
> vast majority of loading requests are now on user URLs. That&#39;s a 20+
> second wait (Java) by a user on almost every loading request.
>
> Back in the Always-on days, I would see virtually all loading requests
> use /_ah/warmup, as long as traffic was relatively stable. So an
> instance loading rarely affected an actual user. What do we configure
> now to get this same behavior?
>
> /Tom
>
> On Jul 13, 7:07 pm, Takashi Matsuo &lt;tmat...@google.com&gt; wrote:
> &gt; Hi Mauricio,
> &gt;
> &gt; Yes I understand that, and we&#39;ve been working on it. Sorry that it&#39;s taking
> &gt; long.
> &gt;
> &gt; Hi Wilson,
> &gt;
> &gt; It&#39;s coming with the pricing change. Since we started charging with number
> &gt; of instances, we needed to provide more control on how we spin-up a new
> &gt; instances.
> &gt; In other words, if we continued the old behavior, many developers would
> &gt; have complained like &quot;App Engine spins up unnecessary instances.&quot;.
> &gt;
> &gt; We believe that it makes more sense when warmup requests are used only for
> &gt; resident idle instances.
> &gt;
> &gt; -- Takashi
> &gt;
> &gt; On Sat, Jul 14, 2012 at 7:18 AM, Wilson MacGyver &lt;wmacgy...@gmail.com&gt;wrote:
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt;
> &gt; &gt; any reason behind this change? this seems kind of strange. I too was
> &gt; &gt; wondering by the
> &gt; &gt; behavior till you posted this. I had my min set to automatic also
> &gt;
> &gt; &gt; On Fri, Jul 13, 2012 at 9:28 AM, Takashi Matsuo &lt;tmat...@google.com&gt;
> &gt; &gt; wrote:
> &gt;
> &gt; &gt; &gt; Now the warmup requests are fired only if you set min idle instances to
> &gt; &gt; some
> &gt; &gt; &gt; value(not automatic).
> &gt;
> &gt; &gt; &gt; -- Takashi
> &gt;
> &gt; &gt; &gt; On Fri, Jul 13, 2012 at 9:52 PM, Michael Hermus &lt;
> &gt; &gt; michael.her...@gmail.com&gt;
> &gt; &gt; &gt; wrote:
> &gt;
> &gt; &gt; &gt;&gt; Same for me.. I just checked: no calls to warmup, lots of loading
> &gt; &gt; &gt;&gt; requests.
> &gt;
> &gt; &gt; &gt;&gt; **shakes fist at App Engine**
> &gt;
> &gt; &gt; &gt;&gt; On Thursday, July 12, 2012 4:39:48 PM UTC-4, Tom Phillips wrote:
> &gt;
> &gt; &gt; &gt;&gt;&gt; Interesting..I checked and I too have 100% of my loading requests on
> &gt; &gt; &gt;&gt;&gt; user facing URLs instead of /_ah/warmup.
> &gt;
> &gt; &gt; &gt;&gt;&gt; Warmup requests are enabled and Automatic-Automatic for both instance
> &gt; &gt; &gt;&gt;&gt; sliders.
> &gt;
> &gt; &gt; &gt;&gt;&gt; I used to see at least a decent percentage of loading requests on /_ah/
> &gt; &gt; &gt;&gt;&gt; warmup, but haven&#39;t looked in quite a while.
> &gt;
> &gt; &gt; &gt;&gt;&gt; /Tom
> &gt;
> &gt; &gt; &gt;&gt;&gt; On Jul 12, 3:46 pm, David Hardwick &lt;david.hardw...@bettercloud.com&gt;
> &gt; &gt; &gt;&gt;&gt; wrote:
> &gt; &gt; &gt;&gt;&gt; &gt; Some additional observations and questions...
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; After reading this [Link 1] stack overflow article that mentioned an
> &gt; &gt; &gt;&gt;&gt; &gt; issue with having your Max Idle count below 6, we started looking at
> &gt; &gt; &gt;&gt;&gt; &gt; our warmup request on our staging environment because that app-id has
> &gt; &gt; &gt;&gt;&gt; &gt; Idle Instances set to Auto-Auto, while production had specific
> &gt; &gt; values.
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; But...Where did all the &quot;/_ah/warmup&quot; requests go?  When doing a
> &gt; &gt; label
> &gt; &gt; &gt;&gt;&gt; &gt; search for these staging environment logs [&quot;path:/_ah/warmup&quot; (doing
> &gt; &gt; a
> &gt; &gt; &gt;&gt;&gt; &gt; label search)] we couldn&#39;t find any warmup request!!(yes, we have
> &gt; &gt; &gt;&gt;&gt; &gt; warmup requests turned on)...we would just see the first cold-start
> &gt; &gt; &gt;&gt;&gt; &gt; request would take around 15 seconds to load (F1) and 10 seconds to
> &gt; &gt; &gt;&gt;&gt; &gt; load on (F2).
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; I even shut down every instance and hit the staging server again to
> &gt; &gt; &gt;&gt;&gt; &gt; see if I could find a warmup request in the logs...nope.  Honestly, I
> &gt; &gt; &gt;&gt;&gt; &gt; would rather have a user wait 10 seconds for the first request to
> &gt; &gt; that
> &gt; &gt; &gt;&gt;&gt; &gt; server as opposed risking the warmup requests failing again.
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; Where did all the &quot;/_ah/warmup&quot; requests go?   More importantly, why
> &gt; &gt; &gt;&gt;&gt; &gt; would we have such different times for warmup requests compared to
> &gt; &gt; &gt;&gt;&gt; &gt; cold starts?  Shouldn&#39;t they be nearly identical?!
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; Rock on,
> &gt; &gt; &gt;&gt;&gt; &gt;   -Hardwick
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; [Link 1]
> &gt; &gt; &gt;&gt;&gt; &gt; -
> &gt; &gt;http://stackoverflow.com/questions/9422698/ah-warmup-producing-hardde...
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; On Jul 12, 12:26 pm, David Hardwick &lt;david.hardw...@bettercloud.com&gt;
> &gt; &gt; &gt;&gt;&gt; &gt; wrote:
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; Hello,
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; I realize there&#39;s been a lot of discussion on startup times
> &gt; &gt; exceeded
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; on
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; this forum recently, but wanted needed to post this experience we
> &gt; &gt; had
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; this
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; morning to keep the attention on this important issue.
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; We uploaded a point release of our app to a &quot;not-live&quot; version this
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; morning
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; and, of course, we were going to click around on that instance to
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; make sure
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; it&#39;s all kosher before making that version &quot;live.&quot;   The warm-up
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; requests
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; for the &quot;not-live&quot; version were exceeding the deadline limit of
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; 60s...
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; __and__we__are__on__F4s__!_!.
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; However, the LIVE version of the app crashed too, 500 server
> &gt; &gt; errors,
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; instance counts went to zero, all sorts of whacky stuff was seen in
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; the
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; control panel.  All that happened to our LIVE version without when
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; all we
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; did was upload another &quot;non-live&quot; version and hit it with a single
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; request...did I mention we were on F4s?  ;-)  Does the failure of
> &gt; &gt; any
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; instance to exceed the 60s limit take down all instances to include
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; live
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; one?
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; We did a few things as quickly as possible since our live
> &gt; &gt; application
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; was
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; down, so clearly we didn&#39;t have the time to take the scientific
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; approach of
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; only changing one thing at a time and wait to see if it that did
> &gt; &gt; it.
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; We...
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; 1. Switched from F4s to F2 (i figured if this would least get us on
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; some
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; new servers/instances)
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; 2. Increased max idle instances from 1 to 2 (with F4s running, I&#39;m
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; fine
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; with having just 1 idle instance and not at all happy about paying
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; for 2
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; idle instances, so maybe we&#39;ll just increase this prior to
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; deployments and
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; then back down again after the deployment succeeds until we know
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; more)
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; 3. Made the recently uploaded version live (hey, why not, the
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; production
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; app was down for 10 minutes, so how much more harm could we do?)
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; We use GWT and Guice, we jar everything (as I have been paying
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; attention to
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; this startup time discussions for quite some time now.  We are also
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; considering switching our Guice libraries to a non-AOP version as
> &gt; &gt; we
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; saw
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; suggested in another blog since we just need the injection.
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; Any insight, and I&#39;m all ears!  app_id=s~myflashpanel
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; Regards,
> &gt; &gt; &gt;&gt;&gt; &gt; &gt;   -Hardwick
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; --
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt;  *We make Google Apps even better.*
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; *David Hardwick*
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; *CTO*
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; david.hardw...@bettercloud.com
> &gt;
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; *Signature by Flashpanel &lt;http://flashpanel.com/&gt;*
> &gt; &gt; &gt;&gt;&gt; &gt; &gt;  *See us in Mashable: Growing Up Google: How Cloud Computing Is
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; Changing a
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; Generation
> &gt; &gt; &gt;&gt;&gt; &gt; &gt; &lt;http://mashable.com/2012/04/30/generation-growing-up-google/&gt;*
> &gt;
> &gt; &gt; &gt;&gt; --
> &gt; &gt; &gt;&gt; You received this message because you are subscribed to the Google
> &gt; &gt; Groups
> &gt; &gt; &gt;&gt; &quot;Google App Engine&quot; group.
> &gt; &gt; &gt;&gt; To view this discussion on the web visit
> &gt; &gt; &gt;&gt;https://groups.google.com/d/msg/google-appengine/-/Bs6JKwLYDAMJ.
> &gt;
> &gt; &gt; &gt;&gt; To post to this group, send email to google-a...@googlegroups.com.
> &gt; &gt; &gt;&gt; To unsubscribe from this group, send email to
> &gt; &gt; &gt;&gt; google-appengi...@googlegroups.com.
> &gt; &gt; &gt;&gt; For more options, visit this group at
> &gt; &gt; &gt;&gt;http://groups.google.com/group/google-appengine?hl=en.
> &gt;
> &gt; &gt; &gt; --
> &gt; &gt; &gt; Takashi Matsuo
> &gt;
> &gt; &gt; &gt; --
> &gt; &gt; &gt; You received this message because you are subscribed to the Google Groups
> &gt; &gt; &gt; &quot;Google App Engine&quot; group.
> &gt; &gt; &gt; To post to this group, send email to google-a...@googlegroups.com.
> &gt; &gt; &gt; To unsubscribe from this group, send email to
> &gt; &gt; &gt; google-appengi...@googlegroups.com.
> &gt; &gt; &gt; For more options, visit this group at
> &gt; &gt; &gt;http://groups.google.com/group/google-appengine?hl=en.
> &gt;
> &gt; &gt; --
> &gt; &gt; Omnem crede diem tibi diluxisse supremum.
> &gt;
> &gt; &gt; --
> &gt; &gt; You received this message because you are subscribed to the Google Groups
> &gt; &gt; &quot;Google App Engine&quot; group.
> &gt; &gt; To post to this group, send email to google-a...@googlegroups.com.
> &gt; &gt; To unsubscribe from this group, send email to
> &gt; &gt; google-appengi...@googlegroups.com.
> &gt; &gt; For more options, visit this group at
> &gt; &gt;http://groups.google.com/group/google-appengine?hl=en.
> &gt;
> &gt; --
> &gt; Takashi Matsuo

Simon Knott

unread,
Jul 15, 2012, 11:18:42 AM7/15/12
to google-a...@googlegroups.com
Completely agree, seems to defeat the entire purpose of warm-up requests.

Takashi Matsuo

unread,
Jul 15, 2012, 9:04:10 PM7/15/12
to google-a...@googlegroups.com

I would say, the best bet is to set an appropriate number of min idle instances.

Tom,

The resident instance handles almost no requests (only 3 so far). If a
> dynamic instance needs to be started, couldn't the scheduler send at
> least some requests to the idle instance while calling /_ah/warmup on
> the spawning dynamic one?

No at lease for now, because the scheduler can not tell beforehand how long the warmup request will take, and even whether the warmup request will succeed or not, thus those subsequent requests will see the significant, un-predictable pending time and probably elevated error rate in some cases.

Michael,

Under load, I wouldI want user facing requests to hit the resident instance UNTIL a new dynamic instance has been spun up AND hit with a warmup request.

If you set enough number of min idle instances to absorb your trafic, it should be Ok.

In my understanding, technically, it might be possible for us to provide prediction based warmup request, but it will cause complaints from many people who want to save money. Now we're offering configurable min idle instance, so please consider using this feature.

If you have any use-case or experience which can not be covered by setting min-idle-instances, please let us know.

-- Takashi


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/kV3oMH_pUloJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Richard Watson

unread,
Jul 16, 2012, 3:00:09 AM7/16/12
to google-a...@googlegroups.com
On Monday, July 16, 2012 3:04:10 AM UTC+2, Takashi Matsuo (Google) wrote:

I would say, the best bet is to set an appropriate number of min idle instances.

But Tom seems to think that "1" is an appropriate number for his app. Why offer that option if it's automatically wrong?

The resident instance handles almost no requests (only 3 so far). If a
> dynamic instance needs to be started, couldn't the scheduler send at
> least some requests to the idle instance while calling /_ah/warmup on
> the spawning dynamic one?

No at lease for now, because the scheduler can not tell beforehand how long the warmup request will take, and even whether the warmup request will succeed or not, thus those subsequent requests will see the significant, un-predictable pending time and probably elevated error rate in some cases.

I don't get this, yet.  A request is coming in.  App Engine knows there is an idle instance with the exact purpose of "absorbing traffic", and a request would obviously count as "traffic".  If we don't know how long the warmup request would take, surely that applies to the user request as well? So which is worse - fulfilling the user request with an instance we know will work while handling the vagaries of the warmup out-of-band, or shooting the customer request directly at said vague warmup process, leaving a pristine idle instance to handle other traffic that we aren't guaranteed might come?

Maybe the definition of "idle" shouldn't be as binary as it is.  An instance that is doing 10% of it's potential is pretty close to idle. I'd rather only warm up a new instance when the first one hits, e.g. 70%, rather than always paying for a just-in-case instance and thereby guaranteeing that I'm paying too much. Having said that, I have no experience with hugely spiky traffic so maybe others know why this idea sucks. But for users who measure their instance count at less than 10, surely this would work better?

Also, surely each time a warmup is performed App Engine could just add the startup time to its knowledge? "This app starts within X seconds 90% of the time."

--
Richard 

Jeff Schnitzer

unread,
Jul 16, 2012, 4:18:20 AM7/16/12
to google-a...@googlegroups.com
I vaguely expect something like this:

* All incoming requests go into a pending queue.
* Requests in this queue are handed off to warm instances only.
* Requests in the pending queue are only sent to warmed up instances.
* New instances can be started up based on (adjustable) depth of the
pending queue.
* If there aren't enough instances to serve load, the pending queue
will back up until more instances come online.

Isn't this fairly close to the way appengine works? What puzzles me
is why requests would ever be removed from the pending queue and sent
to a cold instance. Even in Pythonland, 5-10s startup times are
common. Seems like the request is almost certainly better off waiting
in the queue.

Jeff

On Mon, Jul 16, 2012 at 12:00 AM, Richard Watson
<richard...@gmail.com> wrote:
>
> I don't get this, yet. A request is coming in. App Engine knows there is
> an idle instance with the exact purpose of "absorbing traffic", and a
> request would obviously count as "traffic". If we don't know how long the
> warmup request would take, surely that applies to the user request as well?
> So which is worse - fulfilling the user request with an instance we know
> will work while handling the vagaries of the warmup out-of-band, or shooting
> the customer request directly at said vague warmup process, leaving a
> pristine idle instance to handle other traffic that we aren't guaranteed
> might come?
>
> Maybe the definition of "idle" shouldn't be as binary as it is. An instance
> that is doing 10% of it's potential is pretty close to idle. I'd rather only
> warm up a new instance when the first one hits, e.g. 70%, rather than always
> paying for a just-in-case instance and thereby guaranteeing that I'm paying
> too much. Having said that, I have no experience with hugely spiky traffic
> so maybe others know why this idea sucks. But for users who measure their
> instance count at less than 10, surely this would work better?
>
> Also, surely each time a warmup is performed App Engine could just add the
> startup time to its knowledge? "This app starts within X seconds 90% of the
> time."
>
> --
> Richard
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/iKK_n1yAX8EJ.

Michael Hermus

unread,
Jul 16, 2012, 7:58:37 AM7/16/12
to google-a...@googlegroups.com
Takashi: Like Richard, I really don't understand; perhaps we are missing something obvious; if so, I apologize in advance. The scenario that Jeff laid out is exactly what I would expect, but from what I can gather, that is NOT how App Engine is currently working.

For my particular application right now it is not a huge problem, but there are many scenarios where it might become one. More importantly however, it just doesn't make sense to me, and that is a big warning sign. What else might change in a way that makes no sense?



On Sunday, July 15, 2012 9:04:10 PM UTC-4, Takashi Matsuo (Google) wrote:

I would say, the best bet is to set an appropriate number of min idle instances.

Tom,

The resident instance handles almost no requests (only 3 so far). If a
> dynamic instance needs to be started, couldn't the scheduler send at
> least some requests to the idle instance while calling /_ah/warmup on
> the spawning dynamic one?

No at lease for now, because the scheduler can not tell beforehand how long the warmup request will take, and even whether the warmup request will succeed or not, thus those subsequent requests will see the significant, un-predictable pending time and probably elevated error rate in some cases.

Michael,

Under load, I wouldI want user facing requests to hit the resident instance UNTIL a new dynamic instance has been spun up AND hit with a warmup request.

If you set enough number of min idle instances to absorb your trafic, it should be Ok.

In my understanding, technically, it might be possible for us to provide prediction based warmup request, but it will cause complaints from many people who want to save money. Now we're offering configurable min idle instance, so please consider using this feature.

If you have any use-case or experience which can not be covered by setting min-idle-instances, please let us know.

-- Takashi
On Mon, Jul 16, 2012 at 12:18 AM, Simon Knott wrote:
Completely agree, seems to defeat the entire purpose of warm-up requests.


On Sunday, 15 July 2012 14:40:14 UTC+1, Michael Hermus wrote:
I have to agree with this; it seems completely backwards to me. Wouldn't resident instance warmups be extremely infrequent since they are.... well, resident! Under load, I wouldI want user facing requests to hit the resident instance UNTIL a new dynamic instance has been spun up AND hit with a warmup request.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/kV3oMH_pUloJ.

To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Takashi Matsuo

unread,
Jul 16, 2012, 4:15:14 PM7/16/12
to google-a...@googlegroups.com

Richard,

But Tom seems to think that "1" is an appropriate number for his app. Why offer that option if it's automatically wrong?

If his purpose is reduce the number of user-facing loading requests, and he still sees many user-facing loading requests, the current settings is not enough.

Jeff,

> I vaguely expect something like this:
>  * All incoming requests go into a pending queue.
>  * Requests in this queue are handed off to warm instances only.
>  * Requests in the pending queue are only sent to warmed up instances.
>  * New instances can be started up based on (adjustable) depth of the
> pending queue.
>  * If there aren't enough instances to serve load, the pending queue
> will back up until more instances come online.
> Isn't this fairly close to the way appengine works?  What puzzles me
> is why requests would ever be removed from the pending queue and sent
> to a cold instance.  Even in Pythonland, 5-10s startup times are
> common.  Seems like the request is almost certainly better off waiting
> in the queue.

Probably reading the following section woud help understanding the scheduler:

A request comes in, if there's available dynamic instance, he'll be handled by that dynamic instance. Then, if there's available resident instance, he'll be handled by that resident instance. Then he goes to the pending queue. He can be sent any available instances at any time(it's fortunate for him). Then according to the pending latency settings, he will be sent to a new cold instance.

So, if you prefer pending queue rather than a cold instance, you can set high minimum latency, however, it might not be what you really want because it will cause a bad performance on subsequent requests.

Generally speaking, just looking at a statistic for a spiky event, you might have a feeling that our scheduler can do better, however, the difficult part is that those requests in that statistic were not issued at the flat rate. In other words, the scheduler starts a new dynamic instance because it is really needed at that moment.

Well again, in order to reduce the number of user-facing loading requests, the most effective thing is to set sufficient number of min idle instances. The second thing to consider would be, if you have longer backend tasks, putting those tasks into another version, in order to avoid blocking other frontend requests. If you use python2.7 runtime with concurrent request enabled, probably you'd better isolate CPU bound operations from user-facing frontend version in order to avid slow-down in the frontend version.

Probably it is great if we offer an API for dynamically configuring the performance settings especially in terms of cost efficiency, and I think it's worth filing a feature request.

--
Takashi Matsuo

Jeff Schnitzer

unread,
Jul 16, 2012, 6:10:55 PM7/16/12
to google-a...@googlegroups.com
Hi Takashi. I've read the performancesettings documentation a dozen
times and yet the scheduler behavior still seems flawed to me.

Once a request is taken from the pending queue and sent to an instance
(cold or otherwise), it's dedicated to execution on that instance. In
the queue, it can still be routed to any instance that becomes
available. Why would we *ever* want to send a request to a cold
instance, which has an unknown and unpredictable response time? If I
were that request, I'd want to sit in the queue until a known-good
instance becomes available. Depending on the queue fill rate I might
still end up waiting for an instance to come online... but there's
also a good chance I'll get handled by an existing instance,
especially if traffic is bursty.

"the scheduler starts a new dynamic instance because it is really
needed at that moment." -- this is not an accurate characterization,
because new instances don't provide immediate value. They only
provide value 5+ (sometimes 50+) seconds after they start. In the
mean time, they have captured and locked up user-facing requests which
might have been processed by running instances much faster.

The min latency setting is actually working against us here. What I
really want is a high (possibly infinite) minimum latency for moving
items from pending queue to a cold instance, but a low minimum latency
for warming up new instances. I don't want requests waiting in the
pending queue, but it does me no good to have them sent to cold
instances. I'd rather they wait in the queue until fresh instances
come online.

Jeff
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

Mauricio Aristizabal

unread,
Jul 16, 2012, 6:19:43 PM7/16/12
to google-a...@googlegroups.com
My Spring app takes about 40s to start up.  So I guess even with warmup requests configured my users are sometimes getting 40s load times.  That is not good.

I hope this thread does surface a solution for dealing with this problem.

Rerngvit Yanggratoke

unread,
Jul 17, 2012, 2:51:58 AM7/17/12
to google-a...@googlegroups.com
That would be very useful. If you fire a feature request, I would start it as well.
--
Best Regards,
Rerngvit Yanggratoke 

Takashi Matsuo

unread,
Jul 17, 2012, 8:21:03 AM7/17/12
to google-a...@googlegroups.com

Hi Jeff,

On Tue, Jul 17, 2012 at 7:10 AM, Jeff Schnitzer <je...@infohazard.org> wrote:
Hi Takashi.  I've read the performancesettings documentation a dozen
times and yet the scheduler behavior still seems flawed to me.

I would rather not use a word 'flawed' here, but probably there is still room for improvement. First of all, is there any reason why you can not use min idle instances settings? Is it just because of the cost?

If so, can 'introducing somewhat lower price for resident instance' be a workable feature request?

Vaguely I have a feeling that, what you're trying to accomplish here is to save money while acquire good performance. If so, it is one of the most difficult thing to implement. However, in my opinion, it's worth trying to implement, so let's continue the discussion.

Once a request is taken from the pending queue and sent to an instance
(cold or otherwise), it's dedicated to execution on that instance.  In
the queue, it can still be routed to any instance that becomes
available.  Why would we *ever* want to send a request to a cold
instance, which has an unknown and unpredictable response time?  If I
were that request, I'd want to sit in the queue until a known-good
instance becomes available.  Depending on the queue fill rate I might
still end up waiting for an instance to come online... but there's
also a good chance I'll get handled by an existing instance,
especially if traffic is bursty.

"the scheduler starts a new dynamic instance because it is really
needed at that moment."  -- this is not an accurate characterization,
because new instances don't provide immediate value.  They only
provide value 5+ (sometimes 50+) seconds after they start.  In the
mean time, they have captured and locked up user-facing requests which
might have been processed by running instances much faster.

If you have an app with average +50s loading time, I totally understand that you strongly want to avoid sending requests to cold instances. On the other hand, there are also many well-behaved apps with <5 secs loading/warming time. Please understand that for those apps, it is still acceptable if we send requests to cold instances, so it's likely we can not prioritize the feature bellow over other things, however...


The min latency setting is actually working against us here.  What I
really want is a high (possibly infinite) minimum latency for moving
items from pending queue to a cold instance, but a low minimum latency
for warming up new instances.  I don't want requests waiting in the
pending queue, but it does me no good to have them sent to cold
instances.  I'd rather they wait in the queue until fresh instances
come online.

For me, it look like a great idea. Can you file a feature request for this, so that we can get a rough idea of how many people want it, and start an internal discussion.

Thanks as always,

-- Takashi



--
Takashi Matsuo

Jeff Schnitzer

unread,
Jul 17, 2012, 9:43:56 PM7/17/12
to google-a...@googlegroups.com
On Tue, Jul 17, 2012 at 5:21 AM, Takashi Matsuo <tma...@google.com> wrote:
>
> On Tue, Jul 17, 2012 at 7:10 AM, Jeff Schnitzer <je...@infohazard.org> wrote:
>>
>> Hi Takashi. I've read the performancesettings documentation a dozen
>> times and yet the scheduler behavior still seems flawed to me.
>
> I would rather not use a word 'flawed' here, but probably there is still
> room for improvement. First of all, is there any reason why you can not use
> min idle instances settings? Is it just because of the cost?

My goal is to have every request serviced with minimum latency possible.

Leaving aside implementation complexity, there doesn't seem to be any
circumstance when it is efficient to remove a request from the pending
queue and lock it into a cold start. There are really two cases:

1) The request is part of a sudden burst. The request will be
processed by an active instance before any new instances come online.
It therefore should stay in the queue.

2) The request is part of new, sustained traffic. Whether the
request waits in the pending queue for new instances to warm up, or
waits at a specific cold instance, the request is still going to wait.
At least if it's still in the pending queue there's a chance it will
get routed to the first available instance... which overall is likely
going to be better than any particular (mis-)behaving start.

Imagine you're at the supermarket checkout. The ideal situation is to
have everyone waiting in one line and then route them off to cashiers
as they become available. If the pending queue gets too long, you
open more cashiers (possibly multiple at once) until the line shrinks.
If every cashier has a separate queue, it's really hard to optimize
the # of cashiers since you have some with a line 5 deep and some
sitting idle.

I'm fully prepared to believe there are implementation complexities
that make the single-pending-queue difficult, but what I'm hearing is
that Google deliberately *wants* to send requests to cold starts...
which seems "flawed" to me. Am I missing something?

> If so, can 'introducing somewhat lower price for resident instance' be a
> workable feature request?
>
> Vaguely I have a feeling that, what you're trying to accomplish here is to
> save money while acquire good performance. If so, it is one of the most
> difficult thing to implement. However, in my opinion, it's worth trying to
> implement, so let's continue the discussion.

Let's forget price for a moment and just try to work towards the goal
of having an efficient system. Presumably, a more efficient system
will be more cost-effective than a less efficient system that has lots
of idle instances sitting around hogging RAM. Good for Google, good
for us.

> If you have an app with average +50s loading time, I totally understand that
> you strongly want to avoid sending requests to cold instances. On the other
> hand, there are also many well-behaved apps with <5 secs loading/warming
> time. Please understand that for those apps, it is still acceptable if we
> send requests to cold instances, so it's likely we can not prioritize the
> feature bellow over other things, however...

Someone at Google must have chart which has "Time" on the X axis and
"% of Startup Requests" on the Y axis - basically a chart of what
percentage of startup requests in the Real World are satisfied at
various time boundaries. I have a pretty good idea, I think, of what
this chart looks like. I'm also fairly certain that the Python chart
looks nothing like the Java chart.

For one thing, the Java chart _starts_ at 5s. The bare minimum Hello,
World that creates a PersistenceManagerFactory with one class (the
Employee in the docs) takes 5-6s to start up. And this is when GAE is
healthy; that time can easily double on a bad day.

So if you optimize GAE for apps with a <5s startup time, you're
optimizing for apps that don't exist - at least on the JVM. I'd be
*very* surprised if the average real-world Java instance startup time
was less than 20s. You just don't build apps that way in Java. Given
a sophisticated application, I'm not even sure it's possible unless
the only datatypes you allow yourself are ArrayList and HashMap.

>> The min latency setting is actually working against us here. What I
>> really want is a high (possibly infinite) minimum latency for moving
>> items from pending queue to a cold instance, but a low minimum latency
>> for warming up new instances. I don't want requests waiting in the
>> pending queue, but it does me no good to have them sent to cold
>> instances. I'd rather they wait in the queue until fresh instances
>> come online.
>
> For me, it look like a great idea. Can you file a feature request for this,
> so that we can get a rough idea of how many people want it, and start an
> internal discussion.

http://code.google.com/p/googleappengine/issues/detail?id=7865

I generalized it to "User-facing requests should never be locked to
cold instance starts".

Thanks,
Jeff

Drake

unread,
Jul 18, 2012, 12:55:36 AM7/18/12
to google-a...@googlegroups.com
Jeff,

Check the archive there are several check out lane analogies that I have
posted.

I agree that the Queue is sub optimal, but it is more sub optimal the
smaller you are. When you get to 50 instances it is amazing how well the
load balancing works. On the climb up to peak new instances spin up on
requests rather than causing cascading failures or dramatic spin ups. And on
the way down instances de-utilized and end of life gracefully.

Using your grocery store analogy, imagine that you are optimizing for a
guarantee that you will be checked out with in 30 seconds of entering the
queue. The ideal scenario is that when you get to a spot where you know you
are 15 seconds from being checked out, and it takes 15 seconds to "open a
new lane" you want to send users to go stand in line while the register
opens.

Your goal is to never have to pay on that guarantee, not to serve the
highest percentage in the least time. When this is your ideal QoS the
current load balancing does really well. It does better if it has 10
registers and can open 2 at a time, rather than when it has 1 register and
needs to decide if it is going to double capacity.

-Brandon


Richard Watson

unread,
Jul 18, 2012, 4:17:00 AM7/18/12
to google-a...@googlegroups.com
Ok, but what of Jeff's suggestion would impact performance negatively at your scale?  I assume the GAE team would ensure they don't harm your case while trying to improve the few-instances case.

Jeff Schnitzer

unread,
Jul 18, 2012, 7:43:32 AM7/18/12
to google-a...@googlegroups.com
I don't buy this. Even if the checkout queue is 15s deep and it takes
15s to bring a cashier online, there's still no value in sending Bob
to wait by the register. For one thing, it doesn't actually reduce
Bob's wait time by a material amount (he can move to the register in a
few milliseconds). For another thing, we're only *speculating* that
the register will be open in 15s - sometimes it's 30s or 45s because
the cashier is hungover and keeps dropping the keys.

Sure, it's a lot easier to balance a 50-instance problem because a
load that size will have a lot more inertia and be less spiky. But I
still want to know how what percentage of your user-facing requests
are hitting cold starts, and how many idle instances you need to keep
it from happening.

I have an e-commerce site. It doesn't get a lot of traffic but each
request is very important - people are digging out their credit cards
and buying things. It's *never* ok for a 20s pause to interrupt this
process; people don't like giving money to sites that seem broken.

Jeff

Drake

unread,
Jul 18, 2012, 2:27:45 PM7/18/12
to google-a...@googlegroups.com
Use warm up requests.

Also optimize your code to load faster. I have an app that is HUGE it
loads in about 8s. On an F4. It doesn't fit in an F2. Check that you really
need all your libraries. Check that any of the large blocks of data you have
in code files are in database/memcache. If you need use a Cron to put them
back in memcache every 10 minutes.

If you have multiple people using the same code base, make your app
multi-tenant so that you have fewer spin ups and more instances to serve
requests.

If you are storing init variables in datastore make sure you put them in mem
cache so you can get them faster. Marshal or Pickle init variables.

Use inline imports for functions that are rarely called.

Strip unused class/functions from Libraries and frameworks.




Michael Hermus

unread,
Jul 18, 2012, 3:38:35 PM7/18/12
to google-a...@googlegroups.com
We ARE using warmup requests, that's is the while point of this thread. I dont believe that you (or anyone) has sufficiently explained how sending user requests to cold instances is ever better than warming it up first.. Said request can ALWAYS be pulled right off the pending queue and sent to the instance as soon as it is ready.

On Wednesday, July 18, 2012 2:27:45 PM UTC-4, Brandon Wirtz wrote:
> Use warm up requests.
>
> Also optimize your code to load faster. I have an app that is HUGE it

> loads in about 8s. On an F4. It doesn&#39;t fit in an F2. Check that you really

Jeff Schnitzer

unread,
Jul 18, 2012, 4:05:03 PM7/18/12
to google-a...@googlegroups.com
On Wed, Jul 18, 2012 at 11:27 AM, Drake <dra...@digerat.com> wrote:
> Use warm up requests.

...except that thanks to a recent change, warmup requests don't
prevent users from seeing cold starts anymore. In a burst of traffic,
GAE will happily route requests to cold instances even if an active
instance would have become available 1s later.

> Also optimize your code to load faster. I have an app that is HUGE it
> loads in about 8s. On an F4. It doesn't fit in an F2. Check that you really
> need all your libraries. Check that any of the large blocks of data you have
> in code files are in database/memcache. If you need use a Cron to put them
> back in memcache every 10 minutes.

Believe me, I've gone down this route.

8s is still unacceptable. That's well beyond "user clicks reload
because they think site is broken". And you aren't accounting for
sick periods when latency goes 2-3X that; how do your users feel about
a 24s wait?

I have optimized my codebase about as much as I consider reasonable.
I don't eager load data, I've turned off any kind of classpath
scanning. I've deliberately used GAE-friendly libraries. Hell, I
even wrote one myself. When GAE is behaving, my startup time is
20-30s, and could go down to 10-15s if I wanted to quadruple my bill
with F4s (I don't). The next step is more or less tantamount to
"give up on Java", or at least programming in something that looks
like modern Java:

* In order to persist POJOs, the classes must be introspected at
startup. The alternative is to use only the low-level api; that does
not scale to a complicated project.

* Nearly every modern Java webapp framework uses annotations to
define a sitemap. That means loading and introspecting all those
annotated classes before serving the first request. The alternative
is to find some early-2000s-era relic based on XML files like Struts1
or WebWork (or this blast from the past: http://mav.sf.net). I'm not
even sure it would help; the first thing most of these do is read the
XML and load the related classes.

* I could probably squeeze a few more seconds out of startup if I
dropped Guice and AOP. This would involve rewriting my app more or
less from scratch. Aside from my fixed investment, I refuse to go
back to programming without tools like these. It's the difference
between:

@Transact(TxnType.REQUIRED)
void doSomeWork() {
...do work
}

and littering my code with crap like this:

Transaction txn = datastore.beginTransaction();
try {
...do work
txn.commit();
} finally {
if (txn.isActive()) {
txn.rollback();
}
}

Actually, these are not comparable, because the AOP-based example is
smart about inheriting a pre-existing transaction context, or starting
one if necessary. A truly equivalent second example would include a
ton more boilerplate.

Without techniques like AOP and DI, Java programming becomes miserable
busywork. We picked GAE as a platform because without having to do
the work of an ops staff, we can write code and develop features much
faster. If I am stuck working at half-speed with crappy tools, then
GAE has a negative value proposition - we'd be faster even with the
ops load, and paying a fraction of GAE's price to boot.

So yeah, I don't expect to see my startup times get much better than
what they are today... and if I double my codebase next year, it could
get worse.

Realistically, I do not expect Google will ever be able to squeeze
Java application starts into reasonable timeframes (ie,
http://www.useit.com/alertbox/response-times.html). It's just not
realistic given the nature of classloading and the security
precautions that must be built around it. That kind of JVM
engineering is rocket science, and in the last three years it hasn't
materialized (although, to Google's credit, incremental progress has
been made).

This is why I'm thinking about lateral solutions. Honestly I don't
really care how long it takes my app to start as long as 1) users
never see cold starts and 2) I can expand capacity to meet increased
load. From the outside, it really feels like all the pieces have been
implemented, they just are misconfigured right now.

Jeff

hyperflame

unread,
Jul 18, 2012, 5:07:04 PM7/18/12
to Google App Engine
On Jul 18, 2:38 pm, Michael Hermus <michael.her...@gmail.com> wrote:
> I dont believe that you (or anyone) has sufficiently explained how sending user requests to cold instances is ever better than warming it up first.. Said request can ALWAYS be pulled right off the pending queue and sent to the instance as soon as it is ready.

Let me take a shot at this. Brandon Wirtz touched on it before, when
he said "I agree that the Queue is sub optimal, but it is more sub
optimal the smaller you are.  When you get to 50 instances it is
amazing how well the load balancing works. "

Suppose you have a massive store (we'll call it MegaWalmart).
MegaWalmart has 100 staffed checkout lanes. Suppose all of these lanes
have a single queue of people lined up, and a supervisor which sends
customers to open checkout lanes (roughly analogous to your preferred
way of handling GAE request queuing). That one line would be huge,
would block traffic around the store, etc. It's far better, from
MegaWalmart's POV, to have multiple check out lines, one line per each
lane.

Now suppose you have another checkout lane open up. (remember, now we
have one line per lane) That's an additional 1% capacity. Now if the
checkout clerk is drunk/hungover/whatever, that additional lane will
take extra time to open, annoying the customers lined up in that lane.
From MegaWalmart's POV, who cares? Less than 1% of your customers were
inconveniced. 99% of people still had a decent time checking out.

Let's apply this to the scheduler. Suppose there was one single queue
of requests. At Google-scale, that queue of requests could easily
exceed millions of entries, possibly billions. And God help you if the
machine hosting the queue gets a hiccup, or an outright failure. Don't
you agree that, at least at Google-scale, requests should immediately
be shunted to instance-level queues? Even if a single instance takes
forever, or fails, we don't have to care: such a failure would only
affect 0.000001% of users.

This leads me to my final point: My understanding, from reading the
documentation and blog/news posts about GAE, is that the the core of
GAE is ripped pretty much directly from production Google services.
The problem with this is, the scheduler is intended to work at very
high scale, not at low scale. And frankly, this makes sense when you
consider a lot of the finer points of the GAE ecosystem.

So, to fix this: GAE needs to have a good relook at scheduler code,
and rewrite it so that it has two different rules for apps at less
than 50 instances, and more than 50 instances. Additionally, perhaps
the GAE should look at making the scheduler smarter; perhaps it could
measure the startup time of instances, and in the future, not send
requests to cold instances until that startup time has elapsed.

Personal thoughts: I have admined a corporate GAE app that has
exceeded 100 instances, and I use GAE for personal apps that use, at
max, 3-4 instances. When you use GAE at these two extremes, you really
get an understanding of how GAE scales. For instance, a personal
anecdote: for my low end apps, I occasionally notice that GAE starts
up a new idle instance. I'm not charged for it, it doesn't do any
work, but it is counted in the "current instances" counter. My guess
is that, during non-peak times, the GAE scheduler will load into
memory additional instances of low end apps, to try and be ready for
quick scaling.  So I believe the GAE team tries to handle low-end
instances, but it does need more work.

TLDR: the scheduler needs more work, and MegaWalmart is the same thing
as Google's scheduler.

Michael Hermus

unread,
Jul 18, 2012, 6:11:27 PM7/18/12
to google-a...@googlegroups.com
I certainly appreciate the attempted explanation. Correct me if I am wrong, but what you said amounts to this:

Google has to route requests to cold start instances because otherwise, for large scale apps, the pending queue might get way too big, not to mention it could blow up sometime.

If that is true, I suppose I would be quite surprised, for the following reasons:

a) Google's entire infrastructure is designed for EVERYTHING to scale massively and still work well.
b) By waiting for instances to warm up first, I don't think you would really increase the maximum depth of the pending queue by a whole lot. In fact, the larger your app (i.e. the more instances you have active), the less impact it would have relative to the alternative.
c) I don't think the pending queue is 'hosted' on a single machine; I am pretty sure it relies on a resilient queue infrastructure designed to tolerate failures and scale well.

rerngvit yanggratoke

unread,
Jul 18, 2012, 6:22:34 PM7/18/12
to google-a...@googlegroups.com
Really, looking forward to see reply from Googler(s) regarding this.

hyperflame

unread,
Jul 18, 2012, 6:55:15 PM7/18/12
to Google App Engine


On Jul 18, 5:11 pm, Michael Hermus <michael.her...@gmail.com> wrote:
> If that is true, I suppose I would be quite surprised, for the following
> reasons:
>
> a) Google's entire infrastructure is designed for EVERYTHING to scale
> massively and still work well.

"Massively" being the key word there. Every Google service is massive.
Even abandoned or low use Google services (i.e. Wave, Buzz) are going
to use the equivalent of a minimum 10,000 F4 instances. The GAE
scheduler does inefficient scheduling at less than 50 instances. It
works wonderfully once you hit scale at 100+ machines(correct me if
I'm wrong, but most of the people seeing problems here are probably
less than 50 instances).

> b) By waiting for instances to warm up first, I don't think you would
> really increase the maximum depth of the pending queue by a whole lot.

I would have to disagree with you on this. My experience with big
websites tells me that requests can very quickly pile up if you're not
handling them expeditiously.

> c) I don't think the pending queue is 'hosted' on a single machine; I am
> pretty sure it relies on a resilient queue infrastructure designed to
> tolerate failures and scale well.

My analogy, like all analogies, breaks down if you apply it literally.
Even if you (hypothetically) built a datacenter with 100,000 machines
solely dedicated to hosting a single request queue, that datacenter
can still go down (earthquake, power, hurricane, etc). Far better to
simply dump requests into instance level queues and be done with it.

Just to be clear, I am agreeing with you in that the GAE scheduler
needs work; it is currently optimizing for high-scale apps, and not
apps that are using double-digit or lower instances.

Jeff Schnitzer

unread,
Jul 18, 2012, 8:01:03 PM7/18/12
to google-a...@googlegroups.com
On Wed, Jul 18, 2012 at 3:55 PM, hyperflame <hyper...@gmail.com> wrote:
>
> "Massively" being the key word there. Every Google service is massive.
> Even abandoned or low use Google services (i.e. Wave, Buzz) are going
> to use the equivalent of a minimum 10,000 F4 instances. The GAE
> scheduler does inefficient scheduling at less than 50 instances. It
> works wonderfully once you hit scale at 100+ machines(correct me if
> I'm wrong, but most of the people seeing problems here are probably
> less than 50 instances).

This is an interesting discussion. How many apps do you think are on
GAE that run 100+ instances? 1000+? While these are certainly going
to be the "important" apps for Google, I suspect they are fairly few
in number. With a couple exceptions, the owners are very quiet on
this list.

I wonder also how many of them are on multithreaded runtimes, and of
those, how many of them are on Java runtimes.

>> b) By waiting for instances to warm up first, I don't think you would
>> really increase the maximum depth of the pending queue by a whole lot.
>
> I would have to disagree with you on this. My experience with big
> websites tells me that requests can very quickly pile up if you're not
> handling them expeditiously.

I think we're looking at this wrong. The question is not "should
there be more than a single queue"; I certainly agree that we
certainly don't want one loadbalancer falling over and taking down a
thousand pending requests.

The important question is "should requests go to a queue that is
locked to a single (unstarted) instance", which is a different
question. It's certainly possible to have multiple queues with the
ability to route requests to active instances. At Google scale there
must certainly already be multiple queues; obviously one computer
cannot route the zillions of requests per second that www.google.com
gets.

Continuing the MegaWalmart example, you don't need to have one big
line routed around the store; you can have a number of lines each of
which route among banks of cashiers. Presumably GAE has some sort of
equivalent construct.

"Less than 1% of your customers were inconveniced. 99% of people still
had a decent time checking out." is not very comforting. If 1% of
Google searches took 8+ seconds to respond, the Googleplex alarms
would deafen people as far away as Oakland.

> Just to be clear, I am agreeing with you in that the GAE scheduler
> needs work; it is currently optimizing for high-scale apps, and not
> apps that are using double-digit or lower instances.

I fear that GAE is optimized for single-threaded Python2.5 apps which
take 2s to spin up, an environment that is becoming less relevant as
time goes on. Given an app that takes 20+ seconds to start, would a
high-traffic app work any better than a low-traffic app?

Jeff

hyperflame

unread,
Jul 18, 2012, 9:49:29 PM7/18/12
to Google App Engine


On Jul 18, 7:01 pm, Jeff Schnitzer <j...@infohazard.org> wrote:
> This is an interesting discussion.  How many apps do you think are on
> GAE that run 100+ instances?  1000+?  While these are certainly going
> to be the "important" apps for Google, I suspect they are fairly few
> in number.  With a couple exceptions, the owners are very quiet on
> this list.

As I noted before, I have admined a 100+ instance GAE app, and know a
few companies that have or currently do run at that scale. To be
completely honest, many of those (that I know personally) have
migrated to AWS or similar cloud firms. 8 cents per F1 instance hour
is a budget breaker; AWS is much cheaper and (this is far more
important to managers) has a track record of periodically lowering
prices.

Also, you might be surprised at who reads this list. I have seen
firsthand several executives and corporate strategists read this list,
or get forwarded excerpts by their engineers. They might be "very
quiet" but they are watching (cue Jaws the movie music... )

> "Less than 1% of your customers were inconveniced. 99% of people still
> had a decent time checking out." is not very comforting.  If 1% of
> Google searches took 8+ seconds to respond, the Googleplex alarms
> would deafen people as far away as Oakland.

Just as a purely practical issue, I routinely experience long load
times for certain Google services. Search is always fast, but Gmail
can take some time to load on my iPad: when i go to gmail.com, the url
box will bounce between several different urls before loading, at
least 6-8 seconds. Is there an "alarm" form that I can fill out? :-)

> I fear that GAE is optimized for single-threaded Python2.5 apps which
> take 2s to spin up, an environment that is becoming less relevant as
> time goes on.  Given an app that takes 20+ seconds to start, would a
> high-traffic app work any better than a low-traffic app?

My experience is in Java, so I have no reference to single threaded
Python 2.5 apps. With that said, the main problem is the scheduler.

Essentially, the GAE people have to rewrite the scheduler to reflect
the following rules:
1. For apps that use 50+ instances, use the current rules for
scheduling.
2. For apps that are fewer than 50 instances, change to the following
rules:
2A: Record the startup times for the past 20 instance startups.
2B: Instances that are started up do not receive requests until the
average of the startup times + 20% has elapsed.
2C: In nonpeak hours, idle instances are free or heavily  discounted
2D: During times when GAE is experiencing high latency, the scheduler
is more aggressive in spooling up more instances. Idle instances
spooled up during this time are cheaper (30% discount seems fair...)

Drake

unread,
Jul 19, 2012, 1:41:10 AM7/19/12
to google-a...@googlegroups.com
I can make searches that are slow. You just have to pick things no one has
searched before.

That said, the 1% of bad check outs would only apply if you only serve 1
request from the slow server. If each instance serves 100k requests in its
lifetime, then you are talking about 1/100k being negatively impacted. Even
my lowliest of instances serves 3000 in 2 hours. Before dying.

I said 15s spinup because I think that was your number. I shoot for spin
ups that are under 4s.

If my spinup is more than 4s I create a new app, or dynamic back end to
handle the requests with specialized version of the microapp that is faster
to load.




Richard Watson

unread,
Jul 19, 2012, 3:57:28 AM7/19/12
to google-a...@googlegroups.com
On Thursday, July 19, 2012 12:55:15 AM UTC+2, hyperflame wrote:

Even if you (hypothetically) built a datacenter with 100,000 machines
solely dedicated to hosting a single request queue, that datacenter
can still go down (earthquake, power, hurricane, etc). Far better to
simply dump requests into instance level queues and be done with it.

That's fine, but all we're asking for is "don't dump those requests into an instance that isn't able to serve the request immediately".  If there are three instances running and one being started up, don't consider the latter in the population of instances to send requests to. Dump it into an instance level queue that is already running. Everything else remains constant.

I stand to be corrected, but I doubt that Google searches are dumped onto cold instances. They and Amazon conducted the research on how delay time affects user behaviour. If Brandon's unique searches make Google slow, that's a separate issue from instances waking up.  And I doubt Google slows down too much on long-tail queries. I just searched on "Just to be clear, I am agreeing with you in that the GAE scheduler" and it took: 1 result (0.21 seconds)

Richard Watson

unread,
Jul 19, 2012, 4:02:42 AM7/19/12
to google-a...@googlegroups.com
On Thursday, July 19, 2012 9:57:28 AM UTC+2, Richard Watson wrote:

That's fine, but all we're asking for is "don't dump those requests into an instance that isn't able to serve the request immediately".  If there are three instances running and one being started up, don't consider the latter in the population of instances to send requests to. Dump it into an instance level queue that is already running. Everything else remains constant.

And maybe I'm fibbing. If the non-instance-queue exists, it'd support Jeff's late-instance-binding idea and that would be first prize. I'd happily take the risk that that instance queue dies, considering we're already taking similar risks with the scheduler. I'd rather have that tiny risk rather than a guaranteed 1% user-perceived failure, which is what is happening now.

hyperflame

unread,
Jul 19, 2012, 6:10:17 PM7/19/12
to google-a...@googlegroups.com
On Thursday, July 19, 2012 2:57:28 AM UTC-5, Richard Watson wrote:

That's fine, but all we're asking for is "don't dump those requests into an instance that isn't able to serve the request immediately".  If there are three instances running and one being started up, don't consider the latter in the population of instances to send requests to. Dump it into an instance level queue that is already running. Everything else remains constant.

Let's back up a minute here. To the GAE scheduler, as it is right now, there is no concept of "starting up." Either there is an instance, or there is no instance. A "cold" instance is just as good as an instance that has been active for hours. 

If you read Mr. Matsuo of the GAE team's posting, he wrote, "If you have an app with average +50s loading time, I totally understand that you strongly want to avoid sending requests to cold instances. On the other hand, there are also many well-behaved apps with <5 secs loading/warming time. Please understand that for those apps, it is still acceptable if we send requests to cold instances, so it's likely we can not prioritize the feature bellow over other things, however..."

I read that as (and correct me if I'm wrong) that the GAE scheduler expects cold instances to be ready in less than 5 seconds (notice the "well-behaved" remark). If that is so, there is no reason to categorize instances as being in "starting up" phase. An instance is an instance is an instance, regardless of whether or not it was just cold-started.

Let's go back to the MegaWalmart example. MegaWalmart is guaranteeing that if it opens a new checkout lane, that checkout lane will be open quickly, that the clerk will not be drunk, the scanner will be working, etc. Is that guarantee reasonable? That's the point of this thread.
 

On Thursday, July 19, 2012 2:57:28 AM UTC-5, Richard Watson wrote: 
I stand to be corrected, but I doubt that Google searches are dumped onto cold instances.

The GAE scheduler expects hosted apps to cold start in less than 5 seconds. If the GAE scheduler expects that of us, I don't see why Google wouldn't hold itself to that same commitment.

Takashi Matsuo

unread,
Jul 19, 2012, 6:37:28 PM7/19/12
to google-a...@googlegroups.com
To clarify, there is no any hard limit on the cold startup time except for the hard deadline(60secs for online, 600secs for offline). The 5secs just came from Jeff's e-mail I think. That said, it is one of the most important best practice with App Engine to reduce the loading time.

At the same time, we understand that some people want to use popular Java libraries which take longer to load. We're trying hard to mitigate the pain by fastening the loading time, it's underway, but no ETA yet. So for a time being, I would suggest setting sufficient number of min idle instances to reduce the number of user-facing loading requests, especially if you think your app is important and you strongly want to eliminate the user-facing loading requests.

Additionally, we started an internal discussion about reviving warmup requests for dynamic instances. If you want this feature, please star the following issue:

-- Takashi

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/nwb8Q4Bxx7cJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.



--
Takashi Matsuo

Jeff Schnitzer

unread,
Jul 19, 2012, 6:56:38 PM7/19/12
to google-a...@googlegroups.com
On Thu, Jul 19, 2012 at 3:37 PM, Takashi Matsuo <tma...@google.com> wrote:
>
> To clarify, there is no any hard limit on the cold startup time except for
> the hard deadline(60secs for online, 600secs for offline). The 5secs just
> came from Jeff's e-mail I think. That said, it is one of the most important
> best practice with App Engine to reduce the loading time.

Just to be clear, the 5s number came from your comment "...there are
also many well-behaved apps with <5 secs loading/warming time".

My reply is that if you create a "Hello, World" Java app that does
nothing but create a PersistenceManagerFactory for a single entity, it
takes 5-6 seconds to startup even when GAE is perfectly healthy. In
practice, is impossible to create a "well-behaved" Java app on GAE.

Jeff

Drake

unread,
Jul 20, 2012, 12:20:56 AM7/20/12
to google-a...@googlegroups.com
Dump PMF use the low level API
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengi
ne/api/datastore/package-summary

Either drink the Kool Aid, or quit complaining :-)

5s will be 1.5




hyperflame

unread,
Jul 20, 2012, 2:15:18 AM7/20/12
to Google App Engine

I did some thinking about this problem, and it occurred to me that we
might be going at this the completely wrong way. Could we reverse-
bloom instance startups?

Here's the basic idea. If apps need to load in/process a bunch of
stuff at startup, why don't we give them the processing power to do
that, but only at the time of startup? Suppose we start up an
instance. For the first 30 seconds of that instance, we give it F16
ranking (that is, 4 times the power of a current F4 instance). For the
next 30 seconds, the instance gets F12 ranking (3 times F4 power). The
next 30 seconds, the instance gets F8 ranking (2 times F4 power).
After that, the instance gets lowered to F4 and remains there for the
rest of its lifespan.

For the first 90 seconds of the instance's lifespan, it gets
turbocharged, and should be able to handle all startup processes and a
few initial requests. The scheduler doesn't need to change; it can
continue feeding requests into cold instances, because even cold
instances will tear through their startup and handle the requests.

Even if this turboboosted time was billed at standard rate, it
wouldn't cost that much; F4 costs 32 cents per instance hour, so F16
would (theoretically) cost 4 times as much, so an F16 instance hour
would cost $1.28 per instance hour. We only need 90 seconds at that
rate, so 3.2 cents. I think paying 3 cents extra per instance startup
is very reasonable, especially if it prevents user-facing cold
instance startups.

It's entirely doable on a technical level too. Amazon AWS has a
similar system where Micro servers can temporarily boost to 2 compute
units, then get lowered to their allotted 1 ECU. Linux has the ability
built-in: see nice/renice http://en.wikipedia.org/wiki/Nice_(Unix)

I'd be interested in hearing if this system is possible within the GAE
hosting system.

Hugo Visser

unread,
Jul 20, 2012, 2:30:44 AM7/20/12
to google-a...@googlegroups.com
I did, and used Objectify (Jeff is kinda familiar with that I think :)). A plain Hello World with only registering the entities will get you around the 4-5 sec mark, which is hardly a real application. 

I have a highly optimized app as a backend for my Android app, and that even takes around the 6-8 seconds mark to start up and it varies too.
Add in some third party frameworks, like Jersey for JAXRS and Guice to keep your code sane and you're at the 12 secs.

So kool aid or not, Java start up overhead will always be more than it is for python and I think the 5 seconds mentioned is far from typical if you are running a non-trivial app.

Simon Knott

unread,
Jul 20, 2012, 4:16:28 AM7/20/12
to google-a...@googlegroups.com
That's just not true - I have an app which uses no third-party libraries at all, uses no persistence and in fact it uses no GAE services.  It simply has one servlet which processes request headers and returns a response.  My average start-up time for this app is 3 seconds, when it's running well.  The simple fact is that as soon as you use any third-party libraries, or do anything vaguely useful, you're out of that 5 second "magic" window.

Takashi Matsuo

unread,
Jul 20, 2012, 6:05:49 AM7/20/12
to google-a...@googlegroups.com


Sorry, but, please forget about 5 secs 'magic' window. There is no any hard/soft deadline/threshold like that in the current App Engine system.

It was just a one example of well behaved apps. Let me rephrase what I meant to say.

With app engine, it is always a good practice to keep the loading request faster. The current scheduler works much better with fast loading apps than with slow loading apps. So I always suggest developers fastening the loading request.

On the other hand, we understand that Java apps tend to be slow at a level where it is not acceptable for you if there are many user-facing loading requests. For a time being, please use min idle instances settings to reduce the loading requests with slow loading apps.

-- Takashi

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/W1sBfl9Pu2IJ.

André Pankraz

unread,
Jul 20, 2012, 6:41:43 AM7/20/12
to google-a...@googlegroups.com
If you follow the group longer you should know - Brandom lives in GAE unicorn land and all you ever need are proper Edge cache settings. ;)

Michael Hermus

unread,
Jul 20, 2012, 8:36:15 AM7/20/12
to google-a...@googlegroups.com
Perfect, thanks; that is all I am asking for!

Additionally, we started an internal discussion about reviving warmup requests for dynamic instances. If you want this feature, please star the following issue:

-- Takashi

--
Takashi Matsuo

hyperflame

unread,
Jul 20, 2012, 1:34:09 PM7/20/12
to Google App Engine
Discussing theoretical startup times is great, but I'd like to see
some real-world startups. Does anyone with high startup times ( say,
30+ seconds) want to share the results of a code profiler/appstats?

Jeff Schnitzer

unread,
Jul 20, 2012, 1:59:55 PM7/20/12
to google-a...@googlegroups.com
What code profiler works on GAE server-side? It's not very helpful to
profile in the dev environment since the problem does not manifest
there.

Appstats is not helpful; my startup is long even with zero service layer calls.

Jeff

Drake

unread,
Jul 20, 2012, 7:14:27 PM7/20/12
to google-a...@googlegroups.com

 

>If you follow the group longer you should know - Brandom lives in GAE unicorn land and all you ever need are proper Edge cache settings. ;)


No, I just don't write code using unnecessary frameworks and I do a ton of testing and architecture planning.

Most everyoneelse never uses defer or threads or task queues. They don't manage steady state tasks, or profile their imports.

It is easy to blame the platform, but if you write crap code it will run like crap.

I don't think GAE is perfect. But most the people who complain about its failings don't have the coding skills to be an intern at my shop.

I blame Google for not documenting everything well. I blame google for not telling you "don't use frameworks they suck".

But the number of people who load spring and the gripe that it is slow and are only using 3 things out of it make me want to punch them in the head.

The people who don't know how to build APIs so that apps are task specific. Piss me off. Build modular. Dump frame works. Defer often. Be your own scheduler by shaping internal ops. Cache incoming. Cache reads cache writes. Manage threads. Use warmups. This is not rocket science.

Drake

unread,
Jul 20, 2012, 7:16:30 PM7/20/12
to google-a...@googlegroups.com
All your pages should have the ability to have an optional ?debug=true

Then if you have set that in the url the first line in every one of your
functions, and before your imports, should be "Timer start" and the last
thing should be "Log Timer"
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-
> appengine+...@googlegroups.com.

Kyle Finley

unread,
Jul 21, 2012, 12:39:43 AM7/21/12
to google-a...@googlegroups.com
Hi Brandon,
 

The people who don't know how to build APIs so that apps are task specific. Piss me off. Build modular. Dump frame works. Defer often. Be your own scheduler by shaping internal ops. Cache incoming. Cache reads cache writes. Manage threads. Use warmups. This is not rocket science.

OT, but how are you cacheing writes in a fault tolerant manner? Task queue? I've been trying to develop a strategy for that. Also, what do you mean by "Cache incoming"?

Thanks,

-Kyle
 

Brandon Wirtz

unread,
Jul 21, 2012, 12:52:55 AM7/21/12
to google-a...@googlegroups.com
Cache Incoming requests= The thing that I am always accused of. Use Edge Cache to make sure you don't need to serve people who are asking for the same thing.

Fault Tolerant writes is about determining how "race" your race conditions are, and being smart about your writes.

Common things I see, people want to increment something so they get count. Which is slow. Then they +1 and write. That is really slow.

I use Defer for a lot of writes when I know I wont have a read for a bit of time.  Google Bot just did something. Well Frak Google Bot, it's not a real user, delay that write until we have time for it. Defer. You have to look at your write types and determine the priority so you can decide how important it is to be fast, if you 100% need the write to complete, or if you are just doing something hoping people will use it later.

Write to a write Cache with the Marshall/pickle, if you need the data broken out as well, do a follow up and read the Serialized data and write the individual bits when you have more time.

Do reads in a similar way when ever possible use serialized data.

Kyle Finley

unread,
Jul 21, 2012, 1:30:43 AM7/21/12
to google-a...@googlegroups.com
Oh, OK. Thank you Brandon.

André Pankraz

unread,
Jul 21, 2012, 4:12:09 AM7/21/12
to google-a...@googlegroups.com
I just answer,1.5 seconds startup : yep, with java not even an empty hello world will manage this for us mortals.
Unicorn land.

André Pankraz

unread,
Jul 21, 2012, 4:12:10 AM7/21/12
to google-a...@googlegroups.com

Jeff Schnitzer

unread,
Jul 21, 2012, 4:18:49 AM7/21/12
to google-a...@googlegroups.com
Brandon, your comments are irrelevant and not constructive. The
Python runtime has a completely different startup profile from the
Java runtime. The long startup delays in Javaland occur prior to any
service calls; no amount of caching, deferring, queueing, or
serialization is going to help.

Jeff
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/EVXSdF_xc1UJ.
>
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

Jeff Schnitzer

unread,
Jul 21, 2012, 4:23:13 AM7/21/12
to google-a...@googlegroups.com
On Sat, Jul 21, 2012 at 1:12 AM, André Pankraz <andrep...@gmail.com> wrote:
> I just answer,1.5 seconds startup : yep, with java not even an empty hello world will manage this for us mortals.
> Unicorn land.

* How do you persist data? (low-level, jdo, objectify, etc)
* How many entity kinds/classes do you have?
* How do you manage URL routing? (ie, jsp files? web.xml? JAX-RS)
* How many URL endpoints do you have?
* How many total java classes in your application? What is total size?
* How many jars in your WEB-INF/lib? What is total size?

Jeff

Drake

unread,
Jul 21, 2012, 1:19:52 PM7/21/12
to google-a...@googlegroups.com
Ah, yes, and we come back to Brandon must only run one kind of app, cause I
mean who runs more than one App? He can't know Java AND Python.

I'm apparently not the only person living in Unicorn land, Dave Chandler
also lives there
http://turbomanage.wordpress.com/2010/03/26/appengine-cold-starts-considered
/ So does Meetu Maltiar
http://thoughts.inphina.com/2010/10/09/managing-cold-start-on-google-app-eng
ine/

Just because you are too lazy to optimize your code, and want to blame the
platform don't take your anger out on me.

Looking at the archives you have been singing the same song for 2 years.
Not once have a I seen you say, "Yeah my imports are now being lazy loaded"
or " I serialized my configuration initialization variables" and "It saved
me 1s but I need to find 5 more where should I look"

All I see from you and Andre is Wha-Wha, Wha-Wha, Wha-Wha-Wha.

If Python is magic move to it. I run about 70% py 30% java because unlike
you I profiled my apps and used which ever was better. And in the cases
where Go Is Better. I'm Learning and building code in Go now too.

So grow up. Quit whining read the docs on the internet and when you can post
"I did all those things and my app still doesn't load fast enough" and when
you tell me you modularized your code so that you can leverage backend for
steady state and large code base latency tolerant apps, come back to me.
I'll help you with a code review just to get you to stop complaining.

It's ok, not every programmer and learn to think in the Google code
construction methodology, or think in "cloud". But there are clearly highly
successful projects that have gotten around your issues. So stop saying
those people live in a magical land not available to everyone else. Google
doesn't look at my Email associated with the app and give me a performance
bonus. (I Know I checked)










> -----Original Message-----
> From: google-a...@googlegroups.com [mailto:google-
> appe...@googlegroups.com] On Behalf Of Jeff Schnitzer
> Sent: Saturday, July 21, 2012 1:19 AM
> To: google-a...@googlegroups.com
> Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!
>

Francois Masurel

unread,
Jul 21, 2012, 3:20:46 PM7/21/12
to google-a...@googlegroups.com
What annoys me the most is that it's nowhere explicitly stated in the docs that only heavily optimized low-level applications are "allowed" to run on AppEngine.

Google should clearly specify that most well known Java frameworks shouldn't be used on AppEngine.

In my case I have tried to optimize my code as much as possible (no spring, only low-level datastore access, etc.) but I still need to load a bunch of libraries as my application is heavily connected to many external web services of all kinds (soap, rest, json, xml, etc.).

I can't imagine to rewrite all these vendor specific librairies even if this include some kind of duplicate code.  And for some of them, source code is not available.

At the end my application takes between 10 to 15s to load.

I really hope things will improve over time.

So Jeff, you are not alone, I guess we are quite a few AppEngine users annoyed by these user facing loading requests.

In fact, the problem exists since the beginning.  We have seen some improvements over time, but things seems to go backward lately.  I just hope it's temporary.

Francois

André Pankraz

unread,
Jul 21, 2012, 3:27:29 PM7/21/12
to google-a...@googlegroups.com
Brandon, your 1.5 seconds - I call this nonsense in Java world, thats all. Create empty Java Project with 1 Servlet, disable JDO/Datanucleus stuff, no Framework, nothing.
2012-07-21 12:14:33.275 / 200 3525ms 1kb Mozilla/5.0 (Windows NT 6.0; rv:14.0) Gecko/20100101 Firefox/14.0.1

May be there was some time where your mantioned link could reach 2.5 seconds - but it's not regular.
Do I have problems with 5 seconds startup time? No - it's considered very fast in the Java world. I don't use any Frameworks, have learned that early.
Just saying...1.5 seconds. And your regular posts that all is fine and the people are just doing it wrong are at least as annoying as mine.

Drake

unread,
Jul 21, 2012, 3:41:02 PM7/21/12
to google-a...@googlegroups.com

The intent of that line was that you would save 3.5s dropping factory for the GAE API

 

But one of my main apps is 2.2s average spinup time, I am pretty sure I can get Hello world in under 1.5. 

 

Are you lazy loading? Are your lazy loads lazy loading? Have you threaded your Lazy Loads?

Class 1 has dependencies ABC,  Class 2 has depencies DEF,

Warm up does basically nothing but defer Class1 and Class2

 

If Cold Start only do Called Class,  If first call to instance defer other classes for warm up.

 

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of André Pankraz
Sent: Saturday, July 21, 2012 12:27 PM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!

 

Brandon, your 1.5 seconds - I call this nonsense in Java world, thats all. Create empty Java Project with 1 Servlet, disable JDO/Datanucleus stuff, no Framework, nothing.

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/tv73lRqnY1MJ.

André Pankraz

unread,
Jul 21, 2012, 3:59:33 PM7/21/12
to google-a...@googlegroups.com
Hi,

I do nothing. I have 1 Servlet, no additional Libs. The Servlet reads a local ressource and writes it to the output stream, thats all.
Hello World.
Even the empty container does need some class loading - thats the Java world. And currently people have startup timing problems.

1.5 Seconds are possible if all your stuff is static and in Edge cache, nothing really starts up there.

Best regards,
André

Drake

unread,
Jul 21, 2012, 4:07:54 PM7/21/12
to google-a...@googlegroups.com
Read a local resource? You mean you do FileReader? 

 

Don’t do that. That’s bad. Put that in Memcache and read it from memcache. Keep it in memcache at all times.

 

Never use FileReader.  Ever. Unless it is to get an app booted for the very, very first time. It should be in the datastore, and in memcache.

 

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of André Pankraz
Sent: Saturday, July 21, 2012 1:00 PM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!

 

Hi,

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/jOg0BaLfu6YJ.

Drake

unread,
Jul 21, 2012, 4:09:11 PM7/21/12
to google-a...@googlegroups.com

Shit, even putting it in your static and Fetching by URL is faster than File reader.

 

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of André Pankraz
Sent: Saturday, July 21, 2012 1:00 PM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!

 

Hi,

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/jOg0BaLfu6YJ.

André Pankraz

unread,
Jul 21, 2012, 4:19:21 PM7/21/12
to google-a...@googlegroups.com
I use getRessourceAsStream an Stream-copy - but you miss the point...it's a Hello World, done in 5 minutes as test for your post, stripping away all excuses.
The calls after startup are answered in <80 ms.
That mans that I'm still far above 3 seconds for initializing of an empty project. I could simply write Hello World into the Stream.

I could use static...yes...but thats not the point. Some people have dynamic and user dependand data on each page.



Am Samstag, 21. Juli 2012 22:09:11 UTC+2 schrieb Brandon Wirtz:

Shit, even putting it in your static and Fetching by URL is faster than File reader.

 

 

Drake

unread,
Jul 21, 2012, 4:39:51 PM7/21/12
to google-a...@googlegroups.com

No the point is a major limitation of Appengine is that File IO is SLOW so you never do “open file”

You want to do “dynamic” pull that from memory or datastore.

 

NEVER EVER LOAD DATA FROM FILE OTHER THAN TO POPULATE THE DATASTORE FOR THE FIRST TIME IF YOU CARE ABOUT WARMUP

Yes I have an exception where I do that with a 40 MB pickled file that it is faster than doing 40 reads from the datastore and doing a join because of the memory usage, but that is a weird case.

 

But yes. You are doing it wrong. And if you would profile your apps you would know you are doing it wrong. 

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of André Pankraz
Sent: Saturday, July 21, 2012 1:19 PM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!

 

I use getRessourceAsStream an Stream-copy - but you miss the point...it's a Hello World, done in 5 minutes as test for your post, stripping away all excuses.

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/cTL6IofY1ygJ.

hyperflame

unread,
Jul 21, 2012, 4:47:00 PM7/21/12
to google-a...@googlegroups.com
I just looked over the logs of a corporate GAE application. It's a very simple "heartbeat" application; essentially our production applications have to send a message to it periodically. The message sent is about 500 characters; the application does some checksum-ing and sends back a JSON string about 100 characters long. The only external libraries are jars of twitter4j and the org.json parser. It's really not much more complex than a Hello World app.

Looking over the logs, the fastest start time from cold-instance to first-request-served is 1695 ms (although on average, it needs 2.2 to 2.5 seconds to cold-start) on a F1 instance. Once the instance is warm, subsequent requests take, on average, ~150ms to process and send back a reply. I could probably get that lower if I upgraded to F4.

André Pankraz

unread,
Jul 21, 2012, 4:49:12 PM7/21/12
to google-a...@googlegroups.com
May be it's my language barrier...don't know. Whats hard to understand in
NewProject->HelloWorld-Servlet->3 seconds startup time->1,5 s isn't true and cannot be true
If I take 500 ms away for the request because i'm so terribad it's still >3 s startup time, for HelloWorld. Even if it's 2.5 s in a good timeslice, it's far off 1.5 and it has to load 1 (!) servlet without depencies (besides Java-Servlet-Stuff).
I'm done here - you win the internet.

Drake

unread,
Jul 21, 2012, 5:00:53 PM7/21/12
to google-a...@googlegroups.com

When I said 1.5 I meant PMF was adding 3.5 you don’t need. PMF takes 5s the API takes under 1s.  I think 2.5s is probably the lowest you can get and do something truly useful, but it seems that I can do a LOT in 2.5s  (assuming the F4 this thread was about when we started) (well actually assuming F2, F4’s seem to need 3s no matter what you do)

But also you don’t have to do anything useful on warmup. So you shouldn’t serve a ColdStart very often.

 

I have mentioned before I work to keep everything to under 4s. If my App Gets to 8s I start slicing it up

 

I am also 100% confident that if all you do is say “I am up” your warm up will take under 1.5s  for 95% of starts. And that should really do “I am up” and defer lazyloads of everything you need to be ready to do real work.

 

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of hyperflame
Sent: Saturday, July 21, 2012 1:47 PM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!

 

I just looked over the logs of a corporate GAE application. It's a very simple "heartbeat" application; essentially our production applications have to send a message to it periodically. The message sent is about 500 characters; the application does some checksum-ing and sends back a JSON string about 100 characters long. The only external libraries are jars of twitter4j and the org.json parser. It's really not much more complex than a Hello World app.

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/TDuwyumyZAIJ.

Hugo Visser

unread,
Jul 21, 2012, 5:39:53 PM7/21/12
to google-a...@googlegroups.com
Yeah but where this whole discussion started was that when there's no available instance you can't defer anything, since a cold instance will apparently get the request and due to the change dynamic instances aren't warmed up like they used to be. You're bound to need to do some initialization at that point. And while you could build every single component you're app needs yourself to let it be super duper optimized for App Engine, that will just not justify the amount of work it will take to run a app on App Engine.

IMHO the warm up requests serve their purpose; they warm up the app so that a slightly longer start up time of 10 secs or so isn't that bad. The issue is just that there are now situations where user facing requests end up warming up a instance which sounds like a bad thing, especially if the app is receiving spiky loads.

To post to this group, send email to google-appengine@googlegroups.com.
To unsubscribe from this group, send email to google-appengine+unsubscribe@googlegroups.com.

hyperflame

unread,
Jul 21, 2012, 5:40:16 PM7/21/12
to Google App Engine
Are you using a free or paid app to do your testing? My understanding
is that paid apps are treated preferentially by the scheduler.

Tomas

unread,
Jul 21, 2012, 5:50:13 PM7/21/12
to google-a...@googlegroups.com
Hi guys,

I actually didn't want to reply to this thread originally even I was the one who opened the small thread 3 months ago regarding the very slow startups on gae with spring + velocity + objectify but after reading the email telling me I should "write modular apps and don't use frameworks" I have to write up something.

I'm 10+ years java developer and I'm big fan of Spring as it saves a hundred hours of time - I originally developed the www.librarist.com 2 years ago as a pet project on app engine as a just a prototype using simple JSPs + Servlets and it works working fine (startups 3-5 secs).

But when I wanted to add another features like cart, OAuth login, use templating, use proper caching - and I decided to go with Spring as this the reason why we have frameworks. Plugging in Velocity is just matter of 1 line of code in XML (plus some tweaks if you need them), the cache is again couple of lines in XML and you automatically get in memory + memcache with ehcache, authentication is 2 lines of code. And I can continue.

I use spring in work for other projects which runs on full java stack (but thats just Tomcat/Jetty) and when I run the "new" librarist equivalent code on Micro instance on AWS (which is like 600MB ram and 1.8Ghz - I might be wrong here a little bit), the startup is 2-3 secs and I can serve all my traffic with it with load at max 0.5.

When I run the similar code on GAE, I get 45+ secs startup (when I'm lucky, sometime it just go over 1 nib and got killed) and then GAE keep spinning another instances like crazy but very often use them for three request and then dying to get started again. Worth to mention - I do not use the full Spring stac, basically its just Spring core + Spring MVC, then objectify + Velocity - and I really tried to do all the possible optimizations (packing all the stuff into one big jar, used annotations + java confing, used XML, disable scanning of annotations) but these changes just save me like 4-5 secs from 45 secs.

When I tried just Spring code + Spring MVC without any other library (and here we talk about ~2MB of jars), use one JSP, my best startup time is 15 secs.

Okay thats cool, I can understand the  GAE is different and basically it could be used just for small projects where you write everything by yourself and don't use any normal stuff like Dependency Injection / Proxying and similar. But my biggest question for GAE guys would be this - why the hell the startup takes so long simply I don't get what is actually the issue here:

- is it loading of the jars from the network (virtual drive) when the gae loads each class as separate file
- is it the scanning of the classes (annotations etc)
- is it java reflections
- is it creating of proxies (which Spring does for almost everything)
- anything else I can't see

I really REALLY liked the idea of the app engine since I hate to mess up with server configuration and it's great to focus on SW alone. But honestly speaking, I gave up on GAE for serious projects - it's good platform for doing some quick prototype to show to client (easy deploy, no server maintenance, you can have a simple app talking with a lot of apis up and running in half a day) - but then move the stuff the AWS.

I made this decision 3 months ago and from the app engine google groups I can see it was right thing to do as there is more and more emails coming in regarding broken live apps, issues with deploys, issues with startups and stuff like that (the SSL cert price is just a joke for example).

I really appreciate the work done on the GAE by Google but if I could give an advice - please stop adding new features with every release, invest one month of time of your team and do a proper maintenance release which will fix all that shit around Java apps (this really can't be so hard) or you will lose the war and all java devs will run away to AWS/Heroku/Cloud Foundry/Own servers.

Cheers.

Drake

unread,
Jul 21, 2012, 6:17:27 PM7/21/12
to google-a...@googlegroups.com

If it makes you feel better I have this fight once a month. AWS that is easier to build for or GAE which doesn’t require any sys admin and is infinitely elastic.

 

Someone will put something on AWS show me how awesome it is, and how cheap. I will throw the load simulator at it, and watch it burn. 

They say but you didn’t give it more instances. 

 

I will say fine every 4th  8 minutes when CNN runs a commercial for the site you need to handle  5000 uesrs, and the rest of the time you need to handle 8.

Which is cheaper?

 

They always lose.

 

How do we manage keeping the versioning the same on all instances in AWS? How long is the down time on deploy of new version?  (GAE you can deploy, warm up enough instances and change the “Default” version and not even lose sessions)

 

Every service has plusses and minuses if they didn’t then someone would be a monopoly.

 

Edge Cache is amazing for most apps it will negate the AWS “savings”. Rapid Elasticity is amazing it will negate even RackSpace savings.

 

Not taking off the shelf code… That will cost you a developer. Sorry. I would also say most the time, you should be rewriting that code anyway to be exactly what you need, but I get that sometimes what you want is to use Facebook Authorization, and there is no reason to strip stuff out.  Except that you have to remember everyone who is not you writes sucky code. Generic code. Meant to run everywhere. Except GAE is not everywhere, it is in unicorn land where you traded having nice friendly donkey’s who will put up with what ever shit you feed them and still pull your wagon, for a flying unicorn that shoots rainbows out of its ass but requires that you don’t just give it a sugar cube every now and then, but that it is raw organic sugar fed to it by beautiful naked virgins.  In exchange you can go farther, faster, and you don’t have to clean up poo off the floor.  These are the trades you make for living in Unicorn land.

 

 

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/ozWj4QUDBDkJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Prashant Hegde

unread,
Jul 21, 2012, 10:21:20 PM7/21/12
to google-a...@googlegroups.com, google-a...@googlegroups.com
Here is a counter example. We are a small app with peak traffic of 1 request per second. We use java. No frameworks. We use jdo, guice. Startup time 20 seconds. Average request latency under 500 ms. We have been on appengine close to 2 years. Right now we are able to serve our users with one instance without exposing them to loading requests ( almost zero ) We use min = auto , max = 2 and pending latency = 7.5 s. we arrived at this configuration after some experimentation and it gives the best performance and cost so far... At least we are fine now, unless google decides to change something.

We do have our share of problems with gae ( memcache errors, sudden increase in latency for brief periods..) But that's negligible relative to the advantages of the platform and cost.

Learnings 

Avoid warmups/loading requests at all costs. Our goal was never let gae start extra instances, as that can lead to instability. We cannot reduce startup time, and gae cannot get rid of all the issues associated with startup. Min = 1 max = 1 gave us the worst performance as it creates a constraint that spins lot of instances esp if you have a bursts of traffic. Min = auto allows us to use pending latency config and it gives better control on the scheduler behaviour. 

Disclaimer

Probably this worked for us because  we have a steady request pattern during the day, and we don't normally see zero traffic for more than a few mins. if you are an app with different behaviour ( long idle times ) then min = auto may expose the user to loading requests..in this case min = 1 max = 2 or more should work better.. Again our $.02 cents...your mileage may vary....


Hope this helps.


Prashant




Sent from my iPad
--

hyperflame

unread,
Jul 21, 2012, 11:27:46 PM7/21/12
to Google App Engine
On Jul 21, 5:17 pm, "Drake" <drak...@digerat.com> wrote:
> unicorn land where you traded having nice friendly donkey's who will put up
> with what ever shit you feed them and still pull your wagon, for a flying
> unicorn that shoots rainbows out of its ass but requires that you don't just
> give it a sugar cube every now and then, but that it is raw organic sugar
> fed to it by beautiful naked virgins.  In exchange you can go farther,
> faster, and you don't have to clean up poo off the floor.  These are the
> trades you make for living in Unicorn land.

I like this visual. If someone puts it on a coffee cup or t-shirt as a
GAE promotion, I'd buy it. Seriously, someone do this.

On Jul 21, 9:21 pm, Prashant Hegde <prashant.he...@gmail.com> wrote:
> Here is a counter example. We are a small app with peak traffic of 1 request per second. We use java. No frameworks. We use jdo, guice. Startup time 20 seconds. Average request latency under 500 ms. We have been on appengine close to 2 years. Right now we are able to serve our users with one instance without exposing them to loading requests ( almost zero )

This isn't a counterexample, for the reason that you mentioned in the
first sentence: you can serve everything off one instance. The
original poster needs multiple instances, and to be able to scale as
load changes. If you're not loading new instances, then startup time
is pretty much irrelevant; you can start an instance once, then your
users will never see a cold instance. This entire discussion is about
cold instances.

hyperflame

unread,
Jul 22, 2012, 12:24:37 AM7/22/12
to Google App Engine


On Jul 21, 4:50 pm, Tomas <tomas.ada...@gmail.com> wrote:
> I use spring in work for other projects which runs on full java stack (but
> thats just Tomcat/Jetty) and when I run the "new" librarist equivalent code
> on Micro instance on AWS (which is like 600MB ram and 1.8Ghz - I might be
> wrong here a little bit), the startup is 2-3 secs and I can serve all my
> traffic with it with load at max 0.5.
>
> When I run the similar code on GAE, I get 45+ secs startup...
>
> Okay thats cool, I can understand the  GAE is different and basically it
> could be used just for small projects where you write everything by
> yourself and don't use any normal stuff like Dependency Injection /
> Proxying and similar. But my biggest question for GAE guys would be this -
> why the hell the startup takes so long simply I don't get what is actually
> the issue here:
>
> - is it loading of the jars from the network (virtual drive) when the gae
> loads each class as separate file
> - is it the scanning of the classes (annotations etc)
> - is it java reflections
> - is it creating of proxies

One (small) reason why GAE is slower is just that instances are
smaller than comparable AWS instances. An AWS Micro is 613mb RAM and
can temporarily use 2 ECU of CPU (looking at Amazon docs, they claim
that 1 ECU is equal to a 1 ghz 2007 opteron). A GAE F1 is 128 mb of
RAM and 600mhz.

I bet that the major reason is just network I/O, for the GAE servers
to find an available server, transfer a copy of the application
+libraries to that server, and start up the servlet runner. It would
explain why even simple apps have long startup times: it's not that
the frameworks and libraries are taking a long time, it's just the
overhead of moving all those libraries into place initially.

This also suggests a simple solution to the problem: just integrate a
copy of commonly used libraries (spring, guice, twitter4j, etc) into
the servlet runner.

Jeff Schnitzer

unread,
Jul 22, 2012, 12:27:15 AM7/22/12
to google-a...@googlegroups.com
Brandon, you talk a lot of shit for a guy who only discovered the task
queue 6 months ago.

Everyone who thinks they have the secret to starting up a Java app in
<5s, answer these questions and prove that you're running more than a
toy:

* How do you persist data? (low-level, jdo, objectify, etc)
* How many entity kinds/classes do you have?
* How do you manage URL routing? (ie, jsp files? web.xml? JAX-RS)
* How many URL endpoints do you have?
* How many total java classes in your application? What is total size?
* How many jars in your WEB-INF/lib? What is total size?

Having done a fair bit of experimentation around this issue, I know
that you're talking out of your asses. Clever suggestions like
"rewrite app in Python" aren't nearly as clever as you think they are.
Suggestions that you can optimize startup time by some combination of
caching, task queues, or threading merely indicate total lack of
understanding of the problem in the first place.

To the people paying attention to this thread who are actually trying
to build real world apps on Appengine: Brandon does not speak for the
GAE team. Neither do I, but I occasionally talk to people that do. I
know that Google really does want Java apps to run well on GAE. They
know about this issue and care about it - maybe even enough to do
something about it. It *is* constructive to "whine" politely about
startup time because there are many issues competing for the team's
time, and our feedback helps them prioritize their efforts.

Jeff

Jeff Schnitzer

unread,
Jul 22, 2012, 12:40:48 AM7/22/12
to google-a...@googlegroups.com
On Sat, Jul 21, 2012 at 9:24 PM, hyperflame <hyper...@gmail.com> wrote:
>
> I bet that the major reason is just network I/O, for the GAE servers
> to find an available server, transfer a copy of the application
> +libraries to that server, and start up the servlet runner. It would
> explain why even simple apps have long startup times: it's not that
> the frameworks and libraries are taking a long time, it's just the
> overhead of moving all those libraries into place initially.

The problem specifically seems to be that classloading is slow. A
significant part of that is explained by slow network I/O, but not
all; for example, getting AOP proxies built and loaded seems to take
significantly longer than one would expect by profiling locally. I
speculate that the extra security measures of GAE's sandbox are also
taking a toll.

Jeff

Prashant Hegde

unread,
Jul 22, 2012, 12:42:05 AM7/22/12
to google-a...@googlegroups.com
This isn't a counterexample, for the reason that you mentioned in the
first sentence: you can serve everything off one instance. The
original poster needs multiple instances, and to be able to scale as
load changes. If you're not loading new instances, then startup time
is pretty much irrelevant; you can start an instance once, then your
users will never see a cold instance. This entire discussion is about
cold instances.

To clarify,  we need more than one instance during peak hours, but our overall billing is limited to 1 instance because night time traffic is literally zero. We have seen > 2 instances without user facing loading request during peak hours. Our analysis was that when pending latency comes into picture, the behavior is more predictable. While the pending queue is getting filled up, new instance is started without user facing request as we have specified 7 second as the pending latency. By the time previous request is served, pending request goes to the hot instance. A lot of depends on your traffic pattern and time taken to handle the request though. 

Atleast for us the overall goal was - how do we achieve < 500 ms latency for our users for the traffic, with auto scaling, without repeated loading requests, without escalation in costs. We seemed to have achieved that, hence the post and as a counter example to our friend  who decided to go off appengine to AWS.


Prashant


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.

Drake

unread,
Jul 22, 2012, 2:32:49 AM7/22/12
to google-a...@googlegroups.com
I didn't need task queues/defer because I was self managing queues and in
fact do that more than using task Queue still because it lets me do more
process shaping. So now short term tasks are defer, and long term tasks go
in to a self managed bucket that fires based on current load as determined
by the number of instances active.

> * How do you persist data? (low-level, jdo, objectify, etc)
Actually I use a multi-approach based on the work being done. Python has
NDB, CachePy, and a number of things that Java seems to missing good analogs
for, but using the lowlevel API gets things up and running and does the
majority of the heavy lifting. When we don't want to have to manage a bunch
of code to handle stuff that is more complex or do things that are less time
sensitive like deep queries for single users, we use JPA. (I think JDO is
mostly dead.)

> * How many entity kinds/classes do you have?
A shit ton. For the analytics app we built we found out that you are limited
to 65535 and that Dynamic is not as dynamic as you might think.

> * How do you manage URL routing? (ie, jsp files? web.xml? JAX-RS)
How quaint. Did you miss that I said one type of task per app/backent?
Web.XML at the individual app, but path routing is handled by the
caching/loadbalancing that is done using the python version of CDNInabox for
most our apps, and Squid on another service for some of the other Apps. No
"App" ever is exposed directly to a user request, they are all proxied. (ps
this also allows us to do much of our own QoS manipulation because we can
send a request that is taking too long to an F8 that has multiple apps
installed on it and can be anything it needs at a moments notice.

> * How many URL endpoints do you have?
Per app? Or In total? In many cases we have endpoints per domain in
multi-tenant apps, and we are running 10k plus domains.

> * How many total java classes in your application? What is total size?
> * How many jars in your WEB-INF/lib? What is total size?
Per app? Maybe 20. In an application deployment couple thousand. Last time
I looked our code base was nearly 800 megs, with 80% of apps at less than 10
megs


Jeff, my "toys" are bigger than your "projects". My unique indexed pages
in Google is over 1 Billion. I likely spend more in Datastore file costs
than you spend hosting your entire client base.



> -----Original Message-----
> From: google-a...@googlegroups.com [mailto:google-
> appe...@googlegroups.com] On Behalf Of Jeff Schnitzer
> Sent: Saturday, July 21, 2012 9:27 PM
> To: google-a...@googlegroups.com
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-
> appengine+...@googlegroups.com.

Mauricio Aristizabal

unread,
Jul 22, 2012, 3:28:08 AM7/22/12
to google-a...@googlegroups.com
I really don't care who has the shiniest toys in this playground, bottom line is that I consider every minute spent getting my app to load faster a complete waste of my time.  Adding business logic, minimizing user request times and costs, that's what I care about, not jumping through hoops to get around some design flaw that Google could easily fix if they chose to.

If you have a zillion apps and a huge budget and/or team, and it is cost effective to do a crapload of optimizations, good for you! but I suspect most of us just want to get our apps to work reasonably efficiently, in as little time, and with as many of the tools we already know and love as possible, so we can move on to our next project, client, idea, etc.

Drake

unread,
Jul 22, 2012, 3:34:49 AM7/22/12
to google-a...@googlegroups.com

And this would be why we don’t use contractors.  Optimizations aren’t for “Google” they are good practice.  When you run on 600 mhz with 128 Megs of ram, you see the optimizations when you have 3.2 Ghz * 8 and 64 gigs of ram, you can’t hardly profile an app. That doesn’t mean you can’t make it twice as efficient.

 

The tools you know and love suck. You just don’t know it because you are spoiled by big computers.  I love my ’50 Merc and my ’66 Stang, but my MiniCooper has a lower cost of operation and has a faster 0-60 and top speed than either of the old Fords do.  Because it is optimized.

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of Mauricio Aristizabal
Sent: Sunday, July 22, 2012 12:28 AM
To: google-a...@googlegroups.com
Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!

 

I really don't care who has the shiniest toys in this playground, bottom line is that I consider every minute spent getting my app to load faster a complete waste of my time.  Adding business logic, minimizing user request times and costs, that's what I care about, not jumping through hoops to get around some design flaw that Google could easily fix if they chose to.

 

If you have a zillion apps and a huge budget and/or team, and it is cost effective to do a crapload of optimizations, good for you! but I suspect most of us just want to get our apps to work reasonably efficiently, in as little time, and with as many of the tools we already know and love as possible, so we can move on to our next project, client, idea, etc.

 

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/gmXX3iD35ccJ.


To post to this group, send email to google-a...@googlegroups.com.

To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

Jeff Schnitzer

unread,
Jul 22, 2012, 4:17:36 AM7/22/12
to google-a...@googlegroups.com
On Sat, Jul 21, 2012 at 11:32 PM, Drake <dra...@digerat.com> wrote:
>> * How do you persist data? (low-level, jdo, objectify, etc)
> Actually I use a multi-approach based on the work being done. Python has
> NDB, CachePy, and a number of things that Java seems to missing good analogs
> for, but using the lowlevel API gets things up and running and does the
> majority of the heavy lifting. When we don't want to have to manage a bunch
> of code to handle stuff that is more complex or do things that are less time
> sensitive like deep queries for single users, we use JPA. (I think JDO is
> mostly dead.)

This thread is about Java instance startup time. Enough about Python
already. It's irrelevant.

You've completely - and lamely - dodged ALL of my questions. You
insinuate that you have sophisticated Java applications that start up
quickly. I'm calling bullshit. Give me ACTUAL numbers for one ACTUAL
Java application.

>> * How many entity kinds/classes do you have?
> A shit ton. For the analytics app we built we found out that you are limited
> to 65535 and that Dynamic is not as dynamic as you might think.

Generated kinds or classes don't count. I mean this literally, how
many user-defined entity types does your Java app load? You're going
to tell me you have Java apps that load 64k JPA classes? Not a
chance.

See, this is getting to the ACTUAL problem with the Java environment -
classloading is slow. Entity classes must be loaded and introspected
before datastore operations can start. Near as I can tell, the only
way around this particular problem is to stay with the low-level api -
which, like assembly programming, severely limits how sophisticated an
application you can reasonably build.

>> * How do you manage URL routing? (ie, jsp files? web.xml? JAX-RS)
> How quaint. Did you miss that I said one type of task per app/backent?

I will be generous and assume for the moment that this makes sense for
your particular application. At best you are arguing that you have a
wacky application. You won't find too many people building business
apps that way, especially when you have elaborate transactional logic.
There are sometimes good reasons to compose an app out of more
fundamental services, but this should be driven by application design
- not a dumb limitation of how many classes you can pack into a JAR
before instance startup times get out of control.

> Web.XML at the individual app, but path routing is handled by the

If I went the web.xml route, it would map 326 servlets. Not really
fun. There's a reason people started inventing "web frameworks" in
the late 90s... because that kind of programming sucks when your app
is even mildly complicated.

>> * How many URL endpoints do you have?
> Per app? Or In total? In many cases we have endpoints per domain in
> multi-tenant apps, and we are running 10k plus domains.

I was really asking about your "unicorn" java app that loads in 2
seconds. But I think you misunderstand what an 'endpoint' is. It's a
particular code path; for example, http://example.com/thing/123,
http://example.com/thing/456, and http://other.com/thing/789 are all
the same endpoint because they all execute the same code.

It's relevant because it is related to how many classes get loaded at
startup, which has a strong effect on the startup latency.

>> * How many total java classes in your application? What is total size?
>> * How many jars in your WEB-INF/lib? What is total size?
> Per app? Maybe 20. In an application deployment couple thousand. Last time
> I looked our code base was nearly 800 megs, with 80% of apps at less than 10
> megs

I want those numbers for your "unicorn" sophisticated Java app that
you claim starts in 2s.

> Jeff, my "toys" are bigger than your "projects". My unique indexed pages
> in Google is over 1 Billion. I likely spend more in Datastore file costs
> than you spend hosting your entire client base.

Yawn. The Royal Wedding site served more traffic and probably cost
more than either of us will make in revenue all year, but it's 99.9%
static content. Your application is highly atypical of business
applications, and yet you myopically believe that we should all build
our apps out of zillions tiny components that use the low-level
persistence API and communicate through urlfetches and task queues.
That's retarded, and when you eventually discover the "transaction"
feature of the datastore you will understand why.

Jeff
P.S. One consequence of my particular career history is that I'm not
impressed by people trying to prove their dick is bigger.

Drake

unread,
Jul 22, 2012, 4:34:57 AM7/22/12
to google-a...@googlegroups.com
> I will be generous and assume for the moment that this makes sense for
> your particular application. At best you are arguing that you have a
wacky
> application. You won't find too many people building business apps that
way,
> especially when you have elaborate transactional logic.
> There are sometimes good reasons to compose an app out of more
> fundamental services, but this should be driven by application design
> - not a dumb limitation of how many classes you can pack into a JAR before
> instance startup times get out of control.
>
> > Web.XML at the individual app, but path routing is handled by the
>
> If I went the web.xml route, it would map 326 servlets. Not really fun.
> There's a reason people started inventing "web frameworks" in the late
> 90s... because that kind of programming sucks when your app is even mildly
> complicated.
>

And this is where you prove your stupidity.

This is how every large "app" on the planet works.
A load balancer serves requests to the right "app" and each app is optimized
for what it is supposed to do. Data is shared between apps using API's,
Memcache, and Datastore.

I know what an "end point" is. Apparently you don't. Each of what you call
and End point should be a micro app. A single purpose App that handles one
type of request. This lets you build my Unicorns. Fast little apps that do
one thing, and do it fast. Also because they do only one thing they get to
maximize their instance memory.

Java gets REALLY slow and has more warm ups when you have 200 megs of code
and 128 megs of ram. (as in it doesn't work, and if by some miracle it does
make it past warm up typically you hit soft memory limit and it dies)

You clearly are building Monolithic code in a world of Micro-instances. STOP
IT. You will only have pain doing so. Build single purpose instances that do
what they need to do. I'm sorry your app is too small for there to always be
one instance doing each task. That's not my fault. Get more popular or
something. In the mean time use lazy loads so that you can fake having
instances that do one and only one thing by only loading enough stuff to do
one thing until you need to do that other thing.

I'm clearly not going to convince you to do anything, because you don't want
to. You want somebody else to fix your problem so you won't even look at
solutions.
At which point I have to point out you are always going to be sad and
miserable.

PS
Survivor Finale. And Comments aren't static. Nor are "post to facebook". Nor
are Oauth calls.






Drake

unread,
Jul 22, 2012, 4:44:25 AM7/22/12
to google-a...@googlegroups.com
200 1625ms 98kb Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4)
AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6
70.162.197.131 - - [22/Jul/2012:01:38:53 -0700] "GET /www.xyhd.tv HTTP/1.1"
200 98752 - "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4)
AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6"
"second.cdninabox-java.appspot.com" ms=1626 cpu_ms=844 api_cpu_ms=0
cpm_usd=0.042899 loading_request=1
instance=00c61b117c382cd6d2d719025a767570ad95c0
I 2012-07-22 01:38:53.885
This request caused a new process to be started for your application, and
thus caused your application code to be loaded for the first time. This
request may thus take longer and use more CPU than a typical request for
your application.

I read that as 1.626 MS.

What I did in that time on an F1.
Initialized the App from nothing.
Read the configuration out of Datastore (not even from Memcache)
Did a URL Fetch to read the contents of XYHD.tv (like you doing your read
from file)
Regexed it to change all the paths from relative to absolute.
Deferred a write to the datastore so I wouldn't have to do that for the next
request that comes along in the next 30 seconds.
Output the data.

I still have 370 Ms to do something else. And that is on an F1.

I think that counts as Hello world.

> -----Original Message-----
> From: google-a...@googlegroups.com [mailto:google-
> appe...@googlegroups.com] On Behalf Of Jeff Schnitzer
> Sent: Sunday, July 22, 2012 1:18 AM
> To: google-a...@googlegroups.com
> Subject: Re: [google-appengine] Re: Startup time exceeded...on F4?!
>

Jeff Schnitzer

unread,
Jul 22, 2012, 4:45:53 AM7/22/12
to google-a...@googlegroups.com
On Sun, Jul 22, 2012 at 1:34 AM, Drake <dra...@digerat.com> wrote:
>
> I know what an "end point" is. Apparently you don't. Each of what you call
> and End point should be a micro app. A single purpose App that handles one
> type of request.

All I have to say is "wow". I'm really glad you're just a troll here
and not actually responsible for anything I depend on.

Jeff

Aleksei Rovenski

unread,
Jul 22, 2012, 5:05:13 AM7/22/12
to Google App Engine
I can understand that GAE is more optimized for python than for java.
Maybe it is a highly specialized tool for really tiny apps that use no
frameworks. But I don't get one thing. How Google plans to compete for
java apps by selling platform that forces you to not use frameworks? I
simply refuse to go back into the 90's (in software development
aspect, no problem otherwise :)). I simply do not want to drop
dependency injection and other patterns I have been doing for many
years already. If I must, then I think it just isn't worth it.
Seriously Java is about frameworks. Saying not to use frameworks in
java is same as use python instead. No thanks.

On 22 июл, 11:45, Jeff Schnitzer <j...@infohazard.org> wrote:

Drake

unread,
Jul 22, 2012, 5:12:20 AM7/22/12
to google-a...@googlegroups.com
> All I have to say is "wow". I'm really glad you're just a troll here and
not
> actually responsible for anything I depend on.

Let's see... You spend your life complaining how the platform sucks. I
release tutorials on how to make it suck less. I'm the troll?

Anyone who thinks the low level api is analogous to assembly probably
doesn't want to depend on me because the laughter when they asked a question
might be deafening.

Read EVERY optimization guide on Java for GAE and that is the first thing
they tell you to do. If that is too hard, then this is not the platform for
you. I have vouchers for Azure, hit me up off list I'll send them along,
you will be happier in .Net where you never have to do anything low level.
J# is supported for 3 more years... So you should have plenty of time to
learn basic principles of cloud development in that time.


All animosity aside. I'm sure you are a good Java dev, but this isn't Java.
The Python I write isn't Python either. I mean it is, but not how you would
write it if you were on a "real" server. You have to get over that. You are
going to suck at GAE until you do. NO FRAMEWORKS. NONE. Deal with it. Every
email you beat me over the head with why you need PMF. Well Frak PMF it is
holding you back. It is making you stupid. You don't need it. If you need it
for one really hard thing once in a great while that is what a lazy load is
for.

Yes classes suck we went over that. Which is why you build micro apps that
are purpose built and only use limited classes.

Yes I have an F8 that can be anyone of my Micro Apps, it has 63k Entities.
It lazy loads. So it doesn't break, It is used for tests and for emergency
off loading of traffic.
You don't have that kind of traffic so you don't need that, I get that.

But you need to get over yourself and look at profiling your app, and
determining where you can cut the fat, and how you can break a single
monolithic app in to 4 smaller apps. I don't know what that looks like,
maybe it is Customer authentication and session handling, Credit card
processing and Order insertion, and Product Look up and shipping
calculation.

In so doing you can limit your classes, you can make better use of your
instance memory, and you can likely keep from ever hitting warm ups because
each app will run more efficiently than the monolith.

Yeah that is a lot of work, and sure it sucks that this guy who called you
stupid told you to do it, but it is the right thing to do.




Drake

unread,
Jul 22, 2012, 5:18:01 AM7/22/12
to google-a...@googlegroups.com
I don't use frameworks in Python either. Ask my Dev team. They all but riot
when they find a library they want to use and I say, figure out what it does
and make your own, we don't have room for that thing.

Java aint Ruby on Rails. (or Oh please don't let it be) I get that the appeal
of Java is that it is older than dirt and there is a framework for everything.
You can use 1 frame work if it is a small one :-).

You are trading rapid elasticity for ease of development. You have to remember
that when you are small Servers are cheap and humans are expensive. When you
get to Google Scale that reverses. GAE is optimized for near Google scale.
(not all of Google, but Small Google product) if you are 1/1000th that the dev
resources for optimization will kick your ass. Just like the guys who say they
wont' do it. Those are guys who don't spend $50k a month on hosting. When the
amount you spend is double the cost of a dev, it is time to look at ways to
cut that hosting in half using a dev, because you will break even, and give
higher QoS.




> -----Original Message-----
> From: google-a...@googlegroups.com [mailto:google-

Jeff Schnitzer

unread,
Jul 22, 2012, 5:52:22 AM7/22/12
to google-a...@googlegroups.com
On Sun, Jul 22, 2012 at 2:05 AM, Aleksei Rovenski
<aleksei....@gmail.com> wrote:
> I can understand that GAE is more optimized for python than for java.
> Maybe it is a highly specialized tool for really tiny apps that use no
> frameworks. But I don't get one thing. How Google plans to compete for
> java apps by selling platform that forces you to not use frameworks? I
> simply refuse to go back into the 90's (in software development
> aspect, no problem otherwise :)). I simply do not want to drop
> dependency injection and other patterns I have been doing for many
> years already. If I must, then I think it just isn't worth it.
> Seriously Java is about frameworks. Saying not to use frameworks in
> java is same as use python instead. No thanks.

Don't listen to Brandon. By and large you can use frameworks and
*should* use frameworks, because "framework" is just a word for code
you would otherwise have to write yourself. Google doesn't want to
send you back to the late 90s programming styles (even if Brandon
does), and nobody sane will seriously suggest you break your app into
hundreds of tiny deployments.

It is a known issue with the GAE Java environment that instance starts
for nontrivial applications take significant time -- enough to
negatively impact the user experience if a user request gets routed to
a cold start. Some frameworks make this problem worse than others,
but ultimately it is a linear relationship with the amount of code in
your app.

Maybe Google will be able to radically improve instance startup time
in the future, but probably not. There have been incremental
improvements but it's a difficult problem because the JVM loads code
very differently from Python or Go.

The key is to prevent users from seeing cold starts in the first
place. Takashi has posted one workaround: Keep "min idle instances"
high enough that there is always enough capacity. This works but it's
not ideal. It costs more and there is always the chance that a sudden
burst of traffic will overload the idle instances and route a request
to a cold start. You can decrease the chance of this by adding even
more min idle instances but that costs even more.

My hope is that the scheduler can be made smart enough to always keep
requests in the pending queue until new instances are warmed up. I
believe that GAE used to do this, but something recently changed so
that it no longer does. If requests never see cold starts, it really
doesn't matter how long your app takes to start - you can optimize
capacity for time-to-service-each-request latency, instead of the %
chance that a user request will get sent to an instance for 20+
seconds.

So - don't panic. There is a solution that works right now, assuming
you have reasonably smooth traffic patterns: Pay a bit more than you
would otherwise for more idle instances. But also star this issue,
which IMHO is a better long-term solution:

http://code.google.com/p/googleappengine/issues/detail?id=7865

Jeff

Aleksei Rovenski

unread,
Jul 22, 2012, 5:58:06 AM7/22/12
to Google App Engine
Ok, I get it, welcome to the world of elastic apps.
But still there are few simple things google could do to ease the
pain...
If you are a small startup, with small traffic (just a few F1's) and a
couple devs, you have money for few month to run, what you want to do
is probably keep making features like hell (and consuming memory)...
Don't really care for long startups (simply because not doing anything
important and my client side handles timeouts gracefully) as long as
google won't start killing my instances. I'm worried because I read a
lot here about "bad days" when startup times are 3-4 times compared to
normal and in that case this could happen...
There are tons of projects like us and some of us can go big, so why
not make small effort and make the life a little easier for
"conventional" java apps.
For example, would it be totally impossible to soften the 60s limit
for the loading requests and make it double? Some of the comments
above (by Jeff and others) regarding pending queue do not look too
hard to implement either.
It will make many project owners happiers (even if they won't go as
big, but maybe they don't need to), so google wins also.

Aleksei Rovenski

unread,
Jul 22, 2012, 6:12:01 AM7/22/12
to Google App Engine
Starred and the one posted by Takashi also just in case...
Regarding min idle instances, I must admit there is something strange
going on in my Instance tab in Dashboard.
I have settings like this: idle instances min=auto, max=1, pending
queue min and max=15s. I have some working instances that I get
charged for and then few more that are anyway created by google just
waiting out there for hours and even days and I do not seem to pay for
them. If I kill them, they are coming back anyway. This looks nice
indeed, but could someone explain me what is behind that. I'm worried
because it may be a drug that google will suddenly stop providing
me :)
Also although I'm using F1 type, google is playing soft on me here as
well. For example even now one instance has already reached 143Mb and
the other one 129Mb, but both are still alive and kicking...

On 22 июл, 12:52, Jeff Schnitzer <j...@infohazard.org> wrote:
> On Sun, Jul 22, 2012 at 2:05 AM, Aleksei Rovenski
>

Jeff Schnitzer

unread,
Jul 22, 2012, 6:31:29 AM7/22/12
to google-a...@googlegroups.com
On Sun, Jul 22, 2012 at 2:12 AM, Drake <dra...@digerat.com> wrote:
>
> Let's see... You spend your life complaining how the platform sucks. I
> release tutorials on how to make it suck less. I'm the troll?

This conversation was constructive and mostly positive until you
chimed in. It was a reasonable discussion of real problems that many
people are having with scheduler behavior, including concrete
suggestions for how to improve it.

For the record, I have never, ever said that GAE sucks. Your advice,
on the other hand...

> Anyone who thinks the low level api is analogous to assembly probably
> doesn't want to depend on me because the laughter when they asked a question
> might be deafening.
>
> Read EVERY optimization guide on Java for GAE and that is the first thing
> they tell you to do.

You have this wrong. It's true that the JDO/JPA module gets a bad
rap, but every optimization guide I've read recommends abandoning it
in favor of lighter-weight layers like Objectify or Twig. It's true
that these tools will give you a better startup time because they
don't do classpath scanning. But it turns out that avoiding classpath
scanning only gets you so far - the up-front cost of classloading and
inspecting persistence classes becomes significant when you reach a
critical mass of entity classes.

As for writing complex business applications using strictly the
low-level API... good luck. I can only explain your attitude about
this as naïveté. And your absurd proposal that all 326 of my url
endpoints should be separate applications... Just. Wow.

Jeff

Drake

unread,
Jul 22, 2012, 1:15:36 PM7/22/12
to google-a...@googlegroups.com
And your
> absurd proposal that all 326 of my url endpoints should be separate
> applications... Just. Wow.

If you had actually read what I posted I said that you should group by task
type, and used class so that you had many smaller apps, that were optimized
for like tasks and minimizing classes. Which is just good design because
then you can run fewer instances likely move to f1s or f2s and better
utilize in instance memory, hitting the soft memory limits less.

And no it was not a constructive conversation it was a Gripe fest.

If you want it to be constructive, post your code and I'll whack 30% off
your start time just to get you to stop griping about how it is not possible
to speed up start times because Google sucks.




hyperflame

unread,
Jul 22, 2012, 1:28:22 PM7/22/12
to Google App Engine
On Jul 22, 5:12 am, Aleksei Rovenski <aleksei.roven...@gmail.com>
wrote:
> Regarding min idle instances, I must admit there is something strange
> going on in my Instance tab in Dashboard.
> I have settings like this: idle instances min=auto, max=1, pending
> queue min and max=15s. I have some working instances that I get
> charged for and then few more that are anyway created by google just
> waiting out there for hours and even days and I do not seem to pay for
> them. If I kill them, they are coming back anyway. This looks nice
> indeed, but could someone explain me what is behind that. I'm worried
> because it may be a drug that google will suddenly stop providing
> me :)

I get that as well. My theory is that at non-peak times, the Google
scheduler opens up extra instances to be ready for spikes.

Drake

unread,
Jul 22, 2012, 1:38:37 PM7/22/12
to google-a...@googlegroups.com
I have a similar theory.

We have 9 Million instances, Let's look busy. :-)

I think that they assign instances to somebody at all times, and if that
somebody suddenly needs one it raises the QoS. Doesn't cost anything to
idle instances that weren't doing anything. And you make the customer happy
if you avoid a warm up.



> -----Original Message-----
> From: google-a...@googlegroups.com [mailto:google-
> appe...@googlegroups.com] On Behalf Of hyperflame
> Sent: Sunday, July 22, 2012 10:28 AM
> To: Google App Engine

Richard Watson

unread,
Jul 23, 2012, 11:19:41 AM7/23/12
to google-a...@googlegroups.com
Personally, I don't care much for the who's-a-better-guru argument, doesn't get us any closer to a solution.

App Engine proves Joel Spolsky's "all abstractions are leaky" statement. An abstraction is nice when it works but you'd better know what's going on under the hood when it doesn't. Most of us are willing to put in extra effort to make our apps GAE-friendly upfront, but everything's about balancing available resources given our specific app and situation. If we weren't willing, we'd be using something else. Some have had more opportunity than others to refine their approach, and some apps suit the platform better than others. But we'll do better if we respect the opinions of the other participants a touch more, or at least try keep comments about the ball, not the man.
It is loading more messages.
0 new messages