Feature Request I Think....

162 views
Skip to first unread message

Brandon Wirtz

unread,
Nov 23, 2011, 5:27:07 PM11/23/11
to google-a...@googlegroups.com

I am considering asking for the ability to specify Min/Max Latency in the App.Yaml per handler.

 

I am really wishing I could have Fast instances and slow instances.  I have a lot of requests that take 66ms to fill, and a lot that take 3000ms

Concurrency seems to help some, but really I probably should be breaking these in to two apps one for the slow stuff that it wouldn’t matter if it too 5s instead of 3s and on for the stuff that 66 or 133ms wouldn’t make any difference.

 

But the ones that take 66 seem really slow when they take 3066ms  and I don’t really need more instances, just more ability to arrange them.

 

I had thought that by combining apps that I’d get to the point this would be less of an issued, but since the scheduler doesn’t know how long requests are going to take it seems most inefficient at dealing with multiple “sized” requests.

 

But all my troubles would go away if “slow requests” and “fast requests” went in to two different buckets.

 

I’m willing to implement this however GAE thinks is best, so I’m posting this as “How do you solve this? And How should I?”

 

Chris Ramsdale

unread,
Nov 23, 2011, 6:59:26 PM11/23/11
to google-a...@googlegroups.com
Hey Brandon,

A timely post indeed. Amongst the team, we've been discussing this exact problem. Splitting your app out into smaller apps has a handful of benefits, one of which is the ability to fine tune the performance settings for each app. I'd love to hear more about your design. For example:

- Assuming that it's required, how are you going to handle sharing data between the two apps?
- Are there other app-specific performance characteristics that you expect to change?
- Are there configuration settings that you would expect to be applied to both apps? (E.g., a centralized list of admins or cross-app budget)
- Any thoughts on how you would configure this with custom domains?

-- Chris

Product Manager, Google App Engine

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Brandon Wirtz

unread,
Nov 23, 2011, 7:57:35 PM11/23/11
to google-a...@googlegroups.com

Chris,

 

For my app sharing data would not be an issue.  That is one of the reasons I’m considering (and testing) using more than one App so that I can have different Scheduler/Instance/Latency settings in each.

 

In my case most of the difference is per domain, but I do know for instance that everything in with a ?s=Keyword is a Search and will therefor take longer than just returning some page with some data.

 

The reason I was looking at “Per Handler” was that I was hoping that something could be done where it was just instance balancing.  I would have 2 instances reserved for “Priority 0 requests” and 1 for “p1” requests. 

 

If P0 is busy, and p1 is not p1 serves p0 requests.

If P0 is busy and P1 is busy more P0 instances spin up

If P1 is busy more P1 instances spin up.

 

But the “busy” state for each would be configurable.

 

If P0 Request has “133 ms pending Latency set”  and p1 has 1000 ms Pending latency

 

Then the waterfall would look something like this….

 

P0 Request

 

Instance 1 has 166ms until free  -> go next

Instance 2 has 266ms until free  -> go next

Instance 3 is p1 handler has 106ms until free  -> serve request

 

P0 Request

 

Instance 1 has 166ms until free  -> go next

Instance 2 has 266ms until free  -> go next

Instance 3 is p1 handler has 1206ms until free  -> Create new P0 Instance

Instance 4 is P0 -> serve request

 

P1 Request

 

Instance 1 is not P1 has 166ms until free  -> go next

Instance 2 is not P1 has 266ms until free  -> go next

Instance 3 is p1 handler has 106ms until free  -> serve request

 

P1 Request

 

Instance 1 is not P1 has 166ms until free  -> go next

Instance 2 is not P1 has 266ms until free  -> go next

Instance 3 is p1 handler has 2106ms until free  -> Create New p1 Instance

Insance 4 is p1 -> serve request

 

 

Ideally It would be nice to specify by handler, or Assigned domain. I don’t know if there is reason to have more than 2 tiers of service. For small apps I would think 2 would do fine.  I can envision large apps looking at 3, but likely you have “Very fast”, “Kind of slow” and “Should be done on the back end”.

 

Where things get “interesting” is in resource balancing.  The Really slow requests being on a separate app in my app is causing extra congestion because they have the really high Cache-Miss rate because they are computing/gathering new data that I don’t already have. So they tend to be one off requests that don’t benefit much from memcache. I’m a lazy coder so I always check just in case because when you do get lucky and the data is there I save

2000ms and that adds up.  But in terms of users serve/pages served that hit cost me a lot of the very fast page serves ability to be cached… SO… putting my slow requests on a separate App Gives me HUGE wins because I get a better cache hit ratio and more total Mem-cache.

 

<Side bar>  It doesn’t appear that Memcache scales with my number of instances, but rather is a per app limit. Where as local instance memory goes up with my number of instances.  As a result instance memory cache hits scale up when  you have a spike in traffic to 100s of page, and mem-cache hits go down because if the number of actively requested items goes up mem-cache stays the same size.  (this also comes in to play as you start to multi-tenant lots of websites

 

<back on track>

So for me the “non-sharing” of data across apps means I get more resources which is a good thing, that wouldn’t work for a lot of other people, who don’t have such beautifully segmented data that is happy in silo’s.  Datastore not being available across apps is kind of sad. I mean I like that if I have my PostGres or MySQL or Oracle server I can put as many apps on the edge and talk to it as I want, Datastore in GAE doesn’t really allow that, (Though I did do some code that basically talks to DataStore via “curl”

 

I meandered, these are my thoughts “Raw” as I think things through.

For me:

More Apps = Harder to manage but better and more customizable performance, no ability to have different priorities on the Same Domain

Apps with Instance priority =Fewer resources, Easier Management, ability to prioritize with in a single domain

Brandon Wirtz

unread,
Nov 23, 2011, 8:12:57 PM11/23/11
to google-a...@googlegroups.com

It will be a while before I’m certain, but it looks like separating long and short requests saves me a LOT on instance $$$.  I’m guessing that “volume” and “average request time” don’t get calculated well when the types of requests change, so when site X gets crawled by Google Bot the scheduler tunes for one setting, and why Y gets crawled it gets tune another way, and when both get crawled at the same time all hell breaks loose.

 

Latency by IP Range would be SOOOO awesome :-) those Google bots don’t need the same QoS as my Paying customers :-)

Brandon Wirtz

unread,
Nov 24, 2011, 4:39:54 AM11/24/11
to google-a...@googlegroups.com

I probably need more time for testing but…

 

It would appear that putting slow tasks on one app and fast tasks on the other DRASTICALLY reduces the number of required instances.

I suspect that requests that take less than 200ms don’t support concurrency, and requests that take more than 1500 support a lot.

So when they co-exist on the same instances the fast instances don’t stack well, and the long requests create large delays.  Basically what was requiring 8-10 instances now takes 4.  Some of that may just be that my ram usage lets me hit instance memory more often in the segmented apps because I have less total data passing through instances.

 

This makes a strong case for being able to specify Handlers to specific instances, and having different Scheduler settings for various handlers.

 

 

Brandon Wirtz
BlackWaterOps: President / Lead Mercenary

Description: http://www.linkedin.com/img/signature/bg_slate_385x42.jpg

Work: 510-992-6548
Toll Free: 866-400-4536

IM: dra...@gmail.com (Google Talk)
Skype: drakegreene

BlackWater Ops

image001.jpg

stevep

unread,
Nov 24, 2011, 12:06:26 PM11/24/11
to Google App Engine
There are two schedulers already at work I believe. On-line handler
and Task Queue. TQ is, I believe, optimized for longer running tasks.
Google has never to my knowledge explained how these two schedulers
interact managing load inside the current instance setup. I long, long
ago posited that by changing TQ moderately there would be
opportunities substantial system optimizations. That is a complete
guess, however, knowing nothing about how the system works. Anyway, I
believe TQ is the ignored, red-headed step-child within GAE. Perhaps
initiated by a brilliant engineer who left, and no one wants to visit
his/her code. Alternatively, it could have been a quick idea hack that
everyone wants to deprecate and dispose of. Ultimately GAE today is
clearly a version one. At the very least it would make sense for the
OL Scheduler to analyze calls individually, but I don't think it does
that based on a previous thread Brandon started about one very high
latency function effecting a higher-than-necessary instance count.
Hopefully the initial financials for GAE are promising, and these
types of issues will get the great minds at G. applied. Seeing the
many new names from Google in the forums, I'm assuming that is the
case.

On Nov 24, 1:39 am, "Brandon Wirtz" <drak...@digerat.com> wrote:
> I probably need more time for testing but.


>
> It would appear that putting slow tasks on one app and fast tasks on the
> other DRASTICALLY reduces the number of required instances.
>
> I suspect that requests that take less than 200ms don't support concurrency,
> and requests that take more than 1500 support a lot.
>
> So when they co-exist on the same instances the fast instances don't stack
> well, and the long requests create large delays.  Basically what was
> requiring 8-10 instances now takes 4.  Some of that may just be that my ram
> usage lets me hit instance memory more often in the segmented apps because I
> have less total data passing through instances.
>
> This makes a strong case for being able to specify Handlers to specific
> instances, and having different Scheduler settings for various handlers.
>
> Brandon Wirtz
> BlackWaterOps: President / Lead Mercenary
>

> IM: drak...@gmail.com (Google Talk)
> Skype: drakegreene
>
>  <http://www.blackwaterops.com/> BlackWater Ops

> Then the waterfall would look something like this..

> cost me a lot of the very fast page serves ability to be cached. SO. putting

> <mailto:google-appengine%2Bunsu...@googlegroups.com> .
> For more options, visit this group athttp://groups.google.com/group/google-appengine?hl=en.


>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

> For more options, visit this group athttp://groups.google.com/group/google-appengine?hl=en.


>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.

> For more options, visit this group athttp://groups.google.com/group/google-appengine?hl=en.


>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.

> To unsubscribe from this group, send ...
>
> read more »
>
>  image001.jpg
> < 1KViewDownload

Andrei Volgin

unread,
Nov 24, 2011, 10:23:26 PM11/24/11
to Google App Engine
Allowing several apps to talk to the same Datastore may be a good idea
in its own right, but it would the wrong way to look at this issue.
Most apps combine slow and fast requests in a way that makes them hard
to separate. If I can specify that requests A, B and C go to "instance
type 1", and request D goes to "instance type 2", then I can expect
significant performance improvements by being able to fine-tune each
type of front-end instances. I would love to be able to set different
rules for each type as I know when my users expect an immediate
response and when a few seconds delay is not a problem.

We already take advantage of the two tiers of instances: frontend and
backend. This idea takes it one step further by creating custom
frontend instance types. While you are at it, maybe you can get rid of
these artificial monikers of "frontend" and "backend" instances, and
have a single implementation that lets us define our own instance
types based on 3-4 criteria.

Joshua Smith

unread,
Nov 25, 2011, 9:01:28 AM11/25/11
to google-a...@googlegroups.com

On Nov 24, 2011, at 12:06 PM, stevep wrote:

Seeing the
many new names from Google in the forums, I'm assuming that is the
case.

I noticed this, too. Can anyone from google comment (just between us girls), is GAE getting some traction inside the googleplex now that you're out of preview? Do you get mentioned in high-level meetings? Are you getting some more budget to work with? Are new insanely smart people looking to get into your group? Is Brandon's mermaid costume discussed at every water cooler?

-Joshua

James X Nelson

unread,
Nov 25, 2011, 11:05:50 AM11/25/11
to google-a...@googlegroups.com
Hey Chris,

A couple ideas about how to help us help you help us get better load balancing.

If it were possible to define different classes of instances to target, perhaps we could use a request header?

Either an X-AE-Expected-Latency or X-AE-Instance-Class;
The former to let an auto-balancer know that if it's a small or large request so it can send small requests to fresher instances,
the latter would require the client to specify specific instance classes to target.

This is similar to the way we target non-default versions of the app in task queues.


In fact, it would be an ENORMOUS help to my load balancing if I could send some requests to default frontend, and others to different versions.

Alas, ssl certificates do not allow me to request from msg.appid.appspot.com without an ugly, unacceptable, "please click this and confirm the security exception".

Allowing us to target different versions with a request header through the main url would bypass this limitation, and allow us easy "large and small request" distribution.




Anyway, if you guys decide to implement this,
out of gratitude, I will personally write the gwt generator patch to apply @Instance(target="messaging") or @Latency(expected=2500) annotations to rpc methods
and have the generator apply the headers for each method.  
{as opposed to having RpcAsync methods return RequestBuilder instead of void, and then manually applying the headers and hitting .call()}

Perhaps a whole Rpc interface could target a specific instance as well, with per-method overrides....



In regard to having multiple apps share data, I would think a reserved id namespace {something not currently allowed} would be the easiest.

Link the apps in the admin console, and when any of those apps interact with the Entity.SHARED_NAMESPACE ns, it will be a common big table namespace.

...Shouldn't be too hard.

Treat the shared ds like it's own app, with it's own appid, and use the SHARED_NAMESPACE to route to a different app-id.

Most of the client-library would never need to know it's a different appid.  Work a little magic in KeyFactory and DatastoreServiceImpl, and boom; instance shared data.


...Would also make a great tool for migrating to HR.  Just encode your namespace as a parent key, copy over, demux in HR, win! ;-}

Jamie Nelson

unread,
Nov 25, 2011, 10:26:22 AM11/25/11
to Google App Engine
How about a header we can append to have a request routed to a
particular instance-class?

For those of us using gwt, appending an @Instance(target="a1") or
@Latency(expected=2500) annotation to rpc methods could append the
appropriate header to route requests based on their expected latency.

This would be faster for all of us, and easier on your servers.

If this feature is released, I will personally write the generator
patch to implement the annotation {as opposed to have RpcAsync methods
return RequestBuilder to manually set each request}.

Chris Ramsdale

unread,
Nov 29, 2011, 2:36:27 PM11/29/11
to google-a...@googlegroups.com
Great feedback. Here's the model that I've been envisioning:

- I have foo.com and foo.com/api
- foo.com serves up my UI and needs to be super-fast
- foo.com/api serves up non-realtime API requests
- both route to a separate App Engine app
- foo.com has max pending latency of 200ms, and several idle instances
- foo.com/api has a max pending latency of 10s
- Each app is part of a larger "system" that is configurable within the Admin Console
- Being part of a larger "system" sets up the correct ACLs between apps and services (e.g. each app is able talk to the same Datastore)

A couple of notes:
- There needs to be a simple way of routing requests. Routing foo.com to the "system", and configuring paths that map to apps (e.g. /api routes to api.foo.appspot.com under the covers is one suggestion)
- Configurable Memcache that can be shared by each app would be nice, still iterating on this one
- Expose a few more Backend properties to Frontends, and one could imagine Backends and Frontends merging under this model
- Is there "system" billing and per-app billing?

-- Chris 

James X Nelson

unread,
Dec 19, 2011, 10:27:07 PM12/19/11
to google-a...@googlegroups.com
I definitely, absolutely love the idea of having appengine internally route foo.com/api to api.foo.appspot.com.  An under-the-covers targeting of different apps/versions is the only way I can think of to get ssl for the whole app without having to deal with how browsers handle the  *.appspot.com certificate.  Plus, being able to configure different instance configs with high or low latency would take advantage of smart clients that actually know which requests _should_ take a certain amount of time {before pre-warming requests, I had every gwt rpc log it's average run time in a format I compiled in to set a "you should be done by now" timer at 2x expected latency, so I could kill to old request and fire off a retry without waiting 10 seconds for the request to die}.

Anyway, the extra perks of shared memcache would be nice, but I would be more than absolutely happy to use it with nothing more than a single shared datastore namespace to serve as inter-app-swap space.  Being able to remote api data over _would_ work, but paying for http + instance hours when all we need is entity get() -> entity put() makes it far less than desirable.


I think, for standard users, they would have to pay flat billing for all app versions as if they were the same app.  Premium accounts already pay $500/month and can just hook up multiple apps however they please, and just pay for usage.

Please keep us updated, I'm very excited about the potential to wire up multiple apps, especially if I can send requests to sub-apps / sub-versions under a single ssl-friendly subdomain. =}

Chris Ramsdale

unread,
Dec 20, 2011, 7:51:46 PM12/20/11
to google-a...@googlegroups.com
On Mon, Dec 19, 2011 at 7:27 PM, James X Nelson <jamie....@promevo.com> wrote:
I definitely, absolutely love the idea of having appengine internally route foo.com/api to api.foo.appspot.com.  An under-the-covers targeting of different apps/versions is the only way I can think of to get ssl for the whole app without having to deal with how browsers handle the  *.appspot.com certificate.  Plus, being able to configure different instance configs with high or low latency would take advantage of smart clients that actually know which requests _should_ take a certain amount of time {before pre-warming requests, I had every gwt rpc log it's average run time in a format I compiled in to set a "you should be done by now" timer at 2x expected latency, so I could kill to old request and fire off a retry without waiting 10 seconds for the request to die}.

Great, we're pretty excited about the concepts as well.
 

Anyway, the extra perks of shared memcache would be nice, but I would be more than absolutely happy to use it with nothing more than a single shared datastore namespace to serve as inter-app-swap space.  Being able to remote api data over _would_ work, but paying for http + instance hours when all we need is entity get() -> entity put() makes it far less than desirable.

Re: Memcache -- rather than shared Memcache, I more frequently hear the request for configurable Memcache instances (i.e. please reserve 2GB of Memcache for app X). Does this resonate with what other folks are experiencing?

Re: remote api data -- integrating at the data layer has the added advantage of speed and cleanliness of code. 
 


I think, for standard users, they would have to pay flat billing for all app versions as if they were the same app.  Premium accounts already pay $500/month and can just hook up multiple apps however they please, and just pay for usage.

Still being fleshed out, but rolled-up pricing seems to be a natural fit.
 

Please keep us updated, I'm very excited about the potential to wire up multiple apps, especially if I can send requests to sub-apps / sub-versions under a single ssl-friendly subdomain. =}

Sure thing, and thanks for the feedback. Keep it coming!

-- Chris
 
 

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/s4ZqgSpVPRoJ.

stevep

unread,
Dec 21, 2011, 3:10:08 PM12/21/11
to Google App Engine
Still voting for task queue optimization. Hopefully we get TQ access
in the /api app. Example, if I get a huge burst of traffic, for
certain types of recs I would much rather build a queue backlog rather
than spin up new 10s latency instances. TQFTW. stevep.

On Nov 29, 11:36 am, Chris Ramsdale <cramsd...@google.com> wrote:
> Great feedback. Here's the model that I've been envisioning:
>
> - I have foo.com and foo.com/api
> - foo.com serves up my UI and needs to be super-fast
> - foo.com/api serves up non-realtime API requests
> - both route to a separate App Engine app
> - foo.com has max pending latency of 200ms, and several idle instances
> - foo.com/api has a max pending latency of 10s
> - Each app is part of a larger "system" that is configurable within the
> Admin Console
> - Being part of a larger "system" sets up the correct ACLs between apps and
> services (e.g. each app is able talk to the same Datastore)
>
> A couple of notes:
> - There needs to be a simple way of routing requests. Routing foo.com to
> the "system", and configuring paths that map to apps (e.g. /api routes to
> api.foo.appspot.com under the covers is one suggestion)
> - Configurable Memcache that can be shared by each app would be nice, still
> iterating on this one
> - Expose a few more Backend properties to Frontends, and one could imagine
> Backends and Frontends merging under this model
> - Is there "system" billing and per-app billing?
>
> -- Chris
>

Chris Ramsdale

unread,
Dec 30, 2011, 3:57:35 PM12/30/11
to google-a...@googlegroups.com
For the use case you provided, why not split the low latency and high latency requests into two different apps (each with specific performance tuning)?

-- Chris

Brandon Wirtz

unread,
Dec 30, 2011, 4:26:16 PM12/30/11
to google-a...@googlegroups.com

You’d either have to put a proxy infront or serve off two urls, not an ideal use case.

 

From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of Chris Ramsdale


Sent: Friday, December 30, 2011 12:58 PM
To: google-a...@googlegroups.com

Reply all
Reply to author
Forward
0 new messages