Computationally complex done fast: Was What are the Pros

Drake

unread,

Aug 6, 2012, 5:41:30 PM8/6/12

to google-a...@googlegroups.com

So here are more generalizations for APP Engine:

Cache Everything:
Your CPU is slow, when profiling weigh the CPU to the memory cost and speed
for things which you calculate often. It is surprising that even simple
stuff denormalized can be so much faster than the complex operation.
GetByID("Sales-and-Shipping-AZ-1599-1lb") takes less than 3 MS out of
Instance memory cache served from a dedicated backend, 10ms from memcache
120ms from data store. You can't do the 4 lookups and the calculation that
fast. So Don't. Remember all your "Complex business rules" are cacheable.
Anything worth doing is worth caching :-)

API Everything:
Concurrency is your friend. Concurrency comes from using the API. Anytime
your frontend is doing API stuff in request 1, Request 2 can do CPU stuff.
To get your maximum Concurrency you want to make requests to the API, or to
backends that function like API's In the above example you would have a
Ship and Tax Calculator "API" that would calc and store data which could be
"Fetched" by ID. The tiering would only calculate it if the answer wasn't
already stored somewhere. During the calculation simpler requests are not
blocked or held in queue.

The above methodology also allows you to reduce your code foot print and
leverage "specialized memory". If an instance can use all 128M of ram for
saved calculations you can serve those calculations very, very quickly. And
you don't have to manage memory or worry about soft limit instance
termination. If you dump the memory to something that can be init'ed on
startup every so often, you can create a setup where you always on infinite
scale non-expiring memory.

Cascade Defers:
Building defers that do defers themselves can sound scary, but this allows
you to do more throttling of tasks in a very well threaded environment. When
you have task of different sizes and use the max concurrent, you can have
big tasks spawn small tasks and have fewer errors in your tasks. This is
also better for leveling out your peaks and valleys so that your app doesn't
get "sluggy" when there is a spin up. This is because if you have natural
traffic your "sluggy" tasks won't spawn as many "fast" tasks. So you won't
use as many instance hours.

Use Specialized DataTypes:
The biggest thing you give up with frameworks is the ability do the really
awesome GQL stuff with special data types.
Did you know you can do queries based on RGB values of your images? I can
find "Images like this" using just appengine API's (sorry that bit of code
is secret)
But let's say tomorrow you wanted to build an Eharmony clone:

SELECT * FROM Person WHERE age >= 18 AND age <= 35 AND "white" IN
potential_races AND distance(location, geopoint(-33.857, 151.215)) < 15)

The Above Means you don't have much to write. You can even use calculated
values which can off load a LOT of CPU cycles.
When you use frameworks however lots happens in the instance rather than in
the API, AND you pay for more queries in terms of both datastore calls and
CPU time.

To me the above list is the reason to do GAE.

Jeff Schnitzer

unread,

Aug 7, 2012, 9:38:29 PM8/7/12

to google-a...@googlegroups.com

On Mon, Aug 6, 2012 at 2:41 PM, Drake <dra...@digerat.com> wrote:
> So here are more generalizations for APP Engine:
>
> Cache Everything:
> Your CPU is slow, when profiling weigh the CPU to the memory cost and speed
> for things which you calculate often. It is surprising that even simple
> stuff denormalized can be so much faster than the complex operation.
> GetByID("Sales-and-Shipping-AZ-1599-1lb") takes less than 3 MS out of
> Instance memory cache served from a dedicated backend, 10ms from memcache
> 120ms from data store. You can't do the 4 lookups and the calculation that
> fast. So Don't. Remember all your "Complex business rules" are cacheable.
> Anything worth doing is worth caching :-)

These numbers are nothing like my past experimental results, so I ran
them again. Here's appstats for a calls from a F1 to a B1 that does
nothing but return the string "noop":

https://img.skitch.com/20120808-nn749683wqdg5516fy732k71iw.jpg

Clicking on those links shows that 99% of the time is spent waiting on
urlfetch to the backend.

* The _minimum_ fetch time to a backend is 12ms.

* Maybe 10% of calls are >250ms, and a very significant portion exceed 500ms.

* I also tried this with a B8. No material difference
(https://img.skitch.com/20120808-gerejtgfd958i691hsiw897wjw.jpg)

My experience with memcache is that most requests come back in 2ms or
less, but there are rare outliers in the 100ms or 200ms range. Much,
much better behaved.

BTW this was with zero contention (ab -c 1 -n 50).

Conclusion: Backends suck.

Jeff

Michael Hermus

unread,

Aug 8, 2012, 8:18:12 AM8/8/12

to google-a...@googlegroups.com, je...@infohazard.org

I am pretty sure that the GQL you quoted cannot actually be executed, but I haven't used it in a long while, so perhaps they added some fancy features (or maybe it is more 'secret code'). I also very rarely find datastore gets to take over 100ms (even queries are generally faster than that).

Obviously, caching is good. Obviously, concurrency is good (although the thread limitations on GAE put a practical cap on concurrency).

I agree that most people probably don't take enough advantage of the Task Queue to defer and decompose work. It generally results in apps that are extremely well suited to the performance characteristics of App Engine.

Drake

unread,

Aug 8, 2012, 1:39:01 PM8/8/12

to google-a...@googlegroups.com

https://developers.google.com/appengine/docs/python/datastore/queries

Shows examples of <> which can be stacked, and “In List”. You can also do calculated/weighted values where you say +4 if red and -2 if blue.

Get by ID should be around 150MS if you use the ID. 100ms is minimum typically it trends up to 350ms. GAE has to be having a bad day for GetbyID /key to be more than 500ms.

We are seeing 8MS average on fetches from backends In instance memory. And for Static large data sets we have code that list literally as simple as

this = ‘that’

That = ‘this’

Shipping400lbsto90201 = ‘345.97’

Which is very close to a no-op.

Our spin up time for a 100 meg static data set is 4s. Best of 100,000 requests was 3ms worst of was 114ms (again with an average of 8ms)

Michael Hermus

unread,

Aug 8, 2012, 2:41:26 PM8/8/12

to google-a...@googlegroups.com

I am not sure where you are getting your data; get by id averages less than 50ms consistently, with only very rare spikes towards 100ms. In the last week, for example, latency has not gone over 100ms at all.

Regarding the query, I still don't know what you mean. Inequality filters on multiple properties are still not allowed, as far as I can tell. The "distance" function you use in the query would be absolutely kick-ass if it existed, but as far as I know, something like that is only currently avalilable in the search API.

Drake

unread,

Aug 8, 2012, 4:11:29 PM8/8/12

to google-a...@googlegroups.com

I'll try and dig up the distance one... It was a client doing it, not my
code I don't do much with GEO that way, we do a hex of the entire US in 5
mile chunks and put in you a list.

But the client was doing distance.

(if you are anywhere in hex 2345 that is your "Tier 1" hex, then we know
that 23445 borders hex's 91234, 92345,93456,etc, and we use geographic
features to remove neighbors that you can't navigate to directly... (like
you can't cross Lake Eerie by bridge, and if you are doing a 15 mile radius
and are 9 miles south of the San Mateo Bridge a guy 5 miles as the crow
flies across the SF bay is not a neighboring hex)

You can do < and > for sure, but it two "ANDs"

Drake

unread,

Aug 8, 2012, 4:11:58 PM8/8/12

to google-a...@googlegroups.com

And yes it is poorly documented.

Reply all

Reply to author

Forward