Using RAM instead of datastore - any limits?

151 views
Skip to first unread message

ThePiachu

unread,
Nov 20, 2011, 6:58:42 PM11/20/11
to google-a...@googlegroups.com
My application relies on accessing a lot of simple stored data and displaying it. I'm considering storing all data in the RAM of the application in order not to have problems with datastore access quotas, but I'm not sure if there are any limits of how much data can be stored this way. Is there any limit on how much data can one store in say, a vector in RAM?

JH

unread,
Nov 20, 2011, 8:17:26 PM11/20/11
to Google App Engine
You get 128 megs of ram for front end instances. Also, so far my
experience says that py 2.7 uses quite a bit more ram just to run
hello world.

Brandon Wirtz

unread,
Nov 20, 2011, 11:26:36 PM11/20/11
to google-a...@googlegroups.com

You get an amount of ram close to but not always equal to 128M or ram PER instance.

 

Python 2.7 uses more memory for Hello World, and less for most operations. They both use the same for storing things like Data Caches.

 

You can use the local instance in addition to data store. Not instead.  My apps waterfall from edge cache to instance memory  to memcache to datastore
Use all the ram you can, it is free.  Don’t count on it being there, don’t over use it, and stick to the API’s and Libraries for accessing it, or the world will end violently.

No, I won’t share code for doing this it is our biggest selling point.

 

--

You received this message because you are subscribed to the Google Groups "Google App Engine" group.

To post to this group, send email to google-a...@googlegroups.com.

To unsubscribe from this group, send email to google-appengi...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

 

image001.jpg

Joshua Smith

unread,
Nov 21, 2011, 8:46:09 AM11/21/11
to google-a...@googlegroups.com
When building in-memory caches, it's typical to use a weak reference system (like a WeakHashMap in Java) so you don't have to rely on heuristics for how much memory you should use. Googling around, I found a thing called WeakValueDictionary in python. Anyone here have experience using one of these in GAE?

On Nov 20, 2011, at 11:26 PM, Brandon Wirtz wrote:

You get an amount of ram close to but not always equal to 128M or ram PER instance.
 
Python 2.7 uses more memory for Hello World, and less for most operations. They both use the same for storing things like Data Caches.
 
You can use the local instance in addition to data store. Not instead.  My apps waterfall from edge cache to instance memory  to memcache to datastore 
Use all the ram you can, it is free.  Don’t count on it being there, don’t over use it, and stick to the API’s and Libraries for accessing it, or the world will end violently.
No, I won’t share code for doing this it is our biggest selling point.
 
<image001.jpg>
 
-----Original Message-----
From: google-a...@googlegroups.com [mailto:google-a...@googlegroups.com] On Behalf Of JH
Sent: Sunday, November 20, 2011 5:17 PM
To: Google App Engine
Subject: [google-appengine] Re: Using RAM instead of datastore - any limits?
 
You get 128 megs of ram for front end instances.  Also, so far my experience says that py 2.7 uses quite a bit more ram just to run hello world.
 
On Nov 20, 5:58 pm, ThePiachu <thepia...@gmail.com> wrote:
> My application relies on accessing a lot of simple stored data and
> displaying it. I'm considering storing all data in the RAM of the
> application in order not to have problems with datastore access
> quotas, but I'm not sure if there are any limits of how much data can
> be stored this way. Is there any limit on how much data can one store
> in say, a vector in RAM?
 
--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
 

Nick Johnson

unread,
Nov 24, 2011, 12:47:59 AM11/24/11
to google-a...@googlegroups.com
Weak references may not work as you expect in Python. Python uses both reference counting and garbage collection; if the reference count of an object goes to 0, it will be freed immediately, instead of waiting for garbage collection. As a result, your cache may well be empty most or all of the time.

-Nick Johnson
Nick Johnson, Developer Programs Engineer, App Engine


Gerald Tan

unread,
Nov 24, 2011, 3:58:28 AM11/24/11
to google-a...@googlegroups.com
Don't forget that caching your entities in instance memory will mean that it will become stale if the entity is updated from another instance, and there is no way of knowing that happens without querying the datastore.

Joshua Smith

unread,
Nov 24, 2011, 8:11:52 AM11/24/11
to google-a...@googlegroups.com
Interesting. So is the only way to do an in-memory cache to use my own "bytes used" counter, and a heuristic of how much I can afford to store?

(If so, that's kinda lame.)

Joshua Smith

unread,
Nov 24, 2011, 8:12:30 AM11/24/11
to google-a...@googlegroups.com
The case I'm thinking of is a proxy server, so that isn't really an issue here.

On Nov 24, 2011, at 3:58 AM, Gerald Tan wrote:

Don't forget that caching your entities in instance memory will mean that it will become stale if the entity is updated from another instance, and there is no way of knowing that happens without querying the datastore.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/OfGkDVe2NhwJ.

Brandon Wirtz

unread,
Nov 24, 2011, 3:02:37 PM11/24/11
to google-a...@googlegroups.com

I have local memory as part of my Caching solution which is a Reverse Caching Proxy.  You can’t have a very big site in Local memory because you only get about 60 megs effective and memory isn’t shared between instances.

Joshua Smith

unread,
Nov 25, 2011, 8:43:03 AM11/25/11
to google-a...@googlegroups.com
Wow - only 60? I thought python's interpreter was pretty efficient. Where's the other half of the memory going?

Brandon Wirtz

unread,
Nov 25, 2011, 12:07:22 PM11/25/11
to google-a...@googlegroups.com

You have to remember that there is garbage collection, indexing, pointers, all the variables you loaded, and all the imports.

Joshua Smith

unread,
Nov 25, 2011, 2:01:26 PM11/25/11
to google-a...@googlegroups.com
Good start. Now the other 67 megs?

I'm no Python expert, but if it really takes 60+ megs to just load the executing environment and code (which, seriously, is probably only a few K!), then, well, I'm speechless...

Brandon Wirtz

unread,
Nov 25, 2011, 2:19:17 PM11/25/11
to google-a...@googlegroups.com

You get 128 soft limit, you start hitting soft limit at about 110 megs  depending on the requests per second.  If your RPS gets very high Garbage collection doesn’t keep up and you use more memory.  Unless you are really careful about how you use your variables lots of times you will end up with more than one copy of things in memory.  128M is nothing.  When you are working with single threads it isn’t so bad, things flush with each request, but when you have multiple threads everything is in memory at once.

 

Quit complaining, if you write good code 128 will do you, just don’t try to put your entire datastore in ram, that’s not what it is there for.

Joshua Smith

unread,
Nov 25, 2011, 2:56:23 PM11/25/11
to google-a...@googlegroups.com
As I wrote earlier in the thread, the use case I'm thinking through is similar to yours: basically a caching proxy server. It's really not a matter of "writing good code." There is hardly any code to write.

I'm not complaining, exactly. I'm more just puzzled. I can see how GC could get behind, but in that case, any reasonable VM would just call time out for a bit and get things back in line.

As you know, the bigger the available ram, the lower the bandwidth (inbound), and the lower the memcache & db ops. I'm surprised you, of all people, aren't bitching about the effective available ram being so much lower than the stated number.

I recall that you implemented your app in Java. Was the situation better over there?

-Joshua

Brandon Wirtz

unread,
Nov 25, 2011, 4:14:11 PM11/25/11
to google-a...@googlegroups.com

I was much happier on python.  Ram is nice, but it’s per instance.   If you have 10 instances running you may have 600 megs of ram, but you really only have 60 per.  I am a “proxy” as you say, and ram cache hits are very small compared to memcache. 

 

If you are caching a site that has 1000 pages at 60k per page you are not going to keep all of that in ram on GAE.  Not going to keep all that in Memcache either.  We have done some Very clever things to get more stuff in to cache and to serve it faster.  Optimization has really been about balancing speed, cost, and resource usage.

Reply all
Reply to author
Forward
0 new messages