Richard, I would advise to go with the JSON property. In our project
we intensively use JSONs and update them in task queues & backends.
Actually we have a rule - every page should make just 3-5 DB requests.
In future we would consider to move from JSON to ProtoBuf but not for
Also we've moved some rarely changed dictionaries (like geo locations
- e.g. all cities in the world) into the Python code. That pushed us
to use F2 instances due to higher memory demand but resulted in lower
latency and almost same costs. It's cheaper to upload new version of
app when needed.
Dev lead at http://www.myclasses.org/ project
On Apr 24, 6:07 pm, Richard Arrano <rickarr...@gmail.com> wrote:
> Thank you for the quick and very informative reply. I wasn't even
> aware this was possible with NDB. How would those x.yref.get() calls
> show up in AppStats? Or would they at all if it's just pulling it from
> Thank you Kaan as well, I will actually experiment with the
> PickleProperty and see what's faster. I like that solution because the
> X kind is not one I expect to be heavily cached so I don't mind
> actually caching the pickled instance as I expect them to be evicted
> within a relatively short amount of time.
> I also wanted to ask: I saw someone did a speed test with NDB and I
> noticed he was pulling 500 entities of 40K and in the worst-case 0%
> cache hit scenario, it took something like 8-10 seconds. I was
> actually planning to have a piece of my application regularly query
> and cache ~2500 entities(of 2500) and sort on it to avoid a huge
> amount of indices(and a NOT IN filter that would really slow things
> down). Is this feasible or would you expect his results to scale, i.e.
> 500 entities with 0% cache hits * 5 ~= 40-50s in my usage scenario? Or
> was there something unique to his situation with his indices and large
> amount of data? In mine each entity has about 10 properties with zero
> indices. If this is the case I'll probably copy the entities into a
> JsonProperty that occasionally gets updated and simply query/cache
> that since I don't expect the 2500 entities to change very often.
> On Apr 24, 12:59 pm, Guido van Rossum <gu...@google.com> wrote:
> > On Monday, April 23, 2012 10:21:26 PM UTC-7, Richard Arrano wrote:
> > > I'm switching from db to ndb and I have a question regarding caching:
> > > In the old db, I would have a class X that contains a reference to a
> > > class Y. The Y type would be accessed most frequently and rarely
> > > change. So when I would query an X and retrieve the Y type it points
> > > to, I would store X in the memcache with the actual instance Y rather
> > > than the key. If X is invalidated in the memcache, then so is the Y
> > > instance but otherwise I would skip the step of querying Y upon re-
> > > retrieving X from the memcache. Is there any way to do this in ndb? Or
> > > must I re-query each Y type even if it is from memcache or context?
> > If you leave the caching to NDB, you probably needn't worry about this
> > much. It's going to be an extra API call to retrieve Y (e.g. y =
> > x.yref.get()) but that will generally be a memcache roundtrip. If you are
> > retrieving a lot of Xes in one query, there's a neat NDB idiom to prefetch
> > all the corresponding Ys in one roundtrip:
> > xs = MyModel.query(...).fetch()
> > _ = ndb.get_multi([x.yref for x in xs])
> > This effectively throws away the ys, but populates them in the context
> > cache. After this, for any x in xs, the call x.yref.get() will use the
> > context cache, which is a Python dict in memory. (Its lifetime is one
> > incoming HTTP request.)
> > You can even postpone waiting for the ys, using an async call:
> > xs = MyModel.query(...).fetch()
> > _ = ndb.get_multi_async([x.yref for x in xs])
> > Now the first time you reference some x.yref.get() it will block for the
> > get_multi_async() call to complete, and after that all subsequent
> > x.yref.get() calls will be satisfied from memory (no server roundtrip at
> > all).