The most efficient way to enumerate variations of properties in a subset

32 views
Skip to first unread message

Tom Fishman

unread,
Aug 13, 2011, 4:29:51 PM8/13/11
to google-a...@googlegroups.com
Say there are the following entities:

{ID:1,  width: 5, height: 11, ... },
{ID:2,  width: 5, height: 12, ... },
{ID:3,  width: 5, height: 12, ... },
{ID:4,  width: 6, height: 13, ... },
{ID:5,  width: 5, height: 12, ... },
{ID:6,  width: 5, height: 13, ... },
{ID:7,  width: 5, height: 12, ... },
...

What's the most efficient way to return the set of heights ( the same values are merged ) for all width==5? ( the answer should be 11, 12, 13 ).

We can build a query to enumerate all entities where width==5 and then build the set in code (java/python). But this is not salable, we might have thousands of entities share the same value.

I wish we can query the indexes...

Thanks!
- Tom

Carter Maslan

unread,
Aug 13, 2011, 10:29:46 PM8/13/11
to google-a...@googlegroups.com
http://code.google.com/p/appengine-mapreduce/ is a good way - if by "efficient" you mean efficiently performed on app engine.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/RJShf4x-MgcJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Robert Kluin

unread,
Aug 14, 2011, 1:17:49 AM8/14/11
to google-a...@googlegroups.com
If you need the queries to run 'interactively,' you might want to
implement 'custom indexes' to keep track of all heights for width 5.
I do this in some apps, it makes 'queries' super fast at the expense
of increased write cost. In this case you could create a model who's
key_name will be the width value that has a property to store the
heights. (The list of heights need not be indexed.)


Robert

Tom Fishman

unread,
Aug 14, 2011, 3:22:03 AM8/14/11
to google-a...@googlegroups.com
Carter's method is good for slow update and I will definitely use it.

I do need an "interactive solution" in the mean time. Robert, your method will be touching two entity groups, so there might be inconsistent results when one of the transactions fails.

I really hope we can query/count the index without touching the entities. I'm sure it is technically doable.

So, I don't think I have a robust/efficient solution now.

Tom Fishman

unread,
Aug 14, 2011, 3:44:43 AM8/14/11
to google-a...@googlegroups.com
My understanding on app engine index for our case is that we could build an index as

{ width (ascend), height(ascend), key }

so we can query GQL like : width=5, height>0, width=5, height<15. So all the information we need is ready for us to retrieve efficiently.

Robert Kluin

unread,
Aug 15, 2011, 12:36:55 AM8/15/11
to google-a...@googlegroups.com
Hey Tom,
Yup, the solution I suggested will touch multiple entity groups.
That means the best we can do is probably eventual consistency, but it
can still be really fast.

Check out transactional tasks, if you've got low update rates or you
don't expect a great deal of contention, you could simply push an
'update' task to update your indexes.

If you expect high write rates, or a lot of contention, you should
look into batching the updates. There are many batching techniques,
depending on your needs. If you'd like something that will be quite
accurate check out 'slagg'. I've included an example that is quite
similar to what you're wanting (see examples/basic_group_indexor.py).
Sorry for the lack of detailed documentation; I'm working on that,
along with some large code improvements, but have been swamped with
other projects recently.

https://bitbucket.org/thebobert/slagg


Also, I agree with you about the indexes. I'd love to be able to
only fetch the "index key" rows too.

Robert

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/google-appengine/-/BEjbfHAMI0QJ.

Reply all
Reply to author
Forward
0 new messages