Splitting the provided query requires that too many subqueries are merged in memory.

764 views
Skip to first unread message

Flavio Freitas

unread,
May 17, 2013, 9:19:51 AM5/17/13
to objectify...@googlegroups.com
Hello, people

I'm using the following example from the guide website:

List<Car> cars = ofy().load().type(Car.class).filter("year in", yearList).list();

but i'm getting an error: 

Splitting the provided query requires that too many subqueries are merged in memory.

I looked for it and i read that its just allowed to use a list (yearlist) with maximum 30 elements, but in my case i can have much much more than it. Actually im using another query, but with the same idea. I'm using a query to get all the activities from my friends.

Some people also suggested to bring this issue to be calculated in memory, but it wont be so efficient.

Any idea how can i solve this problem?

Thanks in advance ;)








Jeff Schnitzer

unread,
May 17, 2013, 10:46:42 AM5/17/13
to objectify...@googlegroups.com
This is a fundamental limitation of appengine. By the way, IN queries are merged in memory - it's just done for you. It's the same work if you do it yourself, you just have to write more code.

Generally the solution to problems like this is to pre-index your search terms more coarsely. For example, maybe you would rather search by decade and manually exclude some years?  Index a 'decade' property. Or five-year interval.

Jeff










--
You received this message because you are subscribed to the Google Groups "objectify-appengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appen...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Flavio Freitas

unread,
May 17, 2013, 12:19:33 PM5/17/13
to objectify...@googlegroups.com
Thank you for the reply, Jeff.

Can you help with this? What im trying to do is to get all the activities of my friends. So, instead of years, i pass a list of my friends id, that normally is much bigger than 30. What can i make to solve this? Did you face somehting like this?

Nicholas Okunew

unread,
May 17, 2013, 10:27:51 PM5/17/13
to objectify...@googlegroups.com
I think the suggestion is to partition the data somehow, for example 'all the activities from my friends in the last day' or 'the most recent 200 activities across all my friends', then do some sort of pagination.
Ultimately, you're going to run into memory issues with any solution, so some form of pagination or partitioning will be required. If you're planning on any scale on a particular type of data, then 'get everything' is not really an option.

It may be useful to take away the technology limitation and think of it from a users point of view - how could they meaningfully understand/read all of that on one screen? If its too much info to consume, then there's probably no point fetching it.

Jeff Schnitzer

unread,
May 17, 2013, 10:39:14 PM5/17/13
to objectify...@googlegroups.com
An alternative is to pre-index the data. It sounds like you are trying to build a wall via scatter-gather (at least, that's how I am interpreting "activities" - maybe you mean something different). That doesn't scale, especially not on GAE. Checkout the comments about Tumblr's dashboard:


Tumblr's solution is to make an index of you inbox (or wall/dashboard/whatever). When someone posts something and you are following them, you get indexed into the post. It gets tricky when you have zillions of followers; watch Brett Slatkin's talk on the Relation Index Entity pattern:


Jeff

David Fuelling

unread,
May 19, 2013, 2:20:31 PM5/19/13
to objectify...@googlegroups.com, je...@infohazard.org
Here's a link to how Jaiku tackled the activity-feed problem via pre-indexing.  The code is python, but you should be able to figure out how to accomplish the same thing in Java:

To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appengine+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "objectify-appengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to objectify-appengine+unsub...@googlegroups.com.

Casey

unread,
Feb 5, 2014, 12:20:58 PM2/5/14
to objectify...@googlegroups.com, je...@infohazard.org
I found the tumblr post to be quite informative. Specifically where they talk about creating an inbox of activity for each user. 
  • An inbox is the opposite of scatter-gather. A user’s dashboard, which is made up posts from followed users and actions taken by other users, is logically stored together in time order.
  • Solves the scatter gather problem because it’s an inbox. You just ask what is in the inbox so it’s less expensive then going to each user a user follows. This will scale for a very long time.


On Friday, May 17, 2013 8:39:14 PM UTC-6, Jeff Schnitzer wrote:
Reply all
Reply to author
Forward
0 new messages