Query returning about 2000 results?

36 views
Skip to first unread message

K. Anderson

unread,
Nov 21, 2011, 2:58:09 PM11/21/11
to Google App Engine
I run an application which manages large grid schematics. A schematic
has about 2000 grid elements (transformers, power lines, voltage
regulators, etc). There are about 40 different types of grid
elements, which are mostly set in stone. Each grid element has about
40 attribute (voltage rating, install date, etc). The attributes are
not necessarily fixed (attributes could be added semi-frequently).

Here is are some example rows for a single schematic (schematicId=1)
which contains a Transformer, a PowerLine, and a Regulator

ElementId, SchematicId, ElementType, ElementAttributes

Example rows:
1, 1, Transformer, <voltage>5</voltage><install_date>5/1/2011</
install_date>
2, 1, PowerLine, <voltage>5</voltage><install_date>5/1/2011</
install_date>
3, 1, Regulator, <rating>5KW</rating><install_date>5/1/2011</
install_date>


When the user wants to a load a schematic, I simply query this table
for all entries with the associated schematicId (this returns about
2000 rows). The problem is that this process takes about 2 minutes.
(even though the data is only about 10MB).

Suggestions for improvement?

Jeff Schnitzer

unread,
Nov 21, 2011, 8:50:15 PM11/21/11
to google-a...@googlegroups.com
2 minutes to pull 2000 rows is crazy slow.  That doesn't sound right.  I would expect a few seconds.

One thing you should consider is putting all the elements under a common 'schematic' parent and fetching them with an ancestor query.  This should be faster than walking a regular index.

Jeff


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--
I am the 20%

GordonHo

unread,
Nov 21, 2011, 9:31:50 PM11/21/11
to google-a...@googlegroups.com
i am having sort of the same problem.

i need to catch about 4000 entries for a specific frontend user - i could split up the query, but it is really a slow thing unless i am using memcache (which i haven't implemented for real).

my experience is about 50 seconds for all 4000 entries. this does not involve any ancestors, or filtering. only ordering by an index.

cheers
gordon

Jeff Schnitzer

unread,
Nov 21, 2011, 11:27:47 PM11/21/11
to google-a...@googlegroups.com
The way I understand this (maybe a googler can correct me if I'm wrong):

A query on an index walks the index bigtable getting keys, then makes separate fetches to the main data bigtable to get the entity data.  Each entity could live in a different tablet so there might be a lot of RPCs and a lot of different servers involved.

A kindless ancestor query (and filtering on __key__ in general) doesn't use the separate index table - it walks the main data bigtable itself and the full data records are right there in sequence.  This is analogous to a clustered index in an RDBMS, and should be really fast.

So if you have a big chunk of related data that you want to store in component parts (as opposed to embedded in a single entity) but you need to pull up at once, give them a single common parent and use a kindless ancestor query to suck it all in.  And then post performance metrics here because I haven't tried this and I'm really curious to know if it makes a big difference :-)

Jeff

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/nH5CMYqe4PIJ.

To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Brian Quinlan

unread,
Nov 21, 2011, 11:30:17 PM11/21/11
to google-a...@googlegroups.com
Hi Kyle,

Could you post your query code?

Cheers,
Brian

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

Nick Johnson

unread,
Nov 21, 2011, 11:49:55 PM11/21/11
to google-a...@googlegroups.com
Hi Kyle,

It sounds like you generally fetch a schematic all at once. Rather than storing them as separate datastore objects, why not serialize them all as a single "schematic" object, which you fetch and store as a single unit?

-Nick Johnson

On Tue, Nov 22, 2011 at 6:58 AM, Kyle Anderson <kyleand...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.




--
Nick Johnson, Developer Programs Engineer, App Engine


Jeff Schnitzer

unread,
Nov 22, 2011, 6:51:05 AM11/22/11
to google-a...@googlegroups.com
On Tue, Nov 22, 2011 at 12:49 AM, Nick Johnson <nickj...@google.com> wrote:
Hi Kyle,

It sounds like you generally fetch a schematic all at once. Rather than storing them as separate datastore objects, why not serialize them all as a single "schematic" object, which you fetch and store as a single unit?


Presumably at 10MB of data that's not an option.  But maybe he could condense it to fewer entities. 

Jeff
Reply all
Reply to author
Forward
0 new messages