datastore advice?

43 views
Skip to first unread message

Charles

unread,
Oct 26, 2010, 4:13:23 PM10/26/10
to google-a...@googlegroups.com
Hi all,

I'm wondering, since the datastore is hierarchical, does the number of children an entity has affect the performance on querying on the parents themselves?  For example, if I have a set of parents, say...

Jane
Margaret
Graham
Arthur

...and I have a set of children associated with those parents...

Jane
  -Sam
  -Robert
Margaret
  -Lisa
Graham
Arthur
  -Rowen
  -Jerry

...will the number of children for each parent affect the performance of querying the parents themselves?  For instance, if I wanted to select all of the parents (SELECT * FROM parents), that would be easy with the data above.  But, since the datastore is hierarchical, does the performance get hampered if say the parents have many thousands or even millions of children?  Say, like...

Jane
  -Sam
  -Robert
  ...1 million more
Margaret
...

If so, I'm just wondering if it would make more sense to make the children root entities too, so as not to affect the performance of querying on the parents.  Anyways, hope I've explained my question well enough.

Thanks in advance!


Charles

djidjadji

unread,
Oct 26, 2010, 6:26:09 PM10/26/10
to google-a...@googlegroups.com
Parent and child objects are stored in the same Bigtable node. This is
done for the transactions. A transaction works on a single
entity-group.
If you perform a query you use an index to find the objects needed. At
this point there is no performance penalty for parent or child objects
that match the query. The query results in a number of keys of objects
that need to be retrieved. The best query responds time is when the
objects
to fetch are stored in as many Bigtable nodes as possible (parallel fetch).

Why do you need so many child objects?
Can you implement it with adding a Parent Reference Property to the
child object and thus remove the need to store the entity group all in
one Bigtable node?

The parent-child objects are needed
1) if you need transactions on them
or
2) if you want to extract the parent key given a child key
Perform a keys_only query on child objects and from these keys get
the set of parent keys of objects that you need to fetch complete.
Brett Slatkin uses this technique in a number of Google IO talks,
the child object has a ListProperty that is used in the keys_only
query.

Most other applications can be implemented without the explicit parent object.

2010/10/26 Charles <cha...@whoischarles.com>:

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

Ikai Lan (Google)

unread,
Oct 27, 2010, 2:27:05 PM10/27/10
to google-a...@googlegroups.com
Generally speaking: no. Entity groups will guarantee a stronger chance of data locality, but this should not affect index traversal or batch reads.

A best practice, however, is to keep entity groups as small as possible. There aren't many compelling reasons to not use root entities if you don't need transactions, as working with root entities is generally simpler.

--
Ikai Lan 
Developer Programs Engineer, Google App Engine
Message has been deleted

Robert Kluin

unread,
Oct 28, 2010, 1:51:46 AM10/28/10
to google-a...@googlegroups.com
There are no "tables" on App Engine, only Kinds which are arbitrary
collections of properties. In fact, all applications' entities are
stored in a single bigtable "table."

Entity groups refers to a logical group of entities that are
physically stored together; this allows that group of entities to be
updated in a transaction.

If you make entity groups with lots and lots of entities in them your
data can not be sharded across bigtable tablets. That means if you
are try to write 5 entities in that group the writes are done
serially. Fetches will also be less efficient for the same reasons.

You can read all about these topics:
http://code.google.com/appengine/articles/datastore/overview.html


Robert


On Wed, Oct 27, 2010 at 21:40, sodso <sodhiso...@gmail.com> wrote:
> Hi Ikan Lan,
>
> I have seen this on so many discussions here in the forums -


> "A best practice, however, is to keep entity groups as small as
> possible"
>

> As far as I understand it, one Model class should not have too many
> records i.e. one table should not have many records
>
> Isn't this strange ? When we talk of GAE, we talk of scalability but
> if the best practice says to limit ther records to only a few in a
> Model class, what use is the scalability for ? If we store millions fo
> records (for eg- customer records) in one db.Model table, are you
> saying it would really impact the performance.
>
> Please correct me if I am wrong. I am unclear if my app can survive on
> GAE if its working with millions of records in one db.Model table ??
>
> Thanks,
> Sodso
>
>
> On Oct 27, 11:27 pm, "Ikai Lan (Google)" <ikai.l+gro...@google.com>


> wrote:
>> Generally speaking: no. Entity groups will guarantee a stronger chance of
>> data locality, but this should not affect index traversal or batch reads.
>>
>> A best practice, however, is to keep entity groups as small as possible.
>> There aren't many compelling reasons to not use root entities if you don't
>> need transactions, as working with root entities is generally simpler.
>>
>> --
>> Ikai Lan
>> Developer Programs Engineer, Google App Engine
>> Blogger:http://googleappengine.blogspot.com
>> Reddit:http://www.reddit.com/r/appengine
>> Twitter:http://twitter.com/app_engine
>>

>> > 2010/10/26 Charles <char...@whoischarles.com>:

>> > > google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>


>> > .
>> > > For more options, visit this group at
>> > >http://groups.google.com/group/google-appengine?hl=en.
>>
>> > --
>> > You received this message because you are subscribed to the Google Groups
>> > "Google App Engine" group.
>> > To post to this group, send email to google-a...@googlegroups.com.
>> > To unsubscribe from this group, send email to

>> > google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>

sodso

unread,
Oct 29, 2010, 12:27:15 AM10/29/10
to Google App Engine
my apologies for the word typo, i meant the same

but my question still remains - if we have to store 1 million customer
records, do we create 100 different "Kinds" ~= Entity Groups ~=
db.Model and distribute those 1 million records among those entity
groups ?

then what about scalability ?
does this mean === more records = mroe entity groups to be created ?
isnt this an awkward of doing simple things ?

thanks.

On Oct 28, 10:51 am, Robert Kluin <robert.kl...@gmail.com> wrote:
> There are no "tables" on App Engine, only Kinds which are arbitrary
> collections of properties.  In fact, all applications' entities are
> stored in a single bigtable "table."
>
> Entity groups refers to a logical group of entities that are
> physically stored together; this allows that group of entities to be
> updated in a transaction.
>
> If you make entity groups with lots and lots of entities in them your
> data can not be sharded across bigtable tablets.  That means if you
> are try to write 5 entities in that group the writes are done
> serially.  Fetches will also be less efficient for the same reasons.
>
> You can read all about these topics:http://code.google.com/appengine/articles/datastore/overview.html
>
> Robert
>

djidjadji

unread,
Oct 29, 2010, 5:37:45 AM10/29/10
to google-a...@googlegroups.com
No. You just create 1 million "Entity Groups". Resulting in 1 million
instances of customer records.
Reread the answer from Robert.
You can view the Bigtable DATASTORE (notice the absence of database,
usually associated with tables)
as one big hashtable. You have a key that designates an object
instance and the datastore retrieves
the object and uses the model definition to convert the stored data
(pbuffer) to an instance of the Model class.
Part of the datastore you have index lists that sort keys based on
some property of the object instances.

2010/10/29 sodso <sodhiso...@gmail.com>:

Eli Jones

unread,
Oct 30, 2010, 1:34:27 PM10/30/10
to google-a...@googlegroups.com
Sodso,

The phrase "Entity Group" refers to a specific case where two entities from different db.Model classes are related (one is the parent of the other).

If you just create a regular db.Model like this:

class myUsers(db.Model):
    username = db.StringProperty(required=True)
    age          = db.IntegerProperty(required=True)

And then you create a new user like so:

newUser = myUsers(username='bobb', age = 77)

That newly created entity has an "Entity Group" size of 1. (You can create billions of myUsers entities with no problem.. the Entity Group size will be 1 for each entity in the myUsers db.Model).

BUT, if you have another db.Model defined like this:

class coolPeopleClub(db.Model):
    clubname = db.StringProperty(required=True)
    coolnessfactor = db.FloatProperty(required=True)

and then you do something like this:

newClub = coolPeopleClub(clubname="surly bikers", coolnessfactor=1422.39)
newUser = myUsers(username='bobbb', age= 77, parent=newClub)

That new user has an "Entity Group" size of 2.

You can have parents of parents of parents or parents.. etc.. and make larger and larger "Entity Group" sizes for an entity..

But, as others have mentioned, you don't want to give an entity a parent unless you need transactions.

Anyway, do not confuse the official phrase "Entity Group" with the group of entities that belong to the same db.Model class.

sodso

unread,
Oct 30, 2010, 11:39:50 PM10/30/10
to Google App Engine
Thanks everyone, I got the point :)

Cheers !
> > > > On Oct 27, 11:27 pm, "Ikai Lan (Google)" <ikai.l+gro...@google.com<ikai.l%2Bgro...@google.com>
> > <google-appengine%2Bunsu...@googlegroups.com<google-appengine%252Buns...@googlegroups.com>
>
> > > >> > .
> > > >> > > For more options, visit this group at
> > > >> > >http://groups.google.com/group/google-appengine?hl=en.
>
> > > >> > --
> > > >> > You received this message because you are subscribed to the Google
> > Groups
> > > >> > "Google App Engine" group.
> > > >> > To post to this group, send email to
> > google-a...@googlegroups.com.
> > > >> > To unsubscribe from this group, send email to
> > > >> > google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>
> > <google-appengine%2Bunsu...@googlegroups.com<google-appengine%252Buns...@googlegroups.com>

dipti seni

unread,
Nov 1, 2010, 2:44:44 PM11/1/10
to Gaby Desjardins, Ganesh K. Ramamoorthy, gaspiet...@googlegroups.com, george-gallowa...@googlegroups.com, george...@gmail.com, geth...@gmail.com, gha...@gmail.com, Becka11y, Fabrizio Giudici, hayati...@googlegroups.com, Golde...@googlegroups.com, google-a...@googlegroups.com, greg....@gmail.com, greg...@mortyg.com, gr...@index.hu, griz...@gmail.com, gurua...@gmail.com

       Search your life partner








 
  Greetings for Diwali  
 

Offer 20% Discount
 
 
 
 
Corporate Office : Ksrista.com, ks online services pvt. ltd. baludyan road uttam nagar delhi
 
 
     
  Toll Free : 1800-470-8521  +91-9050118000
  www.ksrista.com






Reply all
Reply to author
Forward
0 new messages