AppEngine entity modeling - minimizing entity groups + achieving atomic cascading

11 views
Skip to first unread message

Harishankar Nagarajan

unread,
Dec 3, 2010, 5:53:07 AM12/3/10
to Google App Engine
Hello,
Am learning AppEngine and have started developing new app and want to
clarify something.

I understood that
a. To achieve atomicity of update/delete of several entities we need
to do it in a transaction and hence all should fall under same entity
group
b. Having big entity groups is not scalable as it causes contention.
(Q1: Correct?)

So here is an entity model of an online examination system for sake of
discussion:

Entities:
Subject
Exam
Page
Question
Answer

As you can see from top, each entity 1 - many relationship with the
immediate bottom one i.e 1 Subject can have many exams, 1 exam -> many
pages, 1 page can have many questions...

As you can see, i would like to establish cascading update/delete
relationship among these entities (JPA datanucleus appengine
implemention supports this (under the hood) by putting all entities
under same entity group (Q2: Correct?) though AppEngine natively
doesn't support this constraint) so naturally all would go under same
entity group so that
a. i can delete a Page (if my user does) in a transaction and be sure
that all pages, questions, answers are all deleted
b. or i can delete a subject altogether in a transaction all clear all
stuff underneath it

So when i extend this to my real app, i see that all of my (or atleast
most) entities are interrelated and fit into same entity group to be
able to transact them altogether - making my model inefficient.

Q3: Please advice on how to rethink this design (and the best
practice) and still achieve what i need. Ask me more if needed.
Would be great if you could point me to relevant examples.

p.s. 1 solution i could think of is having each entity in a separate
entity group and a separate persistent field in each entity (say Exam)
named 'IS_DELETED' defaulting to FALSE (value 0). Once a user deletes
an Exam, i will set the field to 1 (TRUE) and that i don't load them
anymore. I shall write a Cron job which clears all related entities in
separate separate transaction in the backend which will retry upon
failures if needed. But am sure this is not elegant and not sure
whether this will work out..

Thanks all for your responses,
Hari

Yasuo Higa

unread,
Dec 3, 2010, 6:14:21 AM12/3/10
to google-a...@googlegroups.com
Hi Hari,

> I understood that
> a. To achieve atomicity of update/delete of several entities we need
> to do it in a transaction and hence all should fall under same entity
> group
> b. Having big entity groups is not scalable as it causes contention.
> (Q1: Correct?)

Correct, but slim3, which is a java framework, supports global
transactions between multiple entity groups.
http://sites.google.com/site/slim3appengine/#gtx

You may worry about the overhead of global transactions. Don't worry.
It is not very expensive.
The demonstration is as follows:
http://slim3demo.appspot.com/gtx/

Yasuo Higa

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>
>

Eli Jones

unread,
Dec 3, 2010, 10:49:11 AM12/3/10
to google-a...@googlegroups.com
Hari,

It seems you are already thinking along the "correct" lines with your final suggestion.

There is not requirement that something that is "deleted" must be removed from a model immediately.

For example, when you delete an entity from the datastore, it isn't deleted.  It is marked as "deleted" and occassionally the datastore tablets are compacted and all entites marked "deleted" get removed.

What seems to be in-elegant, is really used all over the place in computer science.  When something get's deleted.. either a "delete flag" is turned on.. or there is just a pointer to that thing that gets set to Null or something or other.

See here for Nick Johnson's description of Log Structured storage:


For more wonky underneaths of distributed filesystems, see Matt Dillon's description of Hammer ("Data is not (never!) immediately overwritten so no UNDO is needed for file data."):



Also, there is an added benefit of not immediately deleting an entity.. what if someone is on a roll, and they're deleting questions left and right... and then they realize that they deleted five questions that shouldn't have been deleted?  If you've been furiously ensuring all deletes with transactions, there is nothing they can do.  If you are simply marking items as deleted, you can simply provide them with an un-delete option.

So.. I may start to sound like a broken record (since I feel like I say this in every other post)... but do not use transactions and entity groups unless it is absolutely necessary (you  have gone made and are creating a banking subsytem on Appengine, for example).

Most of the time, people just get hung up thinking that a delete or some other event should happen immediately at the moment it was conceived (I blame twitter and txting and chat for this).. and if it doesn't, there is something wrong with the design.

So, long story short, consider doing something like the "IS_DELETED" flag.. (or, if more than one Exam can share the same question, just have Exams point to Pages which point to Questions.. and IS_DELETED is only marked if an entity is no longer pointed to by anything.. and your nightly delete process verifies that IS_DELETED is correct by checking if an entity belongs to something else before delete [that might be a little much])

Harishankar Nagarajan

unread,
Dec 3, 2010, 12:05:21 PM12/3/10
to Google App Engine
Hi Yasuo,

Thanks for your suggestion, but as of now i would like to go for a
standardized solution. Would definitely keep slim3 in mind.

Hari

On Dec 3, 4:14 pm, Yasuo Higa <higaya...@gmail.com> wrote:
> Hi Hari,
>
> > I understood that
> > a. To achieve atomicity of update/delete of several entities we need
> > to do it in a transaction and hence all should fall under same entity
> > group
> > b. Having big entity groups is not scalable as it causes contention.
> > (Q1: Correct?)
>
> Correct, but slim3, which is a java framework, supports global
> transactions between multiple entity groups.http://sites.google.com/site/slim3appengine/#gtx

Harishankar Nagarajan

unread,
Dec 3, 2010, 12:08:14 PM12/3/10
to Google App Engine
Hi Jones,
Thanks for your detailed expln. i got it and It really helped. Am
happy that am thinking in the right lines.

All, Any other suggestions?

Thanks again,
Hari

On Dec 3, 8:49 pm, Eli Jones <eli.jo...@gmail.com> wrote:
> Hari,
>
> It seems you are already thinking along the "correct" lines with your final
> suggestion.
>
> There is not requirement that something that is "deleted" must be removed
> from a model immediately.
>
> For example, when you delete an entity from the datastore, it isn't deleted.
>  It is marked as "deleted" and occassionally the datastore tablets are
> compacted and all entites marked "deleted" get removed.
>
> What seems to be in-elegant, is really used all over the place in computer
> science.  When something get's deleted.. either a "delete flag" is turned
> on.. or there is just a pointer to that thing that gets set to Null or
> something or other.
>
> See here for Nick Johnson's description of Log Structured storage:
>
> http://blog.notdot.net/2009/12/Damn-Cool-Algorithms-Log-structured-st...
> > google-appengi...@googlegroups.com<google-appengine%2Bunsu...@googlegroups.com>
> > .
Reply all
Reply to author
Forward
0 new messages