DB design best practices

92 views
Skip to first unread message

n8gray

unread,
Jun 9, 2009, 1:47:03 PM6/9/09
to Google App Engine
Hi everybody,

I'm about to start working on an AppEngine backend for an iPhone game
I'm developing. It's a simple board game, with 2-4 players who each
take turns making plays. Originally I had planned to set up a LAMP
server for my project, but AE has changed my plans (for the better!).
I've never written database code before but I've read up on the basics
of database design, and I came to the conclusion that I would need a
DB schema with tables something like this:

Players: username, email, userid, ...
Games: gameid, time_started, current_player, is_finished, ...
Players2Games: userid, gameid, score, ...
Turns: userid, gameid, timestamp, turn_number, play, turn_score, ...

It seems clear, however, that Datastore is not a traditional database
and perhaps my schema needs to be revisited. Is it still necessary or
advisable to use a table like Players2Games in order to represent many-
to-many relationships? What should my roots and parent/child
relationships be?

Typical queries will be (unsurprising) things like:
get all games for player x
get all players for game x
get the scores of all players in game x
get any turns in game x that have occurred since time t

Any advice, or pointers to articles/posts/documentation are
appreciated!

Thanks,
-n8

Tony Rowles

unread,
Jun 10, 2009, 8:21:06 PM6/10/09
to Google App Engine
It's a really good question and perhaps someone with more db
experience than me will write some kind of tutorial about planning
your db structure for GAE...

The biggest question I wished I'd asked myself before designing my app
is: "what sorts of models will I want to update together in
transactions?" Because in Datastore you can only run transactions on
entities in the same entity group (meaning each entity is an ancestor/
child of the others). So if you need certain operations to maintain
consistency, for instance "add new Turn entity, update Game's
current_player" you will need to plan that ahead of time (perhaps
assigning each Turn as a child of the Game it takes place in).

Other than that, just remember that writing to Datastore is very
expensive (as is fetching large numbers of entities from a query) -
you will want to figure out ways to retrieve entities by key as much
as possible.

Good luck,
Tony

Nick Johnson (Google)

unread,
Jun 11, 2009, 7:19:18 AM6/11/09
to google-a...@googlegroups.com
Hi n8gray,

Excellent question!

Given the amount of information you're likely to store against Players2Games (GamePlayers, perhaps?), and the number of games any one player may have, and the likely access patterns, I would suggest sticking with a separate model for it.

The main lesson for using the datastore instead of a relational database is simply to denormalize and precalculate. In this case, that likely means storing running totals (number of turns, score, etc) against the Players2Games entity, instead of calculating them when needed as you might in a relational database.

Tony's point about entity groups is an excellent one. Based on the sort of updates you're likely to want to do, and the access patterns in an app like this, I would suggest making the Players2Games entities child entities of the related Games entity, and making the Turns entities likewise child entities of their Games entity. This way, each game has its own entity group, so you can make atomic (transactional) updates across the whole game with ease.

-Nick Johnson

n8gray

unread,
Jun 11, 2009, 6:20:46 PM6/11/09
to Google App Engine
On Jun 10, 5:21 pm, Tony <fatd...@gmail.com> wrote:
> The biggest question I wished I'd asked myself before designing my app
> is: "what sorts of models will I want to update together in
> transactions?"  Because in Datastore you can only run transactions on
> entities in the same entity group (meaning each entity is an ancestor/
> child of the others).  So if you need certain operations to maintain
> consistency, for instance "add new Turn entity, update Game's
> current_player" you will need to plan that ahead of time (perhaps
> assigning each Turn as a child of the Game it takes place in).

Right, I came to the same conclusion. The game's state is the only
"super-entity" that needs to be updated atomically, so I decided to
group it together.

> Other than that, just remember that writing to Datastore is very
> expensive (as is fetching large numbers of entities from a query) -
> you will want to figure out ways to retrieve entities by key as much
> as possible.

I'll keep that in mind. It would be nice to have some "rule of thumb"
advice on when and how to use memcache.

Thanks,
-n8

Wooble

unread,
Jun 11, 2009, 6:52:39 PM6/11/09
to Google App Engine


On Jun 11, 6:20 pm, n8gray <n8g...@gmail.com> wrote:
> I'll keep that in mind.  It would be nice to have some "rule of thumb"
> advice on when and how to use memcache.

If you're likely to need to fetch it more than once, cache it. It's
almost always better to cache too much than to cache too little.

n8gray

unread,
Jun 11, 2009, 7:07:35 PM6/11/09
to Google App Engine
Hi Nick,

On Jun 11, 4:19 am, "Nick Johnson (Google)" <nick.john...@google.com>
wrote:
> Hi n8gray,
>
> Excellent question!
>
> Given the amount of information you're likely to store against Players2Games
> (GamePlayers, perhaps?), and the number of games any one player may have,
> and the likely access patterns, I would suggest sticking with a separate
> model for it.

Interesting -- I came to the opposite conclusion but maybe my
reasoning isn't sound. Here's what I've got right now (untested, so
it may contain syntax errors and such):

class GameState(VersionedModel):
# Parent should be a GameMeta
scores = db.ListProperty(int)
other gritty details of the current game state

class GameTurn(VersionedModel):
# Parent should be a GameMeta
creationDate = db.DateTimeProperty(auto_now_add=True)
player = db.ReferenceProperty(User)
turn_score = db.IntegerProperty()
more gritty details about this turn

This is the stuff you only care about if you're currently playing the
game. The scores may be better placed in GameMeta -- I haven't
decided yet.

class GameMeta(VersionedModel):
name = db.StringProperty(required=True)
password = db.StringProperty(indexed=False)
creationDate = db.DateTimeProperty(auto_now_add=True)
isActive = db.BooleanProperty(default=True)
# In case of tie, there can be more than one winner
winners = db.ListProperty(db.Key, default=None)
playerCount = db.IntegerProperty(required=True)
currentPlayer = db.ReferenceProperty(User, required=True)
currentPlayerNumber = db.IntegerProperty(default=0)
gameState = db.ReferenceProperty(GameState)
players = db.ListProperty(db.Key)

GameMeta holds all the metadata of the game. I moved the players list
in here (despite watching Brett Slatkin's I/O talk on list properties)
because I reasoned that a) the serialization/deserialization overhead
for 4 elements wouldn't be too bad, and b) You're going to want the
player list every time you retrieve the game anyway. If you think
this is unwise, however, I'm interested to hear why.


> The main lesson for using the datastore instead of a relational database is
> simply to denormalize and precalculate. In this case, that likely means
> storing running totals (number of turns, score, etc) against the
> Players2Games entity, instead of calculating them when needed as you might
> in a relational database.

Yeah, I was planning to do a fair bit of denormalization.

> Tony's point about entity groups is an excellent one. Based on the sort of
> updates you're likely to want to do, and the access patterns in an app like
> this, I would suggest making the Players2Games entities child entities of
> the related Games entity, and making the Turns entities likewise child
> entities of their Games entity. This way, each game has its own entity
> group, so you can make atomic (transactional) updates across the whole game
> with ease.

At least I got that part right!

Thanks,
-n8

Nick Johnson (Google)

unread,
Jun 12, 2009, 12:22:40 PM6/12/09
to google-a...@googlegroups.com
Hi n8gray,

The system you outline seems reasonable too. I'm not sure there's much value in splitting out GameState and GameMeta, but otherwise they seem reasonable. My main concern behind not having a separate GamePlayer entity would be a profusion of ListProperties for per-player data, but if you think that's manageable, go right ahead! :)

-Nick Johnson
Reply all
Reply to author
Forward
0 new messages