Hi All
I'm looking for some input on Db.Model design for the following
scenario:
1) A User can send a message which will to anywhere between 1 to 5000
receipients with 200-2000 receipients being by far the the most
common.
2) Each recipient is expected to receive between 1 and 50 messages a
day.
3) When a receipent has read a message it needs to be flagged as read.
For the distribtion of messages to receipients I took inspiration from
this
http://www.youtube.com/watch?v=AgaL6NGpkB8
(About 16 minutes into the video) he suggests a model like this:
class Message(db.Model):
sender=db.StringProperty()
body=db.TextProperty()
class MessageIndex(db.Model):
recipient = db.StringListProperty()
with Message and MessageIndex being in the same entity group. In short
the benefit to this design is supposedly that I can do a key only
query on
the MessageIndex for a particular user. From the MessageIndex keys
returned for the recepient I can extract the actual Message entity
keys and fetch those directly by key.
That's all well and good...but then I get to 3)...recipients needing
to flag messages as read. For that I'm contemplating something like
this:
class MessageReadIndex(db.Model):
recipient=db.StringProperty()
month=db.IntegerProperty()
messagesRead = db.StringListProperty(indexed=False)
When a recipient asks for a list of messages it will be sorted by
date, newest messages first, and paged (think gmail).
In the same page request I can query the MessageReadIndex for the user
and month(s) in question. From here I can loop through each message in
memory and
check to see if it has already been read.
When the recipient clicks a message to read it I can also retrieve the
MessageReadIndex entity and append the Message Id to the messagesRead
property and put() the entity.
This last bit is what has be a bit worried. It will be quite a few
writes from every recipient every day...again think gmail ;-) Not
indexing the messagesRead
property should help minimize the number of index entries that need
updating ...but still. Am I being overly paranoid and prematurely
optimizing at an unreasonably
level? Does anybody have any better ideas as for how to handle this?
Thanks in advance for your CPU time!
/Chris