appengine datastore model for twitter like showing posts from users followed

60 views
Skip to first unread message

Suresh Jeevanandam

unread,
Jun 15, 2017, 9:18:38 AM6/15/17
to Google App Engine

I am working on a web system where a feature is similar to Twitter's concept of following a list of users and seeing their posts as a list.

The simple model I came up with requires join operation which is not available in datastore.

class Post(Model):
   author = reference to user id
   content = text content

class Following(Model):
   author = reference to user id
   followed_by = reference to user id

The frequent operation is to display a list of posts (sorted in time) from users followed by the current user.

With the above model, it can only be done in two steps:

authors = Following.author when Following.followed_by == current_user
posts = Posts with Posts.author in authors

Is there any way to achieve this more efficiently?

Jordan (Cloud Platform Support)

unread,
Jun 15, 2017, 3:30:15 PM6/15/17
to Google App Engine
You are correct, performing the two separate queries is the correct way of retrieving the Post entities made by the authors who are followed by current_user. 

A JOIN operation is actually the same thing, just in SQL syntax. The first step would be to JOIN all 'Post' entities with 'Following' entities where 'author' is equal. This would then produce a new table containing combined rows of the two kinds on 'author'. Once you have the JOINed result, you would then perform another query for all 'followed_by' equal to 'current_user', and you would get your Posts.

So you can see how your current implementation is the most optimized. If anything, you could do a projection query on the first Following query to only receive the list of 'author's IDs instead of the entire Following objects. This would save you a bit of networking time and costs.  

Alex Martelli

unread,
Jun 15, 2017, 3:57:15 PM6/15/17
to google-a...@googlegroups.com
I would recommend denormalizing your data model -- a common optimization in non-relational DBs (like the datastore) and frequently useful in relational DBs as well. Just have user entities, with the user id as their key's name, containing a list of the authors the user follows. This will make adding/removing a follower relationship minutely slower, but surely that's a far rarer operation than displaying appropriate lists of posts.

For more on denormalization, start e.g. from https://en.wikipedia.org/wiki/Denormalization .


Alex

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscribe@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/9480ad51-6530-4fa6-a23d-88d215d525a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Suresh Jeevanandam

unread,
Jun 16, 2017, 4:37:45 AM6/16/17
to Google App Engine
Jordan,
Ok. I was thinking doing join on SQL would be significantly less costly - but after reading your insights I think that doing the join ourselves will not be significantly slow. Thanks for this insight.

-
Suresh

Suresh Jeevanandam

unread,
Jun 16, 2017, 4:44:36 AM6/16/17
to Google App Engine
Thanks, Alex. I will do this. I will also read about caches. Thinking of something like the list of users could be cached for the current-user. When the rare event of user follows/unfollows someone, it will invalidate the cache if it is present for the current user.

-
Suresh

On Friday, 16 June 2017 01:27:15 UTC+5:30, Alex Martelli wrote:
I would recommend denormalizing your data model -- a common optimization in non-relational DBs (like the datastore) and frequently useful in relational DBs as well. Just have user entities, with the user id as their key's name, containing a list of the authors the user follows. This will make adding/removing a follower relationship minutely slower, but surely that's a far rarer operation than displaying appropriate lists of posts.

For more on denormalization, start e.g. from https://en.wikipedia.org/wiki/Denormalization .


Alex
On Thu, Jun 15, 2017 at 4:19 AM, Suresh Jeevanandam <jm.s...@gmail.com> wrote:

I am working on a web system where a feature is similar to Twitter's concept of following a list of users and seeing their posts as a list.

The simple model I came up with requires join operation which is not available in datastore.

class Post(Model):
   author = reference to user id
   content = text content

class Following(Model):
   author = reference to user id
   followed_by = reference to user id

The frequent operation is to display a list of posts (sorted in time) from users followed by the current user.

With the above model, it can only be done in two steps:

authors = Following.author when Following.followed_by == current_user
posts = Posts with Posts.author in authors

Is there any way to achieve this more efficiently?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages