Best practice for faves/likes counting?

39 views
Skip to first unread message

Hong Minhee

unread,
Dec 5, 2012, 6:06:03 PM12/5/12
to sqlal...@googlegroups.com
Hi,

I am making a typical web application using SQLAlchemy, and it contains “users”, “works” and “likes”:

  • users ( id, login, … )
  • works ( id, title, … )
  • likes ( user_id, work_id )

I want to print how many users liked each work, so the most simple (but naive) is querying count of them e.g.:

len(work.like_set)
work.like_query.count()

However it’s inefficient, so I want to maintain like_count field or such thing for works.  The problem is that I have to manually update the field every time new like is inserted or existing like is deleted.  I thought it could be automatically updated without inconsistency if I catch SQLAlchemy events, but I’m not sure whether it’s correct way to do it or not.

How do you guys solve such situation with SQLAlchemy?

Thanks,
Hong Minhee

Eric Ongerth

unread,
Dec 5, 2012, 7:17:14 PM12/5/12
to sqlal...@googlegroups.com
But work.like_query.count() will be efficient if you have the right indexes in the database, no?

I think if you want to denormalize that count all the way and also stay very efficient, maybe it would be good to do it right on the db server with a trigger and a stored procedure and avoid extra python function calls.  What database do you use?

Hong Minhee

unread,
Dec 5, 2012, 8:16:01 PM12/5/12
to sqlal...@googlegroups.com
But for work in works: work.like_query.count() causes inefficient 1+N queries even if we have the right indices for it.  Of course I could query like session.query(Work, count()).join(Work.like_set).group_by(Work) but it’s somewhat complicated to read and write for me (is it only me?).  I want to reject such uses in ORM…

Anyway I use PostgreSQL.

Audrius Kažukauskas

unread,
Dec 6, 2012, 6:31:25 AM12/6/12
to sqlal...@googlegroups.com
On Wed, 2012-12-05 at 17:16:01 -0800, Hong Minhee wrote:
> But for work in works: work.like_query.count() causes inefficient 1+N
> queries even if we have the right indices for it. Of course I could query
> like session.query(Work, count()).join(Work.like_set).group_by(Work) but
> it’s somewhat complicated to read and write for me (is it only me?). I
> want to reject such uses in ORM…
>
> Anyway I use PostgreSQL.

But using JOINs is how relational DBs are meant to be used.

I had similar mindset like yours not that long ago. I didn't really
understand how the database was working, it was kind of blackbox to me.
What helped me immensely was writing pure SQL queries for Postgres in
one of my jobs. Not only that, but also reading EXPLAIN query plans,
seeing how and when the indices were used, etc.

There are still many things to learn, but I have much better
understanding of relational DBs now. This also helped me to see why
SQLAlchemy is as it is and appreciate it even more. Its exposure of SQL
is a good thing, contrary to other ORMs, which try to hide it as much as
possible, limiting their flexibility and power in result.

So I strongly encourage you to learn more about database you are using
and SQL in general. This will help you to write more efficient queries
and better leverage the power of your RDBMS.

In case of your JOIN query, even if you have the right indices, it could
still be slow, if it's not selective enough (it selects big part of the
table). In that case index simply won't be used by the query planner.

--
Audrius Kažukauskas
http://neutrino.lt/
Reply all
Reply to author
Forward
0 new messages