Returning an average from a Related Field.

6 views
Skip to first unread message

MerMer

unread,
Oct 15, 2006, 4:26:18 PM10/15/06
to Django users

I have two Models.

1. Product
2. Review - There can be many reviews for any product (one -to many
relationship)

One of the fields of Review is "Rating". This holds an integer between
1 and 5.

I want to display a list of Products, and display the average review
rating for each product.

I can see how I can easily display the number of reviews for each
product by using
"review_set.count" but it's not clear to me how I go about displaying
the average rating.

Could anybody set me on the right path on this? I'm sure it must be a
common type of requirement. I'm very new to both Python and Django.

MerMer

Jonathan Buchanan

unread,
Oct 16, 2006, 5:50:06 AM10/16/06
to django...@googlegroups.com

The following should give each Product a "rating" attribute containing
the average rating - you'll need to change the table names to match
whatever you have, and you can omit the NOT NULL condition if ratings
are mandatory for your review table.

Product.objects.all().extra(
select={
'rating': 'SELECT AVG(appname_review.rating) FROM
appname_review WHERE appname_review.product_id = appname_product.id
AND appname_review.rating IS NOT NULL',
}
)

The AVG function may be database dependent - I ran this on sqlite to test it.

Jonathan.

MerMer

unread,
Oct 16, 2006, 6:32:16 AM10/16/06
to Django users
Jonathan,

Many thanks for this.

Sorry to be a pain , but I have a couple of follow on questions, so I'm
clear.

1. How should I best implement this code. As a custom tag or in the
view?

2. Any suggestions on how I should pass the product.id to the code?

Thanks

MerMer

Jonathan Buchanan

unread,
Oct 16, 2006, 7:30:34 AM10/16/06
to django...@googlegroups.com

1. The queryset we just defined will return a list of Product objects
which have a 'rating' attribute in addition to their regular
attributes. You would use this when retrieving Products in your view
or defining a queryset for use by generic views in your urls.py.

You can then access the rating with {{ product.rating }} in your templates.

2. You don't need to specify a product id - the extra select query
we've provided results in something like the following in the SQL
generated by Django's ORM:

(SELECT AVG(appname_review.rating) FROM appname_review WHERE
appname_review.product_id = appname_product.id) AS "rating"

As such, you don't have to worry about doing anything else to specify
which Product objects you want to get the average rating for - just
use your Product.objects queryset methods as normal and use
.extra(...) when you need the average rating pulled out as well - the
ORM will already be restricting the resultset to the product rows
representing the Product objects you're interested in and the
.extra(...) query above will take advantage of that.

Just shout if that's not clear enough :)

Jonathan.

MerMer

unread,
Oct 16, 2006, 8:24:24 AM10/16/06
to Django users
Jonathan,

I don't want to the list to be filtered to only those who have a
Rating. I need to display ALL the products, while still showing the
average rating for those products which have a rating.

Example Table:

-------------- Price --- # Reviews ---Average Rating
Product 1 $24 2 4.5
Product 2 $20 0


Sorry if my orginal post wasn't clear enough on this point. Will your
solution still work on this basis? Many thanks.

MerMer

Jonathan Buchanan

unread,
Oct 16, 2006, 8:51:42 AM10/16/06
to django...@googlegroups.com

In the case where a product doesn't have a rating, the product should
be retrieved, with None set as its rating attribute. Again, I've only
verified this on sqlite.

If you want to get both the number of reviews and the average rating
in the same query, instead of doing a new query for each product in
order to count the number of reviews assocated with it, you could use
the following:

Product.objects.all().extra(
select={
'average_rating': 'SELECT AVG(appname_review.rating) FROM
appname_review WHERE appname_review.product_id = appname_product.id',
'rated_reviews': 'SELECT COUNT(*) FROM appname_review WHERE


appname_review.product_id = appname_product.id AND
appname_review.rating IS NOT NULL',
}
)

In the case where there are no reviews or no reviews with scores, the
rated_reviews attribute should be zero.

Jonathan.

MerMer

unread,
Oct 16, 2006, 1:42:42 PM10/16/06
to Django users
Looking at the options, I've also experimented with creating a custom
Model method.
The following almost works ( The division at the end returns an error
and I'm new at Python so need to work that out)

def av_rating(self):
queryset = self.review_set.all()
x=0
y=0
for rating in queryset:
x=x+rating.rating
y=y+1
return x/y

{{ objects.av_rating }} is the tag that is then put in the template.

Can anybody comment on the disadvantage/advantages of using this type
of custom model method? Also, if anybody can advise me on how I get
the division to work - that would also be appreciated - Django returns
a Zero Division error.

MerMer

Tim Chase

unread,
Oct 16, 2006, 2:13:30 PM10/16/06
to django...@googlegroups.com
> def av_rating(self):
> queryset = self.review_set.all()
> x=0
> y=0
> for rating in queryset:
> x=x+rating.rating
> y=y+1
> return x/y
>
> {{ objects.av_rating }} is the tag that is then put in the template.
>
> Can anybody comment on the disadvantage/advantages of using this type
> of custom model method? Also, if anybody can advise me on how I get
> the division to work - that would also be appreciated - Django returns
> a Zero Division error.

I don't know about the efficiency of calculating this, as you're
pulling back all the data just to calculate the average over the
set. I would suspect it would be faster for the server to simply
calculate the average and return it (one datum traversing the
net, rather than a whole dataset).

As for the exception/error, you could do any of a number of things:

1) return y and x/y or 0

2)

try:
return x/y
else:
return 0

3)
if y == 0:
return 0
else:
return x/y

However, having the server calculate the average should obviate
the need for such contortions.

HTH,

-tkc

MerMer

unread,
Oct 16, 2006, 3:30:16 PM10/16/06
to Django users
Tim,

Many thanks - that has helped me understand alittle more about Python.
I'm sure that there is a faster way. It's been more of an exercise
for me to try and get to understand Django, Python etc. I'm practically
a complete newbie.

Any suggestions on how would you get the server to calculate and return
the average?

Tim Chase

unread,
Oct 16, 2006, 3:50:26 PM10/16/06
to django...@googlegroups.com
> Any suggestions on how would you get the server to calculate
> and return the average?

Though I believe someone else already answered this, Django
allows you to fire off live SQL queries, which can do things like

SELECT AVERAGE(column_name1)
FROM tblTable

which will just return the average of the values in column_name1.

Or, if you want summary values base on groups, you can do things like

SELECT column_name1, AVERAGE(column_name2)
FROM tblTable
GROUP BY column_name1

which would return two columns...one for some common aggregate
value, and one for the average for items having that common
aggregate value. E.g. if your source data looks something like

column_name1 column_name2
John 1
John 2
John 3
Mary 5
Mary 7
Pat 8

The output would be something like

column_name1 column_name2
John 2
Mary 6
Pat 8

(note that only one column_name1 value appears because we're
grouping on it, and that the value in column_name2 is the average
of just the values for that person's name).

The result of the first query (that just returns the average)
would just return (1+2+3+5+7+8)/6

-tkc

johnnnnnnn

unread,
Oct 16, 2006, 4:03:34 PM10/16/06
to django...@googlegroups.com
On Sun, Oct 15, 2006 at 01:26:18PM -0700, MerMer wrote:
> I have two Models.
>
> 1. Product
> 2. Review - There can be many reviews for any product (one -to many
> relationship)
>
> One of the fields of Review is "Rating". This holds an integer between
> 1 and 5.
>
> I want to display a list of Products, and display the average review
> rating for each product.

Other responses have pointed you in the direction of custom
SQL. There's another (less data-normalized) way, which can be
implemented without SQL knowledge: use signals. This also has the
benefit of being easily sortable.

Step one is to add an "average rating" attribute to your Product
model. Set it to default to 0.0, or allow nulls and default to null.

Step two is to create a function (a function, not an object method)
which will update the average rating. That function should take this
argument list:

(sender, instance, signal, *args, **kwargs)

"instance" is the important one here, as it will be a Review
object. Use that to get the Product object, and then calculate the
average rating, set the avg_rating attribute of that Product object,
and call save().

Step three is wiring that function to the signal you
want. Specifically, you probably want the pre_save or post_save
signal, fired from the Review model.

You can do all of this in your models.py, which would end up looking
something like this:

# a bunch of imports
from django.db.models import signals
from django.dispatch import dispatcher

class Product:
# blah blah blah

class Review:
# blah blah blah

def update_average_rating(sender, instance, signal, *args, **kwargs):
# 'instance' will be the Review object being saved
#
# use that to get the Product object affected, calculate the new
# average rating, then call .save() on the Product object

dispatcher.connect(update_average_rating, signal=signals.pre_save, sender=Review)

Once you do that, the "average rating" attribute in your Product model
will be updated automatically every time a new Review is created (or
when an old Review is modified).

Do a google search for "django signals" for more information.

-johnnnnnnnn

MerMer

unread,
Oct 16, 2006, 4:22:29 PM10/16/06
to Django users
John,

Excellent. The requirement to make it easily sortable makes your
solution compelling. Thanks for detailing. Plus Im sure that there will
lots of places where this type of Signals method would be handy.

MerMer

Carlos Yoder

unread,
Oct 24, 2006, 8:28:43 AM10/24/06
to django...@googlegroups.com
While I believe this mini tutorial on signals is great, I guess this
functionality would be better accomplished in the DB itself, via a
trigger.

On 10/16/06, MerMer <mer...@googlemail.com> wrote:
>


--
Carlos Yoder
http://blog.argentinaslovenia.com/

johnnnnnnn

unread,
Oct 24, 2006, 2:34:27 PM10/24/06
to django...@googlegroups.com
On Tue, Oct 24, 2006 at 02:28:43PM +0200, Carlos Yoder wrote:
> While I believe this mini tutorial on signals is great, I guess this
> functionality would be better accomplished in the DB itself, via a
> trigger.

There are, i think, three ways of doing what was asked for:

1- signals. Benefits: it works with all dbs (supported by django), the
fields are sortable using the api. Drawbacks: you can't use non-django
code to do data manipulation, the data is not normalized.

2- raw sql. Benefits: the data is normalized, you can use any app code
to access the data. Drawbacks: sql is not fully portable, there is no
current way to use the django api to sort by subquery results, it
requires both sql and python knowledge to maintain.

3- triggers. Benefits: you can use any app code to access the data,
the fields are sortable using the api. Drawbacks: less portable than
vanilla sql, it requires even more sql knowledge to maintain, and the
data is not normalized.

In my mind, triggers are the least successful of those three
options. Triggers have neither the portability benefits, nor the
normalization benefits. But that's just my opinion.

Most important is really recognizing the strengths and weaknesses of
each approach, and choosing based on that.

-johnnnnnnnn

Ed

unread,
Oct 24, 2006, 6:46:21 PM10/24/06
to Django users
Hi John, thanks for your introduction to django dispatchers. I think
this is a greatly overlooked facility and has many applications.

> 1- signals. Benefits: it works with all dbs (supported by django), the
> fields are sortable using the api. Drawbacks: you can't use non-django
> code to do data manipulation, the data is not normalized.

What do you mean when you say that the data is not 'normalized'?
Thanks.

johnnnnnnn

unread,
Oct 26, 2006, 11:13:32 AM10/26/06
to django...@googlegroups.com
On Tue, Oct 24, 2006 at 10:46:21PM -0000, Ed wrote:
> What do you mean when you say that the data is not 'normalized'?

I'm referring to database normalization, which boils down to: store
each piece of data in only one spot. Check here:
http://en.wikipedia.org/wiki/Database_normalization or Google around
for database normalization tutorials.

So, using signals in this case is not normalized because the "average
rating" field is redundant with the existing rating fields. The raw
sql solution avoids that denormalization.

If your primary concern is the design of the relational database, then
you should absolutely try to normalize your system as much as
possible. If your primary concern is the object model, and you have a
decent ORM, then normalization is secondary.

-johnnnnnnnnn

Reply all
Reply to author
Forward
0 new messages