Please help with complex aggregation/annotation issue

24 views
Skip to first unread message

Salvatore Iovene

unread,
Jan 18, 2012, 4:03:04 AM1/18/12
to django...@googlegroups.com
Hi,
I apologize for the lousy title, but I really didn't know how to summarize the problem I'm facing. I have the following model:

class MessierMarathon(models.Model):
    messier_number = models.IntegerField()
    image = models.ForeignKey(Image)
    nominations = models.IntegerField(default = 0)
    nominators = models.ManyToManyField(User, null=True)
    
    def __unicode__(self):
        return 'M %i' % self.messier_number

    class Meta:
        unique_together = ('messier_number', 'image')
        ordering = ('messier_number', 'nominations')


A typical content for this model would be:

   Image A: 5 nominations for messier_number 1
   Image B: 4 nominations for messier_number 1
   Image C: 6 nominations for messier_number 2
   ...and so on.

I would like to formulate a query that returns me one image for each messier_number, picking the one with the most nominations. So, in the previous example, the query would return images A and C. The image B would not be returned because image A has more nominations for messier_number 1.

The images returned must be sorted by messier_number, and of course may repeat. (An image might contain more that one Messier object, and get highest nomination counts for both).

Can anybody please help with this?
Thanks in advance,
  Salvatore.

Bill Beal

unread,
Jan 19, 2012, 3:59:50 PM1/19/12
to django...@googlegroups.com
Each user nominates just one Messier number/image pair?

Do you need to keep track of the users to verify that each one votes only once?


--
You received this message because you are subscribed to the Google Groups "Django users" group.
To view this discussion on the web visit https://groups.google.com/d/msg/django-users/-/GB-T19nk21cJ.
To post to this group, send email to django...@googlegroups.com.
To unsubscribe from this group, send email to django-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-users?hl=en.

Ian Clelland

unread,
Jan 19, 2012, 4:13:42 PM1/19/12
to django...@googlegroups.com
The other edge case that you need to consider (and this is the one that makes this more than a simple aggregation query) -- what should be returned if two images have the same number of nominations for a given messier_number? Are both of them returned, or is there another tie-breaking criterion?

It's easy to get the highest nomination count for each messier number:

    MessierMarathon.objects.values('messier_number').annotate(Max('nominations'))

What is harder is getting a unique image attached to each messier_number, given that value for nominations.

A simple solution, which unfortunately requires a database hit for each messier_number, would be:

    highest_counts = MessierMarathon.objects.values('messier_number').annotate(Max('nominations'))
    top_images = dict((x['messier_number'], MessierMarathon.objects.filter(messier_number=x['messier_number'], nominations=x['nominations'])[0].image) for x in highest_counts)

You might be better off writing raw SQL for it, though; you could probably get it all with one (convoluted) query.

--
Regards,
Ian Clelland
<clel...@gmail.com>
Reply all
Reply to author
Forward
0 new messages