A tricky query in one to many relationship - atleast for me:)

48 views
Skip to first unread message

Web Architect

unread,
Aug 30, 2016, 2:33:14 AM8/30/16
to Django users
Hi,

I am looking for an elegant and efficient mechanism to have a query filter or a solution for the following one to many relationship model. Please note the following is just an illustration of the models - hope it should provide what I am looking for:

class A(models.Model):

name = models.CharField(_("Name"), max_length=255, unique=True)


class B(models.Model):
text = models.CharField(_("Text"), max_length=255, unique=True)
date_created = models.DateTimeField(_("Date Created"), auto_now_add=True) 
 a = models.ForeignKey(A, related_name='b')

To get all the instances of B associated with a specific instance of A (say 'a'), I could do the following : a.b.all()

The latest instance of B associated with 'a' would be : a.b.latest('date_created')

Now I would like to have a list of all instances of A where the latest instance of B associated with each instance of A will have the 'text' field as 'ABCD'. 

A.objects.filter(b__text='ABCD') will give all instances of A where each instance of A will have atleast one instance of B with 'text' = 'ABCD'.

There could be a brute force way of getting a solution for the above where in I can do A.objects.filter(b__text='ABCD')  and then go through each instance  of A in a for loop over the queryset and check for the latest instance of B for text='ABCD'.

As mentioned earlier, I am looking for an elegant and optimal way (if any in Django) for the above query.

Thanks.

 

Mike Dewhirst

unread,
Aug 30, 2016, 2:48:56 AM8/30/16
to django...@googlegroups.com
On 30/08/2016 4:33 PM, Web Architect wrote:
> Hi,
>
> I am looking for an elegant and efficient mechanism to have a query
> filter or a solution for the following one to many relationship model.
> Please note the following is just an illustration of the models - hope
> it should provide what I am looking for:
>
> class A(models.Model):
>
> name = models.CharField(_("Name"), max_length=255, unique=True)
>
>
> class B(models.Model):
>
> text = models.CharField(_("Text"), max_length=255, unique=True)
>
> date_created = models.DateTimeField(_("Date Created"),
> auto_now_add=True)Â
>
> Â a =Â models.ForeignKey(A, related_name='b')
>
>
> To get all the instances of B associated with a specific instance of A
> (say 'a'), I could do the following : a.b.all()
>
> The latest instance of B associated with 'a' would be :
> a.b.latest('date_created')
>
> Now I would like to have a list of all instances of A where the latest
> instance of B associated with each instance of A will have the 'text'
> field as 'ABCD'.Â
>
> A.objects.filter(b__text='ABCD') will give all instances of A where
> each instance of A will have atleast one instance of B with 'text' =
> 'ABCD'.
>
> There could be a brute force way of getting a solution for the above
> where in I can do A.objects.filter(b__text='ABCD') Â and then go
> through each instance  of A in a for loop over the queryset and check
> for the latest instance of B for text='ABCD'.

HAve you looked at ...

https://docs.djangoproject.com/en/1.8/topics/db/queries/#chaining-filters


>
> As mentioned earlier, I am looking for an elegant and optimal way (if
> any in Django) for the above query.
>
> Thanks.
>
> Â
> --
> You received this message because you are subscribed to the Google
> Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-users...@googlegroups.com
> <mailto:django-users...@googlegroups.com>.
> To post to this group, send email to django...@googlegroups.com
> <mailto:django...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/django-users.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-users/a9ea312a-578e-4a2f-a739-a38e87a4b15b%40googlegroups.com
> <https://groups.google.com/d/msgid/django-users/a9ea312a-578e-4a2f-a739-a38e87a4b15b%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

Web Architect

unread,
Aug 30, 2016, 3:20:45 AM8/30/16
to Django users
Hi Mike,

Thanks for your response. 

I am aware of chaining filters.

The latest() API in django queryset returns an object and not a queryset. Hence, I cannot add a filter to it. 

I had thought about using chaining filters but couldn't find a way.

Erik Cederstrand

unread,
Aug 30, 2016, 5:21:02 AM8/30/16
to Django Users
I'm not even sure that's possible to express in SQL, but it would probably be quite convoluted if it is. Here's an easier-to-understand solution:

res = set()
for b in B.objects.all().select_related('a').annotate(Max('date_created')):
if b.date_created != b.date_created__max:
continue
if b.text != 'ABCD':
continue
res.add(a)


You'll get too many B objects, but it's only one query. If you want a more optimal query, try to write it in SQL first. Maybe it's possible to express in the ORM.

I used set() because 'date_created' is not unique so there could be two B objects with the same 'a' and 'date_created' values.

Erik
> --
> You received this message because you are subscribed to the Google Groups "Django users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
> To post to this group, send email to django...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/a9ea312a-578e-4a2f-a739-a38e87a4b15b%40googlegroups.com.

Erik Cederstrand

unread,
Aug 30, 2016, 5:33:44 AM8/30/16
to Django Users
> for b in B.objects.all().select_related('a').annotate(Max('date_created')):

That should probably be: Max('a__b__date_created') instead.

Erik

Web Architect

unread,
Aug 30, 2016, 6:16:18 AM8/30/16
to Django users
Hi Erik,

Thanks for the solution.

So, I understand we get all instances of B which are max (date created)  or latest (date created) for each instance of A and then we check b.text == 'ABCD' and select the corresponding instance of A. 
Can't we add additional filter to check if text='ABCD' instead of running it through the for loop? I am still yet to catch up with all the concepts of Django ORM hence, not sure about it.

Also, if we use Max('a__b__date_created') , do we need the following condition still in the for loop? :
    if b.date_created != b.date_created__max: 
        continue 

Thanks.

Erik Cederstrand

unread,
Aug 30, 2016, 6:29:20 AM8/30/16
to Django Users

> Den 30. aug. 2016 kl. 11.20 skrev Erik Cederstrand <erik+...@cederstrand.dk>:
>
> I'm not even sure that's possible to express in SQL, but it would probably be quite convoluted if it is. Here's an easier-to-understand solution:
>
> res = set()
> for b in B.objects.all().select_related('a').annotate(Max('date_created')):
> if b.date_created != b.date_created__max:
> continue
> if b.text != 'ABCD':
> continue
> res.add(a)

I did some more experimenting. I think this actually does what you want:

res = [
b.a for b in B.objects
.filter(date_created=Max('a__b__date_created'))
.annotate(Max('a__b__date_created'))
.filter(text='ABCD')
.select_related('a')
]

which you can rewrite as:

A.objects.filter(
b__in=B.objects
.filter(date_created=Max('a__b__date_created'))
.annotate(Max('a__b__date_created'))
.filter(text='ABCD')
)

Erik

Marco Silva

unread,
Aug 30, 2016, 7:05:25 AM8/30/16
to Django users

Web Architect

unread,
Aug 30, 2016, 8:38:55 AM8/30/16
to Django users
Thanks Erik. Will try out the solution you mentioned.

Web Architect

unread,
Aug 31, 2016, 2:46:14 AM8/31/16
to Django users
Hi Erik,

I tried your solution but there are some issues:

.filter(date_created=Max('a__b__date_created'))  - this is throwing error saying not proper use of group function.

If I remove the above, the result isn't correct where when I go through each 'a' in the result, associated latest B.text isn't always 'ABCD' - if there are multiple instances of B associated with an instance of A and one of the instances of B has text='ABCD' (might not be the latest date_created), that instance of B is also there in the query result.

Thanks.


On Tuesday, August 30, 2016 at 3:59:20 PM UTC+5:30, Erik Cederstrand wrote:

Michal Petrucha

unread,
Aug 31, 2016, 4:18:38 AM8/31/16
to django...@googlegroups.com
On Tue, Aug 30, 2016 at 11:46:14PM -0700, Web Architect wrote:
> Hi Erik,
>
> I tried your solution but there are some issues:
>
> .filter(date_created=Max('a__b__date_created')) - this is throwing error
> saying not proper use of group function.
>
> If I remove the above, the result isn't correct where when I go through
> each 'a' in the result, associated latest B.text isn't always 'ABCD' - if
> there are multiple instances of B associated with an instance of A and one
> of the instances of B has text='ABCD' (might not be the latest
> date_created), that instance of B is also there in the query result.

Hmm, I haven't actually tried to run any code, but I'd try reordering
some of the method calls, and using an explcit field alias for the
annotation, which should let you then use an F() expression in
subsequent filters; something like this:

A.objects.filter(
b__in=B.objects
.annotate(max_date_created=Max('a__b__date_created'))
.filter(date_created=F('max_date_created'), text='ABCD')
)

Good luck,

Michal
signature.asc

Web Architect

unread,
Aug 31, 2016, 5:53:43 AM8/31/16
to Django users
Hi Michal,

Thanks for the solution. I tried the solution in a different way:

A.objects.annotate( max_date_created=Max('b__date_created')).filter(b__date_created=F('max_date_created'), b__text='ABCD') 

Hope the above amounts to the same thing as you have suggested.

Thanks.
Reply all
Reply to author
Forward
0 new messages