A photo-app I'm working on has tags on an image defined as a many to many
relationship:
class Image(models.Model):
filename = models.CharField(maxlength=255)
description = models.TextField()
tags = models.ManyToManyField(Tag)
Given several tags: TagA, TagB, ..., I'm trying to find images that
contain each of the tags.
I try this:
In [73]:Image.objects.filter(tags__id=1).filter(tags__id=2)
Out[73]:[]
However, I know images with both of those tags exist, eg:
In [73]:i = Image.objects.filter(tags__id=1)[0]
In [74]:i
Out[74]:<Image: 2004-09-23++14-10-03_++_dsc_5994.jpg>
In [75]:[tag.id for tag in i.tags.all()]
Out[75]:[1, 2, 3, 17, 170, 259, 362, 387, 414, 647, 681, 810]
How can I find the image(s) with a given set of tags?
Thanks
Neilen
--
you know its kind of tragic
we live in the new world
but we've lost the magic
-- Battery 9 (www.battery9.co.za)
I think you want to use:
Image.objects.filter(tags__id__in=[1,2]).distinct()
Thanks for the info.
On Thu, 20 Jul 2006 10:05:43 +1000, Matthew Flanagan wrote:
>> In [73]:Image.objects.filter(tags__id=1).filter(tags__id=2)
>> Out[73]:[]
>>
>
> I think you want to use:
>
> Image.objects.filter(tags__id__in=[1,2]).distinct()
Alas, this returns all the images with tag1 OR tag2. I'm looking for
images with tag1 AND tag2.
Thanks
Neilen
Here's a solution that I prepared earlier (it talks about finding all
groups containing a subset of people, but it's the same problem):
http://www.pointy-stick.com/blog/2006/06/14/custom-sql-django/
Regards,
Malcolm
On Thu, 20 Jul 2006 16:56:22 +1000, Malcolm Tredinnick wrote:
>> > I think you want to use:
>> >
>> > Image.objects.filter(tags__id__in=[1,2]).distinct()
>>
>> Alas, this returns all the images with tag1 OR tag2. I'm looking for
>> images with tag1 AND tag2.
>
> Here's a solution that I prepared earlier (it talks about finding all
> groups containing a subset of people, but it's the same problem):
>
> http://www.pointy-stick.com/blog/2006/06/14/custom-sql-django/
Wow, thanks. What a nice writeup! I assume you don't mind me using your
code in whatever way I choose? As part of a baby-killing machine as it
happens ;-) (not really of course)
One comment about the code, it only seems to work if the list of people is
2 or more. There seems to be an error with the tuple()'ed bit in the query
if the list is of length 1. I just special cased len() == 1 to make this
work in my code.
A second more general note about the Django ORM. First off, excuse me if
I'm talking through my ass, I have almost no experience with DBs, but...
The Django ORM API seems to imply that chaining filter() (i.e.
queryset.filter(stuffA).filter(stuffB)...filter(stuffZ)) is the logical
equivalent of ANDing together the conditions stuffA ... stuffZ.
If this is technically infeasible to do on m2m relations I think it would
be better for the ORM to raise an error of some sort rather than just
returning an empty set. Also, would it be feasible for a query like this
with multiple m2m conditions to realise what's going on and to call the
code described on your blog, or am I just dreaming?
Regards
Neilen
>
> Regards,
> Malcolm
[...]
> One comment about the code, it only seems to work if the list of people is
> 2 or more. There seems to be an error with the tuple()'ed bit in the query
> if the list is of length 1. I just special cased len() == 1 to make this
> work in my code.
Excellent point! (For those playing along at home, str(tuple([10])) is
not something you can put into an SQL query.) That tuple() hack is just
that -- a fairly ugly hack. I thought I was avoiding most of the
pitfalls (e.g. it's very dangerous for strings, so don't try that trick
on a model where the primary key is not an integer), but had forgotten
about the length one case. Thanks.
> A second more general note about the Django ORM. First off, excuse me if
> I'm talking through my ass, I have almost no experience with DBs, but...
> The Django ORM API seems to imply that chaining filter() (i.e.
> queryset.filter(stuffA).filter(stuffB)...filter(stuffZ)) is the logical
> equivalent of ANDing together the conditions stuffA ... stuffZ.
>
> If this is technically infeasible to do on m2m relations I think it would
> be better for the ORM to raise an error of some sort rather than just
> returning an empty set.
Because of the way QuerySets are constructed, there's no easy way to
tell when a user is walking into this case. The current behaviour may
reasonably be considered a semi-bug, however it's unavoidable at the
moment since we form the query by combining the "where" clauses as
pieces, rather than taking a more holistic view of the query. Fixing
that would be unbelievably hard in the current implementation (to get it
right for the general case). It might happen one day, but for now it's
the old punchline that follows: "Doctor, it hurts when I do this..."
> Also, would it be feasible for a query like this
> with multiple m2m conditions to realise what's going on and to call the
> code described on your blog, or am I just dreaming?
I've sat down more than once to try and work out if there were ways to
do this (it crops up a bit when trying to construct efficient aggregated
queries as well) and it is a very similar problem to trying to write an
SQL query optimizer in the general case. Not trivial.
Malcolm
> Because of the way QuerySets are constructed, there's no easy way to
> tell when a user is walking into this case. The current behaviour may
> reasonably be considered a semi-bug, however it's unavoidable at the
> moment since we form the query by combining the "where" clauses as
> pieces, rather than taking a more holistic view of the query. Fixing
> that would be unbelievably hard in the current implementation (to get it
> right for the general case). It might happen one day, but for now it's
> the old punchline that follows: "Doctor, it hurts when I do this..."
Haha, well, doctor, I won't hold my breath then :) Would definitely be
nice to have though.
On another note, there seems to be another unhandled corner case. If the
query returns no results, the final
return self.filter(id__in = group_idents)
gets called with group_idents=[]. filter(id__in=[]) seems to make the ORM
unhappy. I "fixed" this with the hacktastic solution of always appending a
0 to group_idents. It seems that all id's are > 0, which means that
id__in=[0] will return an empty query set. Is this a reasonable solution?
Regards
Neilen