I have a simple model with 'id' as a primary key and I have a list of
IDs that were already processed in variable 'processed_ids'.
How can I retrieve all objects that are *not* in that list?
Something like "Model.objects.filter(id__not_in = processed_ids)".
Unlike filter(id__in=...) which works just fine the __not_in modifier
apparently isn't understood by django.
I hope I won't have to fetch everything and then filter manually in a loop.
Any ideas?
Thanks!
Michal
As commented recently on the list, you want "filter()"'s evil
twin "exclude()":
exclude(id_in=...)
http://www.djangoproject.com/documentation/db-api/#exclude-kwargs
-tim
That's it. Thanks.
Now with the above example if I don't have a list of IDs but a list of
the processed objects I'm not able to pass these as an argument to the
__in operator.
This: Model.objects.filter(...).exclude(id__in = processed)
gives me: Truncated incorrect DOUBLE value: 'Model object'
I have to use id__in=[p.id for p in processed] instead.
Am I doing something wrong again or is it expected behaviour?
Thanks
Michal
I've seen the above request on the ML several times now.
Is there any hope that, in the QuerySet refactor, the "__in" form
can sniff the datatype of the parameter for other querysets?
Thus, you could have something like
users = Users.objects.filter(...)
things = Foo.objects.filter(owner_id__in=users)
it seems that this would just translate to an IN/EXISTS query,
using the query definition from "users" to build the sub-query.
It's far more efficient to do that all on the server and just
return the correct results, rather than
things = Foo.objects.filter(owner_id__in=[
user.id for user in users])
which would force the entire set of users to be brought back and
translated into a list, and then ship off a potentially-long list
of IDs back in a 2nd query. (Imagine len(users) > REALLY_BIG_NUM)
The pseudo-code as I'd imagine it would be something like
if query_type = "in":
if issubclass(paramenter, QuerySet):
where_clause = 'IN (SELECT ID FROM %s WHERE %s)' % (
param.from_clause(), param.where_clause()
)
else:
where_clause = do_whatever_it_currently_does()
I haven't explored the QS refactor branch, but am looking forward
to its arrival so I can get started on tinkering with the
aggregate code.
-tkc
No, since I'm not adding every single enhancement possible in that
branch. You're talking about a change that requires nested sub-queries,
which will be easier/possible with that code, but not something I'm
going to implement immediately.
It's also not a showstopper user-case in practice. Even if you do pull
back 10,0000 rows, extract their id values (which you do with
a .values() query the first time) and sent a second query, it isn't that
bad. Databases are fast here in the 21st century. Sometimes it will be a
problem, but for the majority of cases, not so much. Nice behaviour to
have one day, but less important than a number of other issues.
Malcolm
--
Works better when plugged in.
http://www.pointy-stick.com/blog/