[Django] #22709: Pproblem with negating Q object

Django

unread,

May 27, 2014, 11:21:12 AM5/27/14

to django-...@googlegroups.com

#22709: Pproblem with negating Q object
----------------------------------------------+--------------------
Reporter: thierryb@… | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer (models, ORM) | Version: 1.6
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
Hello,

I have 2 models. One is the computer model which has a M2M field linking
to an application model.

If I want to know the computers that have a specific app I can for
instance do:
Q(applications__name='app name', applications_version=12)
Computer.objects.filter(q)

this will translate into the right where clause with both conditions.

Now if I want the opposite, I do:
Computer.objects.filter(~q)

that will generate something totally different and will consider the 2
conditions separately and generate a where clause roughly like that:

NOT "computer"."id" IN
(
SELECT U1."computer_id" FROM "computer" U1
INNER JOIN "generic_application" U2 ON ( U1."application_id" = U2."id" )
WHERE U2."name" = 'MyApp'
)
AND
"computer"."id" IN
(
SELECT U1."computer_id" FROM "computer" U1
INNER JOIN "generic_application" U2 ON ( U1."application_id" = U2."id" )
WHERE U2."version"::text LIKE 10.0%
)

So it treats the 2 conditions are totally separate and handed separately.
Which will give me NOT all the computers that either have my app or an app
which version starts with 10.0.

In django 1.4, that used to work as expected (i.e. that both conditions
would be taken into account at once.

Now I understand that Django has to evolve I'm wonder how I can achieve
what I want.

BTW I purposefully simplified the example. We have a system that
generically generates those Q object instances and moving to Django 1.6 is
currently not really possible for us, given that behavior.

Any help, workaround or advice is very welcome.

Thanks.

--
Ticket URL: <https://code.djangoproject.com/ticket/22709>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,

May 27, 2014, 12:12:55 PM5/27/14

to django-...@googlegroups.com

#22709: Pproblem with negating Q object

-------------------------------------+-------------------------------------

Reporter: thierryb@… | Owner: nobody
Type: Uncategorized | Status: new

Easy pickings: 0 | UI/UX: 0

-------------------------------------+-------------------------------------
Changes (by akaariai):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0

Comment:

Could you include the details of the models used, and what query you got
in Django 1.4?

It might be that the reason for this regression is that both Django 1.4
and Django 1.6 generate a query that isn't actually correct. However, the
1.4 version happened to work for your use case. If that is the case it
will be hard to revert back to providing the 1.4 version of the query.

--
Ticket URL: <https://code.djangoproject.com/ticket/22709#comment:1>

Django

unread,

May 28, 2014, 9:33:38 AM5/28/14

to django-...@googlegroups.com

#22709: Pproblem with negating Q object

-------------------------------------+-------------------------------------

Reporter: thierryb@… | Owner: nobody
Type: Uncategorized | Status: new

Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0

-------------------------------------+-------------------------------------

Comment (by anonymous):

Thank you for your quick reply.
Looking deeper into the generate SQL for django 1.4, it looks like it
worked by chance, so here I'm focusing on 1.6 (and the conclusion +
solution I came to)

Here are a summary of the models I'm using:

class Application(models.Model):
name = models.TextField()
version = models.TextField()

class GenericClient(models.Model):
applications = models.ManyToManyField(Application)

Now, 1 example I'm doing is:
q=Q(applications__name='app name', applications__version='1')
qs = GenericClient.objects.only('pk).filter(q)

# at this point I'm expecting qs will contain the clients that have 'app
name' version '1'
# For reference if I print qs.query it gives me

SELECT "generic_genericclient"."id" FROM "generic_genericclient" INNER
JOIN "generic_clientapplication" ON ( "generic_genericclient"."id" =
"generic_clientapplication"."client_id" ) INNER JOIN "generic_application"
ON ( "generic_clientapplication"."application_id" =
"generic_application"."id" ) WHERE ("generic_application"."version" = 1
AND "generic_application"."name" = app name )

Which is what I'm expecting

Now the problem comes when I try to reverse that condition:
qs = GenericClient.objects.only('pk).filter(~q)

# here I'm expecting to get all the clients that don't have 'app name' in
version '1'
But instead of that I'm getting the clients don't have either 'app_name'
or any other application which version is '1'

> print qs.query
SELECT "generic_genericclient"."id" FROM "generic_genericclient" WHERE NOT
("generic_genericclient"."id" IN (SELECT U1."client_id" FROM
"generic_clientapplication" U1 INNER JOIN "generic_application" U2 ON (
U1."application_id" = U2."id" ) WHERE U2."version" = 1 ) AND
"generic_genericclient"."id" IN (SELECT U1."client_id" FROM
"generic_clientapplication" U1 INNER JOIN "generic_application" U2 ON (
U1."application_id" = U2."id" ) WHERE U2."name" = app name ))

I tried the Queryset.exclude route, but after having a look at the code I
understood it was the same as my Q version.

More thinking about it, I figured that my expected behavior was just an
execration from my side and that generalizing exclude is not easy at all.

So I decided to go with "sub-queryset", that would hopefully work better
and be clearer.

So I just wanted to exclude the clients that have 'app name' version '1'.

I did:
qs =
GenericClient.objects.only('pk').exclude(pk__in=GenericClient.objects.filter(q))

That gives us:
SELECT "generic_genericclient"."id" FROM "generic_genericclient" WHERE NOT
("generic_genericclient"."id" IN (SELECT U0."id" FROM
"generic_genericclient" U0 INNER JOIN "generic_clientapplication" U1 ON (
U0."id" = U1."client_id" ) INNER JOIN "generic_application" U2 ON (
U1."application_id" = U2."id" ) WHERE (U2."version" = 1 AND U2."name" =
app name )))

...Which is finally what I wanted. YAY success.

So for the the conclusion is that exclude only does what I want in a few
cases.
In general I think exclude is hard to implement on the django side. Not
even sure it was not a design mistake in the first place.

I basically wanted to share my experience with using it. On the actual
problem I consider it "FIXED" since I found a workaround and I will be
super careful when using exclude in the future.

Any comment is of course welcome.

--
Ticket URL: <https://code.djangoproject.com/ticket/22709#comment:2>

Django

unread,

May 28, 2014, 10:10:28 AM5/28/14

to django-...@googlegroups.com

#22709: Pproblem with negating Q object

-------------------------------------+-------------------------------------
Reporter: thierryb@… | Owner: nobody
Type: Uncategorized | Status: closed

Component: Database layer | Version: 1.6

(models, ORM) | Resolution: duplicate

Severity: Normal | Triage Stage:
Keywords: | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0

-------------------------------------+-------------------------------------
Changes (by akaariai):

* status: new => closed
* resolution: => duplicate

Comment:

Yes, implementing .exclude() correctly is hard - when multiple filter
conditions target the same multivalued join, a single subquery needs to be
generated in a way that it includes all the conditions. Doing this in a
generic way is surprisingly hard, and often the fix is to manually use a
subquery.

The exclude() behavior isn't a design mistake - the principle is very
sound - when doing .exclude(condition) you get the complement of
.filter(condition) set. The problem is that this is just really hard to
achieve in some cases, generating (or just writing) correct SQL queries is
sometimes just plain messy.

The upshot here is that I think there is some hope to use the style you
did manually - detect a (possibly multipart condition) that needs to be
pushed into a subquery (lets call this condition subq_condition), and then
just generate the subquery as
`qs.filter(pk__in=subq.filter(subq_condition).values_list('pk'))`. This
might not be the most efficient query, but in theory this should work for
many currently broken cases.

I'll close this as duplicate of #14645.

--
Ticket URL: <https://code.djangoproject.com/ticket/22709#comment:3>

Django

unread,

Aug 14, 2014, 10:28:18 AM8/14/14

to django-...@googlegroups.com

#22709: Problem with negating Q object
-------------------------------------+-------------------------------------
Reporter: thierryb@… | Owner: nobody