Incorrect query generated from ~Q/exclude

28 views
Skip to first unread message

kirk

unread,
May 12, 2020, 11:56:57 AM5/12/20
to Django users
Django 2.2.12
Python 3.7.5
Postgres 10.11

I am receiving what I perceive to be an incorrect SQL query from a Django queryset.  I am hoping I've just done something wrong.

Here's the setup:

from django.db.models import DecimalField, DO_NOTHING, F, ForeignKey, Model, Q, Sum, TextField


class TableA(Model):
    name
= TextField()

   
class Meta:
        db_table
= "tablea"


class TableB(Model):
    tablea
= ForeignKey(TableA, DO_NOTHING, null=True)
    value
= DecimalField(max_digits=5, decimal_places=2)

   
class Meta:
        db_table
= "tableb"

Here's some sample data:

insert into tablea (id, name)
select * from (values (1, 'NAME 1'), (2, 'NAME 2'), (3, 'NAME 3')) t;

insert into tableb (id, tablea_id, value)
select * from (values
    (1, 1, 0), (2, 1, 5),
    (3, null, 13),
    (4, 2, 0), (5, 2, 0),
    (6, 3, -1), (7, 3, 1)
) t;

The goal is to group names from tablea and sum values from tableb where value != 0.  If I were to write it purely in SQL I'd probably do something like this:

select   a.name, sum(b.value)
from     tablea a inner join tableb b on b.tablea_id = a.id
where    b.value != 0
group by a.name;

The answer should be:

NAME 3     0
NAME
1     5

If I write the Django queryset as one of the following:

qs = (
   
TableA.objects.filter(~Q(tableb__value=0))
   
.annotate(the_name=F("name"), the_sum=Sum("tableb__value"))
   
.values("the_name", "the_sum")
)

qs
= (
    TableA.objects.exclude(tableb__value=0)
   
.annotate(the_name=F("name"), the_sum=Sum("tableb__value"))
   
.values("the_name", "the_sum")
)

I get SQL that I really didn't expect:

SELECT   "tablea"."name" AS "the_name",
         SUM
("tableb"."value") AS "the_sum"
FROM     
"tablea"
         LEFT OUTER JOIN
"tableb" ON ("tablea"."id" = "tableb"."tablea_id")
WHERE    NOT
(
           
"tablea"."id" IN (
                SELECT  U1
."tablea_id"
                FROM    
"tableb" U1
                WHERE  
(U1."value" = 0 AND U1."tablea_id" IS NOT NULL)
           
)
         
)
GROUP BY
"tablea"."id";

which returns the wrong answer:

NAME 3     0

If I rewrite the queryset as such, I get the correct answer:

qs = (
   
TableA.objects.filter(Q(tableb__value__gt=0) | Q(tableb__value__lt=0))
   
.annotate(the_name=F("name"), the_sum=Sum("tableb__value"))
   
.values("the_name", "the_sum")
)

This translates to:

SELECT   "tablea"."name" AS "the_name",
         SUM
("tableb"."value") AS "the_sum"
FROM     
"tablea"
         INNER JOIN
"tableb" ON ("tablea"."id" = "tableb"."tablea_id")
WHERE    
"tableb"."value" > 0 OR "tableb"."value" < 0
GROUP BY
"tablea"."id";

Am I doing something wrong in my ~Q/exclude querysets?  Is this working as intended?

Thanks!

kirk

unread,
May 18, 2020, 9:21:37 AM5/18/20
to Django users
I struggled to find this in the Django ticketing system when I posted it, but I searched again this morning and found a few bugs that seem to be related.  They were closed with comments indicating that https://code.djangoproject.com/ticket/10060 seems to be the root cause.  Looks like this is a known, longstanding issue.
Reply all
Reply to author
Forward
0 new messages