[Django] #19493: annotate(Count()) does not work properly with django.db.backends.oracle

23 views
Skip to first unread message

Django

unread,
Dec 18, 2012, 7:47:35 AM12/18/12
to django-...@googlegroups.com
#19493: annotate(Count()) does not work properly with django.db.backends.oracle
----------------------------------------------+--------------------
Reporter: kimvais@… | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer (models, ORM) | Version: 1.4
Severity: Normal | Keywords: oracle
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
An QuerySet operation like:
models.SomeModel.objects.filter(somecharfield=value).values("someintegerfield").annotate(Count("someintegerfield"))

return a QuerySet with no aggregate count, but multiple rows with
someintegerfield__count=1

The correct SQL would be: SELECT
tablespace_somemodel.someintegerfield,COUNT(tablespace_somemodel.someintegerfield)
AS someintegerfield__count FROM tablespace_somemodel WHERE
tablespace_somemodel.somecharfield = 'value' GROUP BY
tablespace_somemodel.someintegerfield;

This works fine on other backends.

--
Ticket URL: <https://code.djangoproject.com/ticket/19493>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Dec 18, 2012, 8:17:24 AM12/18/12
to django-...@googlegroups.com
#19493: annotate(Count()) does not work properly with django.db.backends.oracle
-------------------------------------+-------------------------------------

Reporter: kimvais@… | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.4
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: oracle | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by akaariai):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Old description:

> An QuerySet operation like:
> models.SomeModel.objects.filter(somecharfield=value).values("someintegerfield").annotate(Count("someintegerfield"))
>
> return a QuerySet with no aggregate count, but multiple rows with
> someintegerfield__count=1
>
> The correct SQL would be: SELECT
> tablespace_somemodel.someintegerfield,COUNT(tablespace_somemodel.someintegerfield)
> AS someintegerfield__count FROM tablespace_somemodel WHERE
> tablespace_somemodel.somecharfield = 'value' GROUP BY
> tablespace_somemodel.someintegerfield;
>
> This works fine on other backends.

New description:

An QuerySet operation like:
models.SomeModel.objects.filter(somecharfield=value).values("someintegerfield").annotate(Count("someintegerfield"))

return a QuerySet with no aggregate count, but multiple rows with
`someintegerfield__count=1`

The correct SQL would be:
{{{
SELECT tablespace_somemodel.someintegerfield,

COUNT(tablespace_somemodel.someintegerfield) AS someintegerfield__count


FROM tablespace_somemodel WHERE tablespace_somemodel.somecharfield =
'value'
GROUP BY tablespace_somemodel.someintegerfield;
}}}

This works fine on other backends.

--

Comment:

What SQL do you get? settings.DEBUG = True + connection.queries should
tell you the executed query easily.

Quickly checking it seems a QuerySet with multiple rows having
`someintegerfield__count=1` seems the correct answer for the query.

--
Ticket URL: <https://code.djangoproject.com/ticket/19493#comment:1>

Django

unread,
Dec 19, 2012, 3:22:58 AM12/19/12
to django-...@googlegroups.com
#19493: annotate(Count()) does not work properly with django.db.backends.oracle
-------------------------------------+-------------------------------------

Reporter: kimvais@… | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.4
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: oracle | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by kimvais@…):

The SQL produced by Django, which returns incorrect results (1):

{{{#!sql
SELECT "TABLESPACE_SOMEMODEL"."SOMEINTEGERFIELD",
COUNT("TABLESPACE_SOMEMODEL"."SOMEINTEGERFIELD") AS
"SOMEINTEGERFIELD__COUNT" FROM "TABLESPACE_SOMEMODEL" WHERE
"TABLESPACE_SOMEMODEL"."SOMECHARFIELD" = active
GROUP BY "TABLESPACE_SOMEMODEL"."SOMEINTEGERFIELD",
"TABLESPACE_SOMEMODEL"."ID" ORDER BY "TABLESPACE_SOMEMODEL"."ID" DESC
}}}

Which returns a QuerySet containing lots of rows with same values, eg:
[(1,1), (1, 1), (2, 1), (2, 1), (2, 1)] when correct would be [(1, 2), (2,
3)]
where each item is (someintegerfield, someintegerfield_count)

Query (2) that produces correct results on MySQL:
{{{#!sql
SELECT `tablespace_somemodel`.`someintegerfield`,
COUNT(`tablespace_somemodel`.`someintegerfield`) AS
`someintegerfield__count` FROM `tablespace_somemodel` WHERE
`tablespace_somemodel`.`somecharfield` = active GROUP BY
`tablespace_somemode
l`.`someintegerfield` ORDER BY `tablespace_somemodel`.`id` DESC
}}}
Correctly working query (3) for Oracle (tested on sqlplus command line)
{{{#!sql
SELECT tablespace_somemodel.someintegerfield,
COUNT(tablespace_somemodel.someintegerfield) AS someintegerfield__count


FROM tablespace_somemodel WHERE tablespace_somemodel.somecharfield =
'value'
GROUP BY tablespace_somemodel.someintegerfield;
}}}

However, query 3 cannot be passed to SomeModel.objects.raw() as "raw
queries
must contain primary key".

Furthermore,
{{{#!python
SomeModel.objects.raw("SELECT tablespace_somemodel.id,
tablespace_somemodel.someintegerfield, "
"COUNT(tablespace_somemodel.someintegerfield) AS


someintegerfield__count "
"FROM tablespace_somemodel WHERE tablespace_somemodel.somecharfield =

'active' "
"GROUP BY tablespace_somemodel.someintegerfield,
tablespace_somemodel.id")
}}}

goes in to infinite recursion loop of

{{{#!python
Exception RuntimeError: 'maximum recursion depth exceeded while calling a
Python object' in <type 'exceptions.KeyError'> ignored
}}}

But that probably is not a big issues, since I guess the results would in
any case be the same as the "wrong" query 1.
There probably isn't a simple fix for this, as it seems to me that there
is an inherent requirement for the primary key in order to construct a
QuerySet, but in oracle GROUP BY statement MUST contain all the selected
fields - which results the Count() to aggregate the results based on both
type and id.

--
Ticket URL: <https://code.djangoproject.com/ticket/19493#comment:2>

Django

unread,
Dec 19, 2012, 11:35:11 PM12/19/12
to django-...@googlegroups.com
#19493: annotate(Count()) does not work properly with django.db.backends.oracle
-------------------------------------+-------------------------------------
Reporter: kimvais@… | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: 1.4
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: oracle | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by russellm):

* type: Uncategorized => Bug
* stage: Unreviewed => Accepted


Comment:

I'm guessing an important, but unstated part of this bug report is that
your model needs to have an implicit ordering on the primary key; I
haven't been able to reproduce the "working" SQL without that important
element.

The "wrong" results are being generated because the PK is included in the
GROUP BY clause; I'm guessing this is because the ORDER BY clause exists.
PKs aren't required for all querysets -- values() querysts can drop them
under most circumstances, but they will be added back under certain
ordering or join situations. I'm guessing the Oracle backend's logic is
slightly different, and isn't pruning the PK as needed.

Regarding the call to raw() -- Yes, it needs to have a PK field present;
however, the fact that you were able to get an infinite recursion is
alarming. I haven't been able to reproduce on a non-Oracle backend; if
this problem isn't fixed by the fix for the COUNT, then it should be
opened as a separate ticket.

--
Ticket URL: <https://code.djangoproject.com/ticket/19493#comment:3>

Django

unread,
Dec 20, 2012, 1:20:54 AM12/20/12
to django-...@googlegroups.com
#19493: annotate(Count()) does not work properly with django.db.backends.oracle
-------------------------------------+-------------------------------------
Reporter: kimvais@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.4
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: oracle | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by kimvais@…):

You are correct, I have "ordering = ('-pk')" in the model Meta - (to make
sure that the ordering is correct on MySQL - which doesn't store fractions
of second in date fields). I guess I have to pick lesser of two evils :)

--
Ticket URL: <https://code.djangoproject.com/ticket/19493#comment:4>

Django

unread,
Dec 20, 2012, 3:33:52 AM12/20/12
to django-...@googlegroups.com
#19493: annotate(Count()) does not work properly with django.db.backends.oracle
-------------------------------------+-------------------------------------
Reporter: kimvais@… | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 1.4
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: oracle | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by kimvais@…):

Just tested and verified that a query with explicit removal of ordering
works correctly on Oracle, e.g.
{{{#!python
models.SomeModel.objects.order_by().filter(somecharfield=value).values("someintegerfield").annotate(Count("someintegerfield"))
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/19493#comment:5>

Django

unread,
Mar 11, 2019, 11:56:08 PM3/11/19
to django-...@googlegroups.com
#19493: annotate(Count()) does not work properly with django.db.backends.oracle
-------------------------------------+-------------------------------------
Reporter: kimvais@… | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 1.4
(models, ORM) |
Severity: Normal | Resolution: duplicate
Keywords: oracle | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* resolution: => duplicate


Comment:

Duplicate of #14357 which started deprecating `Meta.ordering` usage in
such queries in 1b1f64ee5a78cc217fead52cbae23114502cf564.

--
Ticket URL: <https://code.djangoproject.com/ticket/19493#comment:6>

Reply all
Reply to author
Forward
0 new messages