For some awkward reason I have the following queryset:
{{{#!python
qs = MyModel.objects.all()
qs = qs.annotate(
total=Sum('statstable__total'),
closed=Sum('statstable__closed'),
opened=Sum('statstable__opened'),
#opened_percent=Avg('statstable__opened_percent'), # The average of
the percents is not the correct ratio
#closed_percent=Avg('statstable__closed_percent'), # value between
closed and total
).filter(total__gte=min_total)
qs = qs.extra(
select={
'closed_percent':
'SUM(statstable.closed)/SUM(statstable.total)*100',
'opened_percent':
'SUM(statstable.opend)/SUM(statstable.total)*100',
}
)
}}}
When you try to do this on Oracle, it will put 'closed_percent' and
'opened_percent' on GROUP BY list, which will raise a `DatabaseError:
ORA-00934: group function is not allowed here`.
Since we are not supposed to mess with ".group_by" without Django's inner
code, there is no way to solve this problem.
The quick and dirty solution for me right now was to edit
`django/db/models/sql/compiler.py` at line 572 and add the following code:
{{{#!python
if isinstance(col, (list, tuple)):
result.append('%s.%s' % (qn(col[0]), qn(col[1])))
elif hasattr(col, 'as_sql'):
result.append(col.as_sql(qn, self.connection))
else:
if 'avg(' in col.lower() or 'sum(' in col.lower():
# This is what I added, just checking if the extra
select/column has the
# group functions that I need, if it does, skip
it, since you
# can't GROUP BY group functions
continue
result.append('(%s)' % str(col))
}}}
I'm using Django 1.4 on this project (with no ETA to upgrade,
unfortunally), and I see that on HEAD it has changed a bit, but remains
"Unconditionally adding the extra_select items", which is the root of my
problems:
https://github.com/django/django/blob/master/django/db/models/sql/compiler.py#L575
--
Ticket URL: <https://code.djangoproject.com/ticket/21524>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
This "validation" also must happens at line 114:
{{{#!python
if not self.connection.features.allows_group_by_pk:
for col, col_params in ordering_group_by:
if col not in grouping:
if 'avg(' in col.lower() or 'sum(' in
col.lower():
continue
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/21524#comment:1>
Comment (by aaugustin):
You're clearly pushing the boundaries of what the ORM can do... Writing
the query in plain SQL (look for raw SQL in the documentation) could be
easier at this point — and a short term solution.
I'll let someone who knows the aggregation code better than me make a
decision on this ticket.
--
Ticket URL: <https://code.djangoproject.com/ticket/21524#comment:2>
Comment (by shai):
The solution suggested is indeed dirty -- and bogus (I can think up valid
queries it breaks, the most trivial of which adds constant 'sum()' in the
extra columns).
In a similar case, I did something like
{{{#!python
def annotate_percents(qset):
class AnnotatedQuerySet(qset.__class__):
def iterator(self):
for obj in super(AnnotatedQuerySet, self).iterator():
obj.closed_percent = obj.closed/obj.total * 100
obj.opened_percent = obj.opened/obj.total * 100
yield obj
qset.__class__ = AnnotatedQuerySet
return qset
}}}
Now, in the case you have with AVG you may really need to do things in the
database; you can still do this efficiently by judicious use of
`prefetch_related`.
It can allow you to do things that look like you add a new related query
per object returned, but really only cost you one more database access per
the whole queryset.
I hope this helps; note that even if something like your use-case ends up
supported, it will probably not be in Django 1.4.x.
--
Ticket URL: <https://code.djangoproject.com/ticket/21524#comment:3>
Comment (by eu@…):
@Shai, thanks for your help, but this query that I do is used in 2~3
tables with almost same columns (they are statistical tables), and I need
to group data by ranges of dates and then order the result by the highest
and lowest percentage, hence my need for all the math being done by the
database. I don't think getting all the data and then math and sort on the
python would be a good solution for now.
@aaugustin, I do indeed am pushing what the ORM can do, and will totally
accept this ticket be closed as "invalid" or "wontfix", even if I'm just
using two queryset methods: `.extra` and `.annotate`.
--
Ticket URL: <https://code.djangoproject.com/ticket/21524#comment:4>
* status: new => closed
* resolution: => wontfix
Comment:
I don't think we can support this. It is impossible to detect which parts
of the SQL in extra select needs to be in group by. Trying to parse the
SQL doesn't work. Consider for example:
account_balance - SUM(transaction_amount)
Here account_balance must be in group by, but transaction_amount not.
The .extra() as it is currently implemented in Django just doesn't work
for your use case. I hope we will have something better later on, but for
now I'll have to just wontfix this issue.
--
Ticket URL: <https://code.djangoproject.com/ticket/21524#comment:5>