Max function and grouping with Oracle

19 views
Skip to first unread message

Dan Davis

unread,
Nov 26, 2018, 12:32:09 PM11/26/18
to Django users
I have a parent model that has a relationship to some data that is changing:


class Parent(models.Model):
     name = models.CharField(...)
     created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
     updated_timestamp = models.DateTimeField(auto_now=True, null=True)


class Child(models.Model):
     parent = models.ForeignKey(Parent, on_delete=models.CASCADE)
     created_timestamp = models.DateTimeField(auto_now_add=True, null=True)
     updated_timestamp = models.DateTimeField(auto_now=True, null=True)


I am trying to annotate a query with a Max updated_timestamp for the children:

Parent.objects.annotate(child_updated_timestamp=models.Max('child__updated_timestamp', output_field=models.DateTimeField()))

It seems like Oracle backend is attempting to GROUP BY every field in the child model.

Can anyone tell me whether they've seen anything like this and how to constrain the GROUP BY?

Thanks,

-Dan


     
    

Dan Davis

unread,
Nov 26, 2018, 1:03:19 PM11/26/18
to Django users
Looks like there is no problem here.   Django groups by whatever is in the query, so that this would be done as follows:

Parent.objects.values('id').annotate(child_updated_timestamp=models.Max('child_updated_timestamp'))

Simon Charette

unread,
Nov 27, 2018, 3:46:08 AM11/27/18
to Django users
Dan,

The root of the issue here is that you query is asking to retrieve all fields from
the Parent table and perform an aggregation on the JOIN'ed child table.

That results in the following query

SELECT parent.*, MAX(child.updated_timestamp)
FROM parent
JOIN child ON (child.parent_id = parent.id)
...

Since you SELECT all parent columns Django has no choice but to GROUP BY
parent.* as well because on Oracle GROUP BY must contain all non-aggregate
SELECT clauses. That is not the case on MySQL and PostgreSQL for example.

If you use .values('id') as you described you'll only SELECT parent.id which will
solve your issue as you've come to discover.

FWIW .only('id') would have worked as well and returned Parent objects instead
of a dict but all fields accesses would have been deferred.

Also there's a ticket tracking support for aggregation through subquery that would
have worked for your case[0]

Parent.objects.annotate(
    child_updated_timestamp=Child.objects.filter(
        parent=OuterRef('pk'),
    ).aggregate(Max('updated_timestamp'))
)

Cheers,
Simon

Dan Davis

unread,
Feb 26, 2019, 8:35:51 PM2/26/19
to Django users
> That is not the case on MySQL and PostgreSQL for. example.

+1 Insightful.

I'm still trying to get my Ops guys to let us drop and re-create databases for Postgres/MySQL.  They are very afraid to give that permission because they worry about production databases, and it is hard for them to manage grants one way for some databases and other ways for other databases.

So, a long way to go to PostgreSQL


Reply all
Reply to author
Forward
0 new messages