Django ORM generate inefficient SQL

164 views
Skip to first unread message

Alex Lebedev

unread,
Aug 21, 2014, 6:49:41 AM8/21/14
to django...@googlegroups.com
Hi, guys!

I have encountered a problem. When I use the following code in django shell:

    from django.contrib.auth.models import Group
    from django.db.models import Count
    print Group.objects.annotate(cnt=Count('user')).values('id', 'cnt').query.sql_with_params()
 
Django ORM generate the following SQL query:

    'SELECT `auth_group`.`id`, COUNT(`auth_user_groups`.`user_id`) AS `cnt` FROM `auth_group` LEFT OUTER JOIN `auth_user_groups` ON ( `auth_group`.`id` = `auth_user_groups`.`group_id` ) GROUP BY `auth_group`.`id`, `auth_group`.`name` ORDER BY NULL'

"auth_group.name" occurs in "group by" statement. But this column isn't represented in "select" statement. Such query is inefficient (expecially for large tables with many columns and rows).

Debuging of Django SQLCompiler ( https://github.com/django/django/blob/stable/1.6.x/django/db/models/sql/compiler.py#L568 ) gives me the following information:
- postgresql:
    "self.query.select" == "self.query.group_by" == "[(u'auth_group', u'id'), (u'auth_group', 'name')]"
    "self.connection.features.allows_group_by_pk" is False
    "
len(self.query.get_meta().concrete_fields) == len(self.query.select)" is False
   
'auth_group'.'name' appears in result because of "cols = self.query.group_by + having_group_by + select_cols"
- mysql:
    "self.query.select" == "self.query.group_by" == "[(u'auth_group', u'id'), (u'auth_group', 'name')]"
    "self.connection.features.allows_group_by_pk" is True
    "
len(self.query.get_meta().concrete_fields) == len(self.query.select)" is False
   
'auth_group'.'name' appears in result because of "cols = self.query.group_by + having_group_by + select_cols"

In the same time, the following code (without .values()):

    from django.contrib.auth.models import Group
    from django.db.models import Count
    print Group.objects.annotate(cnt=Count('user')).query.sql_with_params()

gives the right SQL query for mysql (because "len(self.query.get_meta().concrete_fields) == len(self.query.select)" is True):

'SELECT `auth_group`.`id`, `auth_group`.`name`, COUNT(`auth_user_groups`.`user_id`) AS `cnt` FROM `auth_group` LEFT OUTER JOIN `auth_user_groups` ON ( `auth_group`.`id` = `auth_user_groups`.`group_id` ) GROUP BY `auth_group`.`id` ORDER BY NULL'

Is it a bug? Should I create a bug report or something?

Thanks in advance!

Simon Charette

unread,
Aug 27, 2014, 2:37:00 AM8/27/14
to django...@googlegroups.com
This is already tracked in #19259.

Alex Lebedev

unread,
Aug 28, 2014, 7:28:18 AM8/28/14
to django...@googlegroups.com
Thanks for the answer! Yes, but this problem occurs regardless of database backend (I tested it on PostgreSQL and MySQL)

среда, 27 августа 2014 г., 12:37:00 UTC+6 пользователь Simon Charette написал:

Simon Charette

unread,
Aug 28, 2014, 10:36:19 AM8/28/14
to django...@googlegroups.com
The issue on MySQL should be tracked in another ticket I guess -- you're right that it should be able to GROUP only by `auth_group`.`id`.

Alex Lebedev

unread,
Aug 29, 2014, 1:03:03 AM8/29/14
to django...@googlegroups.com
Yes, I guess, in this case it should be able to group only by `auth_group`.`id` for each database backend.
Should I create ticket(s) about the issue on MySQL and/or the issue of columns in "group by" statement, or would you like to do it yourself?

четверг, 28 августа 2014 г., 20:36:19 UTC+6 пользователь Simon Charette написал:

Simon Charette

unread,
Aug 29, 2014, 1:50:01 AM8/29/14
to django...@googlegroups.com
I think you should create a single issue pointing to the fact that only selected columns (either through values() or only()) should be grouped by, regardless of the backend.

I suggest you include your initial post body in the report.

Thanks!

Alex Lebedev

unread,
Aug 29, 2014, 2:56:03 AM8/29/14
to django...@googlegroups.com
I have created a ticket - https://code.djangoproject.com/ticket/23383
Should I try to proceed my investigations and create a patch?

пятница, 29 августа 2014 г., 11:50:01 UTC+6 пользователь Simon Charette написал:
Reply all
Reply to author
Forward
0 new messages