[Django] #25377: COUNT(*) not COUNT('*')

16 views
Skip to first unread message

Django

unread,
Sep 10, 2015, 12:04:18 PM9/10/15
to django-...@googlegroups.com
#25377: COUNT(*) not COUNT('*')
----------------------------------------------+--------------------
Reporter: adamchainz | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Database layer (models, ORM) | Version: 1.8
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 1
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
Performance regression on 1.8 expression refactor. Count stopped
outputting `COUNT(*)` and started outputting `COUNT('*')` due to using a
`Value`. Possibly unnoticed until now because it doesn't affect the most
popular database options much - I can't measure a performance drop for
this on MySQL / InnoDb.

However I have managed to measure a performance drop on MariaDB + Aria
(=MySQL fork + MyISAM fork):

{{{
adamj@localhost [3]> select count(*) from count_test;
+----------+
| count(*) |
+----------+
| 10000 |
+----------+
1 row in set (0.00 sec)

adamj@localhost [10]> select benchmark(100 * 1000 * 1000, (select count(*)
from count_test));
+-----------------------------------------------------------------+
| benchmark(100 * 1000 * 1000, (select count(*) from count_test)) |
+-----------------------------------------------------------------+
| 0 |
+-----------------------------------------------------------------+
1 row in set (0.86 sec)

adamj@localhost [11]> select benchmark(100 * 1000 * 1000, (select
count('*') from count_test));
+-------------------------------------------------------------------+
| benchmark(100 * 1000 * 1000, (select count('*') from count_test)) |
+-------------------------------------------------------------------+
| 0 |
+-------------------------------------------------------------------+
1 row in set (1.23 sec)

}}}

This is because MyISAM / Aria store the count in a metadata variable but
once you introduce the expression it figures it has to do a table scan.
The situation would of course only get worse with more rows, 10000 is
tiny.

--
Ticket URL: <https://code.djangoproject.com/ticket/25377>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 10, 2015, 12:07:42 PM9/10/15
to django-...@googlegroups.com
#25377: COUNT(*) not COUNT('*')
-------------------------------------+-------------------------------------
Reporter: adamchainz | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by adamchainz):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Comment:

Pull request: https://github.com/django/django/pull/5260

--
Ticket URL: <https://code.djangoproject.com/ticket/25377#comment:1>

Django

unread,
Sep 10, 2015, 12:51:33 PM9/10/15
to django-...@googlegroups.com
#25377: Regression in expressions refactor causes database queries to run
COUNT('*') instead of COUNT(*)
-------------------------------------+-------------------------------------
Reporter: adamchainz | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 0 => 1
* stage: Unreviewed => Accepted
* severity: Normal => Release blocker


--
Ticket URL: <https://code.djangoproject.com/ticket/25377#comment:2>

Django

unread,
Sep 10, 2015, 12:57:42 PM9/10/15
to django-...@googlegroups.com
#25377: Regression in expressions refactor causes database queries to run
COUNT('*') instead of COUNT(*)
-------------------------------------+-------------------------------------
Reporter: adamchainz | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by adamchainz):

I'm not 100% confident in my performance measurement there, I just tried
on a multi-million row table on my production DB converted to MyISAM and
`COUNT(*)` measured the same as `COUNT('*')`. However since there are so
many database versions and storage engines out there I think it's probable
this caused a regression somewhere.

--
Ticket URL: <https://code.djangoproject.com/ticket/25377#comment:3>

Django

unread,
Sep 11, 2015, 9:56:09 AM9/11/15
to django-...@googlegroups.com
#25377: Regression in expressions refactor causes database queries to run
COUNT('*') instead of COUNT(*)
-------------------------------------+-------------------------------------
Reporter: adamchainz | Owner: nobody

Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/25377#comment:4>

Django

unread,
Sep 14, 2015, 1:42:57 PM9/14/15
to django-...@googlegroups.com
#25377: Regression in expressions refactor causes database queries to run
COUNT('*') instead of COUNT(*)
-------------------------------------+-------------------------------------
Reporter: adamchainz | Owner: nobody
Type: | Status: closed

Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Release blocker | Resolution: fixed

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: new => closed
* resolution: => fixed


Comment:

In [changeset:"3fe3887a2ed94f7b15be769f6d81571031ec5627" 3fe3887a]:
{{{
#!CommitTicketReference repository=""
revision="3fe3887a2ed94f7b15be769f6d81571031ec5627"
Fixed #25377 -- Changed Count queries to execute COUNT(*) instead of
COUNT('*').
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/25377#comment:5>

Django

unread,
Sep 14, 2015, 2:14:38 PM9/14/15
to django-...@googlegroups.com
#25377: Regression in expressions refactor causes database queries to run
COUNT('*') instead of COUNT(*)
-------------------------------------+-------------------------------------
Reporter: adamchainz | Owner: nobody

Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 1.8
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"3c2c74f58f9261feaa5ad9dfb596213a654c2e25" 3c2c74f]:
{{{
#!CommitTicketReference repository=""
revision="3c2c74f58f9261feaa5ad9dfb596213a654c2e25"
[1.8.x] Fixed #25377 -- Changed Count queries to execute COUNT(*) instead
of COUNT('*').

Backport of 3fe3887a2ed94f7b15be769f6d81571031ec5627 from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/25377#comment:6>

Reply all
Reply to author
Forward
0 new messages