Is 'LIMIT 0' a valid query?

2 views
Skip to first unread message

Ben Walton

unread,
Dec 11, 2007, 5:20:10 AM12/11/07
to django-d...@googlegroups.com
I was just inspecting a couple of queries my application was
outputting. I noticed that a COUNT(*) query was made, followed by a
SELECT later on, as expected, which used the result from COUNT(*) to
LIMIT the queryset to the same results.

The COUNT(*) in this case returned 0, and the SELECT which followed it
ran the full query but with LIMIT 0 on the end.

I was wondering if this was a valid query, and if it needs to be ran
when the result is going to be an empty queryset. Or if the query will
be short-circuited by the database back end so no joins are actually
made.

I've not raised a ticket as I'm not sure if it's a non-issue so I
thought I'd ask on here first.

Thanks,
Ben Walton

Patryk Zawadzki

unread,
Dec 11, 2007, 5:28:36 AM12/11/07
to django-d...@googlegroups.com
2007/12/11, Ben Walton <benw...@gmail.com>:

>
> I was just inspecting a couple of queries my application was
> outputting. I noticed that a COUNT(*) query was made, followed by a
> SELECT later on, as expected, which used the result from COUNT(*) to
> LIMIT the queryset to the same results.
>
> The COUNT(*) in this case returned 0, and the SELECT which followed it
> ran the full query but with LIMIT 0 on the end.

Speaking of which, since we want a number of all matching rows,
wouldn't COUNT(1) be more optimal?

--
Patryk Zawadzki
PLD Linux Distribution

Malcolm Tredinnick

unread,
Dec 11, 2007, 6:01:08 AM12/11/07
to django-d...@googlegroups.com

Very much depends on the database backends. SOme are optimised for
count(*), others do count(<something>) a bit faster.

Malcolm

--
The only substitute for good manners is fast reflexes.
http://www.pointy-stick.com/blog/

Patryk Zawadzki

unread,
Dec 11, 2007, 6:16:34 AM12/11/07
to django-d...@googlegroups.com
2007/12/11, Malcolm Tredinnick <mal...@pointy-stick.com>:

> On Tue, 2007-12-11 at 11:28 +0100, Patryk Zawadzki wrote:
> > 2007/12/11, Ben Walton <benw...@gmail.com>:
> > >
> > > I was just inspecting a couple of queries my application was
> > > outputting. I noticed that a COUNT(*) query was made, followed by a
> > > SELECT later on, as expected, which used the result from COUNT(*) to
> > > LIMIT the queryset to the same results.
> > >
> > > The COUNT(*) in this case returned 0, and the SELECT which followed it
> > > ran the full query but with LIMIT 0 on the end.
> >
> > Speaking of which, since we want a number of all matching rows,
> > wouldn't COUNT(1) be more optimal?
>
> Very much depends on the database backends. SOme are optimised for
> count(*), others do count(<something>) a bit faster.

Is there any documentation coverage? I'd like to know if switching to
COUNT(1) (which is considered a good practice and taught in DBA
classes) would cause any penalty and if not file a bug to do this at
least as a good example.

Semantics of COUNT require counting all non-NULL values and some more
naive databases might actually try to fetch all the columns while 1
comes from DUAL and requires no actual data fetching.

Ian Kelly

unread,
Dec 11, 2007, 10:43:26 AM12/11/07
to django-d...@googlegroups.com

This surprises me if it's true. count(*) just counts all result rows
without checking anything for null, whereas count(1) must count all
rows for which the expression 1 is not null. count(*) is therefore
simpler and intuitively easier to optimize.

Ian

Patryk Zawadzki

unread,
Dec 11, 2007, 12:20:12 PM12/11/07
to django-d...@googlegroups.com
2007/12/11, Ian Kelly <ian.g...@gmail.com>:

Actually what you describe is COUNT(1), COUNT(*) has been subject to
such abuse that most database engines actually substitute it with
either COUNT(0) or COUNT(1) in early parsing phases. (This is true for
Postrgres, MySQL, MS SQL and Oracle at least.)

A true COUNT(*) would select each row's data for all columns and then
compare the set with NULL.

I believe COUNT(1) was the correct way to do it even before I was born ;)

flo...@gmail.com

unread,
Dec 11, 2007, 12:26:36 PM12/11/07
to Django developers
I'm not sure if you're correct that Postgres substitutes it, as I've
noticed a definite speedup doing count(1) versus count(*) on large
datasets.

-Eric Florenzano

On Dec 11, 11:20 am, "Patryk Zawadzki" <pat...@gmail.com> wrote:
> 2007/12/11, Ian Kelly <ian.g.ke...@gmail.com>:

Ian Kelly

unread,
Dec 11, 2007, 4:32:12 PM12/11/07
to django-d...@googlegroups.com
On Dec 11, 2007 10:20 AM, Patryk Zawadzki <pat...@gmail.com> wrote:
>
> 2007/12/11, Ian Kelly <ian.g...@gmail.com>:
> > On Dec 11, 2007 4:16 AM, Patryk Zawadzki <pat...@gmail.com> wrote:
> > > Semantics of COUNT require counting all non-NULL values and some more
> > > naive databases might actually try to fetch all the columns while 1
> > > comes from DUAL and requires no actual data fetching.
> > This surprises me if it's true. count(*) just counts all result rows
> > without checking anything for null, whereas count(1) must count all
> > rows for which the expression 1 is not null. count(*) is therefore
> > simpler and intuitively easier to optimize.
>
> Actually what you describe is COUNT(1), COUNT(*) has been subject to
> such abuse that most database engines actually substitute it with
> either COUNT(0) or COUNT(1) in early parsing phases. (This is true for
> Postrgres, MySQL, MS SQL and Oracle at least.)

I wouldn't know about the others, but at least for Oracle, this is
exactly the opposite of what I've previously heard, which is that
count(1) is optimized to be equivalent to count(*) [0]

> A true COUNT(*) would select each row's data for all columns and then
> compare the set with NULL.

Not according to any sql documentation that I've ever read. I've
checked all 4 supported engines and the sql-92 standard, and all say
that count(*) is just the cardinality of the full result set.

Ian

[0] http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:40208915257337

Reply all
Reply to author
Forward
0 new messages