LIMIT 21 on primary key lookups

1,585 views
Skip to first unread message

Alex French

unread,
Sep 14, 2014, 12:35:43 AM9/14/14
to django...@googlegroups.com
I'm using Django 1.7 and I noticed something odd in my postgres query logs. Almost every query has a "LIMIT 21" clause, including queries of the type "Thing.objects.get(pk=#)", which could only ever return one row. This behavior seems odd to me, but so far I haven't seen it come up in a place where it would necessarily be harmful. Is this something I should just ignore? Is it a bug?

Josip Lazic

unread,
Sep 14, 2014, 4:26:20 AM9/14/14
to django...@googlegroups.com
On Sunday, September 14, 2014 6:35:43 AM UTC+2, Alex French wrote:
I'm using Django 1.7 and I noticed something odd in my postgres query logs. Almost every query has a "LIMIT 21" clause, including queries of the type "Thing.objects.get(pk=#)", which could only ever return one row. This behavior seems odd to me, but so far I haven't seen it come up in a place where it would necessarily be harmful. Is this something I should just ignore? Is it a bug?

LIMIT clause is included only when printing results, as explained here: http://www.mail-archive.com/django...@googlegroups.com/msg67486.html 

Alex French

unread,
Sep 14, 2014, 9:12:36 AM9/14/14
to django...@googlegroups.com
I realize that's how it's supposed to work, but this happens regardless of whether or not results are printed. If you use postgres with the tutorial project and look at the query log, you can see the limit clause applied to primary key lookups.

Alex French

unread,
Sep 14, 2014, 11:45:49 AM9/14/14
to django...@googlegroups.com
Okay, after looking through the source code and documentation it seems that django uses the limit for primary key lookups as a cautionary measure and also to support the MultipleObjectsReturned exception in case something goes wrong.

Ben Collier

unread,
Sep 14, 2014, 12:53:08 PM9/14/14
to django...@googlegroups.com
So why 21, precisely?

On 14 September 2014 16:45, Alex French <kb1...@gmail.com> wrote:
Okay, after looking through the source code and documentation it seems that django uses the limit for primary key lookups as a cautionary measure and also to support the MultipleObjectsReturned exception in case something goes wrong.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/bbc1dbe4-3dff-4f03-80f9-3943c08aa051%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

James Bennett

unread,
Sep 14, 2014, 1:24:16 PM9/14/14
to django...@googlegroups.com
On Sun, Sep 14, 2014 at 12:48 PM, Ben Collier <bmco...@gmail.com> wrote:
So why 21, precisely?

I can't find the exact changeset in which it was introduced, but I do remember why :)

There was an incident involving a large Django installation at World Online, where we were attempting to isolate a bug in our code built on top of Django, and the real error was being hidden by the fact that every time we reproduced the bug, our development server would suddenly have a massive spike in memory use, essentially crashing things before we could see what was going on.

The cause of that was Django attempting to generate a debug page, which included the repr() of variables at various points in the call stack. One of those variables was a QuerySet which, at the point of the error, had not yet been filtered, and so had something like half a million results in it. Generating a string representation of that required instantiating every result in the QuerySet, which was the cause of the huge memory spike.

This was something that could bite any user of Django with large enough QuerySets, so the behavior was changed in upstream Django to only show 20 objects. Specifically, the behavior of repr() on a QuerySet does this:

1. Issue a query with LIMIT 21 to find out if there are more than 20 results.
2. If there are fewer than 20 results, simply display them all.
3. If there are more than 20 results, display only the first 20 plus a message indicating it's been truncated.

This means that repr() on a QuerySet will never instantiate huge numbers of objects, which avoids potentially gigantic in-memory sets of objects when trying to debug something. If you ever *need* to see further than 20 objects in, of course, you can use the slicing operators to adjust the limit and offset.

Tim Chase

unread,
Sep 14, 2014, 1:39:27 PM9/14/14
to django...@googlegroups.com
On 2014-09-14 13:23, James Bennett wrote:
> This was something that could bite any user of Django with large
> enough QuerySets, so the behavior was changed in upstream Django to
> only show 20 objects.
[snip]
> This means that repr() on a QuerySet will never instantiate huge
> numbers of objects,

A change for which I directly benefited, as my initial Django project
involved statements on corporate cell-phone bills, meaning that a
repr() of a bill might dump tens of thousands of line-items (and take
correspondingly long in a traceback where that repr() might appear
multiple times).

So thanks to those that implemented that change!

-tkc



Arnold Krille

unread,
Sep 15, 2014, 4:05:41 PM9/15/14
to django...@googlegroups.com
On Sun, 14 Sep 2014 17:48:18 +0100 Ben Collier <bmco...@gmail.com>
wrote:
> So why 21, precisely?

Because its half the answer?
signature.asc

Collin Anderson

unread,
Sep 17, 2014, 10:52:56 AM9/17/14
to django...@googlegroups.com
We simply picked a number, not too large and not too small. The number 20 is now hardcoded as MAX_GET_RESULTS at the top of this file:

Reply all
Reply to author
Forward
0 new messages