On Monday 03 June 2013, Patryk Zawadzki wrote:
> Here's the ticket:
>
>
https://code.djangoproject.com/ticket/6785
>
> tl;dr: Calling .get() on a badly filtered queryset can result in lots
> of database rows being fetched and wrapped in Python objects for no
> gain.
>
tl;dr: There's a general, valuable optimization to be made here, but it should
be implemented at a lower level.
I have raised a related issue about a year and a half ago[0], thinking
(mistakenly) that it was mostly an Oracle issue. Ian Kelly had pointed me in
the right direction, but then life happened... The problem is that for all
single-row queries except aggregates, Django uses the same strategy as used in
get (and which you kept using in your patch): limit the query (usually)
properly, then fetch until no more records are retrieved (fetching in done in
chunks using fetchmany(), and not using fetchall()).
For single-row queries, this means two fetches; an unnecessary network
roundtrip, for each such query, unless the backend or underlying driver take
care to prevent it. On Oracle, at least, nobody does.
What we should do instead, in my opinion, is stop fetching as soon as a chunk
arrives that isn't full. This alone should reduce the number of network
accesses significantly on most Django projects.
While we're doing that, we could also define a new mode of sql execution (see
SQLCompiler.results_iter(), in django.db.models.sql.compiler): Next to the
current SINGLE (which uses fetchone()) and MULTI (the chunked loop described
above) we could have SINGLE_VERIFY, which always fetches a single chunk of
size 2. Make get() use that, and #6785 is fixed, while improving performance --
rather than hurting it on correct usage, like the proposed patch[1] does.
Assuming nobody raises objections, I intend to propose a patch along these
lines sometime in the coming weeks (it's one of a few things on my to-do
list), but I wouldn't mind at all if someone beat me to it.
Hope this helps,
Shai.
[0]
https://groups.google.com/d/msg/django-developers/SQ0Pltt_f0M/3ccQn0aauiEJ
[1]
https://github.com/django/django/pull/1139