This is coming off the back of the report in #33015, which is a nice
pathological case we can use:
{{{
In [1]: from django.db.models import Q
In [2]: from testapp.models import ExampleModel
In [3]: items_to_fetch = [(i, j) for i in range(0,100) for j in
range(0,100)]
In [4]: query = Q()
...: for v1, v2 in items_to_fetch:
...: query |= Q(val1=v1, val2=v2)
%timeit x = ExampleModel.objects.filter(query)
644 ms ± 6.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
}}}
If we cProfile that `ExampleModel.objects.filter(query)` we get something
like:
{{{
ncalls tottime percall cumtime percall filename:lineno(function)
30001/1 0.178 0.000 1.268 1.268 query.py:1221(build_filter)
40000 0.142 0.000 0.157 0.000 query.py:1457(names_to_path)
10002/1 0.077 0.000 1.268 1.268 query.py:1386(_add_q)
20000 0.046 0.000 0.379 0.000 query.py:1156(build_lookup)
}}}
and if we do `%lprun -f Query.build_filter -f Query.build_lookup
ExampleModel.objects.filter(query)` on it we'll get (snipped for brevity):
{{{
Function: build_filter at line 1221
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1329 20000 651989.0 32.6 24.9 condition =
self.build_lookup(lookups, col, value)
Function: build_lookup at line 1156
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1195 20000 200730.0 10.0 38.1 if
(connections[DEFAULT_DB_ALIAS].features.interprets_empty_strings_as_nulls
and
1196
lookup_name == 'exact' and lookup.rhs == ''):
1197 return
lhs.get_lookup('isnull')(lhs, True
}}}
Because I'm using sqlite, that 38% of time is essentially free to reclaim
by switching the if condition as above, in which case we get cProfile
output like so:
{{{
ncalls tottime percall cumtime percall filename:lineno(function)
30001/1 0.171 0.000 1.082 1.082 query.py:1220(build_filter)
40000 0.145 0.000 0.159 0.000 query.py:1456(names_to_path)
10002/1 0.076 0.000 1.082 1.082 query.py:1385(_add_q)
20000 0.042 0.000 0.177 0.000 query.py:1559(setup_joins)
220005 0.039 0.000 0.064 0.000 {built-in method
builtins.isinstance}
20000 0.037 0.000 0.223 0.000 query.py:1156(build_lookup)
}}}
and line profile information like so:
{{{
Function: build_filter at line 1220
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1328 20000 447682.0 22.4 18.9 condition =
self.build_lookup(lookups, col, value)
Function: build_lookup at line 1156
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1195 20000 12744.0 0.6 3.9 if (lookup_name
== 'exact' and lookup.rhs == '' and
1196
connections[DEFAULT_DB_ALIAS].features.interprets_empty_strings_as_nulls):
1197 return
lhs.get_lookup('isnull')(lhs, True
}}}
which ultimately changes the timed performance for non oracle backends to:
{{{
In [5]: %timeit x = ExampleModel.objects.filter(query)
547 ms ± 3.59 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
}}}
(YMMV; it's being tested on a laptop and I've seen it ranging from 519ms
to 600ms average.)
Patch will be forthcoming as a PR shortly, to confirm it doesn't break CI.
--
Ticket URL: <https://code.djangoproject.com/ticket/33025>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: 0 => 1
Comment:
PR is https://github.com/django/django/pull/14769
So far, a linter is telling me off ... because of course it is. I hate it.
And 3 CI jobs died ''immediately'' (shockingly quickly, never even saw
them go to pending) having failed to do a git checkout?
I'm going to hope that at least some of the others pass...
--
Ticket URL: <https://code.djangoproject.com/ticket/33025#comment:1>
* stage: Unreviewed => Accepted
Comment:
Yes asgiref's `Local` is "slow" see #32312, although I'm not sure what's
possible to improve its performance.
--
Ticket URL: <https://code.djangoproject.com/ticket/33025#comment:2>
* needs_better_patch: 1 => 0
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/33025#comment:3>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"bf5abf1bdcedb15e949db419c61eeec7c88414ea" bf5abf1]:
{{{
#!CommitTicketReference repository=""
revision="bf5abf1bdcedb15e949db419c61eeec7c88414ea"
Fixed #33025 -- Avoided accessing the database connections in
Query.build_lookup() when not necessary.
Of the built-in backends, only Oracle treats empty strings and nulls as
equal, so avoid testing the default connection backend for
interprets_empty_strings_as_nulls if it can be established from the
lookup that it wouldn't affect the lookup instance returned. This
improves performance a small amount for most lookups being built,
because accessing the connections requires touching the thread critical
`Local` which is an expensive operation.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33025#comment:4>