#35252: Optimize django.urls.resolvers._route_to_regex()
-------------------------------------+-------------------------------------
Reporter: Adam | Owner: Adam Johnson
Johnson |
Type: | Status: assigned
Cleanup/optimization |
Component: Core | Version: dev
(URLs) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 1
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
`_route_to_regex()` converts Django’s parameterized route syntax into a
regular expression. Whilst working on #35250, I noticed several
opportunities for optimizing this function:
1. It has O(n^2) runtime from slicing the route string per parameter,
repeatedly copying the remainder. A single search for all parameters would
have O(n) runtime instead.
2. Within my 950 URL project, there are many repeat calls with the same
route, such as the empty string or ModelAdmin URL suffixes like
'<path:object_id>/history/'. I think this would be typical of many
projects, so it makes sense to add caching.
3. `match.start()` and `match.end()` are called separately, when
`match.span()` gives both values in one function call.
4. `get_converter()` is unnecessary, the function can fetch all converters
once with `get_converters()` and use the dictionary directly.
4. Casting each parameter string to a set for the whitespace check is a
bit costly, it’s faster to use a whitespace set scanning the string.
5. An f-string can be used for concatenation, which is a little bit
faster.
Applying these optimizations makes the function significantly faster,
especially for more parameters. Below are some benchmarks.
Before optimization stats (Python 3.12, macOS, M1 mac, Django main
branch):
* Converting a seven parameter route:
{{{
In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True)
12.3 µs ± 68 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
each)
}}}
* Profiling a project’s system checks: 950 calls take 9ms, ~1.5% of the
total runtime.
After optimization:
* Converting that seven parameter route is ~2x faster (with caching
removed):
{{{
In [2]: %timeit _route_to_regex("<a>/<b>/<c>/<d>/<e>/<f>/<g>", True)
5.49 µs ± 18.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops
each)
}}}
* Adding caching makes repeat calls ~100x faster, from 5µs to 50ns.
* The profile shows those 950 calls now take 5ms, ~50% faster, ~0.8% of
total runtime.
--
Ticket URL: <
https://code.djangoproject.com/ticket/35252>
Django <
https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.