[Django] #35518: Avoid regex search for simple route patterns

22 views
Skip to first unread message

Django

unread,
Jun 10, 2024, 12:26:13 PM6/10/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake | Owner: Jake Howard
Howard |
Type: | Status: assigned
Cleanup/optimization |
Component: Core | Version: 5.0
(URLs) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 1
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
The `RoutePattern` assumes all routes provided include some form of
converter, and so needs to change it into a regex for matching. However,
if no converters are included in the string, the additional overhead of
using a regex vs simpler string operations is unnecessary.

Replacing this with a simpler string comparison results in between a 50
and 75% reduction in match time, which stacks up quickly as an application
generally has numerous URLs. This can be done by modifying the
`RoutePattern` internally, with no external breakages.

**Before**

{{{#!python
In [2]: endpoint_pattern = RoutePattern("foo/", "name", is_endpoint=True)

In [3]: pattern = RoutePattern("foo/", "name", is_endpoint=False)

In [4]: %timeit pattern.match("foo/")
441 ns ± 2.68 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)

In [5]: %timeit endpoint_pattern.match("foo/")
435 ns ± 0.974 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
}}}

**After**

{{{#!python
In [4]: %timeit pattern.match("foo/")
187 ns ± 1.84 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops
each)

In [5]: %timeit endpoint_pattern.match("foo/")
103 ns ± 0.192 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops
each)
}}}

Theoretically, these improvements get better based on the length of the
route pattern (although at this scale, not notably).

This optimisation could be done by adding a different kind of pattern (eg
`LiteralPattern`), but the added complexity to a project probably isn't
necessary, not to mention the migration effort to take advantage of this.
--
Ticket URL: <https://code.djangoproject.com/ticket/35518>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jun 12, 2024, 11:36:36 AM6/12/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* cc: Adam Johnson (added)
* resolution: => needsinfo
* status: assigned => closed

Comment:

This has similarities to #35252 hence cc-ing Adam here. I also think your
benchmarks are testing against 5.0, is that correct? That won't have those
changes. Can you benchmark against main or 5.1a1?
Closing as "needsinfo" for now 👍
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:1>

Django

unread,
Jun 12, 2024, 11:44:53 AM6/12/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Jake Howard):

These benchmarks were run against `main`, with those changes. Adam's
optimisation specifically targets optimising converting a route into a
regex, which isn't something changed as part of this. This is just about
avoiding matching against said regexes (we do still need to run
`_route_to_regex` to determine whether the route has any converters, and
thus whether it's "simple").
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:2>

Django

unread,
Jun 12, 2024, 1:27:03 PM6/12/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Adam Johnson):

If I understand correctly, the idea is that `_route_to_regex()` may return
a “plain” string if no converters are found. Then in
`RoutePattern.match()`, any such “plain” strings will be matched with
`str.startswith` or `==`, depending on `_is_endpoint`, avoiding the
overhead of performing a regex search?

It seems like a legitimate optimization, but saving nanoseconds is maybe a
little small for the added complexity. I suppose it would multiply across
the number of routes.

Can you make a draft PR with your patch? Does it pass all existing tests?
We’d need to be extra careful here.
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:3>

Django

unread,
Jun 12, 2024, 3:39:45 PM6/12/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Jake Howard):

You're exactly right. It's a 'relatively' simple change, although does add
a little complexity.

~300ns doesn't sound like much I agree, but this saving is per route per
request, as opposed to just once per request, so for large projects with
lots of URLs it can stack up. I can source some benchmarks of `resolve`
from a sensibly-sized project if that would help with review?

[https://github.com/django/django/pull/18270 PR] for your perusal.
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:4>

Django

unread,
Jun 12, 2024, 4:53:10 PM6/12/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Adam Johnson):

Thanks for the PR. The change is a little less invasive than I expected.

A whole-project benchmark would be good for posterity. There are also some
benchmarks in djangobench, called
[https://github.com/django/djangobench/tree/master/djangobench/benchmarks
url_resolve*], which we can check.
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:5>

Django

unread,
Jun 13, 2024, 3:31:41 AM6/13/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Jake Howard):

It looks like `djangobench` only has benchmarks for `re_path` and not
`path` (this change only affects `path`). With some guidance, I'm happy to
add some benchmarks in (somehow ignoring Django versions which didn't have
`path`), but I'll try and get some project benchmarks anyway.
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:6>

Django

unread,
Jun 13, 2024, 3:57:55 AM6/13/24
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.0
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Jake Howard):

I've run some benchmarks against https://github.com/wagtail/bakerydemo,
comparing `main` to my branch, intentionally choosing a path without a
match to get a worst-case benchmark:

**Before**:

{{{#!python
In [2]: %timeit -n 10000 resolve("/foo/")
44.8 µs ± 18.5 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
}}}


**After**:

{{{#!python
In [3]: %timeit -n 10000 resolve("/foo/")
39.7 µs ± 4.24 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
}}}

That's much less noticeable than I was expecting, and realistically within
the margin of error. `bakerydemo` isn't the largest project, and I suspect
that because the URL pattern tree is hierarchical, it'll only make a
notable difference when a given node has a large number of children,
rather than being useful in blanket. That's still likely to cover a non-
zero number of projects.
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:7>

Django

unread,
May 2, 2025, 4:04:35 AMMay 2
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: new
Component: Core (URLs) | Version: 5.2
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jake Howard):

* resolution: needsinfo =>
* status: closed => new
* version: 5.0 => 5.2

Comment:

With some (slightly) less real-world examples, it's much clearer to see
the benefit (running against `0f5dd0dff3049189a3fe71a62670b746543335d5`:

{{{#!python
import django
from django.conf import settings
from django.urls import resolve, path
from django.http import HttpResponse

settings.configure(
ROOT_URLCONF=__name__
)

def main_view(request):
return HttpResponse()

urlpatterns = [
path(f"foo{i}", main_view)
for i in range(10_000)
]

django.setup()
}}}

Before:

{{{#!python
In [2]: %timeit resolve("/foo9999")
5.61 ms ± 103 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
}}}


After:

{{{#!python
In [1]: %timeit resolve("/foo9999")
3.08 ms ± 52.3 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
}}}

A 45% improvement, with reduced variance, definitely feels worthwhile to
me. It's _similar_ to #35252, but that optimises the `RoutePattern`
definition (startup cost), whereas this affects runtime.
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:8>

Django

unread,
May 7, 2025, 6:19:18 AMMay 7
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: new
Component: Core (URLs) | Version: 5.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* stage: Unreviewed => Accepted

--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:9>

Django

unread,
May 13, 2025, 5:03:53 AMMay 13
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: new
Component: Core (URLs) | Version: 5.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:10>

Django

unread,
May 13, 2025, 7:06:09 AMMay 13
to django-...@googlegroups.com
#35518: Avoid regex search for simple route patterns
-------------------------------------+-------------------------------------
Reporter: Jake Howard | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Core (URLs) | Version: 5.2
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce <42296566+sarahboyce@…>):

* resolution: => fixed
* status: new => closed

Comment:

In [changeset:"f920937c8a63df6bea220e4386f59cdb45b2e355" f920937c]:
{{{#!CommitTicketReference repository=""
revision="f920937c8a63df6bea220e4386f59cdb45b2e355"
Fixed #35518 -- Optimized RoutePattern by using string operations for
converter-less routes.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35518#comment:11>
Reply all
Reply to author
Forward
0 new messages