[Django] #35467: Prefer urlsplit() over urlparse()

43 views
Skip to first unread message

Django

unread,
May 20, 2024, 8:47:59 AM5/20/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
------------------------------------------------+------------------------
Reporter: Adam Johnson | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Utilities | Version: dev
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
------------------------------------------------+------------------------
Many places in Django use
[`urlparse()`](https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse),
which supports the rarely-used “path parameter” syntax (not to be confused
with query parameters). The `urlsplit()` function is similar but does not
parse such path parameters, which makes it a bit faster.

I think most or all calls to `urlparse()` can be replaced with
`urlsplit()`. This make make a small but measurable performance difference
in common paths, such as in `CsrfViewMiddleware` or the test `Client`.

See more in this Anthony Sottile video:
https://www.youtube.com/watch?v=ABJvdsIANds
--
Ticket URL: <https://code.djangoproject.com/ticket/35467>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
May 20, 2024, 8:49:12 AM5/20/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Adam Johnson:

Old description:

> Many places in Django use
> [`urlparse()`](https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse),
> which supports the rarely-used “path parameter” syntax (not to be
> confused with query parameters). The `urlsplit()` function is similar but
> does not parse such path parameters, which makes it a bit faster.
>
> I think most or all calls to `urlparse()` can be replaced with
> `urlsplit()`. This make make a small but measurable performance
> difference in common paths, such as in `CsrfViewMiddleware` or the test
> `Client`.
>
> See more in this Anthony Sottile video:
> https://www.youtube.com/watch?v=ABJvdsIANds

New description:

Many places in Django use
[`urlparse()`](https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse),
which supports the rarely-used “path parameter” syntax (not to be confused
with query parameters). The `urlsplit()` function is similar but does not
parse such path parameters, which makes it a bit faster.

I think most or all calls to `urlparse()` can be replaced with
`urlsplit()`, and similarly `urlunparse()` with `urlunsplit()`. This may
make a small but measurable performance difference in common paths, such
as in `CsrfViewMiddleware` or the test `Client`.

See more in this Anthony Sottile video:
https://www.youtube.com/watch?v=ABJvdsIANds , where he reports a 3% import
time improvement on the Stripe project.

--
--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:1>

Django

unread,
May 21, 2024, 5:35:41 AM5/21/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
--------------------------------------+------------------------------------
Reporter: Adam Johnson | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by Sarah Boyce):

* stage: Unreviewed => Accepted

Comment:

Accepting for someone to make updates and confirm with benchmarks. 👍
--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:2>

Django

unread,
May 21, 2024, 6:00:05 PM5/21/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jake Howard):

* owner: nobody => Jake Howard
* status: new => assigned

Comment:

From some basic testing, it looks like this should have a nice
improvement:

{{{#!python
In [1]: import urllib.parse

In [2]: %timeit urllib.parse.urlparse("https://example.com")
1.52 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)

In [3]: %timeit urllib.parse.urlsplit("https://example.com")
258 ns ± 1.14 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops
each)
}}}

The difference stays about constant, even with longer URLs. The specifics
will obviously vary by hardware and Python version (the above is 3.12.3),
but ~6x improvement is definitely worthwhile.
--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:3>

Django

unread,
May 22, 2024, 4:53:33 AM5/22/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jake Howard):

* has_patch: 0 => 1

Comment:

[https://github.com/django/django/pull18187 PR].
--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:4>

Django

unread,
May 27, 2024, 1:01:00 PM5/27/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* needs_better_patch: 0 => 1
* needs_tests: 0 => 1

--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:5>

Django

unread,
May 29, 2024, 5:49:39 AM5/29/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: assigned
Component: Utilities | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jake Howard):

* needs_better_patch: 1 => 0
* needs_tests: 1 => 0

Comment:

Comments addressed / replied-to in PR.
--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:6>

Django

unread,
May 29, 2024, 4:35:43 PM5/29/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Utilities | Version: dev
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* resolution: => fixed
* status: assigned => closed

Comment:

Fixed by ff308a06047cd60806d604a7cf612e5656ee2ac9
--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:7>

Django

unread,
May 29, 2024, 5:23:18 PM5/29/24
to django-...@googlegroups.com
#35467: Prefer urlsplit() over urlparse()
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Jake
Type: | Howard
Cleanup/optimization | Status: closed
Component: Utilities | Version: dev
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/35467#comment:8>
Reply all
Reply to author
Forward
0 new messages