Should Django use Ada?

263 views
Skip to first unread message

Michael Lissner

unread,
Apr 1, 2024, 4:36:55 PMApr 1
to Django developers (Contributions to Django itself)
Hi all,

A few years ago, I reported a vulnerability in Django because Python wasn't parsing URLs containing tabs or newlines correctly. In this ticket, it was fixed in Python:


But Python, being maintained mostly by volunteers, did the minimum needed work to fix the vulnerability without really fixing the urlparse library properly.

This means that it's probably still possible to send a URL to django that urlparse doesn't know how to handle. When this happens:

1. It could still be a vulnerability.* If this is the case, Django could redirect people to domains where it shouldn't.

2. It could fail to parse the URL properly, leading to the wrong URL being provided to the user.

3. urlparse could decide it's an invalid URL even though it's not.

This is all pretty bad, but there is some hope in the form of a tool called Ada, which aims to actually support URL parsing properly:

Github (more useful, really): https://github.com/ada-url/ada

It's written in C++, is used in Node and Cloudflare Workers. It has bindings for Python, Rust, R, and Go. It's licensed under MIT and Apache License 2.0. It's fuzzed by Google OSS Fuzzer, and it's much faster than urlparse.

I'm curious: Would Django consider switching to this library? I'm not sure if I'll have time to do the work, but I can at least open an issue if it's a useful switch to make, and I might be able to assign a developer to it if this is something we want.

Love to hear thoughts,

Mike


* I'm posting this publicly because this kind of vulnerability is really well known these days, and exists across most general-purpose languages. URLs are just very difficult to parse properly.

Dylan Reinhold

unread,
Apr 1, 2024, 5:02:14 PMApr 1
to django-d...@googlegroups.com
I always wonder why people feel the need to belittle others' work with statements like " But Python, being maintained mostly by volunteers, did the minimum needed work to fix the vulnerability without really fixing the urlparse library properly."
But then add something about their time being too valuable to work on making it better. 


Dylan


--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/f31bc17b-c0c5-4ce4-9999-7d1ec3dfe90bn%40googlegroups.com.

Jörg Breitbart

unread,
Apr 1, 2024, 6:46:09 PMApr 1
to django-d...@googlegroups.com
You write:

"It could still be a vulnerability ... / It could fail to parse ... /
could decide it's invalid - This is all pretty bad..."

I agree - this indeed would be really bad, if it can be used in
malicious ways. But note that the fact that django or an upstream lib
decided to slightly deviate from the latest URL parsing spec incarnation
does not make it vulnerable per se. URL specs (or URI in general) used
to contradict itself across various RFCs, so there is some ground of
interpretation and which rules to follow in an implementation. Also
django has to maintain backwards compat to some degree, and introducing
a foreign c++ lib binding in its default installation is a very bold move.

Anything into this direction needs proper justification and not just
handwaving arguments (FUD?), unless there actually is a real
vulnerability with the current impl.

Cheers,
Jörg

Adrián Salatino

unread,
Apr 2, 2024, 1:20:13 PMApr 2
to django-d...@googlegroups.com
You should probably be addressing urllib devs with this inquiry (e.g. such vuln is then probably in many other web frameworks). Anyhow, just out of curiosity, wouldn't it be possible to use functools.partial function to replace urllib.parse.urlparse with ada-python in settings.py? Or make some kind of django extension that integrates this other dependency?

--

Adam Johnson

unread,
Apr 2, 2024, 3:24:05 PMApr 2
to django-d...@googlegroups.com
I agree with Jörg. We need evidence of problems before we decide to act, and that those problems aren’t being addressed in Python. Forcing a new dependency on all users is not something we’d do lightly.

On the contradictory standards, see the cURL maintainer’s post: https://daniel.haxx.se/blog/2022/10/18/deviating-from-specs/ .

Anyhow, just out of curiosity, wouldn't it be possible to use functools.partial function to replace urllib.parse.urlparse with ada-python in settings.py? Or make some kind of django extension that integrates this other dependency?

It should always be possible to create and use custom classes that use ada, such as an alternative URL field.
-- 
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

Michael Lissner

unread,
Apr 2, 2024, 6:28:10 PMApr 2
to Django developers (Contributions to Django itself)
Thanks for the replies everybody. A few thoughts....

From Adrián:

> You should probably be addressing urllib devs with this inquiry (e.g. such vuln is then probably in many other web frameworks)

I did that in 2021 when I found the issue with newlines in URLs. Python devs had the resources to fix the newlines but not to make urlparse spec/browser compliant. They had concerns about backwards compatibility (fair enough), and it seemed like they'd have to largely rewrite the library to do things properly b/c URLs are so nasty these days. I think they also felt like they couldn't keep up with the standard and that the language shouldn't try. Instead, they argued the usual thing: folks should use third party libraries that can be better and that can change more quickly (fair enough). 

Until Ada, I hadn't heard of better solutions, so I let it lay.

From Jorg:

> the fact that django or an upstream lib decided to slightly deviate from the latest URL parsing spec incarnation does not make it vulnerable per se

From Adam:

> On the contradictory standards, see the cURL maintainer’s post

I agree, Jorg, and thanks for the cURL reference, Adam. Specs aren't my target so much as "what browsers do," and wherever our URL parser diverges from what browsers do, we risk a redirect vulnerability. It's been a few years since I worked on this issue, but IIRC, this particular spec is actually well aligned with what browsers do, so they're essentially one and the same.

I understand the push back that we need proof of an issue here before we'd move forward with anything. WHATHG provides a test suite of nasty URLs. I guess what I should do is run that through urlparse and look for places where it fails. If, for example, it's possible to send a valid URL to urlparse and have it get the wrong (sub)domain name, we would consider that a vulnerability or at least an issue, right?

-----

That's it for my substantive comments, but I want to reply to this too:

> But then add something about their time being too valuable to work on making it better.

I don't know C, so I can't help much with the Python language, but I am here, where I have more expertise. You'll note that I offered to assign a paid developer to adding Ada to Django if we wanted to. That's me being busy with other priorities, but offering resources from my org. If that's not good enough, I don't know what is. 

Also, I'm not denigrating Python by saying it's maintained by volunteers — in my experience, it is.  The fact that Python doesn't have tons of resources is one of the reasons it was difficult to get this vulnerability fixed in the language itself. They did a minimal fix because a bigger one wasn't possible given the resources of those that work on the language (and concerns about backwards compat). 

Finally, you might also consider that I spent a lot of my time working on the vulnerability above, and that I contribute to other open source projects practically every day for the last decade. Point being: If you want to drive me away from contributing here, you're on your way, but I'm here trying to help, and I've got a record of doing so here and elsewhere in various ways.

Sorry to others here for the off topic response. Though I probably should, I can't let that kind of comment go in the Django discussion group.

Thank you all,

Mike
Reply all
Reply to author
Forward
0 new messages