[Django] #35607: Improve Storage base backend API Flexibility

6 views
Skip to first unread message

Django

unread,
Jul 16, 2024, 10:39:56 AM7/16/24
to django-...@googlegroups.com
#35607: Improve Storage base backend API Flexibility
-------------------------------------+-------------------------------------
Reporter: Natalia Bidart | Type:
| Cleanup/optimization
Status: new | Component: Core
| (Other)
Version: | Severity: Normal
Keywords: storages | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Currently, Django's Storage base backend API provides limited flexibility
with regards to customization, particularly concerning filename
validation. This rigidity has historically led to challenges and security
concerns. See for example [https://nvd.nist.gov/vuln/detail/CVE-2024-39330
CVE-2024-39330], [https://nvd.nist.gov/vuln/detail/CVE-2021-45452
CVE-2021-45452], and [https://nvd.nist.gov/vuln/detail/CVE-2021-31542
CVE-2021-31542].

To address this, I'm proposing revisiting and enhancing the public API of
storage backends to support customizable validation methods. This was also
discussed internally in the Django Security mailing list. This public
validation method would provide a default implementation very similar to
the current validations, and should be used to replace the ad-hoc
validations being done in save, generate_filename, and get_available_name.
--
Ticket URL: <https://code.djangoproject.com/ticket/35607>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jul 18, 2024, 12:10:00 PM7/18/24
to django-...@googlegroups.com
#35607: Improve Storage base backend API Flexibility
-------------------------------------+-------------------------------------
Reporter: Natalia Bidart | Owner: (none)
Type: | Status: new
Cleanup/optimization |
Component: Core (Other) | Version:
Severity: Normal | Resolution:
Keywords: storages | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Natalia Bidart):

To add some extra context and details, below some extracts of
conversations held while discussing the security reports around `Storage`
API:

From Shai:
> [...] I think the validation itself should be overridable, but I feel
weary about adding an API in a security release; can we add it as
"internal" in the security release, and document it (making it public)
later?

From Josh Schneier:

> Untangling the various ways that validation was peppered through the
code base in a couple of competing ways led me to drafting a local change
that unified them all. I’m happy to hear you’re taking a stab at that,
shows my head was in the right place.
> I think you are already on top of `get_available_name` and
`generate_filename` (which uses `validate_file_name`) but want to also
call your attention to `get_valid_name` which is also part of the publicly
overridable API. Currently it calls into `get_valid_filename` which has
yet another implementation of some of the protection logic, it also
appears to be its only used place in the code base.

From me:

> At a high level, I believe the core challenges to resolve are:
> 1. The entanglement and tight coupling of file name and file path
validations in the code.
> 2. The scattering of validations across multiple methods and files.
> To fix 1, I think we should take path validation out of
`validate_file_name` and put it in a dedicated `validate_file_path`. This
makes the `allow_relative_path` switch disappear. Backward compatibility
is an issue with this plan though.
> There is one more change that I considered but postponed, which is
potentially removing the call to `validate_file_name` in
FileField.generate_filename[1] since this method calls it's storage's
`generate_filename` which would *also* call `validate_file_name` (at least
in its default implementation). My main concern here is how often/likely
is to have custom storages overriding `generate_filename` without
validating the given name
> [1]
https://github.com/django/django/blob/main/django/db/models/fields/files.py#L336

Outcome up to this point: we did evaluate adding a public and overridable
method to do validation in the
[https://www.djangoproject.com/weblog/2024/jul/09/security-releases/
recent security release] but we decided not to, to keep the patch small
and self contained, with the counterpart of having to duplicate file name
validation in various places.

I'll be proposing a PR (no logic change) to land in `main` to extend test
coverage in preparation for future refactoring.
--
Ticket URL: <https://code.djangoproject.com/ticket/35607#comment:1>
Reply all
Reply to author
Forward
0 new messages