[Django] #36023: Update content_disposition_header to handle control chars.

17 views
Skip to first unread message

Django

unread,
Dec 17, 2024, 11:48:16 AM12/17/24
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------+-----------------------------------------
Reporter: Alex Vandiver | Type: Uncategorized
Status: new | Component: Uncategorized
Version: 5.1 | Severity: Normal
Keywords: | Triage Stage: Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+-----------------------------------------
content_disposition_header does not correctly handle newlines and other
control characters in values that are given to it.

See https://github.com/django/django/pull/18890 for a fix.
--
Ticket URL: <https://code.djangoproject.com/ticket/36023>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Dec 17, 2024, 2:33:51 PM12/17/24
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------+--------------------------------------
Reporter: Alex Vandiver | Owner: (none)
Type: Bug | Status: closed
Component: HTTP handling | Version: dev
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage: Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by Natalia Bidart):

* component: Uncategorized => HTTP handling
* resolution: => needsinfo
* status: new => closed
* type: Uncategorized => Bug
* version: 5.1 => dev

Comment:

Hello Alex, thank you for taking the time to create this report. I have
seen the PR but I think this ticket need more background information about
why this change is needed. What I mean is that the PR shows the *what*,
but in order to accept the ticket, we need to understand the *why*.

I'll close as `needsinfo` but please reopen when you can provide a use
case describing the need/issue being solved. A small Django project or a
way to trigger the bug would also help. Thanks again!
--
Ticket URL: <https://code.djangoproject.com/ticket/36023#comment:1>

Django

unread,
Dec 18, 2024, 1:41:01 PM12/18/24
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------+--------------------------------------
Reporter: Alex Vandiver | Owner: (none)
Type: Bug | Status: new
Component: HTTP handling | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+--------------------------------------
Changes (by Alex Vandiver):

* resolution: needsinfo =>
* status: closed => new

Comment:

Headers cannot contain newline characters and other control characters.
The current code passes them through unchanged, and filenames should not
be assumed to be free from them. For instance:


{{{
def index(request):
filename = "foo\nbar.pdf"
return HttpResponse(
"Some PDF content",
headers={"Content-Disposition": content_disposition_header(True,
filename)},
)
}}}

...will produce:
{{{
Traceback (most recent call last):
File "/srv/zulip-py3-venv/lib/python3.10/site-
packages/django/core/handlers/exception.py", line 55, in inner
response = get_response(request)
File "/srv/zulip-py3-venv/lib/python3.10/site-
packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args,
**callback_kwargs)
File "/home/zulipdev/newline-disposition/newlines/views.py", line 7, in
index
return HttpResponse(
File "/srv/zulip-py3-venv/lib/python3.10/site-
packages/django/http/response.py", line 374, in __init__
super().__init__(*args, **kwargs)
File "/srv/zulip-py3-venv/lib/python3.10/site-
packages/django/http/response.py", line 115, in __init__
self.headers = ResponseHeaders(headers)
File "/srv/zulip-py3-venv/lib/python3.10/site-
packages/django/http/response.py", line 41, in __init__
self[header] = value
File "/srv/zulip-py3-venv/lib/python3.10/site-
packages/django/http/response.py", line 87, in __setitem__
value = self._convert_to_charset(value, "latin-1", mime_encode=True)
File "/srv/zulip-py3-venv/lib/python3.10/site-
packages/django/http/response.py", line 61, in _convert_to_charset
raise BadHeaderError(
django.http.response.BadHeaderError: Header values can't contain newlines
(got 'attachment; filename="foo\nbar.pdf"')
}}}

The PR switches that to correctly percent-encode them, so that they are
can be passed as an HTTP header.
--
Ticket URL: <https://code.djangoproject.com/ticket/36023#comment:2>

Django

unread,
Dec 18, 2024, 8:36:39 PM12/18/24
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------+-----------------------------------------
Reporter: Alex Vandiver | Owner: Alex Vandiver
Type: Bug | Status: assigned
Component: HTTP handling | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+-----------------------------------------
Changes (by JaeHyuckSa):

* owner: (none) => Alex Vandiver
* status: new => assigned

--
Ticket URL: <https://code.djangoproject.com/ticket/36023#comment:3>

Django

unread,
Dec 19, 2024, 2:52:47 AM12/19/24
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------+-----------------------------------------
Reporter: Alex Vandiver | Owner: Alex Vandiver
Type: Bug | Status: assigned
Component: HTTP handling | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+-----------------------------------------
Changes (by Sarah Boyce):

* stage: Unreviewed => Accepted

Comment:

Replying to [comment:2 Alex Vandiver]:
> filenames should not be assumed to be free from them.

I guess it is possible control characters might be in a filename but very
rare. This is also mentioned here:
https://datatracker.ietf.org/doc/html/rfc6266#section-4.3

> Recipients SHOULD strip or replace character sequences that are known to
cause confusion both in user interfaces and in filenames, such as control
characters and leading and trailing whitespace.
--
Ticket URL: <https://code.djangoproject.com/ticket/36023#comment:4>

Django

unread,
Dec 19, 2024, 5:05:30 PM12/19/24
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------+-----------------------------------------
Reporter: Alex Vandiver | Owner: Alex Vandiver
Type: Bug | Status: assigned
Component: HTTP handling | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------+-----------------------------------------
Comment (by Alex Vandiver):

We ran into this in the context of [https://zulip.com/ Zulip]; it stores
files uploaded by arbitrary web browsers, and attempts to serve the files
back to them with the same filenames they were uploaded as. In our
production deploy, we observe filenames with newlines on an infrequent but
non-zero frequency.

I agree that non-newline control characters is a bit more of a "did you
really mean that," but filtering them should be the job of the application
calling the function. The function should either produce the requested
output correctly, or throw an exception -- not corrupt the HTTP response.
Given that it's a SHOULD, not a MUST, I feel it's reasonable to produce
the requested output.
--
Ticket URL: <https://code.djangoproject.com/ticket/36023#comment:5>

Django

unread,
Jan 6, 2025, 11:21:30 AM1/6/25
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------------+-------------------------------------
Reporter: Alex Vandiver | Owner: Alex
| Vandiver
Type: Bug | Status: assigned
Component: HTTP handling | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/36023#comment:6>

Django

unread,
Jan 7, 2025, 3:22:18 AM1/7/25
to django-...@googlegroups.com
#36023: Update content_disposition_header to handle control chars.
-------------------------------------+-------------------------------------
Reporter: Alex Vandiver | Owner: Alex
| Vandiver
Type: Bug | Status: closed
Component: HTTP handling | Version: dev
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce <42296566+sarahboyce@…>):

* resolution: => fixed
* status: assigned => closed

Comment:

In [changeset:"8914b571eb5f93722b9741b1da9eb69347271b11" 8914b57]:
{{{#!CommitTicketReference repository=""
revision="8914b571eb5f93722b9741b1da9eb69347271b11"
Fixed #36023 -- Handled controls chars in content_disposition_header.

To use the simple `filename="..."` form, the value must conform to the
official grammar from RFC6266[^1]:

filename-parm = "filename" "=" value
value = <value, defined in [RFC2616], Section 3.6>
; token | quoted-string

The `quoted-string` definition comes from RFC 9110[^2]:

```
quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE
qdtext = HTAB / SP / %x21 / %x23-5B / %x5D-7E / obs-text

The backslash octet ("\") can be used as a single-octet quoting
mechanism within quoted-string and comment constructs. Recipients that
process the value of a quoted-string MUST handle a quoted-pair as if
it were replaced by the octet following the backslash.

quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text )

A sender SHOULD NOT generate a quoted-pair in a quoted-string except
where necessary to quote DQUOTE and backslash octets occurring within
that string.
```

That is, quoted strings are able to express horizontal tabs, space
characters, and everything in the range from 0x21 to 0x7e, expect for
0x22 (`"`) and 0x5C (`\`), which can still be expressed but must be
escaped with their own `\`.

We ignore the case of `obs-text`, which is defined as the range
0x80-0xFF, since its presence is there for permissive parsing of
accidental high-bit characters, and it should not be generated by
conforming implementations.

Transform this character range into a regex and apply it in addition
to the "is ASCII" check. This ensures that all simple filenames are
expressed in the simple format, and that all filenames with newlines
and other control characters are properly expressed with the
percent-encoded `filename*=...`form.

[^1]: https://datatracker.ietf.org/doc/html/rfc6266#section-4.1
[^2]: https://datatracker.ietf.org/doc/html/rfc9110#name-quoted-strings
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36023#comment:7>
Reply all
Reply to author
Forward
0 new messages