* cc: net147 (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:12>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Comment (by ptone):
Taking a bit of a look at this - and the problem ends up not being
trivial.
First and obvious approach to a fix would be to add the deletion of _files
to the property setter for encoding - as is done for _post
But taking that step just uncovers a bit more of a yak
First off is that WSGIRequest and HttpRequest differ in the way the POST
attribute works - the former will refresh it if _post is deleted, the
latter will not. This is perhaps by design - but the differences between
these classes is not documented at all.
Second - there are problems with re-parsing the multipart data, as the
first time parsing involves reading the post data - which sets the
_read_started flag
This prevents _load_post_and_files from being able to reparse with the new
encoding - instead it shortcuts to _mark_post_parse_error.
Even if one changes that detail - by allowing the object to set the full
_body attribute on the first parse, you still have at least this test
fail:
test_body_after_POST_multipart
Which is there to specifically verify that accessing body is not allowed
after parsing post - because we don't want to repeat the expensive
operation of parsing the multipart data
basically there are aspects here that assume parts of this machinery will
only run through once, in a certain order
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:13>
Comment (by claudep):
Generally speaking, content encoding should be set with the charset
attribute of the Content-Encoding header. So we speak of corner cases
here. I'd be more enclined to fix the docs and not support changing
encoding after parsing, unless we have a common use case showing a clear
need for this.
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:14>
Comment (by mark@…):
I think I've stumbled on the same bug, or a related side affect. I'm
receiving some data ("Content-Type: multipart/alternative;") in a POST
from an external site (SendGrid) site as per their API (
http://sendgrid.com/docs/API_Reference/Webhooks/parse.html ) . The
encoding type for one of the fields (text) in the POST varies, and is
given in another POST field (charsets). If read the encoding type in
charsets and then set request.encoding to this encoding type, the
QueryDict vanishes when I try to get to request.POST['text'], as in
MultiValueDictKeyError: "Key 'text' not found in <QueryDict: {}>"
This sounds like an edge use case to me. I'm going to assume that this
will not get fixed, and look for a work around, but I thought it's worth
having a more helpful exception, telling that you can't change the
encoding in this scenario, and updating the docs.
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:15>
Comment (by zimnyx):
Consider following scenario:
1) read !HttpRequest.POST (raw request body is consumed)
2) change !HttpRequest.encoding (discards parsed POST entirely)
3) read !HttpRequest.POST again
Current behaviour for 3) depends on content-type:
- for multipart/form-data: returns {}
- for application/x-www-form-urlencoded: returns brand new dictionary
with values encoded in brand new encoding, just like
[https://docs.djangoproject.com/en/dev/ref/request-
response/#django.http.HttpRequest.encoding docs say]
For "multipart/form-data" behaviour is against documentation and gives
false impression that everything went fine. Only way for user to notice
such error, is checking !HttpRequest._post_parse_error attribute, but I
doubt it's a way to go, because it's not pulic nor documented.
My proposition is to always raise exception when we try to re-parse POST
for multipart/form-data. Patch addtached.
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:17>
* owner: nobody => zimnyx
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:18>
Comment (by Silvan Spross):
hi all, I just found this ticket after I created a related question on
stackoverflow regarding the same sendgrid problem:
http://stackoverflow.com/questions/42862082/reencode-django-request-body-
with-another-encoding
is there a way to get around this at the moment? thank you for your help!
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:19>
Comment (by Stephane Poss):
Hello,
5 years later and 3h of debugging... I stumbled on this. There was a
discrepancy between my production code and my unittests: works in prod,
not in test. My code listens to Paypal IPNs, they do not necessarily
encode their payload in utf-8, it's given in a field of the POST data.
They do not document the content-type used. The unittests use multipart
/form-data.
Using https://docs.djangoproject.com/en/3.2/ref/unicode/#form-submission
lead me to believe I could change the encoding after reading the POST data
-> cannot work in unittest without changing the content type (all good
once the actual content type to use is 'guessed'). I believe a note must
be added to https://docs.djangoproject.com/en/3.2/ref/unicode/#form-
submission explaining that changing the encoding based on POST data
content **only** works if the content-type is application/x-www-form-
urlencoded.
--
Ticket URL: <https://code.djangoproject.com/ticket/14035#comment:20>