>>> from webob import Request
>>> req = Request.blank('/', method='POST')
>>> req.body = "<empty/>"
>>> req.body
'<empty/>'
>>> req.POST
MultiDict([('<empty/>', '')])
>>> req.body
'%3Cempty%2F%3E='
>>>
After assignment, I check req.body
It's good.
Then I access req.POST, which is also fine.
But then I access req.body again, and it's been altered!
This is not expected!
--
Jon
ping?
--
Jon
Even when I am using a content-type of application/x-www-form-urlencoded,
if I access req.body *before* req.POST, body looks like this:
'%3Cempty%2F%3E%0A%0A'
if I merely access req.POST, req.body becomes this:
'%3Cempty%2F%3E%0A%0A='
The former was supplied by curl. The fact that webob alters req.body
regardless of the content type, and that it does so after accessing
req.POST, certainly violates the principle of least surprise.
--
Jon
Wait, you're saying that you're using
'application/x-www-form-urlencoded' content-type, right? And the body
is not a valid urlencoded string, correct? What accessing .POST does
is normalizing the request body, which can be argued is the right
thing to do. This can be a problem if you don't set correct
content-type on requests, but I'm not sure why would it be a problem
otherwise.
I think you're wrong in saying "webob alters req.body regardless of
the content type".
No. In some cases I'm using that content type, and sometimes the body
matches that content type and other times it doesn't. However, in
*both* cases accessing req.POST *changes* req.body.
For example, given:
content-type: application/x-www-form-urlencoded
body: properly URL-encoded content
accessing req.POST does actually change req.body
and when given:
content-type: application/x-www-form-urlencoded
body: content that has *not* been URL-encoded (but is not invalid, either)
accessing req.POST renders req.body into a url-encoded version.
Part of the problem is that content-type is not necessarily correct.
All sorts of clients do all sorts of broken stuff, and I found that
when accessing req.POST req.body changed, and this was very much a
surprise to me. Acessing (but not changing) req.POST I would have
treated as a "read" operation (currying and memoization aside) but I
got a "write" operation as an unexpected side-effect.
Absolutely it can be argued that accessing req.POST normalizes the
body (using the content type header to do so) but it still violates
the principle of least surprise.
> I think you're wrong in saying "webob alters req.body regardless of
> the content type".
It appears to do so, however. In the example I posted content still
changed, even when the content was url-encoded and the header was set
appropriately - note the addition of the trailing '='. It's
irrelevant if the /decoded/ content is the same, a byte comparison of
req.body /before/ and req.body /after/ still shows that they are
different.
Another way to say it might be that accessing req.POST will rewrite req.body.
--
Jon
No. In some cases I'm using that content type, and sometimes the body
matches that content type and other times it doesn't. However, in
*both* cases accessing req.POST *changes* req.body.
For example, given:
content-type: application/x-www-form-urlencoded
body: properly URL-encoded content
accessing req.POST does actually change req.body
and when given:
content-type: application/x-www-form-urlencoded
body: content that has *not* been URL-encoded (but is not invalid, either)
accessing req.POST renders req.body into a url-encoded version.
Part of the problem is that content-type is not necessarily correct.
All sorts of clients do all sorts of broken stuff, and I found that
when accessing req.POST req.body changed, and this was very much a
surprise to me. Acessing (but not changing) req.POST I would have
treated as a "read" operation (currying and memoization aside) but I
got a "write" operation as an unexpected side-effect.
Absolutely it can be argued that accessing req.POST normalizes the
body (using the content type header to do so) but it still violates
the principle of least surprise.
> I think you're wrong in saying "webob alters req.body regardless ofIt appears to do so, however. In the example I posted content still
> the content type".
changed, even when the content was url-encoded and the header was set
appropriately - note the addition of the trailing '='. It's
irrelevant if the /decoded/ content is the same, a byte comparison of
req.body /before/ and req.body /after/ still shows that they are
different.
Another way to say it might be that accessing req.POST will rewrite req.body.
> Well... stuff happens ;) When the request is invalid in some sense, there's
> only so much you can promise. Most other libraries simply make the body
> inaccessible after parsing the form. I'm not sure if it is the case, but it
> *should* be the case that if you access req.body, then it should stay
> constant, as there's no particular reason with respect to optimization to
> throw away the cached body; that is, if it's not the case, it's a bug that
> should be fixed. If you access req.POST before req.body, I don't think
> WebOb can promise to keep the original body.
It seems to me, then, that perhaps it's a documentation issue.
Maybe req.body should have a big fat caveat which states that
accessing req.POST will always normalize/re-write req.body /according
to the currently set content-type/ and that req.body is "raw". An
alternative might be to say that setting req.body *before*
req.content_type (if not is not already set) is bad/deprecated/a
warning, and then *when* req.body is set it is encoded using the
current content-type (if encoding is appropriate). I'm not sold on
that idea, either, but it seems to me that the disconnect is between
req.body being /sometimes/ treated as though it is raw, and other
times encoded - depending on the order of access between .POST and
.body (and the value of .content_type). Perhaps this is an area that
could some clarification, at least when it comes to the documentation?
>> Another way to say it might be that accessing req.POST will rewrite
>> req.body
... if it hasn't been re-written already.
Unless you are saying that every time I access req.POST it
normalizes/re-writes req.body, in which case that seems inefficient.
> What Sergey is saying, which I believe is true, is that req.body won't
> change if you have a request content type that can't be confused for a form
> submission (i.e., empty or one of the two form content types).
That seems to be the key here.
Thanks for the replies! I think this bears thinking about, however.
The interaction between .POST, .body, and .content_type seems like a
bit of a booby trap.
--
Jon