Fail to recognize json payload in a content-type: application/json POST request

509 views
Skip to first unread message

Ray (a.k.a. Iceberg)

unread,
Aug 1, 2013, 2:54:39 AM8/1/13
to
To whom it may concern:

I tried posting vars in json format, with content-type: application/json header, to my web2py application.

When the above test is run on my laptop, i.e. requests are sent directly to web2py's rocket web server, recent web2py (the 2.5.1-stable) can successfully decode the payload into request.vars, as expected as mentioned in https://groups.google.com/d/msg/web2py/9YdxVpuJlA8/ek0zJae5U9YJ

But when I deploy my application to my production server, the same web2py 2.5.1-stable can not recognize the payload and the request.vars are always empty. How come?

My production server is running a web2py (was 2.4.6, and then manually overwritten by those files in 2.5.1's web2py_src.zip), behind apache's mod_wsgi, (which itself is behind a global nginx, but that architecture is a webfaction convention which I think is not relevent).

Below is the request full content:

POST /examples/simple_examples/status/foo/bar.json HTTP/1.1
content-type: application/json
Host: example.com
Content-Length:14
user-agent: fake
Connection:Keep-Alive

{"foo": "bar"}


And
below is part of the web2py response, showing the request content. You can see web2py got the correct content type header, but still fail to decode the json payload in request.

env:






content_length:
14
content_type:
application/json
http_connection:
close
http_forwarded_request_uri:
/examples/simple_examples/status/foo/bar.json
http_host:
http_http_x_forwarded_proto:
http
http_https:
off
http_user_agent:
fake
http_x_forwarded_for:
107.23.xxx.xxx
http_x_forwarded_host:
http_x_forwarded_proto:
http
http_x_forwarded_server:
http_x_forwarded_ssl:
off

post_vars:

vars:

wsgi:
environ:
CONTENT_LENGTH:
14
CONTENT_TYPE:
application/json




Niphlod

unread,
Aug 1, 2013, 5:49:24 AM8/1/13
to web...@googlegroups.com
I really can't reproduce it. Can you log somewhere within the parse_get_post_vars function what happens ?

dhmorgan

unread,
Aug 1, 2013, 8:46:14 AM8/1/13
to web...@googlegroups.com
local server will automatically use generic views if a corresponding view is not found, which is the case with those examples; this is disabled by default for production environments but can be overwritten

In chapter 4 (the Core), there is this from the book:

  • If a view is not found, web2py tries to use a generic view. By default, generic views are disabled, although the 'welcome' app includes a line in /models/db.py to enable them on localhost only. They can be enabled per extension type and per action (usingresponse.generic_patterns). In general, generic views are a development tool and typically should not be used in production. If you want some actions to use a generic view, list those actions in response.generic_patterns (discussed in more detail in the chapter on Services).


check 

Ray (a.k.a. Iceberg)

unread,
Aug 1, 2013, 10:25:07 AM8/1/13
to web...@googlegroups.com
Thanks for trying to help Niphlod. More details so far.

Strangely, that problem does NOT exist when I test with http://web2py.com/examples/simple_examples/status

So I compare the output page between mine and the one from web2py.com, then I found something.

On my server's output, there are only TWO appearances of "application/json", one in request.env.content_type and another in request.wsgi.environ.CONTENT_TYPE.

On web2py.com's output, there are FOUR appearances of "application/json", they are request.env.content_type, request.env.http_content_type, request.wsgi.environ.CONTENT_TYPE, and request.wsgi.environ.HTTP_CONTENT_TYPE.

And turns out line 343 of parse_get_post_vars() uses only "http_content_type" to trigger the json support. http://code.google.com/p/web2py/source/browse/gluon/main.py

No wonder the symptom. So the next question is, what makes the request.env.http_content_type missing in my case? Is it somehow because my apache sits behind an nginx (a webfaction convention) so the request.is_local is always True (which is not the case of web2py.com)?

Or, shall we simply change that line 343 of gluon/main.py into this:

   
is_json = (env.get('http_content_type', '')
or
env.get('content_type', '')
)
[:16] == 'application/json'

Thoughts?

Jonathan Lundell

unread,
Aug 1, 2013, 11:38:17 AM8/1/13
to web...@googlegroups.com
On 1 Aug 2013, at 7:25 AM, "Ray (a.k.a. Iceberg)" <ice...@qq.com> wrote:
Thanks for trying to help Niphlod. More details so far.

Strangely, that problem does NOT exist when I test with http://web2py.com/examples/simple_examples/status

So I compare the output page between mine and the one from web2py.com, then I found something.

On my server's output, there are only TWO appearances of "application/json", one in request.env.content_type and another in request.wsgi.environ.CONTENT_TYPE.

On web2py.com's output, there are FOUR appearances of "application/json", they are request.env.content_type, request.env.http_content_type, request.wsgi.environ.CONTENT_TYPE, and request.wsgi.environ.HTTP_CONTENT_TYPE.

And turns out line 343 of parse_get_post_vars() uses only "http_content_type" to trigger the json support.http://code.google.com/p/web2py/source/browse/gluon/main.py


No wonder the symptom. So the next question is, what makes the request.env.http_content_type missing in my case? Is it somehow because my apache sits behind an nginx (a webfaction convention) so the request.is_local is always True (which is not the case of web2py.com)?

Or, shall we simply change that line 343 of gluon/main.py into this:

    
is_json = (env.get('http_content_type', '')
or
env.get('content_type', '')
)
[:16] == 'application/json'

Thoughts?

The specs associated with the Content-Type header are a little complicated. WSGI inherits them from CGI. 

The general idea is that (if the protocol is http), the environ members beginning with http_ were supplied by the client. Headers not starting with http_ are supplied by the server (possibly based on client headers).

The bottom line: the server is required to provide content_type if the client supplies one (otherwise it's optional); it is *not* required to supply the client header http_content_type.


6.1.3. CONTENT_TYPE
...
   Servers MUST provide this metavariable to scripts if a
   "Content-Type" field was present in the original request
   header. If the server receives a request with an attached
   entity but no "Content-Type" header field, it MAY attempt to
   determine the correct datatype, or it MAY omit this
   metavariable when communicating the request information to the
   script.

6.1.5. Protocol-Specific Metavariables
...
   Metavariables with names beginning with "HTTP_" contain values
   from the request header, if the scheme used was HTTP. Each
   HTTP header field name is converted to upper case, has all
   occurrences of "-" replaced with "_", and has "HTTP_"
   prepended to form the metavariable name.
...
   Servers are not required to create metavariables for all the
   request header fields that they receive. In particular, they
   MAY decline to make available any header fields carrying
   authentication information, such as "Authorization", or which
   are available to the script via other metavariables, such as
   "Content-Length" and "Content-Type".

Rocket, for example, complies:

        if 'HTTP_CONTENT_TYPE' in environ:
            environ['CONTENT_TYPE'] = environ['HTTP_CONTENT_TYPE']


So, if anything, web2py should be looking at content_type only; it's a mistake to look at http_content_type only, and (by spec) unnecessary to look at http_content_type at all.

If we're really worried about spec-breaking servers, we could repeat the Rocket logic (above) early on incoming requests (and the same for content_length, which has similar rules).

Ray (a.k.a. Iceberg)

unread,
Aug 1, 2013, 11:44:58 AM8/1/13
to web...@googlegroups.com


Sounds convincing! So it is a bug need to be fixed. Let's see how Massimo or others say.

Niphlod

unread,
Aug 1, 2013, 12:43:45 PM8/1/13
to web...@googlegroups.com
+1 for the proposed fix. I learned something new in the process ^_^

PS @all: I'd like to backport web3py lazy request.vars, request.get_vars, request.post_vars, request.cookies, and request.env to web2py starting from tomorrow. If other things like this should be patched in globals.py or in main.py, I think it would be a good occasion to fix once and for all.

Derek

unread,
Aug 1, 2013, 12:58:40 PM8/1/13
to web...@googlegroups.com
You should not be posting JSON strings to the webserver. The 'post_vars' can contain anything that json itself can represent, and it is more efficient. If you do a post of a json string with $.POST it will convert that json string to http variables because that's what you should be doing.

Niphlod

unread,
Aug 1, 2013, 2:51:42 PM8/1/13
to web...@googlegroups.com
@derek and @dhmorgan: actually what Iceberg posted is fine, it's really a subtle bug that needs to be addressed as per the docs posted by out own omniscient Jonathan, that can happen with some particular (although allowed) server architectures.

@jonathan: before diving in rocket's own "patching of spec-breaking servers", is there any other header we need to address ?

Jonathan Lundell

unread,
Aug 1, 2013, 3:03:34 PM8/1/13
to web...@googlegroups.com
On 1 Aug 2013, at 11:51 AM, Niphlod <nip...@gmail.com> wrote:
@derek and @dhmorgan: actually what Iceberg posted is fine, it's really a subtle bug that needs to be addressed as per the docs posted by out own omniscient Jonathan, that can happen with some particular (although allowed) server architectures.

@jonathan: before diving in rocket's own "patching of spec-breaking servers", is there any other header we need to address ?



content_size is the other one in this category.

A clarification, though: Rocket is not patching spec-breaking servers; it's just a server complying with the spec, which mandates content_type if the client has supplied one (which would optionally appear as http_content_type).

A spec-breaking server would be one that does not include content_type when one is provided by the client.

The bug is that web2py relies on http_content_type, even though the spec does not require the server to include it. 

My comment about working around a spec break is purely hypothetical, and applies to the case where the client provides Content-Type, and the server passes that along as http_content_type (as it should, but is not required to do) and does not also pass it as content_type (which it *is* required to do). 

Niphlod

unread,
Aug 1, 2013, 3:11:02 PM8/1/13
to web...@googlegroups.com
ok, thanks for the additional explanation.

tl;dr: As we don't "want to support" any breaking-spec servers (+1 on that), the only thing to take care of is to rely for both content-type and content-length headers to be directly on env and not expecting them to be neither http_content_length nor http_content_type.

did I get that clear ?

Jonathan Lundell

unread,
Aug 1, 2013, 3:21:28 PM8/1/13
to web...@googlegroups.com
On 1 Aug 2013, at 12:11 PM, Niphlod <nip...@gmail.com> wrote:
ok, thanks for the additional explanation.

tl;dr: As we don't "want to support" any breaking-spec servers (+1 on that), the only thing to take care of is to rely for both content-type and content-length headers to be directly on env and not expecting them to be neither http_content_length nor http_content_type.

did I get that clear ?

Yes. 

I'm not sure I entirely agree about broken servers, though. Paraphrasing Postel's Law, ""Be conservative in what you send, be liberal in what you accept." 


On Thursday, August 1, 2013 9:03:34 PM UTC+2, Jonathan Lundell wrote:
On 1 Aug 2013, at 11:51 AM, Niphlod <nip...@gmail.com> wrote:
@derek and @dhmorgan: actually what Iceberg posted is fine, it's really a subtle bug that needs to be addressed as per the docs posted by out own omniscient Jonathan, that can happen with some particular (although allowed) server architectures.

@jonathan: before diving in rocket's own "patching of spec-breaking servers", is there any other header we need to address ?



content_size is the other one in this category.

A clarification, though: Rocket is not patching spec-breaking servers; it's just a server complying with the spec, which mandates content_type if the client has supplied one (which would optionally appear as http_content_type).

A spec-breaking server would be one that does not include content_type when one is provided by the client.

The bug is that web2py relies on http_content_type, even though the spec does not require the server to include it. 

My comment about working around a spec break is purely hypothetical, and applies to the case where the client provides Content-Type, and the server passes that along as http_content_type (as it should, but is not required to do) and does not also pass it as content_type (which it *is* required to do). 

--
 



Niphlod

unread,
Aug 1, 2013, 3:30:27 PM8/1/13
to web...@googlegroups.com
ok. so to be on the safe side if env.http_content_type and env.http_content_length are provided gluon.main should update the env accordingly, and then the code can happily always use env.content_length and env.content_type

Jonathan Lundell

unread,
Aug 1, 2013, 3:44:30 PM8/1/13
to web...@googlegroups.com
On 1 Aug 2013, at 12:30 PM, Niphlod <nip...@gmail.com> wrote:
ok. so to be on the safe side if env.http_content_type and env.http_content_length are provided gluon.main should update the env accordingly, and then the code can happily always use env.content_length and env.content_type

That would be the idea. I don't actually like the extra complication, but the thought that somebody might be relying on bogus behavior makes me just *slightly* nervous.

I'd either to this (pseudo-code):

if not env.content_type and env.http_content_type:
    env.content_type = env.http_content_type

...and so on. That is, don't touch variables that the server has already set.

I wouldn't argue to hard for not doing that, though, esp. if Massimo's OK with leaving it out. Which would mean just changing our is_json test to look at content_type. (I scanned the rest of the source, and that seems to be the only place this happens.)

Massimo Di Pierro

unread,
Aug 2, 2013, 3:11:45 AM8/2/13
to web...@googlegroups.com
Our policy is that request.env is just the wsgi environment, without computed variables.
Perhaps this?

if not request.env.content_type and request.env.http_content_type:
    request.content_type = request.env.http_content_type
else:
    request.content_type = request.http_content_type

Michele Comitini

unread,
Aug 2, 2013, 8:12:22 AM8/2/13
to web...@googlegroups.com
Ray,


Just to be sure... after upgrading to 2.5.1 did you restart apache?
If you did and still does not work open a ticket on googlecode.

In the meantime you can try with reading request.body that should work.

mic



2013/8/1 Ray (a.k.a. Iceberg) <ice...@qq.com>

--
 
---
You received this message because you are subscribed to the Google Groups "web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jonathan Lundell

unread,
Aug 2, 2013, 10:12:23 AM8/2/13
to web...@googlegroups.com
On 2 Aug 2013, at 12:11 AM, Massimo Di Pierro <massimo....@gmail.com> wrote:
Our policy is that request.env is just the wsgi environment, without computed variables.

Except for fixup_missing_path_info.

Perhaps this?

if not request.env.content_type and request.env.http_content_type:
    request.content_type = request.env.http_content_type
else:
    request.content_type = request.http_content_type

Are you suggesting a new request variable to hold content_type? 

I don't think we really need to do that, and regardless that's not the right logic. The server is not required to give us env.http_content_type (nor *any* content_type if there's no content). If we really, really want request.content_type:

if request.env.content_type:
    request.content_type = request.env.content_type
elif request.env.http_content_type:
    request.content_type = request.env.http_content_type


There are two issues here. 

1. web2py has a bug: it's using env.http_content_type to set is_json, and it should be using env.content_type. That's because the server is required to give us env.content_type (if there's content; note that we don't get env.content_type for a GET), but is not required to give us env.http_content_type. The fix is easy; just change the is_json line to use the right variable.

wrong: is_json = env.get('http_content_type', '')[:16] == 'application/json' 
right: is_json = env.get('content_type', '')[:16] == 'application/json'
or: is_json = env.get('http_content_type', '').startswith('application/json')
(because I don't like magic numbers)
 

2. Phantom issue: should we try to anticipate servers that do not behave as they're required to do, that is, give us env.http_content_type but not env.content_type? We don't actually know that such servers exist; hopefully not. However, if it *did* happen (we get env.http_content_type and not env.content_type), then it's obvious what to do. So do we do it proactively? 

We do it already for fcgi in fixup_missing_path_info, and that may be a (policy) mistake. It's obscure, there's no good way of testing it, and we don't know whether there's a single web2py installation using a broken server that doesn't have path_info. But we're sorta stuck with it, because taking it out might break something, somewhere (maybe it should have gone in fcgihandler in the first place).

--
 



Niphlod

unread,
Aug 2, 2013, 12:19:21 PM8/2/13
to web...@googlegroups.com
hold still a few hours, I'm going to submit a patch for request that uses lazy evaluation of vars (ala web3py): should be a good occasion to do a general cleanup of all those bits !?

Jonathan Lundell

unread,
Aug 2, 2013, 12:37:52 PM8/2/13
to web...@googlegroups.com
On 2 Aug 2013, at 9:19 AM, Niphlod <nip...@gmail.com> wrote:
hold still a few hours, I'm going to submit a patch for request that uses lazy evaluation of vars (ala web3py): should be a good occasion to do a general cleanup of all those bits !?

No reason not to hold off, but content_type can't be lazy.

BTW, I think there's another minor bug in the is_json logic: the seek(0) call should be *after* the entire try/except. We want to allow rereading the content regardless of whether there was a load exception.

Also, this might be a good opportunity for var laziness, depending on how it works. For json-rpc apps like mine, parsing incoming application/json payloads into vars is a complete waste of time.

--
 



Niphlod

unread,
Aug 2, 2013, 12:57:15 PM8/2/13
to web...@googlegroups.com
let me rephrase: I sent a patch for lazyness including also the content-type fix ^_^
lets see what @massimo thinks of it, I think this is a good occasion to refactor lots of bits and pieces added from time to time in a more general and consistent way

Niphlod

unread,
Aug 2, 2013, 1:14:19 PM8/2/13
to web...@googlegroups.com

Also, this might be a good opportunity for var laziness, depending on how it works. For json-rpc apps like mine, parsing incoming application/json payloads into vars is a complete waste of time.

 
PS: you're right. If we parse POST with application/json already automatically, the gluon/tools.py file needs a little fix too, it doesn't need to re-parse the body.

Vincent Audebert

unread,
Dec 10, 2013, 9:00:27 PM12/10/13
to web...@googlegroups.com
Trying to catch up on this issue. Has it been fixed on trunk? Also when is forecasted the next released?

At the moment, I just hacked main.py by replacing http_content_type with content_type. Does it sound ok ?

Cheers.
Vincent.

Niphlod

unread,
Dec 11, 2013, 2:05:28 PM12/11/13
to web...@googlegroups.com
are you still experiencing the issue in 2.8.2 ?

Vincent Audebert

unread,
Dec 11, 2013, 8:46:51 PM12/11/13
to web...@googlegroups.com
Oh it has been fixed with 2.8.2? (that was the purpose of my first question. Sorry if I was not clear)

Niphlod

unread,
Dec 12, 2013, 5:34:22 AM12/12/13
to web...@googlegroups.com
it should have been. If you're still experiencing issues please post your findings.

Jonathan Lundell

unread,
Dec 12, 2013, 9:59:20 AM12/12/13
to web2py
On 12 Dec 2013, at 2:34 AM, Niphlod <nip...@gmail.com> wrote:
it should have been. If you're still experiencing issues please post your findings.

Do you recall what the issue was? I'm curious, because I have a 2.5.1 server at the moment serving up application/json.

Niphlod

unread,
Dec 12, 2013, 10:37:24 AM12/12/13
to web...@googlegroups.com


On Thursday, December 12, 2013 3:59:20 PM UTC+1, Jonathan Lundell wrote:
On 12 Dec 2013, at 2:34 AM, Niphlod <nip...@gmail.com> wrote:
it should have been. If you're still experiencing issues please post your findings.

Do you recall what the issue was? I'm curious, because I have a 2.5.1 server at the moment serving up application/json.



If I recall correctly was due to the fact that  content-type appeared in env and in env as a header, and the code checked "the wrong one".

Vincent Audebert

unread,
Dec 12, 2013, 5:13:47 PM12/12/13
to web...@googlegroups.com
@Niphlod yes it's exactly this.

@Jonathan I am on 2.5.1 too and it works fine on my MAC OS X machine but once I go on a web2py running under apache, the header content_type is sent and web2py catches only http_content_type

I will post my findings if it's not fixed in 2.8.2. I plan to migrate next month...

Cheers.


---
You received this message because you are subscribed to a topic in the Google Groups "web2py-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py/nVb1gUv2af8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py+un...@googlegroups.com.

Jonathan Lundell

unread,
Dec 12, 2013, 9:41:32 PM12/12/13
to web2py
On 12 Dec 2013, at 2:13 PM, Vincent Audebert <vin...@showcaseworkshop.com> wrote:
@Niphlod yes it's exactly this.

@Jonathan I am on 2.5.1 too and it works fine on my MAC OS X machine but once I go on a web2py running under apache, the header content_type is sent and web2py catches only http_content_type

I will post my findings if it's not fixed in 2.8.2. I plan to migrate next month...


Thanks for the clarification. FWIW, I'm running 2.5.1 under Apache and Content-Type is working. Go figure...
Reply all
Reply to author
Forward
0 new messages