WebOb: problem with req.body vs. req.body_file

24 views
Skip to first unread message

Jon Nelson

unread,
Aug 27, 2009, 9:57:56 AM8/27/09
to Paste Users
I'm using 0.9.6.1 and I recently encountered a problem.
I have a project which takes XML in the body of the request and passes
it to lxml.etree.parse

When I use this paradigm, everything works:
lxml.etree.fromstring(req.body)

When I use this paradigm, I get timeout errors:
lxml.etree.parse(req.body_file)

It seems to me that the likely source of the bug is in WebOb, perhaps
not returning EOF or some such when sufficient data has been read. Any
ideas?

--
Jon

Sergey Schetinin

unread,
Aug 27, 2009, 7:42:40 PM8/27/09
to Jon Nelson, Paste Users
Do you have a sample request to replicate this? Cause it works for me.

Could you be reusing the same body file object twice? For example with
this setup:

from webob import *
from lxml.etree import parse

req = Request.blank('/', method='POST')
req.body = '''<test><foo/></test>'''

The following would work:

tree1 = parse(req.body_file)
tree2 = parse(req.body_file)

But not this:

f = req.body_file
tree1 = parse(f)
tree2 = parse(f)





2009/8/27 Jon Nelson <jne...@jamponi.net>:
--
Best Regards,
Sergey Schetinin

http://s3bk.com/ -- S3 Backup
http://word-to-html.com/ -- Word to HTML Converter

Jon Nelson

unread,
Aug 27, 2009, 9:56:15 PM8/27/09
to Paste Users
On Thu, Aug 27, 2009 at 6:42 PM, Sergey Schetinin<mal...@gmail.com> wrote:
> Do you have a sample request to replicate this? Cause it works for me.

I have a complete example below.

> Could you be reusing the same body file object twice? For example with
> this setup:

I am definitely not doing that.


Here is a complete example.
It requires a sample xml file, and paste script.
You can create a sample xml file with this:

echo "<empty/>" > t.xml

Yes, one that small is sufficient.
I test with curl, like this:

curl -D - --data-binary @t.xml http://localhost:8080/

When using the cherrypy server (paste.script) it does not work.
When using paste.httpserver it does.


#! /usr/bin/python

import webob
import lxml.etree as etree
from paste.httpserver import serve
from paste.script import wsgiserver

def handle(environ, start_response):
req = webob.Request(environ)
root = etree.parse(req.body_file)
res = webob.Response()
res.body = "looks good"
return res(environ, start_response)

#serve(handle)
server = wsgiserver.CherryPyWSGIServer(('0.0.0.0', 8080), handle)
server.start()

--
Jon

Sergey Schetinin

unread,
Aug 28, 2009, 1:15:17 AM8/28/09
to Jon Nelson, Paste Users
I tracked it down to cherrypy.wsgiserver.CP_fileobject.read that tries
to read as much data from the socket as was requested.
lxml.etree.parse immediately asks to read 4000 bytes and
CP_fileobject.read method doesn't just return the eight bytes that are
present in the request body but instead blocks on socket read trying
to get those 4000 bytes that aren't there. Therefore the timeout.
Unless I miss something that's a CherryPy problem, or are wsgi apps
prohibited from trying to read more than CONTENT_LENGTH bytes from
wsgi.input?

BTW, CherryPy wsgiserver in svn trunk is completely broken at the
moment -- it doesn't populate environ properly and collapses on itself
for that reason (for example the server itself tries to look up
['wsgi.version'] which is not there.


2009/8/28 Jon Nelson <jne...@jamponi.net>:
Reply all
Reply to author
Forward
0 new messages