RE: [cherrypy-users] $180 reward for help in tackling the hardest problem in the universe.

5 views
Skip to first unread message
Message has been deleted

Robert Brewer

unread,
Jan 26, 2007, 12:59:29 AM1/26/07
to cherryp...@googlegroups.com
Jesse James wrote:
> Subject: [cherrypy-users] $180 reward for help in
> tackling the hardest problem in the universe.

First, the repeated posts with no new information (other than a new bounty amount) are getting tedious. The people who know how to fix this aren't going to be motivated by the amount. I understand you want this fixed as soon as possible, but multiple carrot-and-stick attempts are making your case less desirable to work on, not more.

> What I'm seeing is that cherrypy is trying to read the file
> and timing out.

Without seeing the data that's sent but not received, it's difficult to ascertain *exactly* what the problem is. However, I've run some tests with the POST body you provided and was able to reproduce the timeouts. In all cases, the timeout occurs because 1) the POST body doesn't follow the multipart MIME spec, and 2) when reading multipart bodies, Python's CGI module doesn't use the global Content-Length value to know when to stop--it just keeps reading until it sees the correct "closing boundary"; if that's malformed, it gets stuck in a "while 1:" loop until timeout. There are two ways to work around this behavior.

One, you could expect the timeout, trap it, and get on with life. This will mean a few seconds extra time to finish receiving an upload, but shouldn't have any adverse affects on the actual received body. This can be done by changing cherrypy/_cpcgifs.py in the following manner:

def read_lines_to_outerboundary(self):
"""Internal: read lines until outerboundary."""
next = "--" + self.outerboundary
last = next + "--"
delim = ""
last_line_lfend = True

import socket

while 1:

try:
line = self.fp.readline(1<<16)
except socket.timeout:
self.done = -1
break

if not line:
self.done = -1
break
if line[:2] == "--" and last_line_lfend:
strippedline = line.strip()
if strippedline == next:
break
if strippedline == last:
self.done = 1
break
odelim = delim
if line[-2:] == "\r\n":
delim = "\r\n"
line = line[:-2]
last_line_lfend = True
elif line[-1] == "\n":
delim = "\n"
line = line[:-1]
last_line_lfend = True
else:
delim = ""
last_line_lfend = False
self.__write(odelim + line)


Two, the Content-Length for the entire request (as provided by the Flex app) seems to be correct. So you could in theory read the entire body based on that Content-Length and stick it in a StringIO object or temporary file, and then hand that to the FieldStorage call (or just set request.rfile to it, which does the same thing). This could be done in a before_request_body filter. I would check the headers for ('User-Agent', 'Shockwave Flash'), a multipart Content-Type, and a valid (nonzero) Content-Length before executing such special-case code. And you would want to read it in chunks, to reduce Denial-of-Service attacks with huge files, and then, at the end of the whole read(), have a look at the received data and see exactly how it's malformed. A multipart message should end with a "closing boundary"; in your example, the last line should be:

'------------KM7Ij5cH2KM7Ef1gL6ae0ae0cH2gL6--\r\n'

I got timeouts whether I removed either or both of the last "--" or the "\r\n", so my guess is one of those is not present. Once you figure out exactly how it's malformed, you can add what's missing to your StringIO or temp file, and then it should work normally when you hand it off to FieldStorage inside _cphttptools.processBody. But again, you should only do this rewriting of the body if the appropriate headers exist.

If you can figure out exactly how the data is malformed, we'd really appreciate a note so we can distribute an official fix.

Robert Brewer
System Architect
Amor Ministries
fuma...@amor.org

winmail.dat

Sylvain Hellegouarch

unread,
Jan 26, 2007, 2:50:34 AM1/26/07
to cherryp...@googlegroups.com
Robert Brewer wrote:
> Jesse James wrote:
>> Subject: [cherrypy-users] $180 reward for help in
>> tackling the hardest problem in the universe.
>
> First, the repeated posts with no new information (other than a new bounty amount) are getting tedious. The people who know how to fix this aren't going to be motivated by the amount. I understand you want this fixed as soon as possible, but multiple carrot-and-stick attempts are making your case less desirable to work on, not more.
>

I concur. I would actually like to stress if the notion that rewarding
with cash does not entitle for pressuring either lists nor people around
here. I'd appreciate if the tone could change.

- Sylvain

Jesse James

unread,
Jan 26, 2007, 9:57:51 PM1/26/07
to cherrypy-users
ok. I sent the same file twice, once with flex and once with the
simple html form.
This time I printed out the contents of the wsgi.input in the
before_request_body filter each time and here's what I observed (with
my own ---start body---, ---end body--- marker included for clarity):

SIMPLE HTML:
---------- start body ---------------
-----------------------------21718375316139
Content-Disposition: form-data; name="file_handle";
filename="hello.txt"
Content-Type: text/plain

hello

world

each man is an island.

-----------------------------21718375316139--

---------- end body ---------------


FLEX:
---------- start body ---------------
------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0
Content-Disposition: form-data; name="Filename"

hello.txt
------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0
Content-Disposition: form-data; name="Filedata"; filename="hello.txt"
Content-Type: application/octet-stream

hello

world

each man is an island.

------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0
Content-Disposition: form-data; name="Upload"

Submit Query
------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0--
---------- end body ---------------

as you can see, flex is doing some strange things here.

I still have not figured out how to re-inject the stream back into CP
correctly so the controller method is not getting properly invoked. I
get this:
Traceback (most recent call last):
File
"c:\python24\lib\site-packages\CherryPy-2.2.1-py2.4.egg\cherrypy\_cphttptools.py",
line 105, in _run
self.main()
File
"c:\python24\lib\site-packages\CherryPy-2.2.1-py2.4.egg\cherrypy\_cphttptools.py",
line 254, in main
body = page_handler(*virtual_path, **self.params)
TypeError: upload() takes at least 2 arguments (1 given)

On 25 Jan, 23:59, "Robert Brewer" <fuman...@amor.org> wrote:
> Jesse James wrote:
> > Subject: [cherrypy-users] $180 reward for help in

> > tackling the hardest problem in the universe.First, the repeated posts with no new information (other than a new bounty amount) are getting tedious. The people who know how to fix this aren't going to be motivated by the amount. I understand you want this fixed as soon as possible, but multiple carrot-and-stick attempts are making your case less desirable to work on, not more.


>
> > What I'm seeing is that cherrypy is trying to read the file

> > and timing out.Without seeing the data that's sent but not received, it's difficult to ascertain *exactly* what the problem is. However, I've run some tests with the POST body you provided and was able to reproduce the timeouts. In all cases, the timeout occurs because 1) the POST body doesn't follow the multipart MIME spec, and 2) when reading multipart bodies, Python's CGI module doesn't use the global Content-Length value to know when to stop--it just keeps reading until it sees the correct "closing boundary"; if that's malformed, it gets stuck in a "while 1:" loop until timeout. There are two ways to work around this behavior.

> fuman...@amor.org
>
> winmail.dat
> 6KDownload

fumanchu

unread,
Jan 26, 2007, 10:45:04 PM1/26/07
to cherrypy-users
On Jan 26, 6:57 pm, "Jesse James" <joel.re...@gmail.com> wrote:
> as you can see, flex is doing some strange things here.

Yes; it's omitting the last CRLF that most multipart-MIME producers
include.

> I still have not figured out how to re-inject the stream back into CP
> correctly so the controller method is not getting properly invoked.

> TypeError: upload() takes at least 2 arguments (1 given)

That's a problem with the number of args in your page handler function.
Your example needs to have a pgae handler signature like:

def upload(Upload, Filename, Filedata):
....

I've opened a ticket and posted a working, tested patch for this issue
[1]. Although it hasn't been approved for inclusion in the distro yet
(and might never be because it's so rare), I suggest you try it out.


Robert Brewer
System Architect
Amor Ministries

fuma...@amor.org

[1] http://www.cherrypy.org/ticket/648

Jesse James

unread,
Jan 26, 2007, 11:16:26 PM1/26/07
to cherrypy-users

> > I still have not figured out how to re-inject the stream back into CP
> > correctly so the controller method is not getting properly invoked.
> > TypeError: upload() takes at least 2 arguments (1 given)That's a problem with the number of args in your page handler function.

> Your example needs to have a pgae handler signature like:
>
> def upload(Upload, Filename, Filedata):
> ....

I don't understand the statement above. If the error says that upload
takes 2 arguments (self and file_handle), and it is only getting one (1
given) (and BTW, it works fine without the before_request_body filter),
why would I need any additional args?
Seems like I was somehow messing up the request processing with my
filter logic. BTW, if I commented out the line where I read the
request.rfile in that filter, the request works.

I will test the patch and post again with the results.
Thanks for your help in this matter.

Message has been deleted

Robert Brewer

unread,
Jan 26, 2007, 11:59:17 PM1/26/07
to cherryp...@googlegroups.com
Jesse James wrote:
> > > I still have not figured out how to re-inject the stream
> > > back into CP correctly so the controller method is not
> > > getting properly invoked.
> > > TypeError: upload() takes at least 2 arguments (1 given)
> > That's a problem with the number of args in your page handler function.
> > Your example needs to have a page handler signature like:

> >
> > def upload(Upload, Filename, Filedata):
> > ....
>
> I don't understand the statement above. If the error says
> that upload takes 2 arguments (self and file_handle),
> and it is only getting one (1 given) (and BTW, it works
> fine without the before_request_body filter), why would I
> need any additional args?

Those are the args I see with my patch, and which you would see if the request included the trailing CRLF as expected. You're right that your filter may change the expected signature. Sorry for any confusion.

winmail.dat
Message has been deleted
Message has been deleted

Robert Brewer

unread,
Jan 27, 2007, 12:54:28 PM1/27/07
to cherryp...@googlegroups.com
Jesse James wrote:
> ok, I installed the patch, now I'm getting this:
> FieldStorage(None, None, [FieldStorage('Filename', None, 'hello.txt'),
> FieldStorage('file', 'hello.txt', 'hello\r\n\r\nworld\r\n\r
> \neach man is an island.\r\n\t\r\n!'), FieldStorage('Upload', None,
> 'Submit Query')])
> TypeError: upload() takes at least 2 non-keyword arguments (1 given)

Did you revert your other changes?

Try printing forms.keys(); what does it contain? On mine, it's always ['Filename', 'Filedata', 'Upload']; those are then used as keyword arguments to the page handler (the 'upload' method). Given the output you showed above, forms.keys() should be a 3-item list. If you get something different, then...there's something wrong with your install.

winmail.dat
Message has been deleted
Message has been deleted
Message has been deleted

Sylvain Hellegouarch

unread,
Jan 27, 2007, 2:35:10 PM1/27/07
to cherryp...@googlegroups.com
Jesse James wrote:

> On 27 Jan, 11:54, "Robert Brewer" <fuman...@amor.org> wrote:
>> Jesse James wrote:
>>> ok, I installed the patch, now I'm getting this:
>>> FieldStorage(None, None, [FieldStorage('Filename', None, 'hello.txt'),
>>> FieldStorage('file', 'hello.txt', 'hello\r\n\r\nworld\r\n\r
>>> \neach man is an island.\r\n\t\r\n!'), FieldStorage('Upload', None,
>>> 'Submit Query')])
>>> TypeError: upload() takes at least 2 non-keyword arguments (1 given)Did you revert your other changes?

>> Try printing forms.keys(); what does it contain? On mine, it's always ['Filename', 'Filedata', 'Upload']; those are then used as keyword arguments to the page handler (the 'upload' method). Given the output you showed above, forms.keys() should be a 3-item list. If you get something different, then...there's something wrong with your install.
>
> Robert,
> I am using your patch exactly as you posted it. I suspect there is
> something wrong with the logic in cherrypy that deals with the
> boundary markers.
>
> Again, the traceback reveals that upload is expecting 2 args (one
> being self) but is only being called with 1 (thus the parsing of the
> request is not producing any args).
>

Can't you lookup yourself then which arg is missing?

- Sylvain

Message has been deleted

Sylvain Hellegouarch

unread,
Jan 27, 2007, 5:02:45 PM1/27/07
to cherryp...@googlegroups.com

> however, when I send it from flex, with Robert's filter in place to
> handle the missing CRLF, it does not produce any arguments to pass to
> the handler (forms is empty). Thus, my attempt at convincing someone
> who cares that cherrypy is broken, probably in the read_multi method
> of the FieldStorage class. I guess I'll just have to fix it myself.
> thanks.
> Joel

That's what they call OSS. People find a defect and work at providing a
fix. If you get a proper one then submit it to the ticket and it will be
reviewed and applied or rejected.

- Sylvain

Reply all
Reply to author
Forward
Message has been deleted
0 new messages