First, the repeated posts with no new information (other than a new bounty amount) are getting tedious. The people who know how to fix this aren't going to be motivated by the amount. I understand you want this fixed as soon as possible, but multiple carrot-and-stick attempts are making your case less desirable to work on, not more.
> What I'm seeing is that cherrypy is trying to read the file
> and timing out.
Without seeing the data that's sent but not received, it's difficult to ascertain *exactly* what the problem is. However, I've run some tests with the POST body you provided and was able to reproduce the timeouts. In all cases, the timeout occurs because 1) the POST body doesn't follow the multipart MIME spec, and 2) when reading multipart bodies, Python's CGI module doesn't use the global Content-Length value to know when to stop--it just keeps reading until it sees the correct "closing boundary"; if that's malformed, it gets stuck in a "while 1:" loop until timeout. There are two ways to work around this behavior.
One, you could expect the timeout, trap it, and get on with life. This will mean a few seconds extra time to finish receiving an upload, but shouldn't have any adverse affects on the actual received body. This can be done by changing cherrypy/_cpcgifs.py in the following manner:
def read_lines_to_outerboundary(self):
"""Internal: read lines until outerboundary."""
next = "--" + self.outerboundary
last = next + "--"
delim = ""
last_line_lfend = True
import socket
while 1:
try:
line = self.fp.readline(1<<16)
except socket.timeout:
self.done = -1
break
if not line:
self.done = -1
break
if line[:2] == "--" and last_line_lfend:
strippedline = line.strip()
if strippedline == next:
break
if strippedline == last:
self.done = 1
break
odelim = delim
if line[-2:] == "\r\n":
delim = "\r\n"
line = line[:-2]
last_line_lfend = True
elif line[-1] == "\n":
delim = "\n"
line = line[:-1]
last_line_lfend = True
else:
delim = ""
last_line_lfend = False
self.__write(odelim + line)
Two, the Content-Length for the entire request (as provided by the Flex app) seems to be correct. So you could in theory read the entire body based on that Content-Length and stick it in a StringIO object or temporary file, and then hand that to the FieldStorage call (or just set request.rfile to it, which does the same thing). This could be done in a before_request_body filter. I would check the headers for ('User-Agent', 'Shockwave Flash'), a multipart Content-Type, and a valid (nonzero) Content-Length before executing such special-case code. And you would want to read it in chunks, to reduce Denial-of-Service attacks with huge files, and then, at the end of the whole read(), have a look at the received data and see exactly how it's malformed. A multipart message should end with a "closing boundary"; in your example, the last line should be:
'------------KM7Ij5cH2KM7Ef1gL6ae0ae0cH2gL6--\r\n'
I got timeouts whether I removed either or both of the last "--" or the "\r\n", so my guess is one of those is not present. Once you figure out exactly how it's malformed, you can add what's missing to your StringIO or temp file, and then it should work normally when you hand it off to FieldStorage inside _cphttptools.processBody. But again, you should only do this rewriting of the body if the appropriate headers exist.
If you can figure out exactly how the data is malformed, we'd really appreciate a note so we can distribute an official fix.
Robert Brewer
System Architect
Amor Ministries
fuma...@amor.org
I concur. I would actually like to stress if the notion that rewarding
with cash does not entitle for pressuring either lists nor people around
here. I'd appreciate if the tone could change.
- Sylvain
SIMPLE HTML:
---------- start body ---------------
-----------------------------21718375316139
Content-Disposition: form-data; name="file_handle";
filename="hello.txt"
Content-Type: text/plain
hello
world
each man is an island.
-----------------------------21718375316139--
---------- end body ---------------
FLEX:
---------- start body ---------------
------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0
Content-Disposition: form-data; name="Filename"
hello.txt
------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0
Content-Disposition: form-data; name="Filedata"; filename="hello.txt"
Content-Type: application/octet-stream
hello
world
each man is an island.
------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0
Content-Disposition: form-data; name="Upload"
Submit Query
------------ei4gL6ei4GI3GI3ei4ae0gL6ei4ae0--
---------- end body ---------------
as you can see, flex is doing some strange things here.
I still have not figured out how to re-inject the stream back into CP
correctly so the controller method is not getting properly invoked. I
get this:
Traceback (most recent call last):
File
"c:\python24\lib\site-packages\CherryPy-2.2.1-py2.4.egg\cherrypy\_cphttptools.py",
line 105, in _run
self.main()
File
"c:\python24\lib\site-packages\CherryPy-2.2.1-py2.4.egg\cherrypy\_cphttptools.py",
line 254, in main
body = page_handler(*virtual_path, **self.params)
TypeError: upload() takes at least 2 arguments (1 given)
On 25 Jan, 23:59, "Robert Brewer" <fuman...@amor.org> wrote:
> Jesse James wrote:
> > Subject: [cherrypy-users] $180 reward for help in
> > tackling the hardest problem in the universe.First, the repeated posts with no new information (other than a new bounty amount) are getting tedious. The people who know how to fix this aren't going to be motivated by the amount. I understand you want this fixed as soon as possible, but multiple carrot-and-stick attempts are making your case less desirable to work on, not more.
>
> > What I'm seeing is that cherrypy is trying to read the file
> > and timing out.Without seeing the data that's sent but not received, it's difficult to ascertain *exactly* what the problem is. However, I've run some tests with the POST body you provided and was able to reproduce the timeouts. In all cases, the timeout occurs because 1) the POST body doesn't follow the multipart MIME spec, and 2) when reading multipart bodies, Python's CGI module doesn't use the global Content-Length value to know when to stop--it just keeps reading until it sees the correct "closing boundary"; if that's malformed, it gets stuck in a "while 1:" loop until timeout. There are two ways to work around this behavior.
> fuman...@amor.org
>
> winmail.dat
> 6KDownload
Yes; it's omitting the last CRLF that most multipart-MIME producers
include.
> I still have not figured out how to re-inject the stream back into CP
> correctly so the controller method is not getting properly invoked.
> TypeError: upload() takes at least 2 arguments (1 given)
That's a problem with the number of args in your page handler function.
Your example needs to have a pgae handler signature like:
def upload(Upload, Filename, Filedata):
....
I've opened a ticket and posted a working, tested patch for this issue
[1]. Although it hasn't been approved for inclusion in the distro yet
(and might never be because it's so rare), I suggest you try it out.
Robert Brewer
System Architect
Amor Ministries
I don't understand the statement above. If the error says that upload
takes 2 arguments (self and file_handle), and it is only getting one (1
given) (and BTW, it works fine without the before_request_body filter),
why would I need any additional args?
Seems like I was somehow messing up the request processing with my
filter logic. BTW, if I commented out the line where I read the
request.rfile in that filter, the request works.
I will test the patch and post again with the results.
Thanks for your help in this matter.
Those are the args I see with my patch, and which you would see if the request included the trailing CRLF as expected. You're right that your filter may change the expected signature. Sorry for any confusion.
Did you revert your other changes?
Try printing forms.keys(); what does it contain? On mine, it's always ['Filename', 'Filedata', 'Upload']; those are then used as keyword arguments to the page handler (the 'upload' method). Given the output you showed above, forms.keys() should be a 3-item list. If you get something different, then...there's something wrong with your install.
Can't you lookup yourself then which arg is missing?
- Sylvain
That's what they call OSS. People find a defect and work at providing a
fix. If you get a proper one then submit it to the ticket and it will be
reviewed and applied or rejected.
- Sylvain