Tested it with the old web2py and Tornado 1.2.1 via anyserver.py
and the download is OK.
Tested with web2py 1.91.6. Were there any changes regarding this?
(I'm still very reluctant to upgrade this project.)
--
Web (en): http://www.no-spoon.de/ -*- Web (de): http://www.frell.de/
<!--[if IE 6]><script>for(x in document.open);</script><![endif]-->
You asked on Reddit if the only constant is the browser. No it
isn't. But it was the browser which had the problem first, with
smaller files.
For smaller files it was enough to raise the chunk_size. IE8 is
slow, maybe this is the reason?
Firefox 4 failed when I tried to download a 33 MiB file remotely.
IE8 failed for anything above 64 KiB on localhost.
It's almost as if Rocket is so fast because it sends without
regard for any receiver. Direct (localhost) or behind a proxy
(Apache 2.2 on the remote Linux server).
Don't know how this could happen. HTTP isn't ZModem. ;-)
All combinations break the download for big files (33 MiB),
regardless of chunk_size or server.
Only Internet Explorer 8 (all servers) had problems with small
files (160 KiB), before increasing chunk_size for the streamed
download.
Made the changes to rocket.py (1.2.2), restarted web2py, and the
download was still broken.
By the way: One of the new examples (Dog and owner registration,
with picture upload/download) doesn't use streamed download. It
reads the whole file and sends it. This method doesn't work for
big files, too.
Massimo Di Pierro <massimo....@gmail.com> wrote:
--
Web (en): http://www.no-spoon.de/ -*- Web (de): http://www.frell.de/
It's only rocket. And it was reported by a co-worker and a
customer.
One silly idea: What if Rocket has problems with some
proxies/firewalls? The customer has a broken proxy which caused
different problems for other projects.
I don't know about any proxy at work, but it could be that they
reroute everything on port 80 through a proxy without me knowing.
(Ignore the non-technical implications of this assumption.)
Massimo Di Pierro <massimo....@gmail.com> wrote:
--
Web (en): http://www.no-spoon.de/ -*- Web (de): http://www.frell.de/
Forgot about the problems on localhost.
Thanks for your offer to help with this. The best way to help right
now would be to provide me a smallish pcap file that records it
happening so I can see which parts of the files are missing.
Thanks,
Timothy Farrell
Ok, the culprit is definitely ignoring exceptions raised in sendall. In my humble opinion this is serious enough to be on the 2.0 blocker list.
How to reproduce: you have to have a wsgi worker, that produces output in parts (that is, returns a list or yields part as a generator). e.g: use web2py's "static" file server (which uses wsgi and does not use the FileSystermWorker).
A better idea where the problem is can be seen from the following ugly patch (applied against web2py's "one file" rocket.py)
@@ -1929,6 +1929,9 @@ class WSGIWorker(Worker):
self.conn.sendall(b('%x\r\n%s\r\n' % (len(data), data)))
else:
self.conn.sendall(data)
+ except socket.timeout:
+ self.closeConnection = True
+ print 'Exception lost'
except socket.error:
# But some clients will close the connection before that
# resulting in a socket error.
Running the same experiment with the patched rocket.py will show that files get corrupted if 'exception lost' is printed to the web2py's terminal.
Discussion: The only way to use sendall() reliably is to immediately terminate the connection upon any error (including timeout), as there is no way to know how many bytes were sent. (That there is no way to know how many bytes were sent is clearly stated in the documentation; the implication that it is impossible to reliably recover from this is not). However, there are sendall() calls all over rocket.py, and some will result in additional sendalls() following a failed sendall(). The worst offender seems to be WSGIWorker.write(), but I'm not sure the other sendalls are safe either.
Temporary workaround: increase SOCKET_TIMEOUT significantly (default is 1 second; bump to e.g. 10), and not swallow socket.timeout in WSGIWorker.write().
Increasing the chunk size is NOT a helpful, because it only changes the number of bytes before the first loss (at a given bandwidth), but from that point, the problem is the same.
its 2.7 as all my servers are (distro default)
In trunk socket timeout is 60 and this resulted in another problem.
Ctrl-C waits for 60 seconds before joining the worker processes.
Perhaps we should increate socket-timeout, catch Ctrl+C and then kill
the process instead of joining the workers.