#1095: Patches to improve large file upload performance
---------------------------+------------------------------------------------
Reporter:
l...@cern.ch | Owner: fumanchu
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: CherryPy code | Keywords:
---------------------------+------------------------------------------------
We have a server which frequently receives file / json upload POSTs, with
bodies up to half a gigabyte. We've found the server performance in
processing the body is limited, especially for a POST multipart mime body
without '`content-length`' header for the individual parts.
Based on our profiling, there's two problems at least in CherryPy 3.1.2
we'd like to see fixed; I'm attaching to this ticket our patches, one to
add a test to measure the performance, and two to fix the issues below.
The first issue is that the default network read size is small, 8 kB for
the initial request line and headers, and 64 kB to read a body with
`content-length`. For a several hundred megabyte file, that is too small.
We added a configuration option `request.body_io_size` to allow the size
to be tuned, for large uploads 128 - 512 kB seems to be a more reasonable
value.
The second issue is that `CP_fileobject` read() and readline()
implementations cause a very large amount of `cStringIO` object churn,
creating a new object for every line of input read. For some files we read
it typically breaks input to just a couple of hundred bytes to a kilobyte
at a time. We found that returning just slices from an internal buffer -
sized by `body_io_size` parameter - works much better. Although
performance for a body without content-length header is still poor (~15
MB/s) compared to one that does (up to 300 MB/s), it's ~50% better than
without the patch.
We added a test which demonstrates the problems. You can run the basic
test with simply '`python test/test_upload.py`', or with
`--server/--client` options to run it between machines. To run a basic
single-host benchmark please run it like this, with and without the other
patches to measure the impact:
{{{
#!sh
for x in smart dumb; do
T=$(for i in $(seq 1 10); do echo UploadTest.test_big_$x; done)
(set -x; python test/test_upload.py $=T)
done
}}}
Note that because `FieldStorage` spools bodies larger than ~1kB to disk,
the above will be writing 500 MB temporary files to disk, so how fast it
goes may depend on how much RAM you have and on the device for temporary
files.
On a linux server we see 10 'smart' uploads go from about 40 seconds and
150 MB/s over localhost to 20 seconds and 300 MB/s over localhost. The 10
'dumb' ones go from about 470 seconds and 10 MB/s to 350 seconds and 14
MB/s. On a mac laptop the smart uploads are about half that speed, dumb
ones the same.
--
Ticket URL: <
http://www.cherrypy.org/ticket/1095>
CherryPy <
http://www.cherrypy.org>
CherryPy - a pythonic, object-oriented HTTP framework