Sending large files to browser

647 views
Skip to first unread message

Sutabi

unread,
Jul 1, 2011, 1:29:12 PM7/1/11
to cherrypy-users
I am working on a personal project programming a video library.
However I ran into a big problem, when I try to server my videos up
(mp4s) my ram maxs out and my IO sky rockets. Its only running on a
netbook so it does not even finish serving. The files are from 500MB
to 6GB.

So I was wondering if there is a way to server large files without
pulling the entire file into memory. I read that cherrypy can do
chunked when enabled to run with http 1.1 but I'm not sure how to make
static files actually serve as chunked data. I realise I can just set-
up Apache to do static files for me, but hopping to avoid this.

Tim Roberts

unread,
Jul 1, 2011, 2:56:16 PM7/1/11
to cherryp...@googlegroups.com

I think you would be happier if you let Apache handle this. It's very
good at that kind of thing. You really need to do some low-level I/O
performance tweaks in order to stream multigigabyte files, and you can't
do that in Python.

--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Daniel Dotsenko

unread,
Jul 6, 2011, 5:18:36 PM7/6/11
to cherrypy-users
I think i remember hearing about separate API for file uploads in
CherryPy.

If that is unicorns, see if you can fall down to the level of CherryPy
WSGI Server (not the app stack). Since last release it supports PEP
3333's bottomless IO on incoming stream through proper support of HTTP
1.1 Chunked encoding (which is what your web client is likely using
when you send the file, unless it's a form, where things are
messier).

There, the in-stream is throttled and you can read from it in chunks
(and write in chunks) until it just ends. Very very nice!

I have a Git server(s) built on top of CherryPy's WSGI Server (http://
github.com/dvdotsenko/git_http_backend.py and http://github.com/dvdotsenko/ges)
and tested very large transfers just fine.

I did not try full CherryPy's app stack, but by the sound of your
message it seems that the stack hides the in-stream reading by trying
to read all of it into memory. It's nice in some cases, but,
obviously, not in this one.

Daniel.

Tony Caduto

unread,
Jul 6, 2011, 11:16:00 PM7/6/11
to cherryp...@googlegroups.com
On 7/6/2011 4:18 PM, Daniel Dotsenko wrote:
> I think i remember hearing about separate API for file uploads in
> CherryPy.
>
> If that is unicorns, see if you can fall down to the level of CherryPy
> WSGI Server (not the app stack). Since last release it supports PEP
> 3333's bottomless IO on incoming stream through proper support of HTTP
> 1.1 Chunked encoding (which is what your web client is likely using
> when you send the file, unless it's a form, where things are
> messier).
>
Are there any docs on how to use this API?

Lakin Wecker

unread,
Jul 6, 2011, 11:33:51 PM7/6/11
to cherryp...@googlegroups.com
Cherrypy (At least 3.X) reads the file into a temporary file (not into
memory) - you can then read it chunks at a time using something like:

def upload(self, myFile):
out = """<html>
<body>
myFile length: %s<br />
myFile filename: %s<br />
myFile mime-type: %s
</body>
</html>"""

# Although this just counts the file length, it demonstrates
# how to read large files in chunks instead of all at once.
# CherryPy reads the uploaded file into a temporary file;
# myFile.file.read reads from that.
size = 0
while True:
data = myFile.file.read(8192)
if not data:
break
size += len(data)

return out % (size, myFile.filename, myFile.content_type)
upload.exposed = True

(taken from - http://docs.cherrypy.org/dev/progguide/files/uploading.html)

If that's not good enough - you can specify custom request body
processors to read the file in chunks and directly write it to where
you want it to go (although you should probably validate it before
doing this) - http://www.cherrypy.org/wiki/RequestBodies

Lakin

> --
> You received this message because you are subscribed to the Google Groups
> "cherrypy-users" group.
> To post to this group, send email to cherryp...@googlegroups.com.
> To unsubscribe from this group, send email to
> cherrypy-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/cherrypy-users?hl=en.
>
>

Lakin Wecker

unread,
Jul 6, 2011, 11:34:28 PM7/6/11
to cherryp...@googlegroups.com
Whoops - I'm talking about uploading - you guys are talking about
downloading. :)

Lakin

Lakin Wecker

unread,
Jul 6, 2011, 11:40:29 PM7/6/11
to cherryp...@googlegroups.com
http://www.cherrypy.org/wiki/ReturnVsYield#HowstreamingoutputworkswithCherryPy
Covers how to stream larger response pages to the browser. I'm not
sure if there is a predefined static media recipe that uses this
pattern, but I suspect someone could use the current static media
handlers and merge them with this pattern and it would work.

Lakin

On Wed, Jul 6, 2011 at 9:34 PM, Lakin Wecker

Michiel Overtoom

unread,
Jul 7, 2011, 8:36:04 AM7/7/11
to cherryp...@googlegroups.com
On 2011-07-01 19:29, Sutabi wrote:

> I am working on a personal project programming a video library.
> However I ran into a big problem, when I try to server my videos up
> (mp4s) my ram maxs out and my IO sky rockets. Its only running on a
> netbook so it does not even finish serving. The files are from 500MB
> to 6GB.

May I suggest a change in your application/environment architecture? I
too made a site which provided streaming of large video files to
visitors. I hosted all the big content on Amazon S3, and had CloudFront
serve it. The webapp only serves small pages containing links to these
CloudFront-hosted videos.

The serving of static content, especially large files, is better not
placed as a burden on CherryPy (or Django, or Rails, for that matter).
Instead, use something as nginx to do that (this will also take care of
not blocking your webapp when many slow clients connect).

Greetings,

--
"Good programming is not learned from generalities, but by seeing how
significant programs can be made clean, easy to read, easy to maintain
and modify, human-engineered, efficient, and reliable, by the application
of common sense and good programming practices. Careful study and
imitation of good programs leads to better writing."
- Kernighan and Plauger, motto of 'Software Tools'

Reply all
Reply to author
Forward
0 new messages