sending binary stream directly to browser

388 views
Skip to first unread message

fred.dixon

unread,
Apr 24, 2005, 10:23:33 AM4/24/05
to cherryp...@googlegroups.com
I was playing around with this last year using python cgi but i cant
find the code i was using.

I'm not seeing how to do this clearly using cherrypy.
I'm pretty sure i need to use the response object.
has anyone got a simple example ?
or can point me to one?

Ravi

unread,
Apr 24, 2005, 11:28:41 AM4/24/05
to cherryp...@googlegroups.com

Fred:
Ran into this problem last month. See post on XMLHttpRequest() on
this list from last week. If your binary content is static and larger
than 1MB or so, use Apache2 to serve it. CherryPy does not scale well
when trying to transmit binary data. For example: in the
_cphttpserver.py file, there is code that measures "Content-Length"
line by line. For large, static binary files, that approach is not
recommended. I added two lines to the static content check portion
(doRequest or handleRequest method...don't have code in front of me
right now...)
.....
contentLength=os.path.getsize(fname)
<Header Map dictionary variable>["Content-Length"] = contentLength

This improved things quite a bit in terms of beginning the download.
However, the download still took about 15 min to complete. By
comparison, Apache2 serves the same file in 4 seconds or so !!!

The speed issue must have something to do with how CherryPy handles
socket communication with the browser (my client is Mozilla based
Javascript).

Another problem serving large binary content with CherryPy is that the
slow download would terminate when about 90% of the content was
transferred (tested with varying files). This happened about 50-75% of
the time with a variety of large binary files. In this case, I am not
sure CherryPy is responsible; could be the browser code. Anyway having
deadlines to meet, I have deferred investigation of this matter to a
later date.

BOTTOM LINE: Run Apache2 on a separate port on the server. Point all
binary download requests to this port and get your content served
quickly and scalably, while freeing CherryPy to do what it does
best...server side computations using Python.

Thanks to Remco and nopas for answering my XMLHttpRequest question of
last week.

Regards
Ravi

fred.dixon

unread,
Apr 24, 2005, 11:44:14 AM4/24/05
to cherryp...@googlegroups.com
crapola!
i was trying to avoid that.

is there away similar to the upload recipe to have cherrypy NOT process
the output, or sidestep CP for this one process ?

Remi Delon

unread,
Apr 24, 2005, 11:48:55 AM4/24/05
to cherryp...@googlegroups.com
> CherryPy does not scale well
> when trying to transmit binary data. For example: in the
> _cphttpserver.py file, there is code that measures "Content-Length"
> line by line.

This was true in cherrypy-2.0-beta but has been fixed in Subversion
since then. It now uses os.stat to get the size of the file and uses a
generator to read/serve the file on the fly, without loading it
completely in memory.

If you're generating the data dynamically, you can tell CP what the size
is beforehand (if you know it) so that CP doesn't call len(...) (which
is bad if you're using generators).
If you don't know the size and can't compute it beforehand, you can just
remove the "Content-Length" key from the dictionary and CP won't send
that header at all.

Remi.

Ravi

unread,
Apr 24, 2005, 11:56:09 AM4/24/05
to cherryp...@googlegroups.com
Fred:
You can always start with Pythons BaseHTTPServer class and write
your own fileserver, but getting it to work efficiently will require a
lot of experimentation ;-). You might try looking at other frameworks
such as Twisted/Nevow which can do asynchronous socket communication
very well. For those who are time constrained, Apache2 or a similar web
server may be the fastest answer.

I believe that the CherryPy code needs to be revamped to support
HTTP1.1 especially w.r.t to chunked data in order to serve binary files
efficiently...maybe someday I will take a stab at it myself....

BTW, I wished to thank Remi Delon and nopa90 for responding to me last
week and not as misattributed in my previous post...

Regards
Ravi

Sylvain Hellegouarch

unread,
Apr 24, 2005, 1:29:55 PM4/24/05
to cherryp...@googlegroups.com
Hi,

I've added a simple recipe to handle download request.

http://www.cherrypy.org/wiki/FileDownload

For some reason I haven't understood as of yet, for binary files,
although the content of the files are correctly transferred, their
header is not.

Let me know if you understand why :)

- Sylvain

Remi Delon

unread,
Apr 24, 2005, 1:53:16 PM4/24/05
to cherryp...@googlegroups.com
> I've added a simple recipe to handle download request.
>
> http://www.cherrypy.org/wiki/FileDownload

You don't need a filter for this ... How about this:

# from cherrypy import cpg
#
# import os
#
# def file2Generator(f):
# for l in f:
# yield l
#
# class Root:
# def download(self, path):
# fStat = os.stat(path)
# cpg.response.headerMap["Content-Type"] = "application/x-download"
# cpg.response.headerMap['Content-Length'] = int(fStat.st_size)
# f = open(path, 'rb')
# return file2Generator(f)
# download.exposed = True
#
# cpg.root = Root()
# cpg.server.start()

Remi

PS: The "file2Generator" function could probably be improved to read
fixed amounts of data instead of splitting over newlines.

Sylvain Hellegouarch

unread,
Apr 24, 2005, 2:00:42 PM4/24/05
to cherryp...@googlegroups.com
Remi,

I just hate you now :)

One might say that by using a filter, one can reuse the same code in
different places ;)

- Sylvain

fred.dixon

unread,
Apr 24, 2005, 4:36:25 PM4/24/05
to cherryp...@googlegroups.com
I would like to thank you both for your help.
I'm going to play with Remi's code for now 'cause i want to send back
binary files.

fred.dixon

unread,
Apr 24, 2005, 5:23:10 PM4/24/05
to cherryp...@googlegroups.com


I was thinking about what you said about reading fixed amounts and I
was playing and looked at the upload recipe, while not as pretty it
seemed to be faster.


still learning python though.


.def file2Generator(self,f):
. while 1:
. data = f.read(1024 * 8) # Read blocks of 8KB at a time
. if not data: break
. yield data
. #for l in f:
. # yield l

Remi Delon

unread,
Apr 25, 2005, 4:14:52 AM4/25/05
to cherryp...@googlegroups.com
Sylvain Hellegouarch wrote:
>
> Remi,
>
> I just hate you now :)

Can you update your page in the Wiki ? I don't want people to think it's
that complicated to provide a download facility in CP ;-)

Remi.

Sylvain Hellegouarch

unread,
Apr 25, 2005, 6:44:46 AM4/25/05
to cherryp...@googlegroups.com
Hi Remi,

What part exactly ?

I've added your code snippet. Is there anything else?

- Sylvain

Remi Delon a écrit :

Remi Delon

unread,
Apr 25, 2005, 7:13:06 AM4/25/05
to cherryp...@googlegroups.com
> What part exactly ?
>
> I've added your code snippet. Is there anything else?

That's fine, thanks :-)

Remi.
Reply all
Reply to author
Forward
0 new messages