wsgi.file_wrapper and trac.web.api performance issues????

5 views
Skip to first unread message

Graham Dumpleton

unread,
Jul 20, 2008, 8:30:32 AM7/20/08
to Trac Development
Can someone tell me what I am missing in the following analysis.
Because I am on holiday can only read code and don't have access to a
runnable Trac instance to play with and see what happens in practice.

When Trac runs under a WSGI server such as mod_wsgi, the Request class
in trac.web.api is used. In the send_file method of that class it has
code:

if self.method != 'HEAD':
self._response = file(path, 'rb')
file_wrapper = self.environ.get('wsgi.file_wrapper')
if file_wrapper:
self._response = file_wrapper(self._response, 4096)

What this code does is open the file specified by 'path'. It then sees
whether the WSGI server has provided wsgi.file_wrapper extension and
if it does then wraps the file object using by executing call of
wsgi.file_wrapper. This will allow the WSGI server to used any
optimised method it can to send the file. In the case of Apache/
mod_wsgi (>= 2.0) which provides this extension, it will use operating
system sendfile() or similar depending on what is available.

Now, wsgi.file_wrapper is optional and a WSGI server doesn't have to
provide it (and mod_wsgi < 2.0 doesn't implement it). In that case the
actual file object gets returned as response. The problem with that is
that when a file object is used as an iterable as it would then be by
a WSGI adapater, the file is broken up based on text lines in the
file, ie., LF as separator. Thus the content of the file can
drastically effect performance as far as throughput or in terms of
memory usage.

The worst case throughput scenario is a large file containing lots of
lines which very few characters on each line. Problem there is that
because WSGI requires each block returned by iterable to be separately
flushed, you end up writing and flushing a large number of small
blocks. Thus network performance sucks really bad.

At the other end of the scale is a binary file that doesn't have any
LFs in it. Here the complete file would be read into memory in one go
causing a spike in transient memory usage. When request complete,
memory freed, but process has still allocated it. Would be bad if you
had really large attachments.

What I don't understand, or can't see how it is occurring, is why when
wsgi.file_wrapper extension doesn't exist, that the code isn't
wrapping the file object in its own wrapper object to return it in
fixed length blocks, just as the WSGI PEP describes.

class FileWrapper:

def __init__(self, filelike, blksize=8192):
self.filelike = filelike
self.blksize = blksize
if hasattr(filelike,'close'):
self.close = filelike.close

def __getitem__(self,key):
data = self.filelike.read(self.blksize)
if data:
return data
raise IndexError

I am well aware that there is a similar _FileWrapper in trac.web.wsgi,
but that doesn't seem to be used by trac.web.api Request object unless
it is through some bit of magic that I cannot find.

So, my question is, does _FileWrapper somehow get used by trac.web.api
Request object in some way I haven't found yet,or is Trac shooting
itself in the foot by making static file serving perform worse or
cause unexpected memory usage for the case the WSGI server doesn't
implement wsgi.file_wrapper. Last time I looked, various WSGI servers
or adapters don't implement wsgi.file_wrapper and neither does older
mod_wsgi 1.X versions.

Oh, and in case you think it isn't a problem because people would be
serving static files from the web server itself, then realise that
above code, ie., send_file(), is used when returning attachments and
the web server can't be made to directly handle them that I know of.

Feedback???

Graham

Christopher Lenz

unread,
Jul 21, 2008, 10:46:10 AM7/21/08
to trac...@googlegroups.com
Hey Graham,

Good catch, I think this problem has simply not been noticed because
mod_python (with the Trac WSGI adapter), mod_wsgi and tracd all have a
sensible implementation of the file_wrapper interface.

The fallback should be fixed to use the FileWrapper from trac.web.wsgi
as you suggest.

Thanks,
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/

Graham Dumpleton

unread,
Jul 21, 2008, 9:15:04 PM7/21/08
to Trac Development
Are you able to create ticket etc? If I have an account, can't
remember what it is and so can't log into issue tracker right now. :-)

Graham

Emmanuel Blot

unread,
Jul 22, 2008, 5:22:28 AM7/22/08
to trac...@googlegroups.com
You don't need to have an account on trac.edgewall.org to create or
edit a ticket (if your request is about Trac web site)

HTH,
Manu

--
Manu

Christopher Lenz

unread,
Jul 22, 2008, 7:38:05 AM7/22/08
to trac...@googlegroups.com
On 22.07.2008, at 03:15, Graham Dumpleton wrote:
> Are you able to create ticket etc? If I have an account, can't
> remember what it is and so can't log into issue tracker right now. :-)

You don't need an account, but I've just checked the change in anyway :)

<http://trac.edgewall.org/changeset/7376>

Cheers,

Reply all
Reply to author
Forward
0 new messages