Revisiting Performance

33 views
Skip to first unread message

Randy Syring

unread,
Feb 11, 2011, 5:06:47 PM2/11/11
to isapi_wsgi-dev
One of the things that has always bothered me about isapi-wsgi is that
the performance degradation seems so significant, especially since the
PyISAPIe project doesn't seem to suffer from the same problem.

http://code.google.com/p/isapi-wsgi/wiki/PerformanceEstimates

This led me to do some quick benchmarking of my own today (for
relative purposes only) and I found the following (numbers are req/
sec):

static file: 1283, 1207, 1302
isapi-wsgi: 642, 734, 708
isapi-extension direct: 1552, 1469, 1354

ran with: ab -c 5 -n 1000 on Windows 7

I know my method can be nitpicked, but its just a lot more settling
for me knowing that in my tests, there is only a ~50% performance
difference between isapi-wsgi vs static as opposed to 83% (wiki page).

In addition, its encouraging for me to know that the raw isapi
extension is much faster. I am assuming, therefore, that if push came
to shove, isapi-wsgi may be able to be optimized to perform better.

Just wondering if anyone else has thoughts on this and if that wiki
page might be out of date.

Thanks.

Preston Landers

unread,
Feb 15, 2011, 1:23:06 PM2/15/11
to isapi_w...@googlegroups.com
I don't have too much intelligent to say about this yet (such as pointers on where to improve performance) but I just wanted to say I am interested in the topic and will follow any potential developments.

I did some performance profiling of my own recently and did find that my company's Windows IIS / isapi-wsgi solution does have noticably lower throughput than an equivalent Linux + Lighttpd + flup FastCGI solution on the same hardware.  Unfortunately that's about as far as I got.  There's a chance my company may be able to allocate some of my time in the next 12 months to investigating this further as a lot of our customers are on Windows.

It's good to see that a raw ISAPI solution is pretty fast as it does indicate there is room for improvement.  I guess one of the first things to look at would be how much of this is isapi-wsgi itself and how much is Python interpreter overhead.  It would be good to do performance profiling both at the Python level and at the C level.  I'm not sure if IIS allows this with ISAPI extensions or if it would require the entire Windows install to be debug/checked enabled.  If I find out anything else on this topic I'll be sure to post here.

thanks,
Preston Landers
Chief Architect and Software Developer at Journyx, Inc.

Randy Syring

unread,
Feb 15, 2011, 3:36:16 PM2/15/11
to isapi_w...@googlegroups.com, Preston Landers
On 02/15/2011 01:23 PM, Preston Landers wrote:
> It's good to see that a raw ISAPI solution is pretty fast as it does
> indicate there is room for improvement. I guess one of the first
> things to look at would be how much of this is isapi-wsgi itself and
> how much is Python interpreter overhead. It would be good to do
> performance profiling both at the Python level and at the C level.
> I'm not sure if IIS allows this with ISAPI extensions or if it would
> require the entire Windows install to be debug/checked enabled. If I
> find out anything else on this topic I'll be sure to post here.

Preston,

Glad to hear you are interested in this.

Just to be clear, when I refer to "isapi-extension direct" in my tests,
it involved the Python interpreter. The code I used to test that is here:

https://gist.github.com/827989

Also, I did some tinkering with isapi_wsgi.py and figured out that the
main slow-down seems to be related to prepping the environ dict. Here
are my test files. Setup this way, any exceptions generated when
modifiying isapi-wsgi should come through to the browser for easier
debugging:

https://gist.github.com/828168

When I started off (no modifications to isapi-wsgi), I was getting ~900
reqs/sec. I then changed _run_app() to what is shown in the gist and
got ~1000 reqs/s. I then modified add_cgi_vars() and got ~1400
reqs/sec, which is just about even with my "isapi-extension" direct numbers.

So it looks to me like any performance improvement efforts should be
directed towards _run_app(), add_cgi_vars() and related environ
manipulations.

--------------------------------------
Randy Syring
Intelicom
Direct: 502-276-0459
Office: 502-212-9913

For the wages of sin is death, but the
free gift of God is eternal life in
Christ Jesus our Lord (Rom 6:23)


Mark Hammond

unread,
Feb 15, 2011, 5:53:16 PM2/15/11
to isapi_w...@googlegroups.com, Randy Syring
On 16/02/2011 7:36 AM, Randy Syring wrote:
> So it looks to me like any performance improvement efforts should be
> directed towards _run_app(), add_cgi_vars() and related environ
> manipulations.

FWIW, the Python profiler should work fine in this environment, so long
as you are sure to send the output somewhere other than the non-existent
console. I just made a small change to the C code so the GIL isn't
acquired and released when fetching server variables as that would just
have added overhead for no gain (as no IO is performed during that
operation.) If the profiler shows this fetch to still be significant,
we could look at the option of adding a method to allow multiple
variables to be fetched at once - but the profiler should tell us if
that is worthwhile.

Cheers,

Mark

Randy Syring

unread,
Feb 15, 2011, 9:09:39 PM2/15/11
to Mark Hammond, isapi_w...@googlegroups.com
I was able to do some initial profiling, thanks to Preston for the code
for the profiler. However, I have run into the problem that I can't get
the decorator on a function "high enough" in the stack in order to be
able to profile multiple requests. I can decorate HttpExtensionProc()
or even _run_app(), but that gets me a new profile for every request. I
haven't done a lot of profiling, but my instincts tell me that we would
want to be able to get cumulative profile info over say 10K requests.

I managed to do that by "faking" the IIS part of the server with FakeECB:

https://gist.github.com/828716

and then running the profiler. That should give a "pure" profile of
just isapi-wsgi, but it seems like it would be better to have the
profiling done with real requests through IIS. Any suggestions on how I
should go about doing that?

--------------------------------------
Randy Syring
Intelicom
Direct: 502-276-0459
Office: 502-212-9913

For the wages of sin is death, but the
free gift of God is eternal life in
Christ Jesus our Lord (Rom 6:23)

jame...@googlemail.com

unread,
Feb 16, 2011, 5:10:11 AM2/16/11
to isapi_w...@googlegroups.com
Hi guys,

I just wanted to say good luck, and thanks for making isapi-wsgi better. I'm stuck with IIS7 at work, and isapi-wsgi is a great tool for who like me is stuck on windows.

Kind regards,
James

Sent from my BlackBerry® wireless device
--
You received this message because you are subscribed to the Google Groups "isapi_wsgi-dev" group.
To post to this group, send email to isapi_w...@googlegroups.com.
To unsubscribe from this group, send email to isapi_wsgi-de...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/isapi_wsgi-dev?hl=en.

Preston Landers

unread,
Feb 16, 2011, 5:23:41 PM2/16/11
to isapi_w...@googlegroups.com
Did you try attaching the profile to __ExtensionFactory__? I think that's the highest level function. 

While profiling across multiple page views would be helpful, in practice it may be tough to do that.  Finding the right place to call the profiler is a problem, but also knowing when to cut off and dump out the profile.  It may be tough to do that with the function decorator I provided since it's pretty simple. 

Did you see anything interesting in the profiles of single page views?  It wouldn't surprise me if fetching the OS environ for each request is the major culprit.  But I'm not sure what could be done about that.  Caching or reusing these across multiple requests may be problematic in some (possibly rare) cases.

thanks,
Preston


Randy Syring

unread,
Feb 16, 2011, 5:47:38 PM2/16/11
to isapi_w...@googlegroups.com
Did you try attaching the profile to __ExtensionFactory__? I think that's the highest level function. 

Yes, but that function just returns the class instance, so the decorator exists immediately, even before the first HTTP request is processed.


While profiling across multiple page views would be helpful, in practice it may be tough to do that.  Finding the right place to call the profiler is a problem, but also knowing when to cut off and dump out the profile.  It may be tough to do that with the function decorator I provided since it's pretty simple.
Well, I think if I can massage the profiling code out of the decorator, I can probably write the profiling output after a specific number of requests or something.  Not sure how exactly it would work, but I would be surprised if something couldn't be worked out.


Did you see anything interesting in the profiles of single page views?
I didn't pay much attention in the little bit of time I had, I was more concerned with trying to get the profiler setup.  Here are the results from testing
isapi-wsgi by itself outside of IIS over 10K requests.

https://gist.github.com/830438

And here is just a single request:

https://gist.github.com/830452

But not sure how helpful that will be.  It may be another week or so before I can revisit this, bit it seems like profiling over multiple requests inside IIS would be the best solution if its possible.


--------------------------------------
Randy Syring
Intelicom
Direct: 502-276-0459
Office: 502-212-9913

For the wages of sin is death, but the
free gift of God is eternal life in 
Christ Jesus our Lord (Rom 6:23)

Chris Lambacher

unread,
Feb 18, 2011, 11:08:15 AM2/18/11
to isapi_w...@googlegroups.com, Randy Syring
On Wed, Feb 16, 2011 at 5:47 PM, Randy Syring <rsy...@gmail.com> wrote:
> Well, I think if I can massage the profiling code out of the decorator, I
> can probably write the profiling output after a specific number of requests
> or something.  Not sure how exactly it would work, but I would be surprised
> if something couldn't be worked out.

I have been working on profiling some Python in classic ASP which
suffers from the a similar issue with the lifetime of the profile
(i.e. wanting to get multiple requests). I have had some success with
a module that has the following:

_profiler = None
def start_profiler():
"""Start the profiler, needs to happen every request in
classic asp because
it seems like the profiler is stopped after each request"""
global _profiler
if not _profiler:
from cProfile import Profile
_profiler = Profile()
_profiler.enable()

def stop_profiler():
"""Stop the profiler on some specific in request cue like
Profile=Stop query param.
I do this as the last thing in the request"""
global _profiler
_profiler.dump_stats(r'c:\well\known\path.prof')
_profiler = None


The trick here is the low level (not documented) .enable() method. The
dump_stats method calls the .disable() method of the profiler. I still
have to look at the code for _lsprof.Profile (the base class for
cProfile to ensure that enable and disable maintain the current
profile information when called multiple times.

PyISAPIe which has done some performance optimization work and the
author specifically calls out attempting to do more of the boiler
plate WSGI work in C and without GIL (which probably doesn't much come
into play in sequential requests but will play heavily into concurrent
ones): http://sourceforge.net/apps/trac/pyisapie/wiki/ReleaseInfo &
http://sourceforge.net/apps/trac/pyisapie/wiki/WikiStart (in both
cases search for GIL).

If you are using 2.6 or lower, I would also try using psyco. I found
that helped a lot with the speed (doubling or tripling my requests per
second (in asp not isapi calls). My profiling showed that a lot of
time was spent transitioning between python and other languages in
asp.

I am wondering if the benchmark from the OP is more than a little
suspect. If the goal of your app is to export a static string, then
all the effort spent to convert the the environment info up front is
wasted. Any benchmark that compares effort spent to do nothing to no
effort spent to do nothing is always going to show the former as being
slower than the latter.

Is it worth getting the environment data lazily? Is it worth making a
Cython extension to do it faster? Can we make a C extension to do some
of the work of extracting the info from the ecb with the GIL released?

When first looked at isapi-wsgi, PyISAPIe did not have wsgi support.
Now it does and it looks mature, I am wondering what the advantage of
isapi-wsgi is over PyISAPIe at this point? Note, I am not saying there
isn't an advantage, I am just asking what people think the advantage
is?

I am also wondering (given the advances in PyPy lately) whether
isapi-wsgi can use PyPy instead. That could potentially be an
advantage for this use case, since the free threading allowed by PyPy
*should* allow higher levels of concurrency. Also PyPy has a jit for
x64, something for which there is no support in CPython.

-Chris

--
Christopher Lambacher
ch...@kateandchris.net

Mark Rees

unread,
Feb 20, 2011, 5:33:19 PM2/20/11
to isapi_w...@googlegroups.com
On Sat, Feb 19, 2011 at 3:08 AM, Chris Lambacher <ch...@kateandchris.net> wrote:
> On Wed, Feb 16, 2011 at 5:47 PM, Randy Syring <rsy...@gmail.com> wrote:


> Is it worth getting the environment data lazily? Is it worth making a
> Cython extension to do it faster? Can we make a C extension to do some
> of the work of extracting the info from the ecb with the GIL released?
>

In a previous post on this thread. Mark Hammond mentioned he has made
a change to the C code so the GIL isn't acquired and released when
fetching server variables. It would be interesting to see if this has
improved the benchmarking results.

Mark Hammond

unread,
Feb 20, 2011, 8:28:58 PM2/20/11
to isapi_w...@googlegroups.com, Mark Rees

A quick look at the profile data sent before implies to me that it will
not make much difference - but I guess the proof is in the pudding!
FWIW, I just released build 215 of pywin32 which has that change, so now
it can be measured...

Cheers,

Mark

Reply all
Reply to author
Forward
0 new messages