threading.local, apache & wsgi

154 views
Skip to first unread message

Jonathan Lundell

unread,
Sep 7, 2010, 5:18:13 PM9/7/10
to mod...@googlegroups.com
Background: I've been writing an enhancement to the web2py application framework, which for purposes of this question can (I think) be viewed as just another wsgi app.

The legacy code has a module variable that contains a set of URL-rewriting parameters. That variable is referenced at various points in a requests lifetime. The enhancement (I won't bother with a lot of detail) establishes multiple sets of these parameters. A set is chosen when a request first comes in, and is used for its lifetime.

I think it's obvious that this isn't thread-safe. The easiest and most elegant fix would be to use threading.local to create a bit of thread-local storage to store the parameters during a request.

This should work fine with web2py's embedded server (Rocket, which is native Python and uses the threading module), and I assume it would work for threads created by mod_wsgi.

What I'm less certain of is whether it will also be safe for Apache worker threads. I'm frankly more than a little confused on the subject, and I don't really understand the mechanism that threading.local is using in the first place.

So: a) should threading.local be thread-safe against Apache worker threads, and b) if not, is there another approach that might work better? That is, some other approach to thread-local storage that would work with Apache threads.

Jason Garber

unread,
Sep 7, 2010, 10:13:42 PM9/7/10
to mod...@googlegroups.com

Just from experience... Yes.  I use threading.local extensively with mod_wsgi.  Check out threading.enumerate() sometime, and you will see the actual python threads in use, despite the fact they are created outside of python.

Now, I was using daemon mode when last testong, so maybe someone else can comment on.embedded mode...  But I suspect that it will work as advertised in any event.

Jason Garber


--
You received this message because you are subscribed to the Google Groups "modwsgi" group.
To post to this group, send email to mod...@googlegroups.com.
To unsubscribe from this group, send email to modwsgi+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en.

Jonathan Lundell

unread,
Sep 8, 2010, 1:47:36 AM9/8/10
to mod...@googlegroups.com
On Sep 7, 2010, at 7:13 PM, Jason Garber wrote:

Just from experience... Yes.  I use threading.local extensively with mod_wsgi.  Check out threading.enumerate() sometime, and you will see the actual python threads in use, despite the fact they are created outside of python.

Now, I was using daemon mode when last testong, so maybe someone else can comment on.embedded mode...  But I suspect that it will work as advertised in any event.



If daemon mode gives wsgi a dedicated process, them I'm fairly sure it'd work fine. But because this code is going into the web2py framework, and not a specific application where I have control over deployment, I'd like to be sure that it's thread-safe in all deployment modes.

Graham Dumpleton

unread,
Sep 8, 2010, 2:06:08 AM9/8/10
to mod...@googlegroups.com
On 8 September 2010 07:18, Jonathan Lundell <jlun...@pobox.com> wrote:
> Background: I've been writing an enhancement to the web2py application framework, which for purposes of this question can (I think) be viewed as just another wsgi app.
>
> The legacy code has a module variable that contains a set of URL-rewriting parameters. That variable is referenced at various points in a requests lifetime. The enhancement (I won't bother with a lot of detail) establishes multiple sets of these parameters. A set is chosen when a request first comes in, and is used for its lifetime.
>
> I think it's obvious that this isn't thread-safe. The easiest and most elegant fix would be to use threading.local to create a bit of thread-local storage to store the parameters during a request.
>
> This should work fine with web2py's embedded server (Rocket, which is native Python and uses the threading module), and I assume it would work for threads created by mod_wsgi.
>
> What I'm less certain of is whether it will also be safe for Apache worker threads. I'm frankly more than a little confused on the subject, and I don't really understand the mechanism that threading.local is using in the first place.
>
> So: a) should threading.local be thread-safe against Apache worker threads,

Yes, because all though it is a foreign thread, a Python thread state
object is still created for the thread and it is that Python thread
state object where threading.local() instances for a thread are
stored.

So, as far as your concerned, you should see no difference to a
standalone pure Python WSGI server using threads.

Just remember, that as good practice you should cleanup any cached
data held within the thread local object for that request when the
request ends. This is because that thread local data can persist
across requests which so happen to be handled by the same request. If
you don't either ensure you clear the data out at start or request or
clear it up at the end, then you could pollute the cached data for a
subsequent request. Obviously, if you don't cleanup at end of request,
then you also hold that memory over and if thread not used for a
while, is wasted memory. If small amount of memory, then not an issue,
but if a lot, not ideal. The best option is to clear at start of
request to ensure reset and clear at end of request, even if request
fails.

> and b) if not, is there another approach that might work better? That is, some other approach to thread-local storage that would work with Apache threads.

The alternate approach to using threading.local() is to bind the per
request data to the request object itself. This is actually the more
traditional approach, although it does presume that all code which
needs access to your data has access to the request object.

Graham

Jonathan Lundell

unread,
Sep 8, 2010, 3:32:50 PM9/8/10
to mod...@googlegroups.com
On Sep 7, 2010, at 11:06 PM, Graham Dumpleton wrote:
>
>> and b) if not, is there another approach that might work better? That is, some other approach to thread-local storage that would work with Apache threads.
>
> The alternate approach to using threading.local() is to bind the per
> request data to the request object itself. This is actually the more
> traditional approach, although it does presume that all code which
> needs access to your data has access to the request object.

Thanks. My original plan was to do something like that, where the web2py request object is already thread-safe. The problem was it wasn't universally accessible, so I'd have to pass it around. threading.local will work well for me; all I need is one reference, and it will be explicitly initialized on each request.

Reply all
Reply to author
Forward
0 new messages