[Web-SIG] question about connection pool, task queue in WSGI

722 views
Skip to first unread message

est

unread,
Jul 12, 2012, 10:50:01 PM7/12/12
to web...@python.org
Hi list,

I am running a site with django + uwsgi, I have few questions about how WSGI works.

1. Is db connection open/close handled by Django? If it's open/closed per request, can we make a connection pool in wsgi level, then multiple django views can share it?

2. As a general design consideration, can we execute some task *after* the response has returned to client? I have some heavy data processing need to be done after return HttpResponse() in django, the standard way to do this seems like Celery or other task queue with a broker. It's just too heavyweight. Is it possible to do some simple background task in WSGI directly?

Thanks in advance!

Graham Dumpleton

unread,
Jul 12, 2012, 11:31:43 PM7/12/12
to est, web...@python.org
On 12 July 2012 19:50, est <electr...@gmail.com> wrote:
> Hi list,
>
> I am running a site with django + uwsgi, I have few questions about how WSGI
> works.
>
> 1. Is db connection open/close handled by Django? If it's open/closed per
> request,

Yes it is.

> can we make a connection pool in wsgi level, then multiple django
> views can share it?

Only by changing the Django code base from memory. Better off asking
on the Django users list.

> 2. As a general design consideration, can we execute some task *after* the
> response has returned to client? I have some heavy data processing need to
> be done after return HttpResponse() in django, the standard way to do this
> seems like Celery or other task queue with a broker. It's just too
> heavyweight. Is it possible to do some simple background task in WSGI
> directly?

Read:

http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode

In doing this though, it ties up the request thread and so it would
not be able to handle other requests until your task has finished.

Creating background threads at the end of a request is not a good idea
unless you do it using a pooling mechanism such that you limit the
number of worker threads for your tasks. Because the process can crash
or be shutdown, you loose the job as only in memory and thus not
persistent.

Better to use Celery, or if you think that is too heavy weight, have a
look at Redis Queue (RQ) instead.

Graham
_______________________________________________
Web-SIG mailing list
Web...@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/python-web-sig-garchive-9074%40googlegroups.com

Graham Dumpleton

unread,
Jul 13, 2012, 10:56:10 AM7/13/12
to est, Web SIG
Please keep replies in the mailing list.

Graham

On 13 July 2012 07:18, est <electr...@gmail.com> wrote:
> Thanks for the answer. That's very helpful info.


>
>> Only by changing the Django code base from memory. Better off asking
> on the Django users list.
>

> Is my idea was good or bad? (make wsgi handle connection pools, instead of
> wsgi apps)
>
> I read Tarek Ziadé last month's experiement of re-use tcp port by specify
> socket FDs. It's awesome idea and code btw. I have couple of questions about
> it:
>
> 1. In theory, I presume it's also possible with db connections? (After wsgi
> hosting worker ended, handle the db connection FD to the next wsgi worker)
>
> 2. Is the socket FD the same mechanism like nginx? If you upgrade nginx
> binary, restart nginx, the existing http connection won't break.
>
> 3. Is my following understanding of wsgi model right?
>
> A wsgi worker process runs the wsgi app (like django), multiple requests are
> handled by the same process, the django views process these requests and
> returns responses within the same process (possible in fork or threaded way,
> or even both?). After a defined number of requests the wsgi worker
> terminates and spawns the next wsgi worker process.
>
> Before hacking into a task queue based on pure wsgi code, I want to make
> sure my view of wsgi is correct. :)
>
> Please advise! Thanks in advance!

Graham Dumpleton

unread,
Jul 14, 2012, 12:07:01 AM7/14/12
to est, Web SIG
> On 13 July 2012 07:18, est <electr...@gmail.com> wrote:
>> Thanks for the answer. That's very helpful info.
>>
>>> Only by changing the Django code base from memory. Better off asking
>> on the Django users list.
>>
>> Is my idea was good or bad? (make wsgi handle connection pools, instead of
>> wsgi apps)
>>
>> I read Tarek Ziadé last month's experiement of re-use tcp port by specify
>> socket FDs. It's awesome idea and code btw. I have couple of questions about
>> it:
>>
>> 1. In theory, I presume it's also possible with db connections? (After wsgi
>> hosting worker ended, handle the db connection FD to the next wsgi worker)

Unlikely. HTTP connections are stateless, open database connections
are high unlikely to be stateless with the client likely caching
certain session information.

>> 2. Is the socket FD the same mechanism like nginx? If you upgrade nginx
>> binary, restart nginx, the existing http connection won't break.

I would be very surprised if you could upgrade nginx, perform a
restart and preserve the HTTP listener socket. If you are talking
about some other socket I don't know what you are talking about.

As you can with Apache, you can likely enact a configuration file
change and perform a restart or trigger rereading of the configuration
and it would maintain the HTTP listener socket across the
configuration restart, but an upgrade implies changing the binary and
I know no way that you could easily persist a HTTP listener socket
across to an invocation of a new web server instance using a new
executable. In Apache you certainly cannot do it, and unless nginx has
some magic where the existing nginx execs the new nginx version and
somehow communicates through open socket connections to the new
process, I very much doubt it would as it would be rather messy to do
so.

>> 3. Is my following understanding of wsgi model right?
>>
>> A wsgi worker process runs the wsgi app (like django), multiple requests are
>> handled by the same process, the django views process these requests and
>> returns responses within the same process (possible in fork or threaded way,
>> or even both?). After a defined number of requests the wsgi worker
>> terminates and spawns the next wsgi worker process.

Different WSGI severs would behave differently, especially around
process control, but your model of understand is close enough.

>> Before hacking into a task queue based on pure wsgi code, I want to make
>> sure my view of wsgi is correct. :)

Would still suggest you just use an existing solution.

Graham

Roberto De Ioris

unread,
Jul 14, 2012, 1:52:46 AM7/14/12
to est, web...@python.org
You can abuse one of the feature you already found in uWSGI.

The simplest approach would be using the Spooler (check uWSGI docs).

It is a simplified celery, where the queue is a simple 'spool directory'
(like a printing system).

A non-uWSGI related trick, would be having a thread pool (one for each
worker) in which you enqueue tasks from the request handler:

http://projects.unbit.it/uwsgi/wiki/Example#threadqueue

There are other solutions to your problem, but all are not relevant to
WSGI, so you may want to move to discussion to the uWSGI list directly.

--
Roberto De Ioris
http://unbit.it

est

unread,
Jul 14, 2012, 3:38:15 AM7/14/12
to rob...@unbit.it, web...@python.org
These uwsgi features are pretty neat! Thank you! I'll try this.

Simon Sapin

unread,
Jul 15, 2012, 11:14:35 AM7/15/12
to web...@python.org
Le 14/07/2012 06:07, Graham Dumpleton a écrit :
>>> >>2. Is the socket FD the same mechanism like nginx? If you upgrade nginx
>>> >>binary, restart nginx, the existing http connection won't break.
> I would be very surprised if you could upgrade nginx, perform a
> restart and preserve the HTTP listener socket. If you are talking
> about some other socket I don't know what you are talking about.
>
> As you can with Apache, you can likely enact a configuration file
> change and perform a restart or trigger rereading of the configuration
> and it would maintain the HTTP listener socket across the
> configuration restart, but an upgrade implies changing the binary and
> I know no way that you could easily persist a HTTP listener socket
> across to an invocation of a new web server instance using a new
> executable. In Apache you certainly cannot do it, and unless nginx has
> some magic where the existing nginx execs the new nginx version and
> somehow communicates through open socket connections to the new
> process, I very much doubt it would as it would be rather messy to do
> so.

I think that est refers to this:
http://wiki.nginx.org/CommandLine#Upgrading_To_a_New_Binary_On_The_Fly

Apparently yes, there is specific code in nginx to start the new binary
and give it the existing socket.

And I think that yes, Tarek’s new Circus is similar to the nginx magic
upgrade in that an open socket is passed around processes. Maybe nginx
even does this in normal operation with multiple worker processes, but I
don’t know.

Regards,
--
Simon Sapin

Benoit Chesneau

unread,
Jul 15, 2012, 4:42:53 PM7/15/12
to web...@python.org
Gunicorn does upgrade itself using the USR2 signal just like nginx and
share the socket like using an fd between OS processes.

However the case of a db is a little different since you handle a
connection to a db and not listening on a port. You will need either a
multiprocess queue passing messages to one process or launching a
connection per processes. You can do that using the hook system of
gunicorn.


- benoît
Reply all
Reply to author
Forward
0 new messages