unable to start a background process using the development server

169 views
Skip to first unread message

stevecrozz

unread,
Nov 13, 2009, 3:15:00 PM11/13/09
to Django developers
I tried to reopen this bug http://code.djangoproject.com/ticket/9286,
but apparently that's a no-no since it was closed as invalid by a core
developer.

I'd like someone to take another look at it since I think Jacob didn't
understand the bug because the original poster's example was
ambiguous. I supplied my own, more concise example which I'll outline
here. Use the development server to serve any project where a view
tries to start a background process using the subprocess module:

import subprocess

def my_view(request):
subprocess.Popen(['/bin/sleep', '5'])
return HttpResponse(u'That sure took a while!')

This example should start a background process which would run for 5
seconds. Since its a background process, it should not block (but it
still does), try it from the cli:

$ python -i
>>> import subprocess
>>> subprocess.Popen(['/bin/sleep', '30'])
<subprocess.Popen object at 0xb77c606c>

This example does not block, it properly starts the background process
and returns its Popen object immediately.

Also, I tried to send this message a few days ago but I don't see that
it ever made it to the list, hopefully I'm not double-posting.

--stevecrozz

Russell Keith-Magee

unread,
Nov 13, 2009, 10:29:45 PM11/13/09
to django-d...@googlegroups.com
On Sat, Nov 14, 2009 at 4:15 AM, stevecrozz <steve...@gmail.com> wrote:
> I tried to reopen this bug http://code.djangoproject.com/ticket/9286,
> but apparently that's a no-no since it was closed as invalid by a core
> developer.
>
> I'd like someone to take another look at it since I think Jacob didn't
> understand the bug because the original poster's example was
> ambiguous. I supplied my own, more concise example which I'll outline
> here. Use the development server to serve any project where a view
> tries to start a background process using the subprocess module:
>
> import subprocess
>
> def my_view(request):
>    subprocess.Popen(['/bin/sleep', '5'])
>    return HttpResponse(u'That sure took a while!')
>
> This example should start a background process which would run for 5
> seconds. Since its a background process, it should not block (but it
> still does), try it from the cli:
>
> $ python -i
>>>> import subprocess
>>>> subprocess.Popen(['/bin/sleep', '30'])
> <subprocess.Popen object at 0xb77c606c>
>
> This example does not block, it properly starts the background process
> and returns its Popen object immediately.

Ah - but so does your example. Try putting some debug in your original view:

def my_view(request):
subprocess.Popen(['/bin/sleep', '5'])
print "Hi there!"
return HttpResponse(u'That sure took a while!')

When you request this view, the call to subprocess doesn't block - the
print statement is issued immediately. However, the web server process
will wait until all subprocesses have completed before it serves the
response to the browser.

The fact that it is waiting for the process to finish doesn't block
the server either - it can serve other requests while it is waiting
for the first request to complete it's subprocesses.

Personally, I agree with Jacob - I don't see this as much of a
problem, for several reasons.

Firstly, Django's development server isn't multithreaded, and it isn't
intended for production use. I'm almost in favour of having
performance problems like this in the development server, specifically
to act as a discouragement for production use.

Secondly, while I haven't actually tested subprocess behaviour on
"real" webservers, I'd be willing to bet that at least some
implementations will do exactly the same thing that Django's webserver
is doing. After all, all we're doing here is *not* making the
subprocesses daemons by default.

Lastly, If your application is relying on HTTP to kick off an
orphan/daemon subprocesses to do some background processing, I
strongly suspect that UR DOIN IT WRONG. HTTP requests are supposed to
be shortlived. Spawning long lived behaviour out of a short lived
request isn't a really good idea. If you want a web request to spawn
some long-lived processing, you should probably be looking at
implementing a Message Queue for your app. There are plenty of options
out there, ranging from a ghetto queue (write a "job" to the database,
and have a cron job to pick up and run the jobs) all the way up to all
singing, all dancing projects with graphical management interfaces
etc.

Yours,
Russ Magee %-)

Stephen Crosby

unread,
Nov 14, 2009, 12:01:53 AM11/14/09
to django-d...@googlegroups.com
Thanks for the response Russell,

We can get into exactly what I'm doing if we need to, but for now lets just assume that I've thought this through fairly thoroughly and I do in fact want to start background processes from certain very specific web requests. We can certainly get into the details if I have to defend my rationale.

Here's the outline of what I'm doing...
1) Make an ajax request to the django view
2) check to see if a process is running (lockfile)
   a) its running, return response A
   b) its not running, start it and return response B

I'll let Jacob speak for himself if he wants to, but he said, "I'm not sure this is a bug at all. That is, Popen.communicate() is supposed to block". Since the rationale for this not being a bug is that Popen.communicate() is supposed to block. He's left open the possibility of this being a bug. So both of us agree with Jacob right now.

I'll respond to some of your other arguments also.

The development server isn't for production use. This is true, and I can live with that, but it makes this type of development more cumbersome (this is something I'd want to test in development before I deploy it), if its a huge design change then maybe its not worth the effort to carry out, but I'm not convinced yet. Maybe the development server can't or shouldn't be made to fit my needs, but I'd still like to look at that rationale more closely and add that information to the bug report once the community comes to a more full decision.

I'm not sure I understand the second argument about single-threadedness yet. Aren't we talking about multiple processes and not multiple threads? Can't one process start another process without needing multiple threads?

As for the third, sure there are many other ways to do what I want to do, but I have certain rationale for choosing the way I have. Of course I agree that HTTP requests are to be short lived. That's why I've opened up this discussion. Currently, using subprocess with the development server takes a very long time. I plan to run certain background tasks that will not be part of the HTTP response so there's no reason to wait for them. There could also be other reasons someone might want to start background processes from django. I'm betting there are since there are a few other related bug reports although they're all pretty light on details.

If anyone has a better idea for starting background processes from django, I'd be happy to consider them for my project.

--Stephen



Yours,
Russ Magee %-)

--

You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=.



Russell Keith-Magee

unread,
Nov 14, 2009, 1:32:50 AM11/14/09
to django-d...@googlegroups.com
On Sat, Nov 14, 2009 at 1:01 PM, Stephen Crosby <steve...@gmail.com> wrote:
> Thanks for the response Russell,
>
> We can get into exactly what I'm doing if we need to, but for now lets just
> assume that I've thought this through fairly thoroughly and I do in fact
> want to start background processes from certain very specific web requests.
> We can certainly get into the details if I have to defend my rationale.

Ultimately, you're the person who has to live with your code, so I'll
defer to your judgement. However, put me on record as officially
advising against it :-)

> The development server isn't for production use. This is true, and I can
> live with that, but it makes this type of development more cumbersome (this
> is something I'd want to test in development before I deploy it), if its a
> huge design change then maybe its not worth the effort to carry out, but I'm
> not convinced yet. Maybe the development server can't or shouldn't be made
> to fit my needs, but I'd still like to look at that rationale more closely
> and add that information to the bug report once the community comes to a
> more full decision.

Django's development server isn't doing anything specific here with
regards to subprocesses. We're just providing a WSGI handler to
Python's BaseHTTPServer. Waiting for subprocess to terminate before
terminating a parent process is not an uncommon behaviour, and that is
exactly what BaseHTTPServer is doing in this case. There might be ways
to fix this, but I doubt they will be simple.

I would also like to repeat a comment from my last mail - have you
actually confirmed that your deployment platform of choice will
actually support the approach you are taking here? I haven't tested
this myself, and I can't find any obvious documentation to this
effect, but I would not be in the least bit surprised to find that a
long lived subprocess is a one-way path to server resource starvation.

> I'm not sure I understand the second argument about single-threadedness yet.
> Aren't we talking about multiple processes and not multiple threads? Can't
> one process start another process without needing multiple threads?

I raised this point because historically, problems with multiple/long
lived requests in the Django development server are closely followed
by requests to make the development server multithreaded. The Django
core has repeatedly rejected these suggestions, specifically because
we don't want Django's development server to be used in production,
and this is one way to make sure that this request is honoured.

I was trying to point out that in this case, while you are waiting for
the subprocess to complete, you actually can serve other responses. It
isn't locking anything - it's just waiting for the subprocess to
complete before the server completes the response.

> If anyone has a better idea for starting background processes from django,
> I'd be happy to consider them for my project.

Sure - and my original response contains 2 of them:

1) Use a message queue :-)
2) Start your subprocess in daemon mode.

Yours,
Russ Magee %-)
Reply all
Reply to author
Forward
0 new messages