Worker restart like in mod_wsgi

1,199 views
Skip to first unread message

Thomas Güttler

unread,
Mar 28, 2013, 5:37:52 AM3/28/13
to celery...@googlegroups.com
Hi,

I like the way mod_wsgi restarts its workers (in daemon mode).

You need to "touch" a file, and all new request get the code. The python reload() method is *not* used.
It is like starting a new python interpreter.

I like this, because running tasks/request are not interrupted, and all new tasks/requests get new code in a way
suitable for production servers. It is a graceful restart without downtime.

Is it possible to implement this?

Of course this would be one more system call before executing each task: Check the mtime of one file. But
this is not a problem in my environment.

mod_wsgi daemon mode can be compared to celery workers. Here is how wsgi does it:

http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode

{{{

Reloading In Daemon Mode

If using mod_wsgi daemon mode, what happens when the script file is changed is different to what happens in embedded mode. In daemon mode, if the script file changed, rather than just the script file being reloaded, the daemon process which contains the application will be shutdown and restarted automatically.

Detection of the change in the script file will occur at the time of the first request to arrive after the change has been made. The way that the restart is performed does not affect the handling of the request, with it still being processed once the daemon process has been restarted.

In the case of there being multiple daemon processes in the process group, then a cascade effect will occur, with successive processes being restarted until the request is again routed to one of the newly restarted processes.

In this way, restarting of a WSGI application when a change has been made to the code is a simple matter of touching the script file if daemon mode is being used. Any daemon processes will then automatically restart without the need to restart the whole of Apache.

So, if you are using Django in daemon mode and needed to change your 'settings.py' file, once you have made the required change, also touch the script file containing the WSGI application entry point. Having done that, on the next request the process will be restarted and your Django application reloaded.

}}}

Thomas Güttler

Ask Solem

unread,
Mar 28, 2013, 11:16:27 AM3/28/13
to celery...@googlegroups.com

On Mar 28, 2013, at 9:37 AM, Thomas Güttler <h...@tbz-pariv.de> wrote:

> Hi,
>
> I like the way mod_wsgi restarts its workers (in daemon mode).
>
> You need to "touch" a file, and all new request get the code. The python reload() method is *not* used.
> It is like starting a new python interpreter.
>
> I like this, because running tasks/request are not interrupted, and all new tasks/requests get new code in a way
> suitable for production servers. It is a graceful restart without downtime.
>
> Is it possible to implement this?
>
> Of course this would be one more system call before executing each task: Check the mtime of one file. But
> this is not a problem in my environment.
>
> mod_wsgi daemon mode can be compared to celery workers. Here is how wsgi does it:
>
> http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
>

Sounds doable for anyone willing to experiment with a patch.

Possible caveats:

- It would have to depend on sys.argv for the command-line path and arguments
to start a new worker.

- The current directory may have changed since start (--workdir argument)

- Privileges may have been revoked after --uid and --gid arguments are applied.


You can change back to the previous directory, but the current user/group may
no longer have the same permissions, and there's nothing to do about that
short of having a dedicated process to do the restarting.


Also, while I think the idea is neat, I'm not sure how useful it will be.
Most people do not change the files directly, instead they have a deployment
procedure that also includes restarting workers. I guess you are considering this
mostly for development?

--
Ask Solem
twitter.com/asksol

joshj

unread,
Mar 29, 2013, 10:54:57 AM3/29/13
to celery...@googlegroups.com
Also, while I think the idea is neat, I'm not sure how useful it will be. 
Most people do not change the files directly, instead they have a deployment 
procedure that also includes restarting workers.  I guess you are considering this 
mostly for development? 

The point is that in production we would NOT have to restart workers. This means we would NOT have to worry about safely shutting down workers and existing tasks. This would be great for those tasks that are non trivial. We do NOT have to wait until the tasks complete, nor worry about if the tasks is killed halfway when we shutdown a worker having to RESUME.

Thus, it makes great sense not only for development, though for PRODUCTION.

Thomas Güttler

unread,
Apr 2, 2013, 3:09:22 AM4/2/13
to celery...@googlegroups.com


Am Freitag, 29. März 2013 15:54:57 UTC+1 schrieb joshj:
Also, while I think the idea is neat, I'm not sure how useful it will be. 
Most people do not change the files directly, instead they have a deployment 
procedure that also includes restarting workers.  I guess you are considering this 
mostly for development? 

The point is that in production we would NOT have to restart workers. This means we would NOT have to worry about safely shutting down workers and existing tasks. This would be great for those tasks that are non trivial. We do NOT have to wait until the tasks complete, nor worry about if the tasks is killed halfway when we shutdown a worker having to RESUME.


I don't understand you. Sooner or later you need to restart workers in production. Of course you restart workers more often in development.

I guess most people write their code that the workers don't run for a long time. A graceful restart that leaves running workers running would be
a good restart at least in my environment.
 

Thus, it makes great sense not only for development, though for PRODUCTION.


Yes, that's is  my opinion, too.

Unfortunately I can't provide a patch during the next weeks. But if someone else can, I think I will have time to test it.

  Thomas

Ask Solem

unread,
Apr 2, 2013, 7:43:01 AM4/2/13
to celery...@googlegroups.com

On Mar 29, 2013, at 2:54 PM, joshj <josh.j...@gmail.com> wrote:

> Also, while I think the idea is neat, I'm not sure how useful it will be.
> Most people do not change the files directly, instead they have a deployment
> procedure that also includes restarting workers. I guess you are considering this
> mostly for development?
>
> The point is that in production we would NOT have to restart workers. This means we would NOT have to worry about safely shutting down workers and existing tasks. This would be great for those tasks that are non trivial. We do NOT have to wait until the tasks complete, nor worry about if the tasks is killed halfway when we shutdown a worker having to RESUME.
>
> Thus, it makes great sense not only for development, though for PRODUCTION.
>
>

You would still have to worry about workers restarting, and that they do so safely and that currently executing tasks tasks complete in a timely manner.

The only difference here is that it would happen on any file change,
does that really make it better?



We do currently have an experimental HUP handler, but it has been very prone to failure.
Currently the safest way I have found so far is to do:

atexit.register(restart_using_argv)
raise SystemExit()

But since everything happens in the same process there is no protection
against failure here and many things can go wrong between these steps.
The only way to solve that is to have a dedicated
process that monitors the worker.


Theoretically it should be unnecessary to restart the MainProcess to
reload task code, and even to add new tasks. It would be a rather complex
problem to fix, but definitely possible to reload tasks just by restarting
the child pool worker processes (and not using Python's reload())

--
Ask Solem
twitter.com/asksol

joshj

unread,
Apr 2, 2013, 8:40:52 PM4/2/13
to celery...@googlegroups.com
You would still have to worry about workers restarting, and that they do so safely and that currently executing tasks tasks complete in a timely manner. 
The only difference here is that it would happen on any file change, 
does that really make it better? 

My response was that in progress tasks keep on executing. Once the new code change is detected, new tasks use the new code.

So what do you mean by  "You would still have to worry about workers restarting, and that they do so safely and that currently executing tasks tasks complete in a timely manner. ". Why is the worker restarting?

Yes, it is better. I just gave a reason why. So what is your question? We have long running tasks. We deploy new code. We want new tasks to use the new code and existing running tasks to complete. We don't want to stop a task midway just because we push new code.

YES, it does make it really better.


The only way to solve that is to have a dedicated 
process that monitors the worker. 

I already asked in a separate email how to do monitoring with celery. Though since you want to mix these two issues, how does one do monitoring with celery? So you are saying there is no monitoring for workers? Why is this related to deploying new code and having new tasks use new code?

It would help if you say that the celery architecture is limited, you don't have time, or some other reasonable answer instead of skirting the question and asking "Is this really better". Well obviously of course it is. Common sense says so and also we have now two different people on the thread asking for this. Not to mention Python has reloading as well as modwsgi.

Now the question is, what does it take to make it happen.

George Reilly

unread,
Oct 30, 2017, 7:54:58 PM10/30/17
to celery-users
[Reviving an old thread]

I'd like to use a mixture of [etcd](https://coreos.com/etcd/) and [confd](https://github.com/kelseyhightower/confd) to distribute config updates to our *production* services. If I'm using Nginx+Uwsgi, then I can use `uwsgi --touch-chain-reload CONFIG_FILE` to restart my Uwsgi worker processes, one at a time, after confd has updated CONFIG_FILE.

I'd like to be able to recycle our Celery worker processes in a similar manner.  We're using `worker_max_tasks_per_child=200` so every worker process in the cluster will eventually age out and pick up the new config file. We're also using [pyramid-celery](https://github.com/sontek/pyramid_celery) to load config from an INI file.

Is there a way to gracefully and promptly recycle the worker processes?

/George Reilly
Reply all
Reply to author
Forward
0 new messages