Problems stopping celeryd

157 views
Skip to first unread message

Zach Smith

unread,
Mar 21, 2011, 5:50:54 PM3/21/11
to celery...@googlegroups.com

I'm running celeryd as a daemon, but I sometimes have trouble stopping it gracefully. When I send the TERM signal and there are items in the queue (in this case service celeryd stop) celeryd will stop taking new jobs, and shut down all the worker processes. The parent process, however, won't shut down.

I've just ran into a scenario where I had celeryd running on two separate worker machines: A and B. With about 1000 messages on the RabbitMQ server, I shut down A, and experienced the situation I've explained above. B continued to work, but then stalled with about 40 messages left on the server. I was however, able to stop B correctly.

I restarted B, to see if it would take the 40 items off the queue, but it would not. Next, I hard killed A, after which B grabbed and completed the tasks.

My conclusions is that the parent process has reserved the 40 items from our RabbitMQ server for its children. It will reap the children correctly, but will not release the items back to RabbitMQ unless I manually kill it.

Has anyone experienced something similar?

I'm running Celery 2.2.2

http://stackoverflow.com/questions/5045687/problems-stopping-celeryd

Ask Solem Hoel

unread,
Mar 22, 2011, 10:29:49 AM3/22/11
to celery...@googlegroups.com

On Mar 21, 2011, at 10:50 PM, Zach Smith wrote:

> I'm running celeryd as a daemon, but I sometimes have trouble stopping it gracefully. When I send the TERM signal and there are items in the queue (in this case service celeryd stop) celeryd will stop taking new jobs, and shut down all the worker processes. The parent process, however, won't shut down.
>
> I've just ran into a scenario where I had celeryd running on two separate worker machines: A and B. With about 1000 messages on the RabbitMQ server, I shut down A, and experienced the situation I've explained above. B continued to work, but then stalled with about 40 messages left on the server. I was however, able to stop B correctly.
>
> I restarted B, to see if it would take the 40 items off the queue, but it would not. Next, I hard killed A, after which B grabbed and completed the tasks.
>
> My conclusions is that the parent process has reserved the 40 items from our RabbitMQ server for its children. It will reap the children correctly, but will not release the items back to RabbitMQ unless I manually kill it.
>

celeryd will not shut down until all active tasks have been processed, where active
means the tasks that have been started on (not all reserved tasks).

The reserved messages will be released and redelivered once the connection channel
is closed. This happens after the active tasks have returned.

if you don't have a --time-limit enabled celeryd will never kill your tasks at shutdown, even
if they take DAYS to complete.


There is currently a bug where CELERY_DISABLE_RATE_LIMITS can make celeryd
freeze at shutdown: https://github.com/ask/celery/issues#issue/264

--
{Ask Solem,
+47 98435213 | twitter.com/asksol }.

Zach Smith

unread,
Mar 22, 2011, 12:02:33 PM3/22/11
to celery...@googlegroups.com, Ask Solem Hoel
Ask,

I'm seeing all active tasks finish, but the celeryd master seems to never close the connection unless I manually kill it. 

I will look at CELERY_DISABLE_RATE_LIMITS also and see if they it affecting anything.

Thanks


--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To post to this group, send email to celery...@googlegroups.com.
To unsubscribe from this group, send email to celery-users...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/celery-users?hl=en.


Reply all
Reply to author
Forward
0 new messages