Celery memory leak?

2024 views
Skip to first unread message

Axel Hansen

unread,
Jan 13, 2011, 7:35:35 PM1/13/11
to celery...@googlegroups.com
Hi everyone,

I'm running an app using celery that recently has gotten a lot of users.  We use celery to run some relatively long tasks for users.  The tasks themselves launch a few of their own threads.  We've been seeing problems where celery seems to be consuming almost all of the system memory, and it doesn't seem to be a problem in our own application code.  We're running celery 2.1.1, python 2.5, and had 20 celery processes (now 15 to try to reduce memory usage).  Has anyone else seen this type of behaviour, or any ideas on how to fix it?  At it's peak, all the celery processes were using over a gig of memory.

Thanks.
Axel

mike

unread,
Jan 14, 2011, 12:36:37 AM1/14/11
to celery-users
2 things:

1) Check that settings.DEBUG = False (for Django). If DEBUG is True,
it will "leak" memory (ie, hold objects like queries in memory
permanently).

2) Thats a lot of processes to be running. If you're consuming data
from feeds or processing images with PIL, that doesn't seem too bad.

axrh

unread,
Jan 18, 2011, 9:58:02 PM1/18/11
to celery-users
We aren't using Django. Does celery perform better with fewer
processes? It looks like only one celery process (I think the main
one) is getting extremely large memory-wise.

Any other ideas?

Axel

Ask Solem

unread,
Jan 19, 2011, 9:03:42 AM1/19/11
to celery...@googlegroups.com

On Jan 19, 2011, at 3:58 AM, axrh wrote:

> We aren't using Django. Does celery perform better with fewer
> processes? It looks like only one celery process (I think the main
> one) is getting extremely large memory-wise.
>


Are you using ghettoq?

--
{Ask Solem,
+47 98435213 | twitter.com/asksol }.

axrh

unread,
Jan 20, 2011, 12:28:12 PM1/20/11
to celery-users
Nope, we're not using ghettoq. Our message broker is rabbitmq.

Axel

Harel Malka

unread,
Jan 21, 2011, 9:29:47 AM1/21/11
to celery...@googlegroups.com
Are you passing fully loaded object to celery tasks?
If so convert to pass simple value Ids and reload objects in the task.
Also, is your rabbitmq server on the same box as the celery tasks?
Have you considered breaking celery into two separate machines?
I also found that running many instances of celery with small number
of workers is somewhat better resource wise to running one celery
instance with many workers.
Also, see if you can break your bigger tasks into smaller tasks.
For example, i had a task which downloaded a very large data file and
performed a few hundred thousand select/insert/update queries based on
the data. It worked better by a magnitude when i made the actual
database query its own task and passed simple values to it. I could
then also throttle this task in a separate queue to make it run slower
so the database was not overwhelmed.

If none of these pointers help, could you tell me what are your tasks
doing? What are you feeding your rabbit with?

Harel

> --
> You received this message because you are subscribed to the Google Groups "celery-users" group.
> To post to this group, send email to celery...@googlegroups.com.
> To unsubscribe from this group, send email to celery-users...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/celery-users?hl=en.
>
>

Ask Solem

unread,
Jan 22, 2011, 6:04:40 PM1/22/11
to celery...@googlegroups.com

On Jan 20, 2011, at 6:28 PM, axrh wrote:

> Nope, we're not using ghettoq. Our message broker is rabbitmq.
>
> Axel
>

That is very strange, the MainProcess is not known to leak memory. Do you confirm
it's the main process? (install setproctitle, restart and see ps output)

What version of RabbitMQ and Python are you running?
What is your Celery related configuration?
What platform/distribution?

Note that there is a known bug in RabbitMQ 2.2 (maybe in combination
with earlier Erlang versions) where the eventual symptom
is growing memory and CPU load, and I'm advising everyone having issues to downgrade
to 2.1.1 until there is a fix available.

axrh

unread,
Feb 15, 2011, 10:17:01 PM2/15/11
to celery-users
Sorry for the slow response on the memory leak, but we've
reimplemented our celery task to use eventlet rather than full threads
in case that was the issue (to be clear, we're not using eventlet as
the pool in celery, but natively in a celery task). Unfortunately,
celery still seems to use a lot of memory... Most of the 15 processes
are using ~12mb of memory, but the top 4 use around 200m.

Our rabbitmq version is 2.1.1, Celery is 2.1.1, and we're running
python 2.5 on fedora 8 with over 1gb or memory.

Any ideas? Are there tools like valgrind for python?

Thanks!

Jean Rougé

unread,
Sep 18, 2013, 9:24:19 AM9/18/13
to celery...@googlegroups.com
Hi Axel,

Sorry for posting on that old thread, but we seem to have the same issue... any luck with your problems? Did you end up finding a solution?

Thanks!

Jean

Darren Govoni

unread,
Dec 12, 2013, 6:05:02 PM12/12/13
to celery...@googlegroups.com
Hi.

I am also seeing this issue. 

I am using Celery 3.0.22
RabbitMQ 3.0.2
AMQP backend
python 2.7.4
Ubuntu 13.10 64bit


After about 4-500 tasks my memory is nearly gone. I run all servers and services on same laptop, but I have 8GB ram.
Reply all
Reply to author
Forward
0 new messages