Celeryd subprocesses seem to consume a lot of memory together even when no DAGs are running

Bob Muscovite

unread,

Oct 6, 2020, 4:43:29 AM10/6/20

to cloud-composer-discuss

Greetings,

I have been on and off encountering this known issue in Composer. In the process I noticed that my Composer GKE cluster, which is set to autoscale 1-6, never seems to scale down below 6 nodes, at n1-standard-1.

Running some commands (kubectl top nodes), I notice that each Node seems to always hover at around 70% RAM usage even idle (i.e. no DAGs are running)!

Investigating further, I narrowed down the high RAM usage to mostly the airflow-worker deployment, which all together at 6 pods takes 8 gb RAM (airflow-scheduler is also quite hungry, but only at 600 mb total).

Once I logged into a pod belonging to the deployment, I identified the culprit: there are very many celeryd subprocesses that each use about 119,96 mb of RAM. These are launched with the command of the form:

[celeryd: celery@airflow-worker-776fb6f5b7-sl8h2:ForkPoolWorker-7]

In each airflow-worker pod there can be around a dozen of these subprocesses at any given time, if not more, all eating memory, in Cloud Composer GKE cluster that is backing a mostly dormant Airflow deployment, in terms of DAGRuns. What is going on?

Bob Muscovite

unread,

Oct 6, 2020, 4:49:52 AM10/6/20

to cloud-composer-discuss

I think this default setting must be either the culprit, or pretty close to it.

Message has been deleted

Bob Muscovite

unread,

Oct 18, 2020, 9:25:04 AM10/18/20

to cloud-composer-discuss

It seems the concurrency does not matter as much as the autoscaling settings. There actually affect the number of processes and hence memory consumption.

On Friday, October 16, 2020 at 3:55:27 PM UTC+2 boris....@easyfairs.com wrote:

I have halved the concurrency to 8 but I still see lots of celeryd processes and the memory footprint is still large, in fact I am not sure it changed much. Any advice on how to dig at this?

Reply all

Reply to author

Forward