How to gather metrics from Celery tasks

2,109 views
Skip to first unread message

Tim Bell

unread,
Feb 6, 2017, 4:12:24 PM2/6/17
to Prometheus Users
Hi,

I'm looking for a way to gather metrics from within Celery tasks. (I'm not talking about monitoring the Celery queue itself.)

It seems like I'll have to use Pushgateway, but I'm worried about individual Celery workers overwriting each other's metrics. Does someone have an example of how to do this, or is there a better way of scraping stats from tasks that I haven't considered?

Thanks,

Tim

mhernand...@gmail.com

unread,
Jul 2, 2018, 6:27:21 PM7/2/18
to Prometheus Users
Hello there Tim.

I know it's been over a year, and if you could provide some insight on how you figured it out, please let us know.

I'm facing the same thing for the company I'm working for, we want to gather metrics from within Celery tasks such as the timing, error count, etc.
We are planning on using the pushgateway for gather metrics from the celery tasks since they are short-lived compared to a daemon running.
To keep the key-value pairs separate, we are gonna add labels on the metric defining the name of the job.
I think in the pushadd_to_gateway() method provided by the prometheus_client, there is a job='' keyword argument that we can add.  Perhaps this will prevent from Celery workers overwriting each other's metrics?  I could be wrong, though.  Let me know what you think.

Tim Bell

unread,
Jul 2, 2018, 6:54:12 PM7/2/18
to Prometheus Users
Hi,

On Tuesday, 3 July 2018 08:27:21 UTC+10, mhernand...@gmail.com wrote:

I know it's been over a year, and if you could provide some insight on how you figured it out, please let us know.

Of course. Having posted the question, I meant to also post the answer we settled on, but never quite got around to it.
 
I'm facing the same thing for the company I'm working for, we want to gather metrics from within Celery tasks such as the timing, error count, etc.
We are planning on using the pushgateway for gather metrics from the celery tasks since they are short-lived compared to a daemon running.
To keep the key-value pairs separate, we are gonna add labels on the metric defining the name of the job.
I think in the pushadd_to_gateway() method provided by the prometheus_client, there is a job='' keyword argument that we can add.  Perhaps this will prevent from Celery workers overwriting each other's metrics?

That's exactly the approach we took. Here are some code snippets to illustrate.


from functools import lru_cache

from billiard import current_process
from prometheus_client import Counter, CollectorRegistry, push_to_gateway

registry = CollectorRegistry()  # if you also have batch Celery tasks, use a separate registry for those

# each metric should be defined referring to the above registry, like this:
count_retries = Counter(
'retries', 'Processing retries', ['task'], registry=registry,
)


@lru_cache(maxsize=2)
def worker_id():
"""
current_process() doesn't have an "index" attribute at compile time;
use a function to return it at runtime, and cache the response since
it doesn't change for a given worker
"""
try:
index = current_process().index
except AttributeError:
# For management commands, which also don't have "index" defined
index = 0
return 'worker-%d' % index


def push_stats():
    """
    Call this function at the end of each Celery task;
    if you have batch tasks, have another function to push those to the batch registry
    """
push_to_gateway('localhost:9091', job=worker_id(), registry=registry) # push our stats identified by our worker_id


 
I hope this helps.

Cheers,

Tim

mhernand...@gmail.com

unread,
Jul 3, 2018, 1:03:20 PM7/3/18
to Prometheus Users
Thanks Tim!  I appreciate it!

mhernand...@gmail.com

unread,
Jul 9, 2018, 1:15:30 PM7/9/18
to Prometheus Users
Why use a separate registry for Celery batch tasks?


On Monday, July 2, 2018 at 3:54:12 PM UTC-7, Tim Bell wrote:
Reply all
Reply to author
Forward
0 new messages