kafka consumer tracking - pushgateway vs scraping with replicas

jim...@gmail.com

unread,

Jul 20, 2018, 8:59:07 AM7/20/18

to Prometheus Users

Hello,

I am new to Prometheus so, apologies if this already discussed but I didn't find anything form digging up the list and the web

I would like to get some suggestion on what is the proposed way of sending metrics to Prometheus from a set of Kafka Consumers

A simplified view of what I am trying to track is the following

I have a set of Kafka Consumers i.e. C1, C2 and C3

C1, C2 and C3 consume messages from different topics and each consumer is triggered by some of the messages and performs a task.

What I want to track is the number of messages each consumer processed and the time it takes to process each message.

At the moment I am using the push gateway option (from the java client) using the consumer name as the job name e.g.

pushAdd(registry,"C1") from C1
pushAdd(registry,"C2") from C2

pushAdd(registry,"C3") from C3

now, because each consumer is stateless and I want to speed up processing, I create a set of replicas, i.e. 5 replicas of C1, 5 of C2 and another 5 of C3 which are managed by a k8s cluster

each replica pushes metrics with the same manner and uses the consumer name as the job, i.e. all C1 replicas execute pushAdd(registry,"C1"), all C2 replicas use "C2" etc

When I visualise the metrics in grafana for each consumer, the overall graphics seem a bit off when multiple replicas are running.

What I cannot understand is if this is a grafana tuning thing or we are pushing metrics to Prometheus in a wrong way.

It would be great if I could get some feedback on the way we send metrics to Prometheus.

e.g. Is pushgateway appropriate for this case or scraping should be preferred?

if pushgaeway is the prefered option, should each replica continue using the same job name or use a different one (i.e. C1-{podid}) and group them together in prometheus/Grafana

if scraping is the way to go, can the metrics be persistent? since we can scale up / down the replicas on demand

Thank you in advance!

Dimitris

Chris Marchbanks

unread,

Jul 20, 2018, 11:03:43 AM7/20/18

to jim...@gmail.com, Prometheus Users

Hello Dimitris,

For your case scraping is preferred. It sounds like each of your consumers is long lived, so can easily be scraped by Prometheus. The pushgateway is really best for short term jobs that will not be reliably scraped (cron jobs and the like).

Using the scraping method, you will be able to query metrics for pods that no longer exist by adjusting the time range of your query. So if a pod went away 30 minutes ago, it would show up for the first 30 minutes on an hour long graph.

Hope this helps and let me know if you have more questions,

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/010d10f1-5825-4dc7-90ea-ce00a6433fe7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Chris Marchbanks | Engineer

FreshTracks.io - Intelligent Alerting for Kubernetes and Prometheus

Dimitris Kontokostas

unread,

Jul 23, 2018, 4:12:21 AM7/23/18

to chr...@freshtracks.io, promethe...@googlegroups.com

Thank you for your feedback Chris,

One question that comes to mind is how to get a uniform grafana graph that shows the total counter of e.g. all C1 orchestrators and the average time to process from all orchestrators.

Can this be set from the graph metrics tab?

Best,

Dimitris

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/010d10f1-5825-4dc7-90ea-ce00a6433fe7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Chris Marchbanks | Engineer
FreshTracks.io - Intelligent Alerting for Kubernetes and Prometheus