Metrics for monitoring fluentd heartbeat

884 views
Skip to first unread message

twelcome

unread,
Oct 7, 2017, 1:37:55 PM10/7/17
to Fluentd Google Group
Hi

I have fluentd heartbeat enabled, and am sending fluentd metrics to prometheus for monitoring fluentd itself.
I would like to have graphs and metrics for fluentd heartbeat, but cannot find any metric exposed by fluent which allows measurement of heartbeat from fluentd nodes or fluent-bit.

How canI get this metric and graph it in prometheus?

Fyi, I'm using the fluent-plugin-prometheus (https://github.com/kazegusuri/fluent-plugin-prometheus) to export data from fluentd to prometheus.

In addition, here is the list of metrics I can see from scraping port 24231 of the fluentd service:

fluentd_output_status_buffer_queue_length

fluentd_output_status_buffer_total_bytes

fluentd_output_status_emit_count

fluentd_output_status_emit_records

fluentd_output_status_num_errors

fluentd_output_status_retry_count

fluentd_output_status_retry_wait

fluentd_output_status_rollback_count

fluentd_output_status_write_count

fluentd_status_buffer_queue_length

fluentd_status_buffer_total_bytes

fluentd_status_retry_count

fluentd_tail_file_inode

fluentd_tail_file_position


Thanks in advance,
Traiano 

Mr. Fiber

unread,
Oct 10, 2017, 7:56:42 AM10/10/17
to Fluentd Google Group
I'm not sure the details of prometheus plugin but out_forward plugin doesn't store the history of heartbeat result, e.g. latency value, the error count.
What the metrics do you want? latency?


Masahiro

--
You received this message because you are subscribed to the Google Groups "Fluentd Google Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

twelcome

unread,
Oct 10, 2017, 10:39:21 PM10/10/17
to Fluentd Google Group
On Tuesday, 10 October 2017 19:56:42 UTC+8, repeatedly wrote:
I'm not sure the details of prometheus plugin but out_forward plugin doesn't store the history of heartbeat result, e.g. latency value, the error count.
What the metrics do you want? latency?



It would be useful to have a count of failed heartbeats.
Alternatively, is there a mechanism I can configure for alerting if heartbeats from fluentd nodes are lost ( from the aggregator's point of view)?




 
Masahiro

To unsubscribe from this group and stop receiving emails from it, send an email to fluentd+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages