How to monitory a service or process running in linux machine using prometheus

rajki...@gmail.com

unread,

Sep 1, 2017, 12:42:30 PM9/1/17

to Prometheus Users

I have prometheus setup. And I will like to monitor process running on the linux box. I have node exporter installed on each server and added as targets to the Prometheus server. Following is the process that i want to monitor.

Process:

ntpd
crond
ssh

I did enabled systemd in node_export config. But i couldnt get the individual service specific metrics in my prometheus server. I am able to get number of process running using node_procs_running. Well i want to know whether a process is running on specific host or not. If not i need to alter a specific group of users. Can anyone help me out.

Ben Kochie

unread,

Sep 2, 2017, 5:32:29 AM9/2/17

to rajki...@gmail.com, Prometheus Users

For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.

You can use the systemd service state collector in the node_exporter, but I would recommend setting a whitelist.

-collector.systemd.unit-whitelist '.+\.service'

This will avoid exporting metrics for the huge number of non-service units that are configured on most systems. You're still going to be adding nearly 1000 metrics this way. You may want to consider a specific whitelist like '(cron|ssh)\.service'.

Another option would be to use mtail[1] to parse syslog output to produce metrics.

[0]: https://github.com/prometheus/node_exporter/blob/master/text_collector_examples/ntpd_metrics.py

[1]: https://github.com/google/mtail

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/02a61776-d911-48a3-92b1-991e0aa304d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian Brazil

unread,

Sep 2, 2017, 6:17:03 AM9/2/17

to Ben Kochie, rajki...@gmail.com, Prometheus Users

On 2 September 2017 at 10:32, Ben Kochie <sup...@gmail.com> wrote:

For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.

For SSH you could use the blackbox exporter, see https://www.robustperception.io/checking-if-ssh-is-responding-with-prometheus/

If you're using the textfile collector via cron, you're also implicitly testing cron.

Brian

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmoshmG8SWNuPztRLa8xep8JcWQTHwXUGUYZU%2B-V3e33%2BQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

Brian Brazil

www.robustperception.io

Ben Kochie

unread,

Sep 2, 2017, 6:37:55 AM9/2/17

to Brian Brazil, rajki...@gmail.com, Prometheus Users

Yup, you could run a once per min cron job that updates a textfile with a last run time seconds metric.

Brian Brazil

unread,

Sep 2, 2017, 6:48:44 AM9/2/17

to Ben Kochie, rajki...@gmail.com, Prometheus Users

On 2 September 2017 at 11:37, Ben Kochie <sup...@gmail.com> wrote:

Yup, you could run a once per min cron job that updates a textfile with a last run time seconds metric.

There's also the node_textfile_mtime metric which provide this for you when you're using the textfile collector.

Brian

--

Brian Brazil

www.robustperception.io

rajki...@gmail.com

unread,

Sep 6, 2017, 7:49:52 PM9/6/17

to Prometheus Users

Thank you, very much.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/02a61776-d911-48a3-92b1-991e0aa304d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmoshmG8SWNuPztRLa8xep8JcWQTHwXUGUYZU%2B-V3e33%2BQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
Brian Brazil
www.robustperception.io

--
Brian Brazil
www.robustperception.io

rajkiran....@tekinvaderz.com

unread,

Sep 15, 2017, 2:01:28 PM9/15/17

to Prometheus Users

Hello Ben,

I am able to use textfile editor and get the matric. But my manager is not happy about having cronjob on each machine.

By any chance can we make user of blackbox exporter and scrap service specific metrics?

Thank you,

Raj Kiran.

On Saturday, September 2, 2017 at 6:37:55 AM UTC-4, Ben Kochie wrote:

Yup, you could run a once per min cron job that updates a textfile with a last run time seconds metric.

On Sat, Sep 2, 2017 at 12:17 PM, Brian Brazil <brian....@robustperception.io> wrote:

On 2 September 2017 at 10:32, Ben Kochie <sup...@gmail.com> wrote:
For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.

For SSH you could use the blackbox exporter, see https://www.robustperception.io/checking-if-ssh-is-responding-with-prometheus/

If you're using the textfile collector via cron, you're also implicitly testing cron.

Brian

You can use the systemd service state collector in the node_exporter, but I would recommend setting a whitelist.

-collector.systemd.unit-whitelist '.+\.service'

This will avoid exporting metrics for the huge number of non-service units that are configured on most systems. You're still going to be adding nearly 1000 metrics this way. You may want to consider a specific whitelist like '(cron|ssh)\.service'.

Another option would be to use mtail[1] to parse syslog output to produce metrics.

[0]: https://github.com/prometheus/node_exporter/blob/master/text_collector_examples/ntpd_metrics.py
[1]: https://github.com/google/mtail

On Fri, Sep 1, 2017 at 6:42 PM, <rajki...@gmail.com> wrote:

I have prometheus setup. And I will like to monitor process running on the linux box. I have node exporter installed on each server and added as targets to the Prometheus server. Following is the process that i want to monitor.

Process:
ntpd
crond
ssh
I did enabled systemd in node_export config. But i couldnt get the individual service specific metrics in my prometheus server. I am able to get number of process running using node_procs_running. Well i want to know whether a process is running on specific host or not. If not i need to alter a specific group of users. Can anyone help me out.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/02a61776-d911-48a3-92b1-991e0aa304d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmoshmG8SWNuPztRLa8xep8JcWQTHwXUGUYZU%2B-V3e33%2BQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
Brian Brazil
www.robustperception.io

Ben Kochie

unread,

Sep 15, 2017, 2:07:59 PM9/15/17

to rajkiran....@tekinvaderz.com, Prometheus Users

I'm not sure what you mean. The blackbox_exporter is for probing network services.

Service-specific metrics must come from the services using our instrumentation libraries[0] or an exporter[1] that is tailored to that service.

[0]: https://prometheus.io/docs/instrumenting/clientlibs/

[1]: https://prometheus.io/docs/instrumenting/exporters/

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f793a4ff-83d8-458f-a91c-d577bfda3e6b%40googlegroups.com.

robu...@gmail.com

unread,

Dec 28, 2018, 8:57:02 AM12/28/18

to Prometheus Users

Sorry to revive this question...

--collector.systemd.unit-whitelist='.+\.service'

works great now,

but..

there are a bunch of services of the
Type=oneshot
which have not set

RemainAfterExit=yes

so node_exporter reporting them correctly as active=0

but that makes monitoring ugly with a long manually constructed exclude list in the query/rule

is there any simple solution to exclude all oneshot services from being collected by node_exporter
or has anybody come up with another simple solution to monitor all services without
individually configuration?

TIA

On Saturday, September 2, 2017 at 11:32:29 AM UTC+2, Ben Kochie wrote:

For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.

You can use the systemd service state collector in the node_exporter, but I would recommend setting a whitelist.

-collector.systemd.unit-whitelist '.+\.service'

This will avoid exporting metrics for the huge number of non-service units that are configured on most systems. You're still going to be adding nearly 1000 metrics this way. You may want to consider a specific whitelist like '(cron|ssh)\.service'.

Another option would be to use mtail[1] to parse syslog output to produce metrics.

[0]: https://github.com/prometheus/node_exporter/blob/master/text_collector_examples/ntpd_metrics.py
[1]: https://github.com/google/mtail

On Fri, Sep 1, 2017 at 6:42 PM, <rajki...@gmail.com> wrote:

I have prometheus setup. And I will like to monitor process running on the linux box. I have node exporter installed on each server and added as targets to the Prometheus server. Following is the process that i want to monitor.

Process:
ntpd
crond
ssh
I did enabled systemd in node_export config. But i couldnt get the individual service specific metrics in my prometheus server. I am able to get number of process running using node_procs_running. Well i want to know whether a process is running on specific host or not. If not i need to alter a specific group of users. Can anyone help me out.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

Ben Kochie

unread,

Dec 28, 2018, 1:20:29 PM12/28/18

to robu...@gmail.com, Prometheus Users

That would require a new type of filter on the exporter side. If you could file an issue, that would help.

To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/6de25685-856a-4ab9-8344-c6afd985c88a%40googlegroups.com.

robu...@gmail.com

unread,

Dec 31, 2018, 5:38:13 AM12/31/18

to Prometheus Users

done:

https://github.com/prometheus/node_exporter/issues/1210

Reply all

Reply to author

Forward