How to monitory a service or process running in linux machine using prometheus

11,296 views
Skip to first unread message

rajki...@gmail.com

unread,
Sep 1, 2017, 12:42:30 PM9/1/17
to Prometheus Users

I have prometheus setup. And I will like to monitor process running on the linux box. I have node exporter installed on each server and added as targets to the Prometheus server. Following is the process that i want to monitor. 


Process:

  • ntpd
  • crond
  • ssh
I did enabled systemd in node_export config. But i couldnt get the individual service specific metrics in my prometheus server. I am able to get number of process running using node_procs_running. Well i want to know whether a process is running on specific host or not. If not i need to alter a specific group of users. Can anyone help me out.

Ben Kochie

unread,
Sep 2, 2017, 5:32:29 AM9/2/17
to rajki...@gmail.com, Prometheus Users
For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.

You can use the systemd service state collector in the node_exporter, but I would recommend setting a whitelist.

-collector.systemd.unit-whitelist '.+\.service'

This will avoid exporting metrics for the huge number of non-service units that are configured on most systems.  You're still going to be adding nearly 1000 metrics this way.  You may want to consider a specific whitelist like '(cron|ssh)\.service'.

Another option would be to use mtail[1] to parse syslog output to produce metrics.


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/02a61776-d911-48a3-92b1-991e0aa304d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian Brazil

unread,
Sep 2, 2017, 6:17:03 AM9/2/17
to Ben Kochie, rajki...@gmail.com, Prometheus Users
On 2 September 2017 at 10:32, Ben Kochie <sup...@gmail.com> wrote:
For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.


If you're using the textfile collector via cron, you're also implicitly testing cron.

Brian
 

For more options, visit https://groups.google.com/d/optout.



--

Ben Kochie

unread,
Sep 2, 2017, 6:37:55 AM9/2/17
to Brian Brazil, rajki...@gmail.com, Prometheus Users
Yup, you could run a once per min cron job that updates a textfile with a last run time seconds metric.

Brian Brazil

unread,
Sep 2, 2017, 6:48:44 AM9/2/17
to Ben Kochie, rajki...@gmail.com, Prometheus Users
On 2 September 2017 at 11:37, Ben Kochie <sup...@gmail.com> wrote:
Yup, you could run a once per min cron job that updates a textfile with a last run time seconds metric.

There's also the node_textfile_mtime metric which provide this for you when you're using the textfile collector.

Brian



--

rajki...@gmail.com

unread,
Sep 6, 2017, 7:49:52 PM9/6/17
to Prometheus Users
Thank you, very much. 
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.



--




--

rajkiran....@tekinvaderz.com

unread,
Sep 15, 2017, 2:01:28 PM9/15/17
to Prometheus Users
Hello Ben,

I am able to use textfile editor and get the matric. But my manager is not happy about having cronjob on each machine. 

By any chance can we make user of blackbox exporter and scrap service specific metrics?

Thank you,

Raj Kiran.


On Saturday, September 2, 2017 at 6:37:55 AM UTC-4, Ben Kochie wrote:
Yup, you could run a once per min cron job that updates a textfile with a last run time seconds metric.
On Sat, Sep 2, 2017 at 12:17 PM, Brian Brazil <brian....@robustperception.io> wrote:
On 2 September 2017 at 10:32, Ben Kochie <sup...@gmail.com> wrote:
For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.


If you're using the textfile collector via cron, you're also implicitly testing cron.

Brian
 

You can use the systemd service state collector in the node_exporter, but I would recommend setting a whitelist.

-collector.systemd.unit-whitelist '.+\.service'

This will avoid exporting metrics for the huge number of non-service units that are configured on most systems.  You're still going to be adding nearly 1000 metrics this way.  You may want to consider a specific whitelist like '(cron|ssh)\.service'.

Another option would be to use mtail[1] to parse syslog output to produce metrics.

On Fri, Sep 1, 2017 at 6:42 PM, <rajki...@gmail.com> wrote:

I have prometheus setup. And I will like to monitor process running on the linux box. I have node exporter installed on each server and added as targets to the Prometheus server. Following is the process that i want to monitor. 


Process:

  • ntpd
  • crond
  • ssh
I did enabled systemd in node_export config. But i couldnt get the individual service specific metrics in my prometheus server. I am able to get number of process running using node_procs_running. Well i want to know whether a process is running on specific host or not. If not i need to alter a specific group of users. Can anyone help me out.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.



--

Ben Kochie

unread,
Sep 15, 2017, 2:07:59 PM9/15/17
to rajkiran....@tekinvaderz.com, Prometheus Users
I'm not sure what you mean.  The blackbox_exporter is for probing network services.

Service-specific metrics must come from the services using our instrumentation libraries[0] or an exporter[1] that is tailored to that service.


To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/f793a4ff-83d8-458f-a91c-d577bfda3e6b%40googlegroups.com.

robu...@gmail.com

unread,
Dec 28, 2018, 8:57:02 AM12/28/18
to Prometheus Users
Sorry to revive this question...

--collector.systemd.unit-whitelist='.+\.service'

works great now, 
but..

there are a bunch of services of the
Type=oneshot
which have not set
RemainAfterExit=yes

so node_exporter reporting them correctly as active=0

but that makes monitoring ugly with a long manually constructed exclude list in the query/rule

is there any simple solution to exclude all oneshot services from being collected by node_exporter
or has anybody come up with another simple solution to monitor all services without
individually configuration?

TIA




On Saturday, September 2, 2017 at 11:32:29 AM UTC+2, Ben Kochie wrote:
For NTPd, I would recommend using the textfile metrics collector[0].

I don't know of any specific exporters for the other two.

You can use the systemd service state collector in the node_exporter, but I would recommend setting a whitelist.

-collector.systemd.unit-whitelist '.+\.service'

This will avoid exporting metrics for the huge number of non-service units that are configured on most systems.  You're still going to be adding nearly 1000 metrics this way.  You may want to consider a specific whitelist like '(cron|ssh)\.service'.

Another option would be to use mtail[1] to parse syslog output to produce metrics.

On Fri, Sep 1, 2017 at 6:42 PM, <rajki...@gmail.com> wrote:

I have prometheus setup. And I will like to monitor process running on the linux box. I have node exporter installed on each server and added as targets to the Prometheus server. Following is the process that i want to monitor. 


Process:

  • ntpd
  • crond
  • ssh
I did enabled systemd in node_export config. But i couldnt get the individual service specific metrics in my prometheus server. I am able to get number of process running using node_procs_running. Well i want to know whether a process is running on specific host or not. If not i need to alter a specific group of users. Can anyone help me out.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To post to this group, send email to promethe...@googlegroups.com.

Ben Kochie

unread,
Dec 28, 2018, 1:20:29 PM12/28/18
to robu...@gmail.com, Prometheus Users
That would require a new type of filter on the exporter side. If you could file an issue, that would help.

robu...@gmail.com

unread,
Dec 31, 2018, 5:38:13 AM12/31/18
to Prometheus Users
Reply all
Reply to author
Forward
0 new messages