Best way to record system state?

334 views
Skip to first unread message

John Dexter

unread,
Jun 8, 2021, 6:36:14 AM6/8/21
to Prometheus Users
Our software has a single state value indicating if the system is running, starting, stopping, or stopped (but the software is still actually running).
I would like to record this as a metric and I can't decide if recording a mapping to key values (my_label = stopped=0, starting=1, ...) is better, or one meta-metric for each state my_label_info{state="starting} etc.

In our visualisation dashboard (Grafana) I will want to be able to:
- show how many systems are in each state
- show the state of a single system in a per-system dashboard.

Is there an obvious answer to this or are there arguments for each?

Thanks.

Harald Koch

unread,
Jun 8, 2021, 9:02:47 AM6/8/21
to Prometheus Users
I'd do the second. The first has the problem that the labels change, creating a new metric each time - you'll have metrics appearing and disappearing as the states change.

The systemd module for the node exporter does this, similar to your second choice:

node_systemd_unit_state{name="nginx.service",state="activating",type="forking"} 0
node_systemd_unit_state{name="nginx.service",state="active",type="forking"} 1
node_systemd_unit_state{name="nginx.service",state="deactivating",type="forking"} 0
node_systemd_unit_state{name="nginx.service",state="failed",type="forking"} 0
node_systemd_unit_state{name="nginx.service",state="inactive",type="forking"} 0

There is a metric / label set for each state, and the values change from 0 to 1 and back as the service transitions through each state.

Counting systems in each state is just a sum by (name,state) (node_systemd_unit_state) operation. Displaying the current state in Grafana can be done with node_systemd_unit_state == 1 , and then using Grafana's ability to extract and display the value of the "state" label.

--
Harald

Reply all
Reply to author
Forward
0 new messages