How to monitor Windows and Linux server Temperature

1,353 views
Skip to first unread message

fahad

unread,
Nov 15, 2019, 7:10:29 AM11/15/19
to Prometheus Users
Hello Team,

How to monitor Windows and Linux server Temperature?
I have setup Node exporter on Linux VM and WMI exporter on Windows VM, I can collect Uptime, Processor, RAM, CPU load, Memory Load, Disk Usage etc.
Unable to monitor Temperature.

Any help would be appreciated.

Thank you.
WindowsMonitoring.png

Christian Hoffmann

unread,
Nov 17, 2019, 8:39:42 AM11/17/19
to fahad, Prometheus Users
Hi,

On 11/15/19 1:10 PM, fahad wrote:
> How to monitor Windows and Linux server Temperature?
> I have setup Node exporter on Linux VM and WMI exporter on Windows VM, I
> can collect Uptime, Processor, RAM, CPU load, Memory Load, Disk Usage etc.
> Unable to monitor Temperature.

node_exporter on Linux already exports temperature data if the kernel
can read the appropriate sensor. The hwmon collector is used for this.
Example:

node_hwmon_temp_celsius{chip="pci0000:00_0000:00:18_3",sensor="temp1"} 43.75

I don't know or use wmi_exporter, but it looks it can export thermal
data as well, but may need to have the collector enabled first:

https://github.com/martinlindhe/wmi_exporter/blob/master/docs/collector.thermalzone.md

Also keep in mind that hypervisors often do not expose the host's
temperature data to a virtual machine. So you would rather want to run
node_exporter or wmi_exporter on the virtualization host (maybe in
addition to the actual VMs).

The exposed metrics look different. You could also try to find another
way. If all your servers are one particular brand, you can try scraping
their management interfaces (such as iLO for HP) instead. There seem to
be exporters for such as cases as well.

Kind regards,
Christian

fahad

unread,
Nov 18, 2019, 11:21:19 AM11/18/19
to Prometheus Users
Hello Christian,
Thanks for the reply,

I tried to build the dashboard for node-exporter(Linux VM) using hwmon collector.
Please review it, is it right?
There is showing temperature 2 times, even I am selecting particular host (variable)

Tried for windows but no success.
Any suggestions please.

Your effort will be appreciated.

thank you.
temperature.png
hwmon.png

Christian Hoffmann

unread,
Nov 18, 2019, 3:28:41 PM11/18/19
to fahad, Prometheus Users
Hi,

On 11/18/19 5:21 PM, fahad wrote:
> I tried to build the dashboard for node-exporter(Linux VM) using hwmon
> collector.
> Please review it, is it right?
> There is showing temperature 2 times, even I am selecting particular
> host (variable)
I think there are two problems:

First, you are graphing how long the collector took. As the metric name
says, this is a duration in seconds which you try to graph as a
temperature in Celsius. This would explain the strange values. :)
Rather look for node_hwmon_temp_celsius etc. If there are no such
metrics, you may have first try to make your kernel expose these
information. Maybe some sensor module has to be loaded first.

Second, you say that you get two values rather than one. This is
probably related to a missing label selector. You selected a single
target in your dashboard. However, that's just the Grafana part of it.
You will also have to add this information into your PromQL query. Try
editing one of the existing panels to see how it is done (especially
what the variable is called).

You probably want something like:

node_hwmon_temp_celsius{instance="$instance"}
where $instance will be replaced with your selected target by Grafana
(Prometheus will never see this variable).

> Tried for windows but no success.
> Any suggestions please.
What exactly have you tried? Did you enable the collector? How? What does
curl http://some-windows-machine:wmi_exporter_port/metrics | grep temp

yield?


Kind regards,
Christian

fahad

unread,
Nov 19, 2019, 7:43:37 AM11/19/19
to Prometheus Users
Hi Christian,

Thanks for the reply,

Linux VM 
installed lm-serson package
VMIP:9100/metrics 
I can see node_hwmon_temp_celsius{chip="platform_coretemp_0",sensor="temp1"} 100

For every host it is showing 100

Even I checked at VM level using sensor command

Created a sample dashboard using this query
node_hwmon_temp_celsius{chip="platform_coretemp_0",instance="$host",job="$job",sensor="temp1"}

Value is 100, is this correct? or need to do filtering (mean, derivative etc)

Windows
WMI exporter service - running.

What exactly have you tried? Did you enable the collector? How? What does
        curl http://some-windows-machine:wmi_exporter_port/metrics | grep temp
no thermal 
no temp
no hwmon
I am looking into it, how I can measure temperature of Windows VM.



Thank you.


On Friday, 15 November 2019 17:40:29 UTC+5:30, fahad wrote:
metrics.png
sensors.png
prometheus.png
graph.png
graph2.png

Christian Hoffmann

unread,
Nov 19, 2019, 11:31:34 AM11/19/19
to fahad, Prometheus Users
Hi,

On 11/19/19 1:43 PM, fahad wrote:
> Linux VM 
> installed lm-serson package
> VMIP:9100/metrics 
> I can see
> node_hwmon_temp_celsius{chip="platform_coretemp_0",sensor="temp1"} 100
>
> For every host it is showing 100
>
> Even I checked at VM level using sensor command
>
> Created a sample dashboard using this query
> node_hwmon_temp_celsius{chip="platform_coretemp_0",instance="$host",job="$job",sensor="temp1"}
>
> Value is 100, is this correct? or need to do filtering (mean, derivative
> etc)
This should be the actual temperature, so no need for further filtering.
I don't know if 100°C is correct -- it might be, although it sounds
rather high.

As said previously, I don't know if/how hypervisors passthrough such
data into the VMs. It may also be possible that this is virtualized fake
data which is constantly set to 100. In this case I would suggest
running node_exporter on the virtualization host instead (KVM/XEN host).

> Windows
> Installed wmi_exporter v0.9.0
> (https://github.com/martinlindhe/wmi_exporter/releases )wmi_exporter-0.9.0-amd64.msi
> <https://github.com/martinlindhe/wmi_exporter/releases/download/v0.9.0/wmi_exporter-0.9.0-amd64.msi>
> WMI exporter service - running.
>
> What exactly have you tried? Did you enable the collector? How? What does
>         curl http://some-windows-machine:wmi_exporter_port/metrics |
> grep temp
> http://localhost:9182/metrics 
> no thermal 
> no temp
> no hwmon
> I am looking into it, how I can measure temperature of Windows VM.
I previously linked to this doc:
https://github.com/martinlindhe/wmi_exporter/blob/master/docs/collector.thermalzone.md

Did you have a look already?

Kind regards,
Christian
Reply all
Reply to author
Forward
0 new messages