Hello all,
I have struggled for weeks
trying to figure this one out. I even took a course on learning Go
(thank you Todd McLeod!) for this and future endeavors. I have a
working shell script which reports the CPU and drive temperatures using a
CRON job but I would like to have that data collected, visualized, and
alerting within a more comprehensive/configurable monitoring system.
Looking forward, I chose Prometheus as that tool and as part of the
ongoing learning/labbing. Using steps from
https://blog.yo61.com/installing-prometheus-node_exporter-on-freenas/ I
was abke to run the node_exporter on FreeNAS, however, it reported the
CPU temperature as 400,000,000 degrees Centigrade. I have gone so far
as to write my own collector using the converted functions (os/exec
calls) from the shell script, but have come back to you to reconsider
the path of least resistance. Within the collector cpu_freebsd.go we
have line 129 (within the method described and defined on 104 and 105):
Line 104: // Expose CPU stats using sysctl.
Line 105: func (c *statCollector) Update(ch chan<- prometheus.Metric) error {
...
Line 129: temp, err := unix.SysctlUint32(fmt.Sprintf("dev.cpu.%d.temperature", cpu))
What
is this Sprintf even pulling from? I get that it is taking the data
and converting it to base 10, then to Uint32, but how are we getting
400M C? Later on line 143 of the same file we have
Line 143: ch <- c.temp.mustNewConstMetric(float64(temp-2732)/10, lcpu)
Again,
what in the world is this conversion? Looking back, "cpu" from line
129, is taken from the cpuTimes array defined below (created and then
looped over) in lines 117-121
cpuTimes, err := getCPUTimes()
if err != nil {
return err
}
for cpu, t := range cpuTimes {
and
this is where I get a bit lost. I dont see the temperature defind in
cpuTimes() of NewStatCollector(). If we look in
golang.org\sys\unix\syscall_bsd.go we find the SysctlUint32(name string)
function, lines 457-476, below
func SysctlUint32(name string) (uint32, error) {
return SysctlUint32Args(name)
}
func SysctlUint32Args(name string, args ...int) (uint32, error) {
mib, err := sysctlmib(name, args...)
if err != nil {
return 0, err
}
n := uintptr(4)
buf := make([]byte, 4)
if err := sysctl(mib, &buf[0], &n, nil, 0); err != nil {
return 0, err
}
if n != 4 {
return 0, EIO
}
return *(*uint32)(unsafe.Pointer(&buf[0])), nil
}
Getting
thick into the weeds here and not sure where to go. I don't see the
call to Sysctl, but it gives me a place to start. I am booting the
machine and will try to just run Sysctl to see what that gives. That
alone may be the issue. The custom collector I wrote can be found here -
https://forums.freenas.org/index.php?threads/custom-metric-collector-for-prometheus-node-exporter.63462/
- and eventually on github. Realizing that I may split the CPU and
Drive temps into two files anyway, I may look at simply using IPMI
exporter first to get it working if I cant get the node_exporter fixed.
Still a little lost. It may be time to make a custom exporter with my one custom collector once I figure out how to get that working. Any advice or help is greatly appreciated