I am using small ARM computers (NanoPi) using armbian (
https://www.armbian.com/) and I am monitoring them using Prometheus, in addition to a few other machines.
My goal is to setup a console page that is a summary of the major health indicators (CPU, RAM use, _disk use_) for all the machines on my network. I started from the consoles/node.html example.
Armbian uses a special filesystem (log2ram) as a way to cache log files to ram and avoid writing to the microSD too often. This is seen as mount point /var/log.hdd. The size reported for this mount point is the same as the microSD it caches. The df command reports this mount as Filesystem = log2ram, while mount reports it as type = ext4 and Prometheus reports it as fstype = ext4 (probably from the same information as mount?):
# df -h
Filesystem Size Used Avail Use% Mounted on
udev 10M 0 10M 0% /dev
tmpfs 100M 14M 86M 14% /run
/dev/mmcblk0p1 14G 1.5G 13G 11% /
tmpfs 249M 4.0K 249M 1% /dev/shm
tmpfs 5.0M 4.0K 5.0M 1% /run/lock
tmpfs 249M 0 249M 0% /sys/fs/cgroup
tmpfs 249M 0 249M 0% /tmp
log2ram 50M 5.4M 45M 11% /var/log
tmpfs 50M 0 50M 0% /run/user/0
# mount |grep mmcblk0p1
/dev/mmcblk0p1 on / type ext4 (rw,noatime,nodiratime,errors=remount-ro,commit=600)
/dev/mmcblk0p1 on /var/log.hdd type ext4 (rw,noatime,nodiratime,errors=remount-ro,commit=600)
To simplify the reporting of machines with various mount points, I decided to create a disk use "global" usage indicator that is based on the sum of node_filesystem_free and the sum of node_filesystem_size for all mount points of each machine.
The log2ram mount points causes these sum numbers to be the double of the actual microSD size. (However the % used should not be impacted too much in my use case :-).
Has anyone dealt with a similar setup? How can I skip these log2ram entries?
As a side note, I am trying to compute the % used as: args (printf "100 * sum(node_filesystem_free{job='node',instance='%s',fstype='ext4'}) / sum(node_filesystem_size{job='node',instance='%s',fstype='ext4'})" .Labels.instance). This always shows "-".
I tested that args (printf "sum(node_filesystem_size{job='node',instance='%s',fstype='ext4'})" .Labels.instance) gives me the right number (doubled :), same for the "_free" number.
What obvious mistake did I make?
Any help will be appreciated.
Thanks.
Pascal.