node_exporter - non root user can't see all disk with node_filesystem

Soph N

unread,

Aug 14, 2021, 12:38:38 AM8/14/21

to Prometheus Users

Hello everyone,

I am struggling to identify what is the permission issue that forces me to run node_exporter as root instead of its own user.

Here is my issue.

df shows me the two disks /dev/nmve0n1p1 and /dev/nmve1n1p1

df -Th

Filesystem Type Size Used Avail Use% Mounted on

devtmpfs devtmpfs 7.6G 0 7.6G 0% /dev

tmpfs tmpfs 7.7G 0 7.7G 0% /dev/shm

tmpfs tmpfs 7.7G 476K 7.7G 1% /run

tmpfs tmpfs 7.7G 0 7.7G 0% /sys/fs/cgroup

/dev/nvme0n1p1 xfs 30G 1.9G 29G 7% /

/dev/nvme1n1p1 ext4 184G 168G 6.2G 97% /home/ec2-user/data

tmpfs tmpfs 1.6G 0 1.6G 0% /run/user/1000

however the node exporter metric node_filesystem_files will not show me both

curl -s "http://localhost:9100/metrics" | grep "node_filesystem_files"

# HELP node_filesystem_files Filesystem total file nodes.

# TYPE node_filesystem_files gauge

node_filesystem_files{device="/dev/nvme0n1p1",fstype="xfs",mountpoint="/"} 1.57276e+07

node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.993667e+06

node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 1.993667e+06

# HELP node_filesystem_files_free Filesystem total free file nodes.

# TYPE node_filesystem_files_free gauge

node_filesystem_files_free{device="/dev/nvme0n1p1",fstype="xfs",mountpoint="/"} 1.5679285e+07

node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.993227e+06

node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 1.993666e+06

It will only work if I run the node_exporter as root

curl -s "http://localhost:9100/metrics" | grep "node_filesystem_files"

# HELP node_filesystem_files Filesystem total file nodes.

# TYPE node_filesystem_files gauge

node_filesystem_files{device="/dev/nvme0n1p1",fstype="xfs",mountpoint="/"} 1.57276e+07

node_filesystem_files{device="/dev/nvme1n1p1",fstype="ext4",mountpoint="/home/ec2-user/data"} 1.2214272e+07

node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.993667e+06

node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 1.993667e+06

# HELP node_filesystem_files_free Filesystem total free file nodes.

# TYPE node_filesystem_files_free gauge

node_filesystem_files_free{device="/dev/nvme0n1p1",fstype="xfs",mountpoint="/"} 1.5679285e+07

node_filesystem_files_free{device="/dev/nvme1n1p1",fstype="ext4",mountpoint="/home/ec2-user/data"} 1.2124279e+07

node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.993227e+06

node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/1000"} 1.993666e+06

few more detail hope that will help with the issue :

my /etc/fstab doesn't have the mount information of the second disk however /proc/mount has it, can this be the issue ?

Thanks in advance for the help,

Soph

Brian Candler

unread,

Aug 14, 2021, 8:22:05 AM8/14/21

to Prometheus Users

I presume you're running the latest version 1.2.2?

Maybe strace will help you understand what's going on:

strace -f -s128 -p <pid-of-node-exporter> 2>strace.out

You can see which files it's trying to access and whether it's getting permission errors.

FWIW, I just tried running node_exporter as root and normal users, and I got 3906 and 3900 lines of metrics respectively. The only ones which were missing were:

# HELP node_rapl_core_joules_total Current RAPL core value in joules

# TYPE node_rapl_core_joules_total counter

node_rapl_core_joules_total{index="0"} 79345.889568

# HELP node_rapl_package_joules_total Current RAPL package value in joules

# TYPE node_rapl_package_joules_total counter

node_rapl_package_joules_total{index="0"} 129630.134976

Soph N

unread,

Aug 15, 2021, 11:51:55 PM8/15/21

to Prometheus Users

Hi Brian,

I am running 1.2.2 yes.

Regarding strace, would you have any keyword i could grep to identify the permission issue ? the output is enormous and though i can see some error when i grep on "file" i am not sure those would explain why I receive metric for one disk and not the other one.

here is an example of the output :

[pid 26439] openat(AT_FDCWD, "/sys/devices/system/cpu/cpu13/thermal_throttle/core_throttle_count", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

[pid 26439] openat(AT_FDCWD, "/sys/devices/system/cpu/cpu13/thermal_throttle/package_throttle_count", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)