I've got the OOM issue figured out now, and it's not a bug; it's an
actual legitimate OOM situation. I feel kinda silly now - it's pretty
obvious in hindsight. md/stripe_cache_size uses memory, and
increasing it uses more memory. Turns out the GnuBee's 512M RAM is
enough for one array w/ 8192 md/stripe_cache_size, but not two -
trying to set the second array to the larger stripe_cache_size uses up
all the memory, and triggers a legitimate OOM. Also obviously swap
space won't help here.
The light bulb went on for me when I was testing: I'd switched one of
the two raid6 arrays to raid10 & had it running as GNUBEE-ROOT, the
other raid6 I kept for testing, re-adding the missing drives so it's
not degraded any more. Setting 8192 md/stripe_cache_size worked w/out
triggering OOM, same as before for a single array.
Then I created a new raid5 array for testing w/ two arrays w/
md/stripe_cache_size, and when I tried `echo 8192 >
/sys/block/md124/md/stripe_cache_size`, it did trigger the OOM. But
to my surprise this time it actually managed to recover from the OOM -
and checking `cat /sys/block/md124/md/stripe_cache_size` showed 7184 -
not the 8192 I had echoed into it. And that's when the light bulb
went on.
To confirm, I kept the first array at 8192 & set the second array to
512, then doubling to 1024, 2048, and 4096, and checking free memory;
it succeeded each time, w/ less free memory. I kept increasing at
smaller intervals, until 7168 succeeded, showing 8M free memory, and
the next test at 7680 triggered the OOM.
So now I'm happy I understand what was triggering the OOM, and it's
not a bug. I'm just going to leave the OMV udev rule disabled (w/
dpkg-divert to prevent it getting installed again), and stay at the
default 256 stripe_cache_size.
Thanks,
Conway S. Smith
P.S.
Here's some of my testing, getting close to but not triggering the OOM.
root@GnuBee0 ~# cat /sys/block/md12?/md/stripe_cache_size
256
256
root@GnuBee0 ~# free -m
total used free shared buff/cache available
Mem: 496 132 45 29 318 320
Swap: 9143 2 9141
root@GnuBee0 ~# echo 8192 > /sys/block/md126/md/stripe_cache_size
root@GnuBee0 ~# cat /sys/block/md12?/md/stripe_cache_size
256
8192
root@GnuBee0 ~# free -m
total used free shared buff/cache available
Mem: 496 303 38 29 155 149
Swap: 9143 22 9121
root@GnuBee0 ~# echo 4096 > /sys/block/md125/md/stripe_cache_size
root@GnuBee0 ~# cat /sys/block/md12?/md/stripe_cache_size
4096
8192
root@GnuBee0 ~# free -m
total used free shared buff/cache available
Mem: 496 395 20 16 80 70
Swap: 9143 40 9103
root@GnuBee0 ~# echo 7168 > /sys/block/md125/md/stripe_cache_size
root@GnuBee0 ~# cat /sys/block/md12?/md/stripe_cache_size
7168
8192
root@GnuBee0 ~# free -m
total used free shared buff/cache available
Mem: 496 437 5 4 53 40
Swap: 9143 89 9054
root@GnuBee0 ~# echo 256 > /sys/block/md125/md/stripe_cache_size
root@GnuBee0 ~# echo 256 > /sys/block/md126/md/stripe_cache_size
root@GnuBee0 ~# cat /sys/block/md12?/md/stripe_cache_size
256
256
root@GnuBee0 ~# free -m
total used free shared buff/cache available
Mem: 496 66 368 4 61 411
Swap: 9143 88 9055