Monitoring HP raid

258 views
Skip to first unread message

Bjørn Jentoft

unread,
Jul 3, 2016, 6:48:10 PM7/3/16
to esos-users
I feel I could use some advice.

So I put my new esos box in semi-production, as storage for some not-so-important stuff. But I fear I put too much disk in the box, and someone might find this extra space and take advantage of it for something a little more important.

What I am scared of, is disk failure. I need to monitor my raid.

As this is a HP raid, I installed the hpacucli tool for it when creating my esos USB. And it works, I can use it to check that status of my disks. But it's not an agent that can be used to monitor the raid.

I am aware of smartctl being able to look into individual disk in an array, but I have no clue how to expose this data to the snmpd service for generic monitoring software or the nrpe service for Nagios.

Tried brainstorming with myself. Is there any cron scripting I can do with the hpacucli software and expose the result?

Any suggestions?

Regards, Bjorn

Bjørn Jentoft

unread,
Jul 3, 2016, 6:54:14 PM7/3/16
to esos-users
I will have a go at something like this and post back my results.

http://www.betweendots.com/topic/23-simple-script-i-use-to-monitor-my-smart-array-p410-raid-in-linux/

Regards

Bjørn Jentoft

unread,
Jul 3, 2016, 8:04:12 PM7/3/16
to esos-users
A rewrite of the script in the link above seems to work:

#!/bin/bash
###
#If something went wrong with the compac smartarray disks this script will send an error email
###
MAIL=some@one
HPACUCLI=`which hpacucli`
HPACUCLI_TMP=/tmp/hpacucli.log

$HPACUCLI controller slot=4 physicaldrive all show > $HPACUCLI_TMP

if [ $(cat $HPACUCLI_TMP | grep -e Failed -e Rebuilding | wc -l) -gt 0 ]
then
    msg="TEST - RAID Controller Errors"
    echo $msg
    logger -p syslog.error -t RAID "$msg"
    echo "Subject:$HOSTNAME [ERROR] - $msg" | cat - $HPACUCLI_TMP | sendmail "$MAIL"
fi

rm -f $HPACUCLI_TMP

  

I'll cron this and wait for my first disk failure... :)

Bjorn

Marc Smith

unread,
Jul 5, 2016, 9:16:50 AM7/5/16
to esos-...@googlegroups.com
Hi Bjorn,

In ESOS we have this health check script:
https://github.com/parodyne/esos/blob/master/scripts/health_chk.sh

It runs in a cron, and currently checks Avago/LSI and Adaptec RAID
controllers for logical volume faults, and disk faults. Would you be
up for making a patch to this script to support HP Smart Array RAID?


--Marc
> --
> You received this message because you are subscribed to the Google Groups
> "esos-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to esos-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Bjorn Jentoft

unread,
Jul 5, 2016, 9:32:02 AM7/5/16
to esos-...@googlegroups.com
I actually saw your script, while trying to troubleshoot my own, as it didn't run that well under cron. I thought about adding to it, but didn't.

However, I found similar scripts for powershell and python, that would let me improve the HP raid health check. So I might as well have a go at patching your script rather than improving my own.

I start my holiday this weekend, so don't expect any results anytime soon.

Bjorn

You received this message because you are subscribed to a topic in the Google Groups "esos-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/esos-users/j7GTluBqTuU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to esos-users+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages