Date created: Monday, April 29, 2013 2:58:17 PM. Last modified: Tuesday, April 30, 2013 11:29:44 AM

Check Dell RAID, Disks and Battery via SNMP

Check RAID state, virual disk state and physical disk state, and RAID battery backup unit health, via two bash scritps. Set up to be checked via SNMP and return OK or FAULT + fault desctription, when there is a problem.

These can be checked via nagios with check_snmp using `check_snmp -o .1.3.6.1.4.1.8072.1.3.2.3.1.2.8.99.104.101.99.107.98.98.117 -C c0mMunITY -P 2c -r OK` and `check_snmp -o .1.3.6.1.4.1.8072.1.3.2.3.1.2.9.99.104.101.99.107.114.97.105.100 -C c0mMunITY -P 2c -r OK`

Add two extend options to /etc/snmp/snmpd.conf (don't forget to restart snmpd!)

extend checkbbu /usr/scripts/checkbbu.sh
extend checkraid /usr/scripts/checkraid.sh

Add MegaCLI and omreport to the sudoers file for the snmpd user `snmp`;

# Cmnd alias specification
Cmnd_Alias MCLI = /opt/MegaRAID/MegaCli/MegaCli
Cmnd_Alias OMREPORT = /usr/sbin/omreport

# User privilege specification
snmp ALL=MCLI, OMREPORT, NOPASSWD: MCLI, OMREPORT

Add the two scripts for check on RAID disks and battery below; In each script you can enable either omreport or MegaCLI functionality depending on which you have installed (or both!).

Script for checking raid battery via omreport and/or MegaCLI, checkbbu.sh

#!/bin/bash 

fault="No"

# Use omreport output for RAID testing
omenable=1
omreport=/usr/sbin/omreport

# Use MegaCLI for RAID testing
megaenable=1
megacli=/opt/MegaRAID/MegaCli/MegaCli

sudo=/usr/bin/sudo

if [ $omenable -eq 1 ]
then
    for status in $($sudo $omreport storage battery | grep Status | grep -v "Capacity Status" | awk '{print $NF}')
    do
        if [ "$status" != "Ok" ]
        then
            fault="Battery has status: $status"
        fi
    done

    for state in $($sudo $omreport storage battery | grep State | grep -v "Learn State" | awk '{print $NF}')
    do
        if [ "$state" != "Ready" ]
        then
            fault="Battery has state: $status"
        fi
    done
fi

if [ $megaenable -eq 1 ]
then
    output=/tmp/$RANDOM
    /usr/bin/sudo /opt/MegaRAID/MegaCli/MegaCli -AdpBbuCmd -aALL > $output

    # Battery Replacement required            : No
    batreplace=$(grep "Battery Replacement required" $output | awk '{print $5}')
    # Pack is about to fail & should be replaced : No
    batabout=$(grep "Pack is about to fail & should be replaced" $output | awk '{print $11}')
    # Remaining Time Alarm    : No
    battime=$(grep "Remaining Time Alarm" $output | awk '{print $5}')
    # Remaining Capacity Alarm: No
    batcapac=$(grep -m 1 "Remaining Capacity Alarm" $output | awk '{print $4}')

    if [ "$batreplace" != "No" ] || [ "$batabout" != "No" ] || [ "$battime" != "No" ] || [ "$batcapac" != "No" ] ; then
        fault="Battery Replacement required: $batreplace, Pack is about to fail and should be replaced: $batabout, Remaining Time Alarm: $battime, Remaining Capacity Alarm: $batcapac"
    fi

    rm $output
fi

if [ "$fault" != "No" ]
then
    echo -n "FAULT: $fault"
else
    echo -n "OK"
fi

Script for check RAID virtual disk state and physical disks via omreport and/or MegaCLI, checkraid.sh

#!/bin/bash

fault="No"

# Use omreport output for RAID testing
omenable=1
omreport=/usr/sbin/omreport

# Use MegaCLI for RAID testing
megaenable=1
megacli=/opt/MegaRAID/MegaCli/MegaCli

sudo=/usr/bin/sudo

if [ $omenable -eq 1 ]
then
    for state in $($sudo $omreport storage pdisk controller=0 | grep State | awk '{print $NF}')
    do
        if [ "$state" != "Online" ] && [ "$state" != "Ready" ]
        then
            fault="Physical disk has state: $state"
        fi
    done

    for failure in $($sudo $omreport storage pdisk controller=0 | grep Failure | awk '{print $NF}')
    do
        if [ "$failure" != "No" ]
        then
            fault="Physical disk failure predicted: $failure"
        fi
    done

    for state in $($sudo $omreport storage vdisk | grep State | awk '{print $NF}')
    do
        if [ "$state" != "Ready" ]
        then
            fault="Virtual disk has state: $state"
        fi
    done
fi

if [ $megaenable -eq 1 ]
then
    for status in $($sudo $megacli -LdPdInfo -aALL | grep S.M.A.R.T | awk '{print $NF}')
    do
        if [ "$status" != "No" ]
        then
            fault="Drive has flagged a S.M.A.R.T alert: $status"
        fi
    done

    for state in $($sudo $megacli -LdPdInfo -aALL | grep -m 1 State | awk '{print $NF}')
    do
        if [ "$state" != "Optimal" ]
        then
            fault="Virtual Disk State: $state"
        fi
    done
fi

if [ "$fault" != "No" ]
then
    echo -n "FAULT: $fault"
else
    echo -n "OK"
fi

Previous page: Kubernetes
Next page: Dell OMSA on Linux