Date created: Friday, January 20, 2012 11:54:00 PM. Last modified: Tuesday, September 19, 2017 12:03:08 PM
RRDTool Traffic Drop Alert
This script will use rrdtool to pull two values from an rra file. If the the more recent one is lower than the older value, by a given threshold, an alert email is sent.
Various RRDtool based monitoring platforms have "threshold" style alerting, like the "Thold" plugin for Cacti but if a situation requires alerting about a sudden drops on link utilisation, or sudden rises, Thold might not do exactly what is needed. An example would be one link dropping by 90% usage since it was last polled 5 minutes ago, and another link increasing in utilisation, again 90%, since it was last polled 5 minutes ago. This may indicate a transfer problem across the link that has caused a routing protocol re-convergence but not a layer 1 link down alert because the link is still up.
Care must bet taken against false reports. This is most useful inscaenarios where a data stream is being sent over link continuously and indefinitely.
traffic_in_watch.sh
#!/bin/bash #./traffic_in_watch /path/to/rrafile 'description of error' trigger_threshold # # Example: #./traffic_in_watch /var/lib/cacti/rra/router1_fa0-1.rra "core router 1 fa0-1 link has dropped very low" 0.2 # # The trigger_threshold is a floating point value of link usage percent, inverse to how much it was using on the newer # sample compared to the older sample. # For example, 0.2 means the link must be at 20 percent usage on the newer sample compared to what it was on the first sample, # so it has dropped 80 percent. To get triggers for a 90 percent drop in usage set it to 0.1 recentepoch=`rrdtool last $1` recentepoch=`echo "(($recentepoch/300)*300)-1" | bc` previousepoch=$(($recentepoch-300)) recentsample=$(rrdtool fetch $1 AVERAGE -s $recentepoch | grep "e" | head -n 1 | awk -F " " '{print $2}') recentsample=`printf "%.f" "$recentsample"` oldersample=$(rrdtool fetch $1 AVERAGE -s $previousepoch -e $recentepoch | grep "e" | head -n 1 | awk -F " " '{print $2}') oldersample=`printf "%.f" "$oldersample"` if [ $recentsample -lt $(printf "%.f" "`echo "$oldersample * $3" | bc`") ] then recentspeed=`echo "($recentsample*8)/1000" | bc` olderspeed=`echo "($oldersample*8)/1000" | bc` mailtemp=$RANDOM.$RANDOM echo "" > ./$mailtemp echo "$2" >> ./$mailtemp echo "$1" >> ./$mailtemp echo "`date -d @$recentepoch` ($recentepoch) : $recentsample Kbps" >> ./$mailtemp echo "`date -d @$previousepoch` ($previousepoch) : $oldersample Kbps" >> ./$mailtemp cat ./$mailtemp | mail -s "Link Utilisation Alert" "user@email.com" rm ./$mailtemp fi
This can be triggered with a crontab entry like the following;
*/5 * * * * /path/to/traffic_in_watch.sh /var/lib/cacti/rra/router1-01_link1_traffic_in_1234.rrd "There has been an 80 percent or greater drop in traffic from Router-01 link-1, within 5 minutes." 0.2
Previous page: RRDTool Total Bandwidth In & Out
Next page: SNMP Extend