Another nagios/bash script that verifies if a cluster has performed a failover. This one is used for any microsoft cluster.
#!/bin/bash
# 2008-07-23 Nereu
# edit: Felipe Ferreira 01-2010
TMPDIR=/usr/local/nagios/var
PLGDIR=/usr/local/nagios/libexec
OK=0
WARN=1
CRIT=2
UNKN=3
if [ "$#" -ne 3 ]
then
echo "$0 Service cluster_node1 cluster_node2"
exit $UNKN
else
SERVICE=$1
NODE1=$2
NODE2=$3
fi
FILENAME=$(echo $SERVICE|tr -d $)
LAST_FILE=${TMPDIR}/.${FILENAME}
if [ -e $LAST_FILE ]
then
LAST=`cat $LAST_FILE`
else
LAST=""
fi
RESULT=`$PLGDIR/check_nt -H ${NODE1} -v SERVICESTATE -l ${SERVICE}|grep running`
if [ -z "${RESULT}" ]; then ### ADDED THIS SO IT WILL NOT FAIL IN CASE NO RETURN FROM CHECK_NT
if [ "$RESULT" = "All services are running" ]
then
CURRENT=$NODE1
else
CURRENT=$NODE2
fi
fi
echo $CURRENT >$LAST_FILE
if [ "$LAST" == "$CURRENT" -o "$LAST" != "" ]
then
echo "Service in ${CURRENT}"
exit $OK
else
echo "Service change from ${LAST} to ${CURRENT}"
exit $CRIT
fi

Hello,
Look like a nice script, but it won’t catch cluster resources failing over, failing again and coming back online on the same node. Or am I wrong? I really need some script that detects failing cluster resources, even when they don’t failover.
Hello,
I have had some trouble with this getting the correct return values in case that the cluster is not working at all and also with using nscp instead of using the standard check_nt.
I have further added a validation check if the cluster is running on the second node in case the the first one is broken. If this is for some reason broken as well this will be handled now.
#!/bin/bash
# 2008-07-23 Nereu
# edit: Felipe Ferreira 01-2010
# edit: Martin Mahnert 2014-02-12
TMPDIR=/usr/local/nagios
# PLGDIR=/usr/local/nagios/libexec
PLGDIR=/usr/lib/nagios/plugins
OK=0
WARN=1
CRIT=2
UNKN=3
if [ “$#” -ne 3 ]
then
echo “$0 Service cluster_node1 cluster_node2″
exit $UNKN
else
SERVICE=$1
NODE1=$2
NODE2=$3
fi
FILENAME=$(echo $SERVICE|tr -d $)
LAST_FILE=${TMPDIR}/.${FILENAME}
if [ -e $LAST_FILE ]
then
LAST=`cat $LAST_FILE`
else
LAST=””
fi
ACCESS=`grep 12489 /etc/nagios-plugins/config/nt.cfg | awk ‘{print $9}’`
PASS=`echo ${ACCESS//\”/}`
#RESULT1=`${PLGDIR}/check_nt -H ${NODE1} -v SERVICESTATE -l ${SERVICE} | grep -c OK`
RESULT1=`${PLGDIR}/check_nt -H ${NODE1} -u -p 12489 -s ${PASS} -v SERVICESTATE -l ${SERVICE} | grep -c OK`
if [ -n “${RESULT1}” ]; then ### ADDED THIS SO IT WILL NOT FAIL IN CASE NO RETURN FROM CHECK_NT
if [ “${RESULT1}” = “1” ]
then
CURRENT=$NODE1
elif [ “${RESULT1}” = “0” ]
then
RESULT2=`${PLGDIR}/check_nt -H ${NODE2} -u -p 12489 -s ${PASS} -v SERVICESTATE -l ${SERVICE} | grep -c OK`
if [ -n “${RESULT2}” ]; then ### ADDED THIS SO IT WILL NOT FAIL IN CASE NO RETURN FROM CHECK_NT
if [ “${RESULT2}” = “1” ]
then
CURRENT=$NODE2
elif [ “${RESULT2}” = “0” ]
then
CURRENT=”Not active.”
fi
fi
fi
fi
echo $CURRENT >$LAST_FILE
if [ “$LAST” == “$CURRENT” -o “$LAST” != “” ] && [ “$CURRENT” != “Not active.” ]
then
echo “Service active on: ${CURRENT}”
echo “OK”
exit $OK
else
if [ “$CURRENT” == “Not active.” ]
then
echo “Service active on: ${CURRENT}”
echo “CRIT”
exit $CRIT
else
echo “Service change from ${LAST} to ${CURRENT}”
echo “CRIT”
exit $CRIT
fi
fi
Cool Martin. Good work.