Nagios / Centreon Distributed Monitoring – Clustering
Document referral to my original post on same subject
There are many ways nagios can be used to load the balance, the basic architeture
is Master/Slave with just nagios is reommeneded to use NSCA daemon witch is developed by Ethan
(the creator of Nagios).The way I describe does not use NCSA but it uses centcore to keep both nagios and .cfgs updated.Distributed architecture or load balancing in nagios is based on central monitoring server (Master) and one or several Sattelite monitors(Remote).
The master server consolidates all monitoring data and offers a user interface which also offers the possibility to monitor and manage the master server and the remote monitors. The remote monitors send their check results to the master server, all is based on NDO to keep the current status updated. Also the centcore script is crucial to keep both servers configuration .cfg aligned and updated.
This type of setup permits distribution of checks – for any type of reason remote locations, or just because you have too many checks for one server to handle. Exemple: we could have all Network checks using Poller1 and all System checks using Poller2, same for remote sites etc…
In practice, centcore takes care of the data transfers between the different servers. The master server has to be equipped with a complete monitoring installation (Nagios, Centreon, NDOutils, MySQL, etc.), in contrast with the remote monitors that only have Nagios and NDOutils installed.
DistribArchSchema
Contents
1 Installing NDOUtils
2 Setting up key authentication using SSH
3 Plugin and configuration duplication
4 Centreon configuration
5 SUDO configuration
6 Finalization
7 Some remarks/tips/advice
8 Troubleshooting
NOTE:
This setup was tested with Nagios 3.2.0 Centreon 2.1.3 and NDOUtil 1.4b9 under CentOS 5.4 should work on RedHat/Fedora. On the remote is enough to have just Nagios and NDO setup, or you can also have another centreon but no real need for it.
1.
Install NDOUtils
# wget http://prdownloads.sourceforge.net/sourceforge/nagios/ndoutils-1.4b9.tar.gz
# tar -zxvf ndoutils-1.4b9.tar.gz
# ./configure –with-ndo2db-user=nagios
# make
# make install
(do not run the script to creat a DB, Centreon will do it)
Check if binaries OK
# ll /usr/local/nagios/bin/
2. Setting up key authentication using SSH
On the master server generate a key pair using ssh-keygen. Accept all defaults. Set the password blank.
# su nagios
# ssh-keygen
> Enter file in which to save the key (/usr/local/nagios/.ssh/id_rsa):
> Created directory ‘/usr/local/nagios/.ssh’.
> Enter passphrase (empty for no passphrase):
> Enter same passphrase again:
> Your identification has been saved in /usr/local/nagios/.ssh/id_rsa.
Transfer the public key to the remote monitor for the Nagios daemon owner. (Replace {IP_ADDRESS} with the IP address of the remote monitor.)
# ssh-copy-id -i ~/.ssh/id_rsa.pub nagios@<hostname>
Should get an answer like:
Now try logging into the machine, with “ssh ‘nagios@<hostname>’”, and check in: .ssh/authorized_keys
It is posible that to work you have to manually allow nagios to SSH and to create the .ssh folder plus the authorized_keys file.
If these steps are succesfully completed, you should be able to log on to the remote monitor via SSH without entering a password. Test the ssh by doing
$ ssh nagios@remoteserver ls
If you get the output without being asked, be sure to use the nagios account to try that, also to start and stop centcore service.
An alernative way to do this is here
4.Plugin & CFG duplication
Restart Centreon/Nagios
Configuration>Nagios>
Nagios Server: Central
[x]Move Export Files [x]Restart Nagios
Mehod:Restar OK
Copy all plugins from the master server to the remote monitor:
# scp /usr/local/nagios/libexec/* nagios@{IP_ADDRESS}:/usr/local/nagios/libexec/
# scp /usr/local/nagios/etc/* nagios@{IP_ADDRESS}:/usr/local/nagios/etc/
Go into each server and check the /usr/local/nagios/etc if are the same
5 Centreon configuration
Connect to the Master Centreon interface and configure the remote monitor.
You must configure 4 things:
pollers : configure a poller for each server (2)
Configuration > Centreon > Pollers > Add
(Status:enabled, Localhost: no, IP address, etc.)
Sattelite Name : give a new name, <UNIQUE>
Status : enabled
localhost : no
ip address : ip address of the remote server
Nagios Init Script : /etc/init.d/nagios
nagios Binary : /usr/local/nagios/bin/nagios
nagiostats Binary : /usr/local/nagios/bin/nagiostats
ndomod : configure a file for each remote server and master (2)
Next, duplicate the ndomod configuration for the new poller.
Configuration > Centreon > ndomod.cfg.
Select action “Duplicate”.
(Status: enabled, Requester: the name of the freshly created poller, IP address: the IP address of the master server, Instance name: must be unique)
Description : <UNIQUE>
Status : enabled
Instance Name : Must be UNIQUE
Requester : Select the good poller
Interface type : tcp socket
output : ip adress of the master
TCP Port : 5668
ndomod.cfg (remote)
instance_name=remote
output_type=tcpsocket
output=172.17.33.79
tcp_port=5668
output_buffer_items=5000
buffer_file=/usr/local/nagios/var/ndomod2.tmp
file_rotation_interval=14400
file_rotation_timeout=60
reconnect_interval=15
reconnect_warning_interval=900
data_processing_options=-1
config_output_options=3
ndo2db : configure a file for each remote server and master too (2)
Next, duplicate the ndo2db configuration for the new poller.
Configuration > Centreon > ndo2db.cfg.
Description : remote <UNIQUE NAME>
Status : enabled
Requester : select the good poller
Socket Type : tcp
Socket Name :
TCP Port : tcp
GENERAL:
Select action “Duplicate”.
Status: enabled
Requester: the name of the new Remote poller
SocketType: tcp
SocketName: no need since its TCP
TCPPort: 5668
DATABASE:
Database Type: MySQL
DatabaseHoster:<IP of MASTER>
Databasename:nagios
Listening Port: 3306
Prefix: nagios_
User:nagios
Password: pass
ndo2db.cfg (remote)
ndo2db_user=nagios
ndo2db_group=nagios
socket_type=tcp
socket_name=/var/run/centreon/ndo.sock
tcp_port=5668
db_servertype=mysql
db_host=<IP of MASTER>
db_name=nagios
db_port=3306
db_prefix=nagios_
db_user=nagios
db_pass=passwordnagiosuser
max_timedevents_age=1440
max_systemcommands_age=1440
max_servicechecks_age=1440
max_hostchecks_age=1440
max_eventhandlers_age=1440
nagios.cfg : configure a file for each server(2)
Next, also duplicate the nagios configuration for the new poller.
Master Centreon : Configuration > Nagios > nagios.cfg
Select action “Duplicate”.
(Status: enabled, Server Nagios configured: the name of the freshly created poller)
Status : enabled
Server Nagios configured : select the good poller
6. SUDO configuration
In order to allow the master server to manage the Nagios daemon on the remote monitor,
sudo has to be configured. Edit /etc/sudoers and add the following lines:
# visudo
nagios ALL=NOPASSWD: /etc/init.d/nagios restart
nagios ALL=NOPASSWD: /etc/init.d/nagios stop
nagios ALL=NOPASSWD: /etc/init.d/nagios start
nagios ALL=NOPASSWD: /etc/init.d/nagios reload
nagios ALL=NOPASSWD: /usr/sbin/nagiostats
nagios ALL=NOPASSWD: /usr/sbin/nagios *
nagios ALL = NOPASSWD: /usr/local/nagios/bin/ndo2db *
Note: The use of “visudo” command is preferd to edit /etc/sudoers file.
7.Finalization
Make sure centcore is running on the master server only, under nagios user. If it is not running, start it:
# /etc/init.d/centcore start
Configure a host with the remote new poller, and restart both nagios and watch if it works.
8.Some remarks/tips/advice
* Remote pollers only are supported from version Centreon 2 beta 5.
* Nagios 3x is required
* NDOutils do not give a lot information on why things just won’t work so make sure NDOutils are
compiled with mysql support – review the config.log carefully. If NDO2DB is working, you should see a mysql session for the configured user on the configured database.
* The procedures to restart, reload, … nagios as well as the transfer of configs to remote pollers are called via a command file (/var/lib/centreon/centcore.cmd).
Make sure the both the Apache and Centcore owner can create and modify this command file.
7.Troubleshooting
If you get it working in the first shoot, congratulations for me it took a whole day, but it was well worthed!
ALLOW MYSQL REMOTE ACCESS:
NDO cannot get to MySQL Database we will need to setup MySQL to listen in the correct IP, by default MySQL listens only in 127.0.0.1 so no inbound connection is accepted.
MySQL5 – connect from Remote.
Login via SSH Edit /etc/mysql/my.cnf
with bind-address = 192.168.X.X (localhost IP)
# vim /etc/mysql/my.cnf
# mysql -u root -p mysql
use centreon
GRANT ALL ON centreon.* TO nagios@’192.168.X.X’ IDENTIFIED BY ‘PASSWORD’;
or Just use phpmyadmin to set permissions.
CENTCORE PROBLEMS
Check the centcore log file:
# tail /usr/local/centreon/log/centcore.log
By default the debug is disable I recommend turning it on by changing the file:
# vim /usr/local/centreon/bin/centcore
Find all debug=0 and set to debug=1 restart centcore(using nagios account) and
# tail -f /usr/local/centreon/log/centcore.log
Centcore has always to run under nagios account so be sure to do
# chown nagios:nagios  /var/lock/subsys/centcore
# sudo -u nagios service centcore restart
Confirm it is running:
# ps ax |grep centco
At first my centcore script was giving me errors like:“Try `mv –help’ for more information.
Use of uninitialized value in concatenation (.) or string at /usr/local/centreon/bin/centcore line 293.”
I now know it was because the script does a query in the DB nagios and looks for the location of files, mine was missing.
Go into both nagios/etc servers and verifiy manually that they are correct, important ones: nagios.cfg ndo2db.cfg ndomod.cfg
NDO PROBLEMS
First check if the daemon is running
# ps ax | grep ndo
26032 ? SNs 0:00 /usr/local/nagios/bin/ndo2db -c /usr/local/nagios/etc/ndo2db.cfg
Check if the service is listening on port 5668
# netstat -an | grep :5668
tcp 0 0 0.0.0.0:5668 0.0.0.0:* LISTEN
Command to startup ndo2db:
# /usr/local/nagios/bin/ndo2db-3x -c /usr/local/nagios/etc/ndo2db.cfg
Look in the nagios.log to check if the broker (aka ndo) started successfully, like:
ndomod: NDOMOD 1.4b7 (10-31-2007) Copyright (c) 2005-2007 Ethan Galstad (nagios@nagios.org)
[1249550302] ndomod: Successfully connected to data sink.  1325 queued items to flush.
[1249550304] ndomod: Successfully flushed 1325 queued items to data sink.
[1249550304] Event broker module ‘/usr/local/nagios/bin/ndomod-3x.o’ initialized successfully.
NDO is critical to be working correctly, I monitor it every 5 minutes using this very cool plugin
In my /etc/init.d/centcore
I changed
#su – nagios -c “nice -n $NICE $Bin >> $DemLog 2>&1”
$Bin >> $DemLog 2>&1
Just be 100% sure that you call the centcore to run under nagios account.
Links:

Tags: , , , , , , , , , ,

3 thoughts on “Centreon Distributed Monitoring (v2)

  1. Thanks for the information but the question that I have concerns network availability between the remote poller and the central server. What happens if communication is lost? Will the remote poller still monitor its list of hosts and services? Will it send out notifications as necessary? When the network communication is restored will everything go back to normal? In short I’m finding very little information about the remote polling feature in Centreon and am looking for any information you can provide.
    Thanks again!

  2. Cara…valeu muito e obrigado por compartilhar esse conhecimento..me ajudou e muito..yeah

Leave a Reply

Your email address will not be published. Required fields are marked *