Troubleshooting Nagios/Centreon :

I. If Centreon monitor does not update itself

(you see nagios different then centreon).


Could be a permission problem related to status.log,
Read the user nagios emails for problems regarding cron jobs and permissions Check the logs at Oreon > Monitor > Event-logs for any errors

II.-“unable to connect to data sink error”

Also on Web front-end you see the error message:
“Connection Error to NDO DataBase !”
1. Check ndomod.cfg, verify the mysql credentials are valid and if user has permissions.
2. Check if the ndo.sock file exists and what PID it has by doing fuser ndo.sock
3. If the ndo.sock has no PID then you probably have ndomod.cfg socket type TCP (WRONG) change it to UNIX and problem solved!
4. Could also be a permission problem on the ndo.sock file.
5. By default Centreon sets ndo to use TCP, it should be UNIX.
Change it here:
Oreon > Configuration > Centreon > ndo2db.cfg
Socket Type: UNIX
Socket Name: /usr/local/nagios/var/ndo.sock
[DATABASE,tab]
change the wrong default (ndo) to nagios
Database Name: nagios
Oreon > Configuration > Centreon > ndomod.cfg
Change to UNIX, and correctly set the Output and Buffer
Output should point to the ndo.sock file (default : /usr/local/nagios/var/ndo.sock )
Buffer should point to ndomod.tmp file (default : /usr/local/nagios/var/ndomod.tmp )
Command to startup ndo2db:
/usr/local/nagios/bin/ndo2db-3x -c /usr/local/nagios/etc/ndo2db.cfg
Be sure ndo2db is always running:
ps -aux | grep ndo2
Be sure both of these files exists and that the permissions are correct.
Also you can try recreating the DB
If you get NDO utils error, do this step:
# mysql -u root
mysql> CREATE DATABASE `ndo` DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
mysql> exit
ATTENTION to the patch, look for the createNDODB.sql ( find / -name NDODB.sql )
# mysql -u root ndo < /usr/local/src/centreon-2.0-RC8/www/install/createNDODB.sql
# mysql -u root
mysql> GRANT SELECT , INSERT , UPDATE , DELETE ON `ndo` . * TO ‘centreon’@’localhost’;
mysql> exit

III. Centreon ACL BUG

Error Message: Fatal error: Call to undefined method DB_Error::fetchRow() in /usr/local/centreon/www/include/home/tacticalOverview/tacticalOverview.php
Version 2 Beta, from RC1 to RC8 has the ACL bug. It need a table in the Nagios database that by default it does not exists!
When we need to specify a user for a certain group of Hosts we will need this to work.
Solution:
#mysql -u root -p
use nagios;

CREATE TABLE IF NOT EXISTS `centreon_acl` (
`id` int(11) NOT NULL auto_increment,
`host_name` varchar(255) default NULL,
`service_description` varchar(255) default NULL,
`group_id` int(11) default NULL,
PRIMARY KEY (`id`),
KEY `host_name` (`host_name`),
KEY `service_description` (`service_description`),
KEY `group_id` (`group_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
[/sourceode]

IV.-Reinstall Centreon

Uninstall:
Delete both DBs: – centreon / – CentreStorgae (via Phpmyadmin)
Delete Centreon Files
rm -R /etc/centreon
rm -R /usr/local/centreon
Remove the Schedulled tasks (crontab):
rm /etc/cron.d/centreon

Remove the Status (.rrd) files:
rm /var/lib/centreon/status/*.*

Remove Plugins Folder:
rm -R /var/lib/centreon/centplugins/

Leave the folders: /var/lib/centreon/metrics,nagios-perf
Installing (Version 2.0 RC7 5.7MB)
cd /tmp
wget http://download.oreon-project.org/index.php?id=93
tar -zxvf centreon-2.0-RC7.tar.gz.1
cd centreon-2.0-RC7
bash install.sh -i

First output should be all [OK]
Then the questions are:
Do you accept GPL license ?
[y/n], default to [n]: y
Do you want to install : Centreon Web Front
[y/n], default to [n]:
> y
Do you want to install : Centreon CentCore (ALLOWS MULTIPLE NAGIOS SERVERS TO WORK TOGETHER Oreon2.0 RC5.0 and UP)
[y/n], default to [n]:

Answer Default should work for all, untill here:
Where is your NDO ndomod binary ?
default to [/usr/sbin/ndomod.o]
>
/usr/sbin/ndomod.o is not a valid file. CRITICAL
Where is your NDO ndomod binary ?
default to [/usr/sbin/ndomod.o]
> /usr/local/nagios/bin/ndomod-3x.o
/usr/local/nagios/bin/ndomod-3x.o OK

Non default answers here:
Do you want me to install CentStorage init script ?
[y/n], default to [n]:
> y
Do you want me to install CentStorage init script ?
[y/n], default to [n]:
> y
Do you want me to install CentStorage run level ?
[y/n], default to [n]:
> y
Finally go to:
http:///centreon/install/setup.php
After it finishes, you will Get the error:
“Connection Error to NDO DataBase !”
Dont worry, the solution is on Troubleshooting step II.

V.How to Upgrade Centreon

bash install.sh -u </usr/local/centreon>
Just follow the steps, and login via web and keep hitting next, it worked fine for me.

VI.How to create hoststauts table:

After a clean installation some people reported this error on first page:
Fatal error: Call to undefined method DB_Error::fetchRow() in /usr/local/centreon/www/include/home/tacticalOverview/tacticalOverview.php on line 96
Somehow the sql script didnt run correclty, well here I will provide the SQL code to create the table.
DOWNLOAD
#mysql -u root -p
use nagios;
And paste the code in the (watch the ‘ sometimes it needs to be fixed)

VII. NDO mysql errors when updating nagios_configfilevariables

varname isn’t unique.
Edit the ndo2db mysql.sql, remove the following line at
– Table structure for table `nagios_configfilevariables`
UNIQUE KEY `instance_id` (`instance_id`,`configfile_id`,`varname`)
Execute the Mysql script in phpmyadmin or run ./installdb from the db folder.

VIII.PNP plugin doesn’t compile or is not included

Download it here and compile it with ./configure and make all:
http://sourceforge.net/project/showfiles.php?group_id=191615&package_id=225647&release_id=662412

IX. Auto Install Script for nagios/debian

Please provide feedback as I have not tried it but the idea is quite cool, I would like to try it and update it but I am out of time right now.
The script is here and it was made by Dex I found it on Centreon forum, take a look there
Its last update was to have:

  • centreon-2.0-RC8
  • nagios-3.0.5
  • nagios-plugins-1.4.13
  • ndoutils-1.4b7

X. After upgrading Perl to 5.10 nagios stop working
ERROR:
~# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
/usr/local/nagios/bin/nagios: error while loading shared libraries: libperl.so.5.8: cannot open shared object file: No such file or directory

We just need to recreate the symbolic link
# ln -s /usr/lib/libperl.so.5.10 libperl.so.5.8

V.Nagios will not start

Sometimes for strange reason nagios will not start. It happened to me more then once, sometimes I had to go back to the last modification. That is why I do a daily backup of all .cfg, here is how
1.Check if the nagios.cfg is ok
#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Any errors here MUST be fixed or nagios will not start
2.Check the file permissions
Are the .cfg permissions ok for user nagios?
3.Call nagios start in verbose mode
# bash -x /etc/init.d/nagios start
(nagios ver 3.0.3 output)
+ kconfig: 345 99 01
/etc/init.d/nagios: line 1: kconfig:: command not found
+ ‘[‘ -f /etc/rc.d/init.d/functions ‘]’
+ ‘[‘ -f /etc/init.d/functions ‘]’
+ prefix=/usr/local/nagios
+ exec_prefix=/usr/local/nagios
+ NagiosBin=/usr/local/nagios/bin/nagios
+ NagiosCfgFile=/usr/local/nagios/etc/nagios.cfg
+ NagiosStatusFile=/usr/local/nagios/var/status.dat
+ NagiosRetentionFile=/usr/local/nagios/var/retention.dat
+ NagiosCommandFile=/usr/local/nagios/var/rw/nagios.cmd
+ NagiosVarDir=/usr/local/nagios/var
+ NagiosRunFile=/usr/local/nagios/var/nagios.lock
+ NagiosLockDir=/var/lock/subsys
+ NagiosLockFile=nagios
+ NagiosCGIDir=/usr/local/nagios/sbin
+ NagiosUser=nagios
+ NagiosGroup=nagios
+ ‘[‘ ‘!’ -f /usr/local/nagios/bin/nagios ‘]’
+ ‘[‘ ‘!’ -f /usr/local/nagios/etc/nagios.cfg ‘]’
+ case “$1” in
+ echo -n ‘Starting nagios:’
Starting nagios:+ /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
+ ‘[‘ 0 -eq 0 ‘]’
+ su – nagios -c ‘touch /usr/local/nagios/var/nagios.log /usr/local/nagios/var/retention.dat’
+ rm -f /usr/local/nagios/var/rw/nagios.cmd
+ touch /usr/local/nagios/var/nagios.lock
+ chown nagios:nagios /usr/local/nagios/var/nagios.lock
+ /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
+ ‘[‘ -d /var/lock/subsys ‘]’
+ echo ‘ done.’
done.
+ exit 0
4.Check processes are running
# ps ax | grep nagios
Remember about NDO errors, check this page for more information
______________________________________

APPENDIX: Understanding the DBs

    CentStorage – is the Centreon2 name for the formerly known Oreon Data Storage (ODS) Module. It was developed for storing data in Databases.
    The main goal of CentStorage is to permit a centreon user to easily manipulate logs and data return by nagios check.
    First CentStorage permits to store numerical data in a database (SQL or/and RRDtool) for generating graphs quickly.
    Centreon – this is where the Centreon Web interface has all nagios and centreon configuration stored. Also its from here that Centreon will generate the nagios configurations files. ( services.cfg, hosts.cfg, etc…)
    Nagios – this DB is where there are informations about graphics and hosts status, it is needed by Nagios and by Centreon. Also the NDO comunicates directly here.

MORE place to find help:

FORUM : http://forums.centreon.com
IRC: irc.freenode.net #Centreon #Nagios
BRAIN: Inside your head 🙂

Helpe Me out and please click one of those site

[ad]

Thanks

Tags: , , , , , , , , , , , , ,

40 thoughts on “Nagios/Centreon TroubleShooting

  1. OMG, your tutorial works. I had “sink error” because official outline on centreon wiki recommends use tcp socket.
    great thanks!

  2. Where is the RRD perl module installed [RRDs.pm]
    default to [/usr/lib/perl5/RRDs.pm]
    >
    /usr/lib/perl5/RRDs.pm is not a valid file.
    It is my problem.Thanks!!!

  3. Hi,
    I’ve installed the the Centreon and Nagios using the tutorial, but I’m getting the following error:
    DB Error : SELECT count(nagios_hoststatus.current_state) , nagios_hoststatus.current_state FROM nagios_hoststatus, nagios_objects WHERE nagios_objects.object_id = nagios_hoststatus.host_object_id AND nagios_objects.is_active = 1 GROUP BY nagios_hoststatus.current_state ORDER by nagios_hoststatus.current_state [nativecode=1146 ** Table ‘nagios.nagios_hoststatus’ doesn’t exist]
    Fatal error: Call to undefined method DB_Error::fetchRow() in /usr/local/centreon/www/include/home/tacticalOverview/tacticalOverview.php on line 94
    I did all the troubleshoots, but no joy.
    Could anyone help me with this?
    Cheers,
    Vinicius.

  4. Vinicius,
    What version are u installing? I have never seen that error, but looks like the SQL Script didnt work correctly, some DB were not correct created.
    Make sure at least Nagios itself works, then I would try to uninstall and install again. check out the forums.centreon.com we can discuss it there.

  5. Same Problem as Vinicius.tb happen to me. Yesterday I installed centreon using this great how to.
    by click the home-button the same error occurs:
    DB Error : SELECT count(nagios_hoststatus.current_state) , nagios_hoststatus.current_state FROM nagios_hoststatus, nagios_objects WHERE nagios_objects.object_id = nagios_hoststatus.host_object_id AND nagios_objects.is_active = 1 GROUP BY nagios_hoststatus.current_state ORDER by nagios_hoststatus.current_state [nativecode=1146 ** Table ‘nagios.nagios_hoststatus’ doesn’t exist]
    Fatal error: Call to undefined method DB_Error::fetchRow() in /usr/local/centreon/www/include/home/tacticalOverview/tacticalOverview.php on line 94
    Is there anything new about it?

  6. It is obvious something related to the DB, check the DB credentials on nagios.cfg and ndo. This could be some error related to the ACL bug, check the soltuion here for it. I would install phpmyadmin and take a closer look to the DB,raise centreon logs, and check it for detailed errors. Configuration Nagios nagios.cfg>debug and the logs should be at /usr/local/centreon/log

  7. Hi xoroz,
    thanks for your reply. By taking your advice and checking the
    error again, the table ‘nagios.nagios_hoststatus’ wasn’t created
    by the installation. credentials are all alright. so I think,
    I have to create the missing table by myself. do you have
    the structure of this table for me?
    Thanks a lot.
    Cheerz
    Faxe

  8. Hi xoroz,
    thanks you for the solution. Table was created without any errors.
    Afterwards another table was missing. To solve the case I deinstalled
    Centreon, deleted all db and also the ndo-db. after reinstalling
    the ndo-db and centron it all worked well.
    The solution I found was to run /ndoutils-1.4b7/db/installdb
    ./installdb -u -p -h localhost -d
    after installing the database this way everthing was okay.

  9. correcting
    ./installdb -u root -p password -h localhost -d nagios
    replace password with userpassword
    replace nagios with databasename
    afterward create nagiosuser for database

  10. I am glad you found a solution. Yes reinstalling can sometimes be a way to fix things.

  11. Felipe
    Desde já os meus parabéns pelo excelente trabalho.
    Ando um pouco perdido na configuração de hosts, templates, serviços, etc. Gostaria de saber como configuro varios “check_commands” para um mesmo host. Ou seja, para um mesmo host gostaria de utilizar por exemplo, check_snmp e check_dns. Quando adiciono um novo host ele apenas permite 1 “check command”. Entao quais os passos correctos? Criar um template de serviço? Será que podias escrever um pequeno post, explicando os passos desde a criacao do host ate ao adicionar de varios “check_commands”?
    Muito obrigado

  12. There’s yet another table missing. “objects”
    Does anyone have the sql for creating this table and it’s structure?

    1. Mark, here is the table OBJECTS
      — Database: `nagios`
      — Table structure for table `nagios_objects`
      CREATE TABLE `nagios_objects` (
      `object_id` int(11) NOT NULL auto_increment,
      `instance_id` smallint(6) NOT NULL default ‘0’,
      `objecttype_id` smallint(6) NOT NULL default ‘0’,
      `name1` varchar(128) NOT NULL default ”,
      `name2` varchar(128) default NULL,
      `is_active` smallint(6) NOT NULL default ‘0’,
      PRIMARY KEY (`object_id`),
      KEY `objecttype_id` (`objecttype_id`,`name1`,`name2`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT=’Current and historical objects of all kinds’ AUTO_INCREMENT=2504 ;

  13. Fala Felipe,
    Muito bom os troubleshooting do centreon…resolveu um monte de problemas nas configurações que tive na isntalação do centreon.
    Mas ainda estou com este problema, quando vou logar: Connection Error to NDO DataBase ! Sem essa conexão ele não puxa nenhum dado.
    o log do ndo2 está ok : domod: Successfully flushed 1302 queued items to data sink.
    no ps -aux tb ele está lá…rodando…
    Saberia alguma outra coisa que possa estar faltando?
    valeu pela ajuda.
    abs
    jeff

  14. Jeff, vc seguiu passo a passo ? Olhe os logs do MySQL algum error?
    Acho q a melhor maneira e ir mirando todos os logs, talvez em varlogsyslog tbm tenha info impotante.
    Outra maneira que descobri de ter certeza que NDO esta atualizando corretamente é atravez de um script (plugin) check_ndo.pl
    O resultado é assim:
    # perl check_ndo.pl -H 127.0.0.1 -d nagios -u nagios -p 321 -i default -t 10
    Instance “default” is running and database was updated during the last 10 seconds. OK
    Link ao plugin:
    http://exchange.nagios.org/directory/Plugins/Network-and-Systems-Management/Nagios/Check-NDO-update-status/details
    boa sorte,
    Felipe

  15. Fala Felipe..Valeu pela ajuda, Era problema com permissão no banco.
    Não estou mais com problemas no NDO, mas não consigo gerenciar os Hosts e outros recursos, não aparecem dentro do Centreon. Tem alguma restrição com versão do nagios com o Centreon? estou usando uma das últimas 3.1.2, com Centreon 2.0.2.
    mais um vez, valeu.
    Abs
    jeff

  16. Ola Felipe o que seria de nós sem o seu BLOG …risos
    Olha só nas explicações acima dizem p/ usar o unixsocket certo porem o ndo só funcionou no meu caso quando configurado p/ TCPSOCKET , e uma outra duvida é a seguinte os graficos no centreon aparecem apenas uma linha … tem como mudar os tipos de graficos.??
    Abraços
    Breno

    1. Breno,
      Que bom que o site te ajudou. NDO funciona pra mim com UNIX socket é o arquivo: /usr/local/nagios/var/ndo.sock.
      Pros graficos:
      Views > Graph template and Curve templates. Then associate your service to the graph template in extended info
      Check more info here

  17. Fala Felipe…
    Show sua documentação….Centreon funcionando……animal.!!!
    valeu
    abs
    jeff

  18. Felipe, meus parebéns pelo blog, foi de muita ajuda para todos. Gostaria de saber se você poderia me fornecer o seu mini manual, pois é de grande interesse pra mim. Por favor, se possível manda para o meu e-mail. Gostaria de saber também se eu poderia ter uma aconselhamento seu sobre a utilização do Centreon dentro da minha necessidade de uso. Desde já agradeço.

    1. Olá Elton,
      Fico contente que o site te ajudou. Qual mini-manul vc se refer? tenho um sobre linux da uma olhada ta no site,
      sobre o Centreon/Nagios é mt flexivel, e mais facil vc dizer o q vc quer monitorizar.
      abs

  19. Felipe, Muito obrigado pela ajuda anterior , agora eu tenho um novo problema quando crio um usuario e vou usar as ACLs no Centreon 2.0 o usuario que não é adm não funciona você ja passou por isso ?!
    Abraços
    Breno
    Muito Obrigado

    1. Breno,
      Ja tive esse tipo de problema. Incluso em versões antigas existia um bug na db de ACL, da uma olhada na
      DB, olha ai o troubleshooting, se nao atualiza o Centreon. Qual a msg de erro?

  20. Felipe, estou tendo o seguinte problema agora solucionei os problemas como conexao do ndo, porem nao aparece nem um host na opçao de Monitoring…Observando o log esta obtendo corretamente mas nao aparece no front do Centreon, gostaria de saber se teria alguma dica do que poderia estar acontecendo
    Att
    Rafael Gumiero
    Nagios.log
    [1256095273] Finished daemonizing… (New PID=2788)
    [1256095274] INITIAL HOST STATE: Centreon-Server;UP;HARD;1;PING OK – Packet loss = 0%, RTA = 0.07 ms
    [1256095274] INITIAL HOST STATE: rafa;UP;HARD;1;PING OK – Packet loss = 0%, RTA = 0.08 ms
    [1256095274] INITIAL SERVICE STATE: Centreon-Server;Disk-/;OK;HARD;1;Disk OK – / TOTAL: 7.487GB USED: 1.157GB (15%)
    [1256095274] INITIAL SERVICE STATE: Centreon-Server;Load;OK;HARD;1;load average: 0.26, 0.20, 0.10.
    [1256095274] INITIAL SERVICE STATE: Centreon-Server;Memory;OK;HARD;1;total memory used : 14% ram used : 56%, swap used 0%
    [1256095274] INITIAL SERVICE STATE: Centreon-Server;Ping;OK;HARD;1;GPING OK – rtt min/avg/max/mdev = 0.043/0.459/0.845/0.329 ms
    [1256095274] INITIAL SERVICE STATE: rafa;Ping;OK;HARD;1;GPING OK – rtt min/avg/max/mdev = 0.059/0.167/0.315/0.108 ms
    [1256095289] ndomod: Successfully connected to data sink. 342 queued items to flush.
    [1256095289] ndomod: Successfully flushed 342 queued items to data sink.

  21. Rafael,
    Vc seguiu os passos todos?
    De acordo com o log do nagios.log o ndo fez o data sink, qual erro vc tem? No nagios ta aparecendo tudo certo? Ve permissões do arquivo e da BD..
    e tbm:
    Oreon > Configuration > Centreon > ndomod.cfg
    Change to UNIX, and correctly set the Output and Buffer
    Output should point to the ndo.sock file (default : /usr/local/nagios/var/ndo.sock )
    Buffer should point to ndomod.tmp file (default : /usr/local/nagios/var/ndomod.tmp

  22. Hello,
    I have two questions:
    -The first one is how do i add a host to be monitored
    -The second is, how can i set up an alert system that can send a message such as sms, email or pager just in case there is a problem with the servers that are being monitored

  23. Felipe, parabéns pelo excelete blog. Está me ajudando muito na implementação do Centreon.
    O meu problema é o seguinte. O Centreon está funcionando normalmente, nenhum problema mesmo, só que se eu colocar os 60 hosts e cerca de 150 services pra serem checados com intervalo de um minuto, os gráficos do Centreon não mostram nada. Nas checagens de cinco em cinco minutos tudo funciona OK.
    Você sabe o que pode estar ocorrendo?

    1. Paulo,
      De uma olhada nos logs /var/log/messages alguma erro de MySQL ? Ja vi acontecer qnd tem informação demais tentando escrever na BD.

  24. Oi, segui o seu tutorial completo mas tenho um problema.
    Depois de fazer login no centreon acontece um 500. Nos logs tenho:
    [Thu Mar 03 14:52:01 2011] [error] [client 10.134.132.15] PHP Fatal error: Call to undefined method DB_Error::fetchRow() in /usr/local/centreon/www/class/centreonGMT.class.php on line 181, referer: http://10.134.132.98/centreon/index.php
    alguma dica?

  25. Ola Felipe,
    Tenho uma pergunta, eu segui o passo a passo de instalacao do centreon/nagios e tudo estava correndo mto bem. Hoje quando fui logar pela manha no TAB de configuracao -> O tab host e o -> tab services desapareceu. Eu andei pesquisando na internet e la diz que isso pode ser causado por um bug que quando vc bota mtos sevidores para serem instalados. Voce tem como me ajudar com isso? eu tenho que por isso em producao em 2 semanas e a ferramenta ja apresentou problemas.
    agradeco

Leave a Reply

Your email address will not be published. Required fields are marked *