Any internet based company may have servers ranging from couple of hundred to thousands. Those servers may be geographically isolated from one another and moreover those servers may be providing a combined service to the end users (customers). A perfect wish for any company will be that any particular issue should not affect the daily operation and that's where Nagios comes into the picture. You can use Nagios to fix those issues before they escalate to the end users (customers).
In this tutorial, I will be installing and configuring Nagios with Chef which is a Server Configuration Management tool like Ansible, Puppet, Saltstack. Nagios will be used to monitor a redisDB and Nginx WebServer. The tutorial will consist of installing and configuring the Nagios server and the nagios plugins on two servers - a RedisDB server and Nginx webserver.
Please visit the post on how to get-started with CHEF http://blog.kaliloudiaby.com/index.php/opscode-chef/
Nagios is a very popular open source tool specialized in system monitoring. It's built on a Server/Client architecture, Nagios server is running on a host machine and the plugins are installed on the client machines which send at a regular interval information to Nagios server on their resources usage and services state.
It monitors hosts resources and services along with a design helping you to be more proactive, which you will be informed about incidents before they reach the end users. Since we've mentioned the hosts resources and services, let's explain what I mean by that.
In a nutshell, Nagios will monitor those resources and services and create alert (Notifications), for example you may configure an alert to limit to 5 times, after the same alert has been triggered 5 times then Nagios will send out a notification which can be email or sms.
nagios I will be installing nagios with CHEF without using any nagios external cookbook even though it is the simplest and quickest way because the goal of this tutorial is for anyone new in CHEF. I will not be using any nagios RPMs for installing nagios server, the installation will be done through getting the source, building and compiling it, all this will be done using CHEF. But as mentioned, the simplest and quickest way will be to use some RPMs and nagios external chef cookbook.
If you haven't install the chefDK, please visit Opscode Chef . And if you don't wanna follow all the steps in the tutorial, please download the GIT repo nagios cookbook tutorial kalilou github by running:
$ git clone https://github.com/kalilou/nagios-cookbook-tutorial
Now let's get-started with all the steps:
Create the nagios-cookbook with knife
$ knife cookbook create nagios-cookbook -o .
You should see the following folder structure
$ cd nagios-cookbook
$ tree -L 1
.
├── CHANGELOG.md
├── README.md
├── attributes
├── chefignore
├── definitions
├── files
├── libraries
├── metadata.rb
├── providers
├── recipes
├── resources
├── templates
Let's now set up our kitchen-test for testing locally our nagios-cookbook. The command below will generate some additional files such as .kitchen.yml
$ cd nagios-cookbook
$ kitchen init
Now we also need to setup berkshelf for downloading external cookbook and resolving our cookbook dependencies and make sure to not overwrite the .kitchen.yml file.
$ berks init
Let's modified the default generated .kitchen.yml $ vi nagios-cookbook/.kitchen.yml and should contain the following.
---
driver:
name: vagrant
provisioner:
name: chef_zero
platforms:
- name: centos-7.1
suites:
# Nagios Server
- name: Nagios
run_list:
- recipe[nagios-cookbook::default]
attributes:
driver:
network:
- ["private_network", {ip: "192.168.50.200"}]
customize:
memory: 4048
cpuexecutioncap: 50
# Redis Server
- name: Redis
run_list:
- recipe[nagios-cookbook::redis]
attributes:
driver:
network:
- ["private_network", {ip: "192.168.50.201"}]
# Nginx Server
- name: nginx
run_list:
- recipe[nagios-cookbook::nginx]
attributes:
driver:
network:
- ["private_network", {ip: "192.168.50.202"}]
The file above describes that we will spin up 3 Virtual machines e.g Nagios, Redis and Nginx servers. So we will use Nagios server to monitor the Redis, Nginx Server and Nagios Server itself. Run the following command to list the VMs and create them:
$ kitchen list all
Output
Run this command to create the VMs (This will create the virtual machines)
$ kitchen create all
Output
Now that the VMs are up and running, we can SSH by running $ kitchen login suite_name and you can get the list of the suites (VMs) by running $ kitchen list all
. Below, you can see I've ssh to the 3 virtual machines e.g Nagios, Redis and Nginx Server.
By the way, to have the color in the shell I've used the environment variable PS1, as example to have the red color, just run export PS1="\e[0;31m[\u@\h \W]$ \e[m " or add it to the .bash_profile file.
Example of PS1 for for bash terminal
$ export PS1="\e[0;31m[\u@\h \W]\$ \e[m " # Red color
$ export PS1="\e[0;32m[\u@\h \W]\$ \e[m " # Green Color
$ export PS1="\e[0;33m[\u@\h \W]\$ \e[m " # Yellow color
Now let's defined Nagios recipe, this means basically telling Chef how to install and configure Nagios. As I mentioned before, the best way will be to use the Nagios external cookbook or use the RPM package, but as for learning Chef I will install and configure Nagios from scratch by downloading the source and build and compile it.
Create a file called server.rb: $ vim nagios-cookbook/recipes/server.rb
and should contain the following:
#
# Cookbook Name:: nagios-cookbook
# Recipe:: server
#
# All rights reserved
#
# Install some RPM packages needed to build and compile the nagios source
['gcc', 'glibc', 'glibc-common', 'gd', 'gd-devel', 'make', 'net-snmp', 'openssl-devel', 'xinetd', 'unzip', 'nodejs', 'npm'].each do |pkg|
package pkg do
action :install
end
end
# nagios-redis will be used to check on
# redis services (server running, memory usage)
execute 'Install nagios-redis' do
command 'npm install -g nagios-redis'
action :run
end
# Create nagios user
user 'nagios' do
comment 'A nagios user'
shell '/bin/bash'
end
# Create nagios group
group 'nagios' do
action :create
members ['nagios', 'apache']
end
# Install nagios plugin check_nginx for monitoring nginx
cookbook_file '/usr/local/nagios/libexec/check_nginx' do
source 'check_nginx'
owner 'nagios'
group 'nagios'
mode 0664
action :create
end
# Download the nagios source
remote_file '/home/vagrant/nagios-4.1.1.tar.gz' do
source 'https://assets.nagios.com/downloads/nagioscore/releases/nagios-4.1.1.tar.gz'
action :create_if_missing
end
# Build and compile the nagios source
# The condition here is a simple condition for the sake of this tutorial
# but a better way will be to check the nagios binaries and skip the build and compile steps
# every time you were to run Chef.
unless Dir.exists?('/home/vagrant/nagios-4.1.1')
execute 'Extract the nagios-4.1.1.tar.gz' do
cwd '/home/vagrant'
command 'tar xvf nagios-*.tar.gz'
action :run
end
execute 'Before building, configure it' do
cwd '/home/vagrant/nagios-4.1.1'
command './configure --with-command-group=nagios'
action :run
end
execute 'Compile it' do
cwd '/home/vagrant/nagios-4.1.1'
command 'make all'
action :run
end
execute 'Install nagios' do
cwd '/home/vagrant/nagios-4.1.1'
user 'root'
command 'make install && make install-commandmode'
end
execute 'Install init script' do
cwd '/home/vagrant/nagios-4.1.1'
user 'root'
command 'make install-init'
end
execute 'Install configs' do
cwd '/home/vagrant/nagios-4.1.1'
user 'root'
command 'make install-config && make install-webconf'
end
end
But before we will be able to run Chef, among other choices I've chosen to run it on apache server, now create another recipe called lampserver.rb: `$ vim nagios-cookbook/lampserver.rb`
#
# Cookbook Name:: nagios-cookbook
# Recipe:: lamp_server
#
# All rights reserved
#
# Install epel-release repo
package 'Install epel release repo' do
package_name 'epel-release'
action :install
end
# Install httpd RPM package
package 'Install httpd package' do
package_name 'httpd'
action :install
end
# Start the httpd service
# Enable the httpd service which at boot the service will be started
service 'httpd' do
service_name 'httpd'
action [ :enable, :start ]
end
# Install the PHP RPM Package
package 'Install php package' do
package_name 'php'
action :install
notifies :restart, 'service[httpd]'
end
It's time to create another recipe file for downloading, building and compiling the nagios plugins: $ vim nagios-cookbook/recipes/nagios_plugin.rb
#
# Cookbook Name:: nagios-cookbook
# Recipe:: nagios_plugin
#
# All rights reserved
#
remote_file '/home/vagrant/nagios-plugins-2.1.1.tar.gz' do
source 'http://nagios-plugins.org/download/nagios-plugins-2.1.1.tar.gz'
action :create_if_missing
end
unless Dir.exists?('/home/vagrant/nagios-plugins-2.1.1')
execute 'Extract nagios-plugins-2.1.1.tar.gz' do
cwd '/home/vagrant'
command 'tar xvf nagios-plugins-*.tar.gz'
action :run
end
execute 'Before building, configure it' do
cwd '/home/vagrant/nagios-plugins-2.1.1'
command './configure --with-nagios-user=nagios --with-nagios-group=nagios --with-openssl'
action :run
end
execute 'Compile and install' do
cwd '/home/vagrant/nagios-plugins-2.1.1'
command 'make'
action :run
end
execute 'Compile and install' do
cwd '/home/vagrant/nagios-plugins-2.1.1'
user 'root'
command 'make install'
action :run
end
end
remote_file '/home/vagrant/nrpe-2.15.tar.gz' do
source 'http://downloads.sourceforge.net/project/nagios/nrpe-2.x/nrpe-2.15/nrpe-2.15.tar.gz'
action :create_if_missing
end
unless Dir.exists?('/home/vagrant/nrpe-2.15')
execute 'Extract nrpe-2.15.tar.gz' do
cwd '/home/vagrant'
command 'tar xvf nrpe-*.tar.gz'
action :run
end
execute 'Configure it' do
cwd '/home/vagrant/nrpe-2.15'
command './configure --enable-command-args --with-nagios-user=nagios --with-nagios-group=nagios --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/x86_64-linux-gnu'
action :run
end
execute 'Build it' do
cwd '/home/vagrant/nrpe-2.15'
command 'make all'
action :run
end
execute 'Install it' do
cwd '/home/vagrant/nrpe-2.15'
user 'root'
command 'make install && make install-xinetd && make install-daemon-config'
action :run
end
template '/etc/xinetd.d/nrpe' do
source 'nrpe.erb'
owner 'root'
group 'root'
mode '0644'
action :create
variables(
node_ip: node['ipaddress']
)
end
service 'xinetd' do
action [ :enable, :restart]
end
end
Now we will be using some templates file to overwrite some nagios default configuration files such as cgi.cfg, contacts.cfg and nagios.cfg
Create the following files under nagios-cookbook/templates/default/
$ vim nagios-cookbook/templates/default/cgi.cfg.erb
. And as for the entire content with comments and description, please copy from https://github.com/kalilou/nagios-cookbook-tutorial/blob/master/templates/default/cgi.cfg.erb$ vim nagios-cookbook/templates/default/nagios.cfg.erb
, as for the content, please copy the content from https://github.com/kalilou/nagios-cookbook-tutorial/blob/master/templates/default/nagios.cfg.erb$ vim nagios-cookbook/templates/default/contacts.cfg.erb
, as for the content, please copy the content from https://github.com/kalilou/nagios-cookbook-tutorial/blob/master/templates/default/contacts.cfg.erb$ vim nagios-cookbook/templates/default/nrpe.erb
, as for the content, please copy the content from https://github.com/kalilou/nagios-cookbook-tutorial/blob/master/templates/default/nrpe.erbNow that we've done with the templates, let's create some static files under nagios-cookbook/files/default
$ vim nagios-cookbook/files/default/nagios.conf
, as for the content, copy from https://github.com/kalilou/nagios-cookbook-tutorial/blob/master/files/default/nagios.confNow that we have all the configuration files under templates and files directory, let's add more codes in the recipes default.rb: $ vim nagios-cookbook/recipes/default.rb
#
# Cookbook Name:: nagios-cookbook
# Recipe:: default
#
# All rights reserved
#
include_recipe 'nagios-cookbook::lamp_server'
include_recipe 'nagios-cookbook::server'
include_recipe 'nagios-cookbook::nagios_plugin'
# Create a directory server which will store
# every server for monitoring
directory '/usr/local/nagios/etc/servers' do
mode 0755
owner 'nagios'
group 'nagios'
recursive true
action :create
end
template '/usr/local/nagios/etc/objects/contacts.cfg' do
source 'contacts.cfg.erb'
owner 'nagios'
group 'nagios'
mode 0664
action :create
variables(
contact_info: node['contact_info']
)
end
template '/usr/local/nagios/etc/cgi.cfg' do
source 'cgi.cfg.erb'
owner 'nagios'
group 'nagios'
mode 0664
action :create
variables(
auth_nagios: node['auth_nagios']
)
end
# This is just for this tutorial, otherwise use the databag
# to store this confidential info
htpasswd '/usr/local/nagios/etc/htpasswd.users' do
user "nagiosadmin"
password "your_pasword"
end
# This will copy the file from nagios-cookbook/files/default/commands.cfg
# to the remote machine (VM) /usr/local/nagios/etc/objects/commands.cfg
cookbook_file '/usr/local/nagios/etc/objects/commands.cfg' do
source 'commands.cfg'
owner 'nagios'
group 'nagios'
mode 0664
action :create
end
cookbook_file '/etc/httpd/conf.d/nagios.conf' do
source 'nagios.conf'
owner 'root'
group 'root'
mode 0664
action :create
notifies :restart, 'service[httpd]'
end
service 'nagios' do
action [ :enable, :restart]
end
Modify the attribute default.rb file: $ vim nagios-cookbook/attributes/default.rb
default['contact_info'] = 'your_email_address'
default['auth_nagios'] = 0
default['nagios_pid_file'] = '/usr/local/nagios/var/nagios.pid'
Modify the metadata.rb file $ vim nagios-cookbook/metadata.rb
name 'nagios-cookbook'
maintainer 'Kalilou Diaby'
maintainer_email 'YOUR_EMAIL'
license 'All rights reserved'
description 'Installs/Configures nagios-cookbook'
long_description IO.read(File.join(File.dirname(__FILE__), 'README.md'))
version '0.1.0'
depends 'htpasswd', '~> 0.2.4'
Run $ berks install
to download htpasswd external cookbook
Now that everything is setup for nagios (recipes, files, templates, attributes), let's converge the node (which means running chef to install and configure nagios)
Run the converge command
$ kitchen converge Nagios
Output
Now nagios should be up and running, to check it out on http://192.168.50.200/nagios/ and you should see the Nagios web interface
In the picture above, you can see nagios is monitoring itself (localhost), it monitors metrics like Current Loads, Current Users, PING, Root Partition, SSH, Swap Usage, Total Processes for more on how to define services check this file out /usr/local/nagios/etc/objects/localhost.cfg
We will setup the redis server so that nagios can monitor the redis service
Create the redis.rb: $ vim nagios-cookbook/recipes/redis.rb
#
# Cookbook Name:: nagios-cookbook
# Recipe:: redis
#
# All rights reserved
#
package 'epel-release'
package 'nrpe'
package 'redis'
package 'nagios-plugins-all'
service 'redis' do
action [ :enable, :start ]
end
template '/etc/nagios/nrpe.cfg' do
source 'redis_nrpe.cfg.erb'
owner 'nagios'
group 'nagios'
mode 0664
action :create
variables(
nagios_server_ip: node['nagios_server_ip']
)
end
service 'nrpe' do
action [ :enable, :start ]
end
Create the template redisnrpe.cfg.erb: $ vim nagios-cookbook/templates/default/redisnrpe.cfg.erb , please get the content from https://github.com/kalilou/nagios-cookbook-tutorial/blob/master/templates/default/redis_nrpe.cfg.erb
Add this line to the nagios-cookbook/attributes/default.rb : default['nagiosserverip'] = '192.168.50.200'
Now converge the redis node by running $ kitchen converge Redis and you should see the following output
You can SSH to the VMs and make sure redis is properly installed and up and running
$ kitchen login Redis
$ ps aux | grep redis
Output
Now let's configure again the nagios server to monitor the redis server
Create a template redis_server.cfg.erb
$ vim nagios-cookbook/templates/default/redis_server.cfg.erb
define host {
use linux-server
host_name redis-server
alias My redis server
address <%= @redis_server_ip %>
max_check_attempts 5
check_period 24x7
notification_interval 30
notification_period 24x7
}
define service {
use generic-service
host_name redis-server
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
define service {
use generic-service
host_name redis-server
service_description SSH
check_command check_ssh
notifications_enabled 0
}
define service {
use generic-service
host_name redis-server
service_description Redis ping
check_command check_redis_ping!6379
}
define service {
use generic-service
host_name redis-server
service_description Redis memory
check_command check_redis_memory!6379!1GB!2GB
}
Create a recipe called nagios_redis.rb:
$ vim nagios-cookbook/recipes/nagios_redis.rb
#
# Cookbook Name:: nagios-cookbook
# Recipe:: nagios_redis
#
# All rights reserved
#
template '/etc/nagios//usr/local/nagios/etc/servers/redis.cfg' do
source 'redis_server.cfg.erb'
owner 'nagios'
group 'nagios'
mode 0664
action :create
variables(
redis_server_ip: node['redis_server_ip']
)
notifies :reload, 'service[nagios]'
end
And now include this recipe above to the default.rb: $ vim nagios-cookbook/recipes/default.rb
, add this line at the bottom of the file includerecipe 'nagios-cookbook::nagiosredis' and run $ kitchen converge Nagios . Then after the node has been converged, check on your nagios Web interface and you should be able to see redis-server being monitored.
Now redis server is being monitored, the most important note to take here is that you have multiple nagios plugins available, so therefore it's up to you to spend some time and identify the critical metrics(services) you would like to monitor. In this tutorial, I have not mentioned the notifications but you could set the notifications for example you will receive an SMS or email about the disk, memory or CPU usage from which can take action on before the problem appears to your end users.
Create a chef recipe for installing nginx: $ vim nagios-cookbook/recipes/nginx.rb
#
# Cookbook Name:: nagios-cookbook
# Recipe:: default
#
# All rights reserved
#
# Install epel-release repo
package 'epel-release'
# Install nginx
package 'nginx'
package 'nrpe'
package 'nagios-plugins-all'
template '/etc/nagios/nrpe.cfg' do
source 'nginx_nrpe.cfg.erb'
owner 'nagios'
group 'nagios'
mode 0664
action :create
variables(
nagios_server_ip: node['nagios_server_ip']
)
end
cookbook_file '/etc/nginx/conf.d/status.conf' do
source 'nginx_status.conf'
owner 'nginx'
group 'nginx'
mode 0664
action :create
end
# Enable and start nginx
service 'nginx' do
action [ :enable, :start ]
end
service 'nrpe' do
action [ :enable, :start ]
end
Create a template file called nginxnrpe.cfg.erb: $ vim nagios-cookbook/templates/default/nginxnrpe.cfg.erb and copy the content from https://github.com/kalilou/nagios-cookbook-tutorial/blob/master/templates/default/nginx_nrpe.cfg.erb. Now run chef to converge the nginx node $ kitchen converge Nginx and you should see the following output.
You can SSH the Nginx VM and make sure nginx is up and running
$ kitchen login Nginx
$ ps aux | grep nginx
And on your browser, please visit http://192.168.50.202/, and you should see the following:
Now let's configure the nagios server to monitor the nginx server, create a template called nginx_server.cfg.erb:
$ vim nagios-cookbook/templates/default/nginx_server.cfg.erb
define host {
use linux-server
host_name nginx-server
alias My nginx server
address <%= @nginx_server_ip %>
max_check_attempts 5
check_period 24x7
notification_interval 30
notification_period 24x7
}
define service {
use generic-service
host_name nginx-server
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}
define service {
use generic-service
host_name nginx-server
service_description SSH
check_command check_ssh
notifications_enabled 0
}
define service {
use generic-service
host_name nginx-server
service_description Nginx ping
check_command check_redis_ping!6379
}
Now create a recipe called nagios_nginx.rb:
$ vim nagios-cookbook/recipes/nagios_nginx.rb
#
# Cookbook Name:: nagios-cookbook
# Recipe:: nagios_nginx
#
# All rights reserved
#
template '/usr/local/nagios/etc/servers/nginx.cfg' do
source 'nginx_server.cfg.erb'
owner 'nagios'
group 'nagios'
mode 0664
action :create
variables(
nginx_server_ip: node['nginx_server_ip']
)
notifies :reload, 'service[nagios]'
end
Create a file called nginx_status.conf:
$ vim nagios-cookbook/files/default/nginx_status.conf
server {
listen 0.0.0.0:80;
server_name localhost ;
location /nginx_status {
stub_status on;
access_log off;
allow all;
}
}
server {
listen 192.168.50.202:80;
server_name localhost nginx-centos-71;
location /nginx_status {
stub_status on;
access_log off;
allow all;
}
}
And now include this recipe above to the default.rb: $ vim nagios-cookbook/recipes/default.rb , add this line at the bottom of the file includerecipe 'nagios-cookbook::nagiosnginx and before running $ kitchen converge Nagios , make sure to add default['nginxserverip'] = '192.168.50.202 to nagios-cookbook/attributes/default.rb . Then after the node has been converged, check on your nagios Web interface and you should be able to see nginx-server being monitored.
Now you have Nginx web server being monitored
Now that we've finished writing the nagios-cookbook, we would like to make sure the server is in the desired state. We will be testing things like this user exists and belong to this group or the files and folders are created and contain the right content.
Create these directories: $ mkdir -p nagios-cookbook/test/integration/Nagios/serverspec and create a file:
$ vim nagios-cookbook/test/integration/Nagios/serverspec/default_sepc.rb
require 'serverspec'
# Required by serverspec
set :backend, :exec
describe package('nodejs') do
it { should be_installed }
end
describe package('openssl-devel') do
it { should be_installed }
end
describe service('httpd') do
it { should be_enabled }
it { should be_running }
end
describe service('nagios') do
it { should be_enabled }
it { should be_running }
end
describe user('nagios') do
it { should exist }
end
describe user('nagios') do
it { should belong_to_group 'nagios' }
end
describe file('/usr/local/nagios/etc/cgi.cfg') do
it { should be_file }
it { should be_mode 664 }
it { should be_owned_by 'nagios' }
it { should be_grouped_into 'nagios' }
it { should contain 'use_authentication= 0' }
it { should contain 'authorized_for_system_information=nagiosadmin' }
it { should contain 'main_config_file=/usr/local/nagios/etc/nagios.cfg' }
end
describe file('/usr/local/nagios/etc/servers/nginx.cfg') do
it { should be_file }
it { should be_mode 664 }
it { should be_owned_by 'nagios' }
it { should be_grouped_into 'nagios' }
it { should contain 'define host {' }
it { should contain 'use linux-server' }
it { should contain 'host_name nginx-server' }
it { should contain 'alias My nginx server' }
it { should contain 'address 192.168.50.202' }
it { should contain 'define service {' }
it { should contain 'use generic-service' }
it { should contain 'service_description PING' }
it { should contain 'check_command check_ping!100.0,20%!500.0,60%' }
it { should contain 'host_name nginx-server' }
it { should contain 'service_description SSH' }
it { should contain 'check_command check_ssh' }
it { should contain 'use generic-service' }
it { should contain 'host_name nginx-server' }
end
describe file('/usr/local/nagios/etc/servers/redis.cfg') do
it { should be_file }
it { should be_mode 664 }
it { should be_owned_by 'nagios' }
it { should be_grouped_into 'nagios' }
it { should contain 'define host {' }
it { should contain 'use linux-server' }
it { should contain 'host_name redis-server' }
it { should contain 'alias My redis server' }
it { should contain 'address 192.168.50.201' }
it { should contain 'define service {' }
it { should contain 'use generic-service' }
it { should contain 'service_description PING' }
it { should contain 'check_command check_ping!100.0,20%!500.0,60%' }
it { should contain 'service_description SSH' }
it { should contain 'check_command check_ssh' }
end
Now running the command for testing $ kitchen verify Nagios and you should see the following output
All green :)