We have already dicussed sample configuration files will appear in /usr/local/nagios/etc folder.The following files are basic configuration files if you don't see any one of these file you need to create each file with the exact syntax.
We will explain each file with the complete syntax in the following sections
Nagios has a list of important files on which they depend upon. These range from the config files to the plugins, logs, command files etc.
The following are the files of importance in Nagios:
Note: The file path is assumed based on the default locations of the files.
Main Configuration File
This is the configuration file which defines the various directives that Nagios uses. These directives include the path to various folders where Nagios needs to check in for the required files, the object config files, the command files etc and various other parameters which decide how Nagios operates.
This file has the suer defined macros and other sensitive configuration information which are denied access for the CGIs.
Commands Config File
CGI Config file
Other Object Configuration files include but not limited to the following:
Nagios Command File
Nagios check this file for external commands to process. The command CGI writes commands to this file. Other third party programs can write to this file if proper file permissions have been granted as outline in here. The external command file is implemented as a named pipe (FIFO), which is created when Nagios starts and removed when it shuts down. If the file exists when Nagios starts, the Nagios process will terminate with an error message.
Nagios Log Files
Downtime Log File
Comment log File
Nagios Lock File
Nagios creates this file when it runs as a daemon. This file contains the process id (PID) number of the running Nagios process.
Nagios Temp File
State Retention File
This is the file that Nagios will use for storing service and host state information before it shuts down. When Nagios is restarted it will use the information stored in this file for setting the initial states of services and hosts before it starts monitoring anything. This file is deleted after Nagios reads in initial state information when it (re)starts.
Configure nagios Files
These are the Object configuration files for nagios these files are pointed in nagios.cfg file which is the main configuration file.If you don't have the following files just create these files using the follwing command
Of course, other users can be set up with different privileges. Remember to create them in $NAGIOSHOME/etc/htpasswd.users.
Also, you need to make sure that the relevant users have the correct permissions for nagios. Usually, you will want the admin user to be able to do everything. So, edit these lines in $NAGIOSHOME/etc/cgi.cfg as follows:-
Check through the $NAGIOSHOME/etc/nagios.cfg to see which are the best options for you with things like whether nagios allows external commands to be executed through the web interface, how often to rotate log files etc.
If you decide to make external commands accessible to nagios, then you make ensure that the directory $NAGIOSHOME/var/rw is readable and writeable by the web server user (usually 'www-data').
If you do want to allow external commands to be parsed and acted on by Nagios, you need to set the directive:
in $NAGIOSHOME/etc/nagios.cfg Then we need a new user group and relevant permissions on $NAGIOSHOME/var/rw and $NAGIOSHOME/var/rw/nagios.cmd accordingly:-
You'll need to restart apache so that it can take advantage of being part of the nagiocmd group.
Templating Configuration Files
With all of the object configuration files, you can use templates to make the files smaller and save you time and effort when you need to make changes to them. Let's take the example of the services definitions (see later for more explanation):-
# Generic service definition template
name generic-service ; The 'name' of this service template, referenced in other service definitions
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
check_command $COMMAND $ARGUMENTS
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
# Service definition
# Service definition
Any pretty common directives to the service checking can go into the template section at the top, then specify only the bits that would differ for specific (groups of) hosts in the service definition sections. Also, you can over-ride templated settings in the specific service definition sections.
Configure time periods (timeperiods.cfg)
You need to think about what time periods you would want to separate out the notifications and checking of services. e.g.
# '24x7' timeperiod definition
alias 24 Hours A Day, 7 Days A Week
# 'none' timeperiod definition
alias No Time Is A Good Time
Notice that time period definitions are allowed to overlap.
For most purposes, the existing configuration is pretty good, though you may just want to tweak the "workhours" definitions (and thus the "nonworkhours" from 9am-5pm to your local requirements. This edit can be made in the $NAGIOSHOME/etc/timeperiods.cfg If you plan to make no changes from the supplied timeperiods.cfg-sample file, then just copy it to timeperiods.cfg and you're done.
Configure contacts (contacts.cfg)
Obviously, the point of monitoring is that the relevant people know when something isn't right. So, one thing we need to do is to set up a list of people who will be notified in the event of problems. e.g.:- Let's say we have 6 servers, 2 in London (LON1 and LON2), 2 in New York (NY1 and NY2) and 2 in Hong Kong (HK1 and HK2). Each location has one machine that is a gateway and firewall (machine 1) and the other machine is mail and webcache (machine 2) and the webserver runs on LON1. There are people in the company responsible for various services and hardware and there are those who would need to know in the event of an outage, for escalation purposes.
You will need one section per person. Let's take two people; Fred Bloggs (login ID fbloggs, email address [email protected]), who is the operations manager and needs to know 24x7x365 about problems and Joanna Smith (login ID jsmith, email address [email protected]), who is a web architect and needs to know about critical problems with her web servers on weekdays, in working hours, but someone else covers at weekends and warnings aren't of interest.
In our hypothetical company, we have various functional groups responsible for technical issues:-
Mail admins - Fred
New York admins - Fred, Joanna
... etc. and we can define these groups in the $NAGIOSHOME/etc/contactgroups.cfg file:-
# 'mail-admins' contact group definition
alias Mail Admins
# 'ny-admins' contact group definition
alias New York Admins
...and so on.
Configure host groups (hostgroup.cfg)
Host groups are useful to separate different physical locations, functions and services. Hosts can be members of one or more groups. We could group them as follows:-
Hong Kong Group: HK1,HK2
New York Group: NY1,NY2
London Group: LON1,LON2,LON3
Mail Servers: HK2,NY2,LON2
So, in the view of host groups, there is a logical set-out by location and by function, making it easier to spot problems. We can specify the groups in the $NAGIOSHOME/etc/hostgroups.conf for this example like this:-
# 'hong-kong' host group definition
alias Hong Kong Group
# 'new-york' host group definition
alias New York Group
# 'london' host group definition
alias London Group
# 'mail' host group definition
alias Mail Servers
# 'gateway' host group definition
alias Gateway Servers
# 'firewall' host group definition
# 'cache' host group definition
# 'www' host group definition
alias Web Servers
contact_groups infrastructure, webbies*
* - host groups do not take contact_groups as a directive in Nagios 2.0.
Configure hosts (hosts.cfg)
This is the part where you tell nagios which hosts you are interested in. In $NAGIOSHOME/etc/hosts.cfg you can specify the hosts by IP address, give them a label and set which check command to use for testing whether it is alive and finally, what time period you want to use for notifications. e.g. for our company's webserver, LON3, we reference the generic host definition given at the top of the hosts.cfg-sample file which we retain in hosts.cfg and specify specifics:-
# 'LON1' host definition
Now, when it comes to the status map, where you will want to make the map look like the physical layout, you can use the "parents" parameter to specify which host is the parent to the one you are defining. For example, if you want the map to show LON1, LON2 and LON3 connected to a router "Route1" on the way to NY1 and NY2, you would specify that LON1, LON2, LON3, NY1 and NY2 have the parent "Route1" like this in the hosts.cfg:-
# 'LON3' host definition
# 'LON2' host definition
alias Solaris/Mail server
Also in the status map, you would probably like to have pretty icons for each of the hosts. Download and unpack imagepak-base.tar.gz(http://prdownloads.sourceforge.net/nagios/imagepak-base.tar.gz) and copy the contents to $NAGIOSHOME/share/images/logos Now, we need to tell nagios which icons to use for each host. In $NAGIOSHOME/etc/cgi.cfg you need to point to an external template file which will contain the definitions:-
where the *_image files are appropriately selected from those in $NAGIOSHOME/share/images/logos, though you must use a .gd2 file for the statusmap_image. The 2d_coords are where the icon should appear on the status map if you are using an option of the statusmap layout (set in $NAGIOSHOME/etc/cgi.cfg) that allows for specifying the location. It is a good idea to start out using the default layout 5 (Circular, Marked Up), which does not required co-ordinates to be set. You can modify the setting later (or not), when you have a better idea of where you want them placed.
Configure commands (commands.cfg)
This part is quite complex, so I've made the details a separate guide, here. However, basically what you need to do is to look in the $NAGIOSHOME/libexec directory to see what commands are there, check out the switches and flags (usually by running the command with a --help option) and configure the ones you want in $NAGIOSHOME/etc/checkcommands.cfg
Here is a basic example for the command to check whether a secure apache is running on a host:-
# 'check_apache' command definition
command_line $USER1$/check_https -H $HOSTADDRESS$
$USER1$ refers to a configuration in the $NAGIOSHOME/etc/resource.cfg file which usually (and in the frame of this installation guide) refers to the location of the executable checking commands/plugins. $HOSTADDRESS$ is the variable passed into the command denoting on which host that service should be checked.
Dependencies between services can be configured in $NAGIOSHOME/etc/dependencies.cfg For the moment, this will not be covered by this set of guidelines.
Dependencies between services can be configured in $NAGIOSHOME/etc/escalations.cfg For the moment, this will not be covered by this set of guidelines.
The $NAGIOSHOME/etc/resource.cfg file is where some common variables and macros are defined. You can define up to 32 $USERx$ macros, which can in turn be used in command definitions in your host config file(s). $USERx$ macros are useful for storing sensitive information such as usernames, passwords, etc. They are also handy for specifying the path to plugins and event handlers - if you decide to move the plugins or event handlers to a different directory in the future, you can just update one or two $USERx$ macros, instead of modifying a lot of command definitions.
Most importantly, the CGIs will not attempt to read the contents of resource files, so you can set restrictive permissions (600 or 660) on them.
After installing nagios, the default resource.cfg-sample file is generally good enough to be used as resource.cfg, unless you have some fancy stuff to configure in.
nrpe Addon Configuration in Nagios
nrpe is the commonly used client application or agent that runs on the hosts to be monitored to gather local data which cannot (or is less logical to) be retrieved directly from the Nagios host.
Download a copy of nrpe-<your version>.tar.gz and untar somewhere sensible. Now build it:-
This is a quite large part of the configuration. The basics are as follows.
In the file $NAGIOSHOME/etc/services.cfg, you need to specify which services are to be monitored for each host. This ranges from the basic ping to checking apache is running, SMTP is working etc. For each server, you must at least specify a ping service. The example I'll give is generic and based on the generic-service template which is supplied in the file services.cfg-sample (which must be included in services.cfg if you want to reference it).
One thing to note... if you are probing the availability of machines/services which are not owned by you, it is probably best to set the normal_check_interval to a conservative time period, say 10 minutes. The interval_length is set in $NAGIOSHOME/etc/nagios.cfg, defaults to 60 (seconds). The check_interval is set in multiples of the normal_check_interval, so for 10 minutes, leave interval_length at the default and set normal_check_interval to 10.
Configure service groups (servicegroup.cfg only forNagios v2.0 or higher)
As with host groups, you can group services into logical clumps, specifying the host and service name for each service in the group:-
# 'Live Databases' service group definition
alias Live Databases
Service groups do not take contact_groups as a directive.
Configure mail alerts (misccommands.cfg)
This is specific to Solaris. The default setup of mail uses mail, which does not take -s under Solaris, so the subject lines of the alert emails will be blank. You need to use mailx. So, edit $NAGIOSHOME/etc/misccommands.cfg and find the lines:-
and change mail to mailx. Also in this section, you can configure what will appear on the subject line. Basically, just modify the section in quotes after mailx -s, using relevant variables for what you want to see.
Troubleshooting Nagios Configuration
If you have problems with the status map, histograms etc., then you do need to make sure that your libraries are linked as follows:-
Remember, your system may be using libraries in other places in addition to these locations. Take care to include those if you need to.
Also, for problems with status map and histograms, check back to when you installed the GD, jpeg and png libraries. Did you install them in the correct order and did gd report jpeg and png support something like this:-
** Configuration summary for gd 2.0.33:
Support for PNG library: yes
Support for JPEG library: yes
Support for Freetype 2.x library: no
Support for Fontconfig library: no
Support for Xpm library: yes
Support for pthreads: yes
If not, you may need to re-visit your gd installation.
Start her up and see what happens
Then point your browser at: http://yourserver/nagios/ and attempt to log in.