DeutschEnglish

Submenu

 - - - By CrazyStat - - -

7. February 2013

Icinga / Nagios: Notify a group of contacts about a group of hosts

In Nagios/Icinga, you can easily define which contacts or contact groups get notified for a certain service in the service definition:

 define service{
        host_name               linux-server
        service_description     check-disk-sda1
        check_command           check-disk!/dev/sda1
        max_check_attempts      5
        check_interval          5
        retry_interval          3
        check_period            24x7
        notification_interval   30
        notification_period     24x7
        notification_options    w,c,r
        contact_groups          linux-admins
        }

(Source of this example: Icinga documentation)

So only contacts of the contact group “linux-admins” would be informed about problems regarding this service.

You could also use the “contacts” directive to list individual contacts or list multiple contact groups.

But often, the responsibility of admins is not defined through services, but through hosts. Usually, there is a group of admins for linux servers and one for windows servers. Or a group for intranet servers and one for internet servers. As admins usually are annoyed if they get notifications about servers they are not responsible for, it is usually a good idea to only notify those admins that are responsible.

So you can also do this at the host-definition:

 define host{
        host_name                       bogus-router
        alias                           Bogus Router #1
        address                         192.168.1.254
        parents                         server-backbone
        check_command                   check-host-alive
        check_interval                  5
        retry_interval                  1
        max_check_attempts              5
        check_period                    24x7
        process_perf_data               0
        retain_nonstatus_information    0
        contact_groups                  router-admins
        notification_interval           30
        notification_period             24x7
        notification_options            d,u,r
        }

(Source of example: icinga documentation)

So only the contact_group “router_admins” would be notified for this host.

But one thing where the “contacts” and “contact_groups” directive is missing, is the hostgroups definition. It is not possible to directly assign a contact groupĀ  or list of contacts to a hostgroup or the other way round. So here is how it can be done with another type of definition.

Group your hosts

First, define a group of hosts for each group of admins. So for example, group all intranet servers in one and all internet servers in another group. You probably already did this.

define hostgroup{
        hostgroup_name          intranet-servers
        alias                   Intranet Servers
        members                 intra1, intra2, intra3
}
define hostgroup{
        hostgroup_name          internet-servers
        alias                   Internet Servers
        members                 inter1, inter2, inter3
}

See the icinga documentation for details. Note to use the shortnames in “members”.

You can also define things the other way round: When defining a host, say which hostgroup it belongs to:

define host{
        use                     generic-host
        host_name               intra1
        alias                   intra1.local
        address                 192.168.10.1
        hostgroups              intranet-servers
        }

See documentation for details.

Group your contacts

Next, group your contacts. So create a contact-group for each group of admins so we can later assign this contact group to the corresponding group of hosts.

Example:

define contactgroup{
        contactgroup_name       intranet-admins
        alias                   Intranet Administrators
        members                 alice, bob
        }
define contactgroup{
        contactgroup_name       internet-admins
        alias                   Internet Administrators
        members                 charley
        }

See documentation. Again, you can also define it the other way round (list the contact groups at the contact-definition).

Assign contact groups to host groups

Now comes the interesting part. To do this, we use a “Hostescalation definition“.

Example:

 define hostescalation{
        hostgroup_name          intranet-servers
        first_notification      1
        last_notification       0
        notification_interval   60
        contact_groups          intranet-admins
        }

 define hostescalation{
        hostgroup_name          internet-servers
        first_notification      1
        last_notification       0
        notification_interval   60
        contact_groups          internet-admins
        }

This will make sure internet-admins get informed about internet-servers and intranet-admins about intranet-servers. “last-notification 0” means that all notifications will get sent to this group of contacts. You can adjust the notification_interval (in minutes) if you want.

The cool thing here is that you can also define that if the problem still occurs after 5 notifications, the other team of admins gets notified:

define hostescalation{
        hostgroup_name          intranet-servers
        first_notification      1
        last_notification       3
        notification_interval   30
        contact_groups          intranet-admins
        }
define hostescalation{
        hostgroup_name          intranet-servers
        first_notification      4
        last_notification       0
        notification_interval   60
        contact_groups          internet-admins, intranet-admins
        }

This would notify “intranet-admins” 3 times (every 30 minutes) about problems with “intranet-servers”. If the problem is still not solved, “internet-admins” will get notified as well. So the internet-admins won’t get bothered with short problems that the intranet-admins can fix, but will still get informed if the problem is not solved for some time.

More information on hostescalation and serviceescalation in the documentation here, here and here.

I hope this helped somebody.

Recommendation

Try my Open Source PHP visitor analytics script CrazyStat.