DeutschEnglish

Submenu

 - - - By CrazyStat - - -

9. December 2014

Icinga: Group all services in a servicegroup instead of using a wildcard

Filed under: Linux,Server Administration — Tags: , , , , , , — Christopher Kramer @ 16:15

At some places you can use the * wildcard as a service description (which requires use_regexp_matching=0), but sometimes it does not seem to work:

Error: Could not expand services specified

Therefore, I simply wanted to group all services in one servicegroup. That’s quite easy if you use a generic service template as a basis for all services. First, create a servicegroup “allservices”:

define servicegroup {
        servicegroup_name               allservices
        alias                           All Services
}

Then edit your generic service template (see servicegroups line):

# generic service template definition
define service{
        name                            generic-service ; The 'name' of this service template
        active_checks_enabled           1       ; Active service checks are enabled
        passive_checks_enabled          1       ; Passive service checks are enabled/accepted
        parallelize_check               1       ; Active service checks should be parallelized (disabling this can lead to major performance probl$
        obsess_over_service             1       ; We should obsess over this service (if necessary)
        check_freshness                 0       ; Default is to NOT check service 'freshness'
        notifications_enabled           1       ; Service notifications are enabled
        event_handler_enabled           1       ; Service event handler is enabled
        flap_detection_enabled          1       ; Flap detection is enabled
        failure_prediction_enabled      1       ; Failure prediction is enabled
        process_perf_data               1       ; Process performance data
        retain_status_information       1       ; Retain status information across program restarts
        retain_nonstatus_information    1       ; Retain non-status information across program restarts
        notification_interval           0               ; Only send notifications on status change by default.
        is_volatile                     0
        check_period                    24x7
        normal_check_interval           5
        retry_check_interval            1
        max_check_attempts              4
        notification_period             24x7
        notification_options            w,u,c,r
        contact_groups                  admins
        servicegroups                   allservices ; ADD THIS TO ADD ALL SERVICES INTO THE allservices GROUP
        register                        0       ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
        }

This assumes all your services use this template like this (see use line):

define service {
        hostgroup_name                  ssh-servers
        service_description             SSH
        check_command                   check_ssh
        use                             generic-service  ; USE THE TEMPLATE ABOVE
        notification_interval           0 ; set > 0 if you want to be renotified
}

 

Now you can easily use this servicegroup for example in serviceescalations (see servicegroup_name line):

 define serviceescalation{
        hostgroup_name          intranet-servers
        servicegroup_name       allservices
        first_notification      1
        last_notification       0
        notification_interval   1440
        contact_groups          intranet-admins
        }

Hope this helps somebody. I guess it works the same way in Nagios.

Recommendation

Try my Open Source PHP visitor analytics script CrazyStat.

7. February 2013

Icinga / Nagios: Notify a group of contacts about a group of hosts

In Nagios/Icinga, you can easily define which contacts or contact groups get notified for a certain service in the service definition:

 define service{
        host_name               linux-server
        service_description     check-disk-sda1
        check_command           check-disk!/dev/sda1
        max_check_attempts      5
        check_interval          5
        retry_interval          3
        check_period            24x7
        notification_interval   30
        notification_period     24x7
        notification_options    w,c,r
        contact_groups          linux-admins
        }

(Source of this example: Icinga documentation)

So only contacts of the contact group “linux-admins” would be informed about problems regarding this service.

You could also use the “contacts” directive to list individual contacts or list multiple contact groups.

But often, the responsibility of admins is not defined through services, but through hosts. Usually, there is a group of admins for linux servers and one for windows servers. Or a group for intranet servers and one for internet servers. As admins usually are annoyed if they get notifications about servers they are not responsible for, it is usually a good idea to only notify those admins that are responsible.

So you can also do this at the host-definition:

 define host{
        host_name                       bogus-router
        alias                           Bogus Router #1
        address                         192.168.1.254
        parents                         server-backbone
        check_command                   check-host-alive
        check_interval                  5
        retry_interval                  1
        max_check_attempts              5
        check_period                    24x7
        process_perf_data               0
        retain_nonstatus_information    0
        contact_groups                  router-admins
        notification_interval           30
        notification_period             24x7
        notification_options            d,u,r
        }

(Source of example: icinga documentation)

So only the contact_group “router_admins” would be notified for this host.

But one thing where the “contacts” and “contact_groups” directive is missing, is the hostgroups definition. It is not possible to directly assign a contact groupĀ  or list of contacts to a hostgroup or the other way round. So here is how it can be done with another type of definition.

Group your hosts

First, define a group of hosts for each group of admins. So for example, group all intranet servers in one and all internet servers in another group. You probably already did this.

define hostgroup{
        hostgroup_name          intranet-servers
        alias                   Intranet Servers
        members                 intra1, intra2, intra3
}
define hostgroup{
        hostgroup_name          internet-servers
        alias                   Internet Servers
        members                 inter1, inter2, inter3
}

See the icinga documentation for details. Note to use the shortnames in “members”.

You can also define things the other way round: When defining a host, say which hostgroup it belongs to:

define host{
        use                     generic-host
        host_name               intra1
        alias                   intra1.local
        address                 192.168.10.1
        hostgroups              intranet-servers
        }

See documentation for details.

Group your contacts

Next, group your contacts. So create a contact-group for each group of admins so we can later assign this contact group to the corresponding group of hosts.

Example:

define contactgroup{
        contactgroup_name       intranet-admins
        alias                   Intranet Administrators
        members                 alice, bob
        }
define contactgroup{
        contactgroup_name       internet-admins
        alias                   Internet Administrators
        members                 charley
        }

See documentation. Again, you can also define it the other way round (list the contact groups at the contact-definition).

Assign contact groups to host groups

Now comes the interesting part. To do this, we use a “Hostescalation definition“.

Example:

 define hostescalation{
        hostgroup_name          intranet-servers
        first_notification      1
        last_notification       0
        notification_interval   60
        contact_groups          intranet-admins
        }

 define hostescalation{
        hostgroup_name          internet-servers
        first_notification      1
        last_notification       0
        notification_interval   60
        contact_groups          internet-admins
        }

This will make sure internet-admins get informed about internet-servers and intranet-admins about intranet-servers. “last-notification 0” means that all notifications will get sent to this group of contacts. You can adjust the notification_interval (in minutes) if you want.

The cool thing here is that you can also define that if the problem still occurs after 5 notifications, the other team of admins gets notified:

define hostescalation{
        hostgroup_name          intranet-servers
        first_notification      1
        last_notification       3
        notification_interval   30
        contact_groups          intranet-admins
        }
define hostescalation{
        hostgroup_name          intranet-servers
        first_notification      4
        last_notification       0
        notification_interval   60
        contact_groups          internet-admins, intranet-admins
        }

This would notify “intranet-admins” 3 times (every 30 minutes) about problems with “intranet-servers”. If the problem is still not solved, “internet-admins” will get notified as well. So the internet-admins won’t get bothered with short problems that the intranet-admins can fix, but will still get informed if the problem is not solved for some time.

More information on hostescalation and serviceescalation in the documentation here, here and here.

I hope this helped somebody.