DeutschEnglish

Submenu

 - - - By CrazyStat - - -

26. February 2017

Postgrey: let nagios/icinga tell you what domains might need whitelisting

Filed under: Linux,Server Administration — Tags: , , , , , — Christopher Kramer @ 16:58

The problem: greylisting significantly delays mail from big mail providers with many servers

I still think greylisting with postgrey is very effective against spam. But now and then, mail from some senders get delayed for hours. This especially is the case for big mail service providers like Amazon SES, Microsoft Exchange Online Protection or Mailchimp. These providers have lots of servers and IP-addresses and the first attempt to deliver an email usually comes from a different server than the second. Thus, postgrey will not find a matching triplet and block the second attempt again. This will continue until one of the mail servers tried twice and postgrey found a matching tripled. This might take hours and thus cause significant delay of mails.

Therefore, if you want to use postgrey without delaying lots of mail for several hours, you need to whitelist these big email services with many servers. As new services come up and others go, you need some way to find out which servers try to send mail to your server and frequently get denied because no triplet is found.

The solution: Check your mail logs

To get an idea of what is happening and which servers might need whitelisting, I wrote a small command that analyzes your mail.log-files. Here it is:

grep -ohP 'reason=new, client_name=[^,.]+\.[^,.]+\.\K[^,.]+\.[^,]+' /var/log/mail.log* | sort | uniq -c | sort -nr | less

So what does this do?

It scans your /var/log/mail.log* files  for occurrences containing reason=new, which is what postgrey logs when no triplet was found. It then takes the domain of the server that is trying to send the mail to you, stripping the first subdomain, as this might vary from server to server that the mail service provider uses. It then sorts, counts and orders the result.

The result of this command might look like this (stripped after some lines):

    579 protection.outlook.com
    503 eu-west-1.amazonses.com
    113 facebook.com
     54 mcsv.net
     52 rsgsv.net
     48 mcdlv.net
     28 amazonses.com
     17 asianet.co.th

You should check especially all the lines that have a lot of occurrences. They might be domains from big email service providers that need whitelisting for the reasons explained above. For example, the mail from protection.outlook.com and amazonses.com is mail sent by customers of these mail providers that usually is not spam. Especially, the mail servers of these providers do retry delivery lots of times, so greylisting would not block spam anyway, it would just cause significant delays. So these domains should be whitelisted.

But the domains in this list might also only be domains that are used by an ISP for dynamic IP addresses that are actually sending spam. For example, asianet.co.th seems to be a big ISP from Thailand and multiple dynamic IP-addresses from this ISP tried to send spam, resulting in 17 occurrences where no triplet was found. This is mail that you want to be blocked/delayed.

Once you decided which domains you want to whitelist, you can do so by adding them in your whitelist-file (on Debian this is usually /etc/postgrey/whitelist_clients).

The entries should look like this:

/.*\.protection\.outlook\.com$/
/.*\.amazonses\.com$/
/.*\.facebook\.com$/
/.*\.booking\.com$/
/.*\.mcsv\.net$/
/.*\.rsgsv\.net$/
/.*\.mcdlv\.net$/
/.*\.mandrillapp\.com$/

Note: You should escape all dots with a backslash, as done in the example. Otherwise, a dot matches any character, so protection.outlook.com would also whitelist protectionAoutlook.com, which might be a domain that spammers register on purpose to get through servers that use inaccurate whitelists.

Automation with nagios/icinga

You could run the above command from time to time to check if new services need whitelisting. But it would be more easy, if you get notified when a new domain pops up in this list that you have not checked yet, right?

Therefore, I wrote a small nagios/icinga plugin that checks if any new domains appear on this list. It is a bit quick and dirty but does its job.

The main part is a bash script, that you could place in /usr/lib/nagios/plugins/check_postgrey_whitelist :

#!/bin/bash
log='/var/log/mail.log.1'
ignoreFile='/etc/nagios-plugins/postgrey_no_whitelist'

hosts=`grep -ohP 'reason=(new|early-retry[^,]*), client_name=[^,.]+\.[^,.]+\.\K[^,.]+(\.co|\.com)\.?[^,]+' $log | sort | uniq -c | awk '$1>=15{print $2}' | sort`;
noWhitelist=`cat $ignoreFile | sort`;

diff=`comm -23 <(echo "$hosts") <(echo "$noWhitelist") | sed ':a;N;$!ba;s/\n/ /g'`;

OK=0;
WARNING=1;
CRITICAL=2;
UNKNOWN=3;

if [ -z "$diff" ]; then
        echo "Okay, the top domains are whitelisted or ignored";
        exit $OK;
else
        echo "Whitelist or ignore these senders: $diff";
        exit $WARNING;
fi

This script ignores all domains listed in /etc/nagios-plugins/postgrey_no_whitelist  (one per line), assuming you checked that you do not want to whitelist these. So create this file as well and put the domains in that you checked and do not want to whitelist:

asianet.co.th
example.com

Now we need to define the command for this plugin:

Create the configuration where your other plugins are defined, e.g. /etc/nagios-plugins/config/postgrey_whitelist.cfg

define command {
        command_name    check_postgrey_whitelist
        command_line    /usr/lib/nagios/plugins/check_postgrey_whitelist
}

Now you need to create a service for your localhost that runs this command.

In your localhost service definition, e.g. /etc/icinga/objects/localhost_icinga.cfg , add this:

define service{
        use                             generic-service
        host_name                       localhost
        service_description             Postgrey Whitelist
        check_command                   check_postgrey_whitelist
}

Finally, make sure that nagios/icinga has read-access to your mail-logfile. You can adjust which file to check in the script above, I chose mail.log.1. So you might chown this file to the nagios user like this:

chown nagios:adm mail.log.1

Of course logrotation needs to know that new rotated files should have this owner. On a debian system, you can configure logrotate to do so in /etc/logrotate.d/rsyslog. Adjust it similar to this one:

/var/log/mail.info
/var/log/mail.warn
/var/log/mail.err
/var/log/daemon.log
/var/log/kern.log
/var/log/auth.log
/var/log/user.log
/var/log/lpr.log
/var/log/cron.log
/var/log/debug
/var/log/messages
{
        rotate 4
        weekly
        missingok
        notifempty
        compress
        delaycompress
        sharedscripts
        postrotate
                invoke-rc.d rsyslog rotate > /dev/null
        endscript
}

/var/log/mail.log
{
        rotate 4
        weekly
        missingok
        notifempty
        compress
        delaycompress
        sharedscripts
        create 440 nagios adm
        postrotate
                invoke-rc.d rsyslog rotate > /dev/null
        endscript
}

So I removed the mail.log from the first logrotate definition and created a new one below which is exactly the same but contains an extra line create 440 nagios adm to tell logrotate which user and permission to assign the logfiles to.

You might need to adjust some stuff depending on your distro, but I hope this helps somebody to set this up.

Update 06.03.2017: Adjusted script so it does not consider second-level domains like co.uk or com.br as single senders anymore.

Recommendation

Try my Open Source PHP visitor analytics script CrazyStat.