Unixwiz.net - Steve Friedl's Weblog: SpamAssassin indirect whitelisting and message munging

March 28, 2003

SpamAssassin indirect whitelisting and message munging

I recently installed Postfix + SpamAssassin at a customer as a front end to MS Exchange, and they are reporting that alerts from CBS Marketwatch are being garbled. This apparently didn't happen with the previous sendmail front end, so I've been tasked to figure it out. I believe that neither SpamAssassin nor Postfix are messing anything up, but it's necessary to prove it. This is how we're doing so.

The first step was to modify the postfix filter that runs the mail through SpamAssassin and capture the emails before and after processing. We modified the /etc/postfix/spamassassin-filter.sh script

#!/bin/sh

# Localize these.
INSPECT_DIR=/var/spool/filter
SENDMAIL="/usr/sbin/sendmail -i"

# Exit codes from <sysexits.h>
EX_TEMPFAIL=75
EX_UNAVAILABLE=69

# Clean up when done or when aborting.
# trap "rm -f in.$$ out.$$" 0 1 2 3 15

# Start processing.
cd $INSPECT_DIR || { echo $INSPECT_DIR does not exist; exit $EX_TEMPFAIL; }

echo "$$: running $@" >> filter.log

tee in.$$ |
/usr/local/bin/spamc > out.$$ || {
        echo "Message content rejected"; exit $EX_UNAVAILABLE; }

$SENDMAIL "$@" <out.$$

exit $?

The entries in bold were modified for our purposes, and the idea is that the /var/spool/filter directory contains both in.$$ and out.$$ files for each email, and a log of everything is kept in filter.log. Once a message has been identified as garbled, it's a simple matter to diff the input and the outputs: when it shows that only the header has been modified, and nothing in the body has, it's easy enough to be sure that SpamAssassin is not messing anything up.

But what if we want to really take SpamAssassin out of the loop, so it doesn't even add its few headers: this can be done on an ugly, ad-hoc basis for testing (though we're working on a much cleaner whitelisting system for production use).

The filter is typically called from Postfix with flags indicating sender and recipient, and we can use these to bypass SpamAssassin processing. We don't want to do real parsing on the command line, so our approach is to simply look for the trigger address anywhere in the command line. When we find one that matches, we simply exec sendmail directly: otherwise we run the mail through SpamAssassin.

...
echo "$$: running $@" >> filter.log

case "$@" in
  *marketwatchmail.com*) exec $SENDMAIL "$@" ;;

  *)
        /usr/local/bin/spamc > out.$$  ||
                { echo "Message content rejected"; exit $EX_UNAVAILABLE; }
        ;;
esac

$SENDMAIL "$@" <out.$$

exit $?

At this point my customer is convinced that SpamAssassin is not munging his messages, so we're going to set up a parallel sendmail system to find out if it's Postfix. We both suspect that it's Exchange, but we don't know why. Posted by Steve at March 28, 2003 11:17 AM | TrackBack

Comments