Below is a simple Spamassassin setup for use with Postfix; I also cover how to train Spamassassin on a Maildir mail box (I use Dovecot).
Some of the below is cribbed from http://www.debuntu.org/postfix-and-pamassassin-how-to-filter-spam. I've simplified it and left more of the configuration at the defaults. You don't have to do this on a locally-hosted Postfix server: the instructions are the same for any email server which can receive email (it just so happens mine isn't open to the outside world and is just passed email by fetchmail). I'm working on Ubuntu Breezy.
Disclaimer: Use these instructions at your own risk. I am not an expert on spam filtering: this set of instructions just gave me what I needed to do my own local spam filtering. If you set it up and it loses you the email clinching a one million dollar contract, I assume no responsibility.
By the way, you need to do all of the below as root.
apt-get install spamassassin spamc
We want it to run as non-root, so add a spamd user and group:
# groupadd spamd # useradd -g spamd -s /bin/false -d /var/log/spamassassin spamd # mkdir /var/log/spamassassin # chown spamd:spamd /var/log/spamassassin
(This sets the spamd user's home directory as /var/log/spamassassin.)
Edit /etc/default/spamassassin so these options are set:
ENABLED=1
SAHOME="/var/log/spamassassin/"
OPTIONS="--create-prefs --max-children 2 --username spamd \
-H ${SAHOME} -s ${SAHOME}spamd.log"
--max-children spawns the specified number of child processes (you might need more on a busy server), --username specifies the username spamd runs under, -H sets the home directory, -s sets the log file.
(I left the rest of the file alone, and didn't touch /etc/spamassassin/local.cf.)
/etc/init.d/spamassassin start
(By default, spamassassin gets added to the startup scripts by Ubuntu, so it will start/stop with your system.)
This makes Postfix pipe email to Spamassassin once it's been received. Edit /etc/postfix/master.cf and add this line as the first line of the file:
smtp inet n - - - - smtpd
-o content_filter=spamassassin
Add this to the end of the same file:
spamassassin unix - n n - - pipe
user=spamd argv=/usr/bin/spamc -f -e
/usr/sbin/sendmail -oi -f ${sender} ${recipient}
I bet you're asking yourself "What the hell?!". I know I was. As far as I can tell, this sets up an after-queue content filter, and this config. does the following:
/usr/bin/spamc -f -e /usr/sbin/sendmail -oi -f ${sender} ${recipient}Reload Postfix for the configuration to take:
/etc/init.d/postfix reload
This was the bit where I had to do the most research. I'm using Maildir format for my email, under dovecot; email for localhost accounts goes in /home/user/Maildir.
To do my training, I created a new Junk folder for my localhost Postfix account: I did this by adding the folder via Thunderbird. Next I got a load of existing spam from my Trash folder, and moved it to the Junk folder.
Then, to train the filter on the Junk folder, I used sa-learn like this (you need to be root, which is why the sudo is there):
sudo sa-learn --spam -u spamd --dir /home/ell/Maildir/.Junk/* -D
You can also train it what the good stuff looks like, e.g. run it over a clean inbox (no spam):
sudo sa-learn --ham -u spamd --dir /home/ell/Maildir/.INBOX/* -D
There are probably spam corpuses (corpi?) you can use for this or some other smart method I missed, but this seems to work.
Get the content out of an existing spam, send it to an account on your protected server from a free email account (you could even set up a Hotmail account to get the most accurate spam scenario), and see if Spamassassin stops it. Check the logs (/var/log/spamassassin and /var/log/mail.log) to see what happened to the email. If Postfix is using Spamassassin properly, you should see something like this in the logs:
Jan 26 14:56:10 localhost postfix/pipe[12139]: 9CBD5DA4BF: \ to=<ell@localhost>, relay=spamassassin, delay=17, status=sent (localhost)
(Notice the mention of postfix/pipe.)
Spamassassin might not stop the email, as it will be coming from a legitimate email address. The important thing is that the logs report that Spamassassin was applied.
I now get very little spam delivered to my inbox: most of it gets stopped by the Postfix server using Spamassassin, and Thunderbird catches the rest. I periodically train the filter to make it improve, using the stuff Thunderbird has decided is junk.
Originally, this was at the top of this post, but it's kind of irrelevant. I've included it here so you know why I put myself through this lunacy.
I have a couple of legacy email accounts which still get the occasional proper email, plus endless spam. I use the excellent fetchmail to download email from these moribund accounts and send it to my local Postfix server instead. I then read my local email account via the Dovecot imap server (setup to use Maildir layouts) in the Thunderbird email client. A fairly complex setup, but one which has worked well for me over the past three or four years.
However, a vast amount of spam used to hit my old email accounts, and I'd have to wait for Thunderbird to sort it out (which it does well) once fetchmail passed it to the local Postfix server. I got tired of having to teach the Thunderbird junk controls, so decided instead to setup my local Postfix server with Spamassassin: the stuff that fetchmail then pulls down and sends to my local Postfix server goes through Spamassassin before it hits Thunderbird. As you can tell Thunderbird to trust Spamassassin SPAM headers, it will automatically delete anything which Spamassassin has already decided is spam. Which means I don't have to teach Thunderbird, and can just teach Spamassassin instead (which is an automatic process), and let it learn by itself too as it finds more spam.
Comments
thanks
Just thanking you for the time you contributed to this article
Spamassassin and Spam Deletion
"most of it gets stopped by the Postfix server using Spamassassin"
I don't think this part is right. From wiki.apache.org/spamassassin:
"SpamAssassin itself will not delete any emails. It's only a filter which reads email in, and passes that same email out, modified in some way. If you want to delete emails, or redirect emails, you need to do it in whatever program calls SpamAssassin. "
The site then goes on to explain how to set up a procmail filter to drop spam at the server, it that is what you want to do.
In your case, do you think what is happening is that Thunderbird's Bayesian filter can learn what is junk instantly because of SpamAssassin headers? Thus you can achieve your end result, very little spam in the Inbox, but all the filtering is done by Thunderbird. I don't see Postfix dropping mail on the floor. Please correct me if I am wrong.
Thank you for this mini-howto. I used it to set up Spamassassin on an Ubuntu server running Courrier IMAP in literally minutes, and it is working great. I also use Thunderbird. It is also easy to set a filter in Squirrelmail (web mail) to filter spamassassinated messages to the Inbox.Junk folder:
options/message filtering:
If Header contains X-Spam-Flag: YES then move to Junk
Thanks for the guide. I'd be
Thanks for the guide. I'd be running with just rbl's but spam was getting through more and more. I installed spamassassin, and it correctly identifies most spam, I train it every time it misses something.
I forgot to mention I'm
I forgot to mention I'm running it on a vm, and io and memory can be an issue, but so far so good. I only let 2 spamassassin processes run at once.