Skip to main content

Improve SpamAssassin accuracy – sa-learn and Spam Trap

SpamAssassin is probably the most popular antiSPAM service for your own server and although through its mails analysis can stop a large part of SPAM, spam bot are always improving their contents to be not flag as SPAM. However SpamAssassin comes with a large set of rules but won’t change much unless you teach him!

Training SA will then help improve its accuracy, however to be effective, you need to have a similar amount of Ham (Non Spam) and Spam mails or more Ham than Spam, as if you only train Spam, SA rules will be biased toward Spam and could generate false positive.

Obviously, teaching as to come from you. If you find Ham in Spam folder, you need to move it to your regular Inbox and if you find Spam in your regular Inbox, you have to move them in Spam. (Basic learning I’ll say:) )

I assume you have a working mail configuration with SpamAssassin working and moving those Spam into a Spam folder through Procmail or Sieve rules.

If not, here is how to install a Postfix + Dovecot mail system with SSL and here is how to protect your mails with SpamAssassin + ClamAV with Amavis and Procmail.

I’m doing a 2 continuous ways training, if you just want a quick script to do so, just check at the end of the article.


1) SA Learning on current INBOX and SPAM mailboxes

The official documentation got a lot of details and I recommend you to check it if you want to know more.

SpamAssassin comes with its own tool to learn Spam and Ham, called sa-learn. Very easy to use with Maildir or mbox, …

To learn Spam, simply run

where .Junk is the folder name of your Spam folder located on your Maildir folder. (If you are using mbox just add the option--mbox before the folder name)

and to learn Ham:

Note that we don’t need to activate the learning on the tmp folder as it should be empty most of the time.

Once done, you need to tell SA to rebuild its database with the command:

You can then build your own rules, script or cron job based on these commands.


 2) Spam Trap

You may want to create a dedicated mail box for known spammy websites like (Go ahead bots, send me emails to this box mail, they will all be set as spam).

To do so, after creating this mail box, either you adapt previous script or run a cron job for SA to learn from (Regardless if it is in Spam folder or not) or you can set up a rule in your procmail if any mails are sent their, to learn it as Spam such as:

But in my case, I prefer to have a unique script to deal with all these.


SA-Lean script coupled with Spam Trap

If you are looking for a full script to scan all users ham and spam folder, while deleting automatically old spam from your system and providing a backup of SpamAssassin rules (In case you want to use them on a different system or just as pure backup), here is a script you could use.

Create the script file (Where you want), example:

and paste:

Just set the users you want to monitor (leave it blank if all), spam trap if any, old spam deletion timeframe and the backup folder path.

The script will then scan the selected users (if not all but excluding Spam trap first), will learn from cur/new INBOX for Ham and from cur/new SPAM box for Spam. Then it will scan the spamtrap account to flag all mails as spam (from both INBOX and SPAM folder). Then it will sync the SA base, remove old spam (If requested) and finally print the statistics on top of create a backup file of SA rules.

Don’t forget to make it executable (chmod +x) and set a cron job, like every day at 1am:

or for example only once a month (if not used often), the first day of the month at 1am:



7 thoughts on “Improve SpamAssassin accuracy – sa-learn and Spam Trap

  1. Sure Johna,
    What part is complicated? May be I can add few details if needed. Let me know where are your issues and I’ll check what I can do!

  2. Great post – though looks like an error on line 24 of the script, which needs to be double square bracketed, e.g.

    if [[ -z ${user[@]} ]]; then

    1. Hi Jerrold,

      Thanks for your feedback.
      I’m not certain why it needs double square brackets? The syntax with single square brackets should work normally…

      You are having issues with the script?

      Any others with this issue?

      Thanks for the feedback. I’ll make few test and see if I need to revise the script.

  3. Howdy,
    You have a typo in the top walk through. When talking about the HAM scan, you still have it pointing to the junk mail folder.


    1. Hi Nathan,
      Which part are you referring?

      The one in the script:
      echo $u Ham Scan>> $log 2>&1
      sa-learn --no-sync --ham /home/$u/Maildir/{cur,new} >> $log 2>&1

      Seems to be linked to ham folder, not spam.

  4. @karibu

    hi, i think he is referring to the the text upwards

    and to learn Ham:
    sa-learn –no-sync –ham ~/Maildir/.Junk/{cur,new}

Leave a Reply

Your email address will not be published. Required fields are marked *