You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Setting up Site-Wide Bayesian Filtering

In local.cf, tell SpamAssassin where to find the Bayesian database files:

bayes_path /var/spamassassin/bayes/bayes
bayes_file_mode 0777

This tells the system that the Bayesian filter database files will be /var/spamassassin/bayes/bayes_msgcount, seen and _toks. Feel free to move it wherever you want. Please note this directory needs to be RWX to all users that SpamAssassin will be executed as, and many use world RWX to simplify this. The directory also shouldn't contain any files other than your bayes database. If it contains any files that start with "bayes" it can break the locking mechanisms SpamAssassin uses.

Now start feeding the Bayesian filter spam and ham messages.

sa-learn --spam --showdots --dir /path/to/directory/full/of/spam/msgs
sa-learn --ham --showdots --dir /path/to/directory/full/of/ham/msgs

See SiteWideBayesFeedback for more tips on getting an entire site to feed back spam and ham messages into the Bayesian filter.

Also restart spamd if you're running it so that it will re-read local.cf and enable the Bayes filter:

/etc/init.d/spamassassin restart
-or-
service spamassassin restart

Your method of restarting spamd options may differ, but the above is typical. If you're using any MTA integrations that invoke SpamAssassin as a perl API (ie: MailScanner or mimedefang) that process will need to be restarted or told to reload its configuration as it is effectively it's own spamd.

You may experience difficulties with file permissions. Make sure you chmod any existing bayes files to readable/writable by your user groups (or world if you're doing so).

If you are going to use group rights instead of a world RWX, there are some additional issues you will need consider. If you use spamd and mail gets scanned on behalf of "root" spamd will use "nobody" as its effective user for bayes database access. You should consider this user when planing your group memberships. Also, be aware that the files are deleted and recreated by whatever user happens to be running spamassassin when an expiration is due. If you are not using world RWX this means you need to beware the files will loose their group ownership you may have set unless you make the directory setgid.

See Mail::SpamAssassin::Conf(3) for details.


CategoryBayes

  • No labels