Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add warning note

...

If you're having trouble with Bayes, see BayesFaq for help.

Things to remember

  • Do not train Bayes on different mail streams or public spam corpora. These method will mislead Bayes into believing certain tokens are spammy or hammy when they are not.
  • To train Spamassassin, you get a mailbox full of messages that you know are spam and use the sa-learn program to pull out the tokens and remember them for later:
    sa-learn --showdots --mbox --spam spam-file
    Then you get a mailbox full of messages you're sure are ham and teach Bayes about those:
    sa-learn --showdots --mbox --ham ham-file
    It is important to do both.
  • The bayesian classifier can only score new messages if it already has 200 known spams and 200 known hams.
  • If Spamassassin fails to identify a spam, teach it so it can do better next time. Run it through the sa-learn program and it will be more likely to correctly identify it as spam next time. Likewise, if SA puts a ham in your spam folder, run that message through sa-learn --ham ham-folder.
  • It's OK to feed emails with Spamassassin markup into the sa-learn command – sa-learn will ignore any standard Spamassassin headers, and if the original email has been encapsulated into an attachment it will decapsulate the email. In other words sa-learn will undo any changes which Spamassassin has done before learning the spam/ham character of the email.
  • If you or any upstream service has added any additional headers to the emails which may mislead Bayes, those should probably be removed before feeding the email to sa-learn. Alternatively, use the bayes_ignore_header setting in your local.cf (as detailed in the man page for Mail::SpamAssassin::Conf).
  • An example of a ham-file could be ~/mail/saved-messages, or wherever your email client saves messages. Make sure all spam is deleted before using sa-learn on a ham-file.
    Similar to the training example above, for a maildir format mailbox, the commands should be altered as shown below.

...