Nightly MassCheck runs are the way people submit data on the effectiveness of current rules on their recent spam and ham. It is used to generate the very rule scores that determine the effectiveness of SpamAssassin (distributed via sa-update), and to evaluate rules via the RuleQaApp. The accuracy of SpamAssassin is directly related to the number of people contributing to nightly MassChecks.
Usually a script is run from cron which automatically downloads the latest development version of SpamAssassin, runs it against your spam and ham, and then uploads a log of the results. One line per email, with a list of the SpamAssassin rules each email hit.
git clone git://git.fedorahosted.org/auto-mass-check.git |
auto-mass-check/auto-mass-check.sh
to ~/bin/
auto-mass-check/auto-mass-check.cf
to ~/.auto-mass-check.cf
~/.auto-mass-check.cf
to point at your ham and spam folders. Be sure to configure properly for mbox (mbox) or Maildir (dir) folder formats. Leave the RSYNC options unchanged for now, because you will be running auto-mass-check in test mode at first.auto-mass-check.sh
.
~/masscheckwork/nightly_mass_check/
for ham-*.log
and spam-*.log
files. (Or weekly_mass_check on Saturday.)ham-username.log
or ham-net-username.log
.~/.auto-mass-check.cf
and set RSYNC_USERNAME and RSYNC_PASSWORD with values from step 1.auto-mass-check.sh
, which will upload your results.rsync --old-d username@rsync.spamassassin.org::corpus/
(External documentation for auto-mass-check script.)
The easiest of all methods is to upload your corpora and let us process it for you: UploadedCorpora
The corpus-nightly script is a less maintained alternative to the auto-mass-check script: CorpusNightlyScript
Or you can do it manually: ManualNightlyMassCheck