You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Rescore Mass-checks for Set 0 and Set 1

(THIS IS ONLY A DRAFT RIGHT NOW)

The mass-check runs for 3.0.0 will be starting shortly. Here's the procedure you'll need to follow, if you wish to submit data for the rescoring run:

First, send mail to <submit.at.spamassassin.org>, and ask for a log-submission account if you haven't already got one.

Turn off your nightly mass-checks, if you're running them, if you want; they aren't important while this is going on.

Then run these commands:

  wget http://SpamAssassin.apache.org/released/Mail-SpamAssassin-3.0.0-pre2.tar.gz
  tar xvfz Mail-SpamAssassin-3.0.0-pre2.tar.gz
  cd Mail-SpamAssassin-3.0.0
  perl Makefile.PL < /dev/null; make

  cd masses
  mkdir spamassassin
  rm spamassassin/bayes*
  echo "use_bayes 0" > spamassassin/user_prefs
  echo "bayes_auto_learn 0" >> spamassassin/user_prefs
  rm ham.log spam.log

  ./mass-check --net -j 4 --restart=400 --all <targets>

<targets> is the list of directories, mboxes, etc., like
spam:dir:~/Mail/spam. See the comments at the top of "mass-check" for details.

This takes *ages* to run. -j 4 controls the number of processes to use; 4 should be OK for a single-processor machine, since most of the time they'll be waiting for network results to arrive.

If you have an unusual network layout, you may need to specify
trusted_networks in the spamassassin/user_prefs file. But SA should be able to infer it in most cases.

Once it finishes:

  USER="[whatever your username is]"
  RSYNC_PASSWORD="[whatever your password is]"
  export RSYNC_PASSWORD

  rsync -CPcvuzb ham.log $USER@rsync.spamassassin.org::submit/ham-nobayes-net-$USER.log
  rsync -CPcvuzb spam.log $USER@rsync.spamassassin.org::submit/spam-nobayes-net-$USER.log

That's it! Then we do the bayes+nonet and bayes+net runs later on.

The results for this run will need to be in by Monday July 19th.

  • No labels