Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: [Original edit by JustinMason] move administrivia to new page

...

If you rsync up your corpus to our server, as described in UploadedCorpora, it can be mass-checked there. Unfortunately you have to share your mail corpus with whoever might have access to that machine. ( It's not expected that anyone will ever actually look, but it's there nonetheless. If you are very concerned about privacy, you may be advised to strip out the more private mails before uploading, or mass-check on your own machine instead. (This is what I do --jm)

Details for PMC members on how to set up new accounts for this are below, under '(Administrivia: setting up a nightly mass-check user on spamassassin.zones.apache.org)'.are at NewUploadedCorporaUser.

How? (Less Easy, The Corpus-Nightly Script)

...

(The version of the tree available at rsync://rsync.spamassassin.org/tagged_builds/nightly_mass_check and .../weekly_mass_check already has this file included.)

(Administrivia: setting up a nightly mass-check user on spamassassin.zones.apache.org)

For PMC members who want to set up a user for the "Easiest" method. Log in to the zone and run:

No Format

MCUSER=[username]
MCPWD=[random password]

sudo mkdir /export/home/nitemc/$MCUSER
sudo chmod 1777 /export/home/nitemc/$MCUSER
cd /export/home/nitemc/$MCUSER
echo "$MCPWD" > rsync_password
chmod 600 rsync_password

sed -e "s/MCUSER/$MCUSER/" -e "s/MCPWD/$MCPWD/" > .corpus

And paste in these lines:

No Format

opts_weekly="--net -j 8 --reuse --cache --cachedir=/tmpfs/aicache_nightly --cs_schedule_cache --cs_cachedir=/export/home/nitemc/cache --restart=500 ham:detect:/export/home/bbmass/uploadedcorpora/MCUSER/ham/* --after="-15552000" --tail=25000 spam:detect:/export/home/bbmass/uploadedcorpora/MCUSER/spam/*"
opts_nightly=" --reuse --cache --cachedir=/tmpfs/aicache_nightly --cs_schedule_cache --cs_cachedir=/export/home/nitemc/cache --restart=500 ham:detect:/export/home/bbmass/uploadedcorpora/MCUSER/ham/* --after="-15552000" --tail=25000 spam:detect:/export/home/bbmass/uploadedcorpora/MCUSER/spam/*"
tmp=$HOME/tmp
tree=$HOME/svn
prefs_weekly=$HOME/user_prefs.weekly
prefs_nightly=$HOME/user_prefs.nightly
username=bb-MCUSER
password=__RSYNC_PASSWORD__
serverhost=spamassassin.zones.apache.org.:38899
clienthosts=__CLIENTHOSTS__
clienttree=nightlymc_MCUSER

Then CTRL-D to end cat.

No Format

mkdir tmp
svn co http://svn.apache.org/repos/asf/spamassassin/trunk svn
[accept certificate 'p'ermanently]

sudo chown -R nitemc .

In SVN trunk, edit build/nightlymc/run_nitemc, add their username to the list, check that file in.

Then in the zone, as the uid "automc", do this:

No Format

  cd /home/automc/svn/spamassassin
  svn up

so that that latest script is updated for when cron runs.

Finally, edit /home/corpus-rsync/secrets and add a line to the end, like so:

No Format

$MCUSER:$MCPWD

e.g. if MCUSER was "bb-jm" and the generated MCPWD was "Wi0FdPWg":

...