High-level how

Automatic generation and possibly publishing (if not disabled, see DisableAutoRuleUpdates) of sa-update rule updates for the 3.3 branch are done by a script that runs from the updatesd user's crontab on spamassassin.zones.apache.org.

The script is /export/home/updatesd/svn/mkupdates-with-scores/do-stable-update-with-scores

A little lower how

The script in turn runs two other scripts. The first script (/home/updatesd/svn/new-rule-score-gen/do-nightly-rescore-example) generates new scores based on the last nightly mass-check and the last weekly mass-check. The second script (/home/updatesd/svn/mkupdates-with-scores/mkupdate-with-scores) generates an update package that includes the scores generated by the first script, test the update package against all stable versions of 3.3 and then publishes the update package in DNS (unless this is disabled) for all the the stable 3.3 versions that the update package passed a lint test against.

Score Generation How

Voodoo!

Basically, the script uses the latest weekly and nightly mass-check results and the active.list rules to go with those mass-check versions. It feeds the mass-check results to the garescorer. It locks the scores on the non-sandbox rules (the one's in the rules/ dir, rather than rulesrc/) so that the garescorer can't change the scores for them (the theory is that they were set with a more concentrated effort mass-check). The garescorer then goes to town on the new rules and assigns them scores taking into account existing rules' scores and nightly mass-check logs.

Currently any rules scores to zero are re-set to 0.001. There might be a flaw right now where net rules get turned on, too, when they shouldn't because of this. Although tflags net might turn them off anyway (I can't remember).

  • No labels