Rules Project: BuildBot

(part of RulesProjectPlan)

For active rule development, it's important to get rules mass-checked very quickly.

Loren outlined the system used in SARE:

For active rule development, this is obviously quite useful! If you can't run mass-check locally for whatever reason, it offers a way to do this using other people's corpora in almost-real-time.

JustinMason: We had been considering a pretty direct copy of this. However I think I've thought of an alternative that improves on it, in my opinion...

The Preflight Mass-checking BuildBot

We set up a BuildBot. It is dedicated to performing mass-checks in the zone, in a chroot, solely on the rules found in the "rulesrc/sandbox" tree. It uses a small(ish) corpus so that mass-checks run quickly.

Mail-submitted rules: if a mail arrives from a known good source at a certain address, and contains a text/plain attachment, that attachment is extracted and passed through "spamassassin --lint". If it passes, it's checked in as "rulesrc/sandbox/mailed/latest.cf", triggering the preflight buildbot to start its mass-checks.

Results are visible immediately through the normal buildbot web UI. They are presented as 'freqs' output (as usual).

Because mail-submitted rules are OK, we have to be a little more careful than normal. For example, plugins are ignored in mail-submitted rules. Nothing happens if --lint fails. In addition, the preflight mass-check script runs with very paranoid ulimits, to trap CPU-load problems.

Why use BuildBot?

In terms of results of mail-in rule checking; there's no automated check-ins into the "sandboxes" or "core" from this. Instead rules considered suitable for use are manually checked into the "sandbox" area by one of the committers who has privs to do that. With luck, they'll go into the core based on the automated testing described in RulesProjStreamlining. This is a rule-QA system, not an alternative to "svn commit".

Issues

The mailed-in rules are checked into SVN temporarily, therefore will be publicly viewable.

JustinMason: In my opinion, we should go with this for now anyway. The use of SVN makes it hard to make it privately-viewable only, but at the same time offers some of the biggest advantages – such as keeping archive copies of the submitted rules after the mass-check has completed, providing network transparency, and ensuring that submitted rules get mass-checked (eventually).