Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

  1. The "not enough rules" problem: SpamAssassin does not have enough high quality spam-catching rules. Anecdotally, our FN ratio seems to be much higher with 3.1 than with 3.0 (we won't know for sure until the mass-checks are done). There may be a variety of reasons for this:
    • The SpamAssassin committers are not spending much time writing rules. Attempts to recruit people to become committers to write rules have been somewhat unsuccessful. We could always use more committers and contributors; what can we do to encourage more contribution?
    • People that do write rules for their own use are not willing to go through the fairly elaborate process in order to submit them to SpamAssassin (this currently requires rules to go through bugzilla and then through 70_testing.cf and eventually into our distribution). What can we do to make this process easier and more inviting?

2. The "release cycle" problem: Any high quality rules that are incorporated into SpamAssassin are not distributed until the next release. Since rules and code are tied together, the release cycle for rules is too long. Submitted rules are not distributed while they are most effective, and rules lose their effectiveness too quickly.'

...

Outstanding Tasks/Votes

Here's a list of the tasks that have fallen out of the above plan so far... we now need to vote to go forward with these, then put them into action.

...

  • PMC vote to approve the sandboxes project (RulesProjSandboxes).
    • VOTE: passed!
    • Done
  • reorganise the rules directory into core/ , sandbox/, and extra/; link that rules project SVN repository to 3.2.0's 'rules' dir; use SVN externals to do this.
    • VOTE: passed
    • Done
  • move current ruleset into a new "core" area
    • Done
  • write scripts to test, filter, and pull rules from sandboxes automatically into core/ production ruleset
    • Dropped in favour of:
  • write scripts to test, filter, and pull rules from sandboxes and core, as a compilation step, into an output directory (see RulesProjPromotion)
    • Done
  • start using the above scripts to generate ruleset in svn
    • Done!

Phase two – mass-checking systems:

  • Weekly mass-check
    • DONE: all rules, with --net
  • Nightly mass-check: web-based user interface for the following data:
    • DONE: freqs for all rules
    • DONE: freqs collated across all users' corpora, or individually
    • DONE: overlaps between rules
    • DONE: historical rule hits data?
    • DONETODO: rule-by-rule comparative performance figures?
    • TODO: historical rule hits data?
    • TODODONE: promotion criteria as defined in RulesProjPromotion, so rules that can be promoted can be identified at a glance

Phase three:

  • TODODONE: the RulesProjBuildBot system, comprising these tasks:
    • _DONE: set up new buildbot master - http://buildbot.spamassassin.org:8011/_
    • DONE: set up user in zone
    • DONE: set up new buildbot slave
    • DONE: set up chroot jail
    • DONE: get mass-check running in chroot
    • DONE: copy in corpus
    • DONE: set up additional slaves for additional corpora
    • DONE: write mass-check wrapper script to:
      • DONE: mass-check that corpus, using rules from .../rulesrc/sandbox/ only
      • DONE: implement strict ulimits
    • DONE: write mass-check-completed script to:
      • DONE: output freqs so it's visible through the Buildbot UI
    • DONE: write mail-handling script to extract mail-submitted rule attachments (PreflightByMail)

Loose-ends-tying-up:

  • TODO: still need to nail down promotion criteria, in particular performance figures
  • TODO: bug fixing as they crop up