The Brain{Storm,Dump}

This is a page for SpamAssassin developers and other interested parties to collaboratively jot down some ideas. Some of these might evolve into real code one day others might stay here forever. But as long as they are here, they are generally up for adaption and proof-of-concept implementations are welcome. Actual feature-requests should still go to Bugzilla. If you write down an idea, try to think of all the pros and cons which come to your mind. And don't forget to mark your submissions with your name

Multi-level rules

MalteStretz: The idea:

Predicted Rule Reliability

BobMenschel: Close to these ideas, but different, is the specification (possibly via tflags indicating that a rule is expected to be "highly reliable" or "less reliable" than most.

For instance, a rule which tests for obfuscation in a long string or one that's not subject to false hits would be "highly reliable" (obfu on "viagra" comes to mind), while a single-word uri test would be "less reliable" than most.

During the perceptron scoring mechanism, we would adjust highly reliable rule scores higher, and less reliable rule scores lower than otherwise, to give a boost to the former, and to avoid false positives from the latter.

Bayes-driven rules

MalteStretz: The idea (see also "Multi-level rules"):

MalteStretz: I just noticed that BartSchaefer submitted bug 3785 "Suggestion: Use Bayes classifier to select among score sets" just a few hours ago. Very similar to this idea.

SVM and RF

DanielQuinlan: trusted networks heuristic:

JustinMason: tried both of these before, expensive in terms of time, requires network lookups to compute trusted network boundary, also I got false negatives! (spams that were "close enough" to the MX record). Also this algo is very similar ot spamcop's.

DanielQuinlan: limit number of messages per sender (from morning talk)

MalteStretz: What does "SVM" and "RF" mean? (Too lazy too Google)

Trustwebs for Whitelisting

MalteStretz: See [TrustNetNotes].