You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

The Brain{Storm,Dump}

_ This is a page for SpamAssassin developers and other interested parties to collaboratively jot down some ideas. Some of these might evolve into real code one day others might stay here forever. But as long as they are here, they are generally up for adaption and proof-of-concept implementations are welcome. If you write down an idea, try to think of all the pros and cons which come to your mind. And don't forget to mark your submissions with your name_

Multi-level rules

MalteStretz: The idea:

  • Each rule is assigned an integer level or reliability. There are only a few (three to five) of those classes/blocks.
  • First rules with the highest reliability are run.
  • If their conclusion already marks the message as spam or ham with a very high propability, all lower levels are skipped.
  • This is similar to the now gone feature of shortcutting other tests when a certain score was reached but has the advantage, that it doesn't has the overhead to check the limit after each rule but after each block/level (of which there are only a few).
  • (Hopefully) tests are run in a pre-defined as-optimal-as-possible order which could be determined by some algorithm (GA, perceptron).

SVM and RF

DanielQuinlan: trusted networks heuristic:

  • doing reverse look-up of HELO, if it matches the user's domain, and forward DNS matches of that matches the IP address, then trust it - does that help? (may be helpful for extending the boundary)
  • find an IP that matches exteral MX, know it's external then (probably not as helpful for finding a boundary?)

JustinMason: tried both of these before, expensive in terms of time, requires network lookups to compute trusted network boundary, also I got false negatives! (spams that were "close enough" to the MX record). Also this algo is very similar ot spamcop's.

DanielQuinlan: limit number of messages per sender (from morning talk)

  • for Bayes learning to avoid over-training for that user (expiry, rolling over - how?)
  • for rescoring process (like this idea)

MalteStretz: What does "AVM" and "RF" mean? (Too lazy too Google)

Trustwebs for Whitelisting

MalteStretz: See [TrustNetNotes].

  • No labels