You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

We're looking for people to volunteer and make code contributions. Patches, code, perl, regression tests, rules, you get the picture. You'll have to send in a [http://www.apache.org/licenses/#clas Contributor License Agreement] before it can be accepted, but that's easy.

So, what are we looking for right now?

The Top N items

Speed

  • Submit code to speed something up without breaking anything. Minimum is probably about a 1% speed-up in overall check speed.
  • Have spamassassin use [ArchiveIterator], will let it work on mbox files, etc, which is much faster than a formail loop. [http://bugzilla.spamassassin.org/show_bug.cgi?id=1890 bug 1890]

  • Make spamd prefork children instead of forking a child per incoming message. [http://bugzilla.spamassassin.org/show_bug.cgi?id=3097 bug 3097]

Size

  • auto-whitelist [http://bugzilla.spamassassin.org/show_bug.cgi?id=3082 bug 3082] and bayes_seen databases need to have automatic expiry.

Bayes accuracy and speed

  • Code and corpus tests that for ramping up the probability for previously unseen tokens. This could be done either heuristically or by keeping real counts of unseen tokens in the Bayes token database. The idea is that words that have never been learned before get high probabilities.
  • Custom database file and code for faster performance and space savings (probably to be compared against qdbm and tdb since they look most promising right now as non-custom databases).
  • Bi-grams: that is, multi-word windowing as used in CRM-114, using two-word tokens (or possibly even higher).
  • Implementing Dobly noise-reduction - [http://bugzilla.spamassassin.org/show_bug.cgi?id=3078 bug 3078].

  • Dynamically determining the autolearning thresholds based on incoming email rather than using hard-coded numbers. See [http://bugzilla.spamassassin.org/show_bug.cgi?id=1829 bug 1829] for more.

  • Looking for specific header tokens when they change location between the original message and the reply. See [http://bugzilla.spamassassin.org/show_bug.cgi?id=2129 bug 2129] for more.

Other ideas

  • Translation : translation of rule descriptions, the manual, the website in other languages
  • No labels