Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: converted to 1.6 markup

The SpamAssassin Challenge

Wiki Markup(THIS IS A DRAFT; see \[http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5376 bug 5376 for discussion\])

Wiki MarkupThe \[http://www.netflixprize.com/ Netflix Prize\] is a machine-learning challenge from Netflix which 'seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on their movie preferences.'

We in SpamAssassin have similar problems; maybe we can solve them in a similar way. We have:

...

Input: the test data: mass-check logs

...

We will take the [SpamAssassin] 3.2.0 mass-check logs, and split them into test and training sets; 90% for training, 10% for testing, is traditional. Any cleanups that we had to do during \[http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5270 bug 5270\] are re-applied.

The test set is saved, and not published.

...