Rules Project: Rules Not in English

(part of RulesProjectPlan)

Problem description: SA rules development handles rules aimed at spam in English best, since most SA rules developers that feed the distribution system speak and correspond in English, and the great majority of the testing corpora are based in English. We're not as good at developing, validating, testing, or scoring rules in other languages.

The Problem

Because of this,

Potential Solutions

We could invent a class of rules that were 'test rules'. They would have nil score and wouldn't report on the mail summary if they hit. But they would show up in the report-home summary is to whether they hit, and whether it was ham or spam.

Then we can make rules that pass initial testing and stick them out for what we believe is good use, or maybe even for pure testing purposes. SA systems around the world would pick up these rules with sa-update, and would report home on the hit stats. If we have a good hitter that sucks in 'de', then we move it to an english-only ruleset, or we have an exclude-de option on the front of the rule or rule grouping. If the sysadmin has set his local language correctly, things should work out correctly.