...
- MISSING_HEADERS: if a message doesn't have all the normal headers, such as From, To, and Subject, this will fire. Be sure to hand-verify any ham and spam messages that hit this to ensure that they're formatted correctly (in RFC-2822 format).
- NO_HEADERS_MESSAGE (or a combo of MISSING_HEADERS,MISSING_DATE,MISSING_SUBJECT in versions < 3.2.0): generally means you've got message without most of the important RFC-822 headers (often errors generated by MUAs/MDAs).
- EMPTY_MESSAGE: generally zero-length files, esp if accompanied by NO_RECEIVED.
- MISSING_HB_SEP: This is another danger sign, typically indicating that a header line has had a newline inserted incorrectly somehow, or an mbox "From" line has been inserted between RFC-822 headers.
- ANY_BOUNCE_MESSAGE: this indicates that the mail was a bounce message, a C/R challenge, or a "virus warning" from a broken scanner. These should be removed from both the ham and spam corpora, in general.