Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: added a section: What can go wrong

...

As you can see from these examples, the AWL operates by averaging out the "score spikes" between different emails. By the very nature of averages, this means that it will push the high points down, and the low points up, but it will always push them towards the average for that sender. As long as the average for that sender is on the right side of the spam/nonspam fence, it will do its job nicely.

What can go wrong

Now, with that said, it IS possible for the AWL to be polluted and cause problems. Generally this is the result of past misconfiguration or scoring problems that have since been fixed, but the AWL retains the old average and causes score problems, pushing things onto the wrong side of the spam/ham threshold line.

AWL database entries contain pairs of a sender's e-mail address along with an IP address from which mail entered the site's trusted zone. It is essential that SpamAssassin extracts the correct client's IP address from Received header fields. In order to do so, the following parameters must be correctly configured: trusted_networks, internal_networks (and the more exotic msa_networks). A misconfiguration can cause an incorrect IP address to be stored in AWL records and used in lookups, potentially treating both internal and faked inbound sender addresses as belonging to the same network IP address space.

Another potential problem is that AWL only keeps the first two octets of an IPv4 address (a /16 network block). If a spammer happened to send his message from some zombiized computer in the same /16 network block as a valid correspondent, using his faked e-mail address, then both would share the same AWL record. Consequently, a future legitimate mail could receive inappropriately high spam score from AWL, or vice versa, a future spam could benefit from legitimate mail correspondence from the same /16 network address space. A variation of this same problem is when mail arrives over IPv6 - the AWL as of version 3.2.5 is unable to store such IP address to a database and consequently treats such mail as if it were unable to determine an IP address, using only a sender's e-mail address as a key.

Blind white- or blacklisting can cause large score values to be entered into AWL records. Such values take lots of messages to be eavened out. Make sure that white- or blacklisting or other rules with high scores do not apply to messages for which they were not intended. For whitelisting only use selective methods such as whitelist_from_rcvd, whitelist_auth, whitelist_from_dkim or whitelist_from_spf - never use a plain whitelist_from.

Solution

If you have this problem, you can use spamassassin --remove-addr-from-whitelist to remove any prior knowledge about a given address from the AWL database. If you consult the main spamassassin manpage, there are other commands to force an AWL entry towards the black or white, but use these somewhat cautiously.

...