Fighting Spam with SpamAssassin

Posted on March 11, 2019 by Scott

A wise man once said that only two things are certain: death and spam.

Because of the way email works, it’s always attracted people who want to stuff your inbox full of unsolicited bulk email, called spam. Spam ranges from merely annoying to downright malicious, and spammers can be pretty sophisticated in how they get their messages to your eyeballs.

Your inbox is a valuable place, and it should be protected. Enter the venerable Apache SpamAssassin. SpamAssassin is a suite of tools for detecting spam. Purelymail has used SpamAssassin since launch, so the news here is that we’re now taking advantage of its automatic learning capabilities.

These come in two forms:

  • The first uses the Bayes plugin. This scans messages classified as spam (or its opposite, ham) and learns which text patterns are likely to appear in legitimate versus illegitimate emails.
  • The second is the TxRep plugin, which maintains a “reputation” among people who email you. People who send you legitimate messages and who you send messages to will earn a good reputation, and spammers will earn bad ones. Having a good reputation makes it less likely the system thinks your emails are spam (even if they look very spammy!), and vice versa.

Learning happens whenever you move a message from or to the Junk folder. Messages moved to the Junk folder count as spam, and messages moved out of it as ham. Reputational learning also happens when you send outgoing emails, since if you send people emails it’s likely you want to hear their responses.

All learning currently happens on a per-user basis. No spam system is perfect, so have a little patience. In time, Purelymail should learn how to filter your inbox.

On a happy note, since our setup now checks outgoing emails for spamminess and we should now receive reports from other providers when outgoing emails register as spam, we’ve reduced the price for sent emails by 50%. Since almost all of the pricing for sent emails right now is a way to deter spammers from abusing our service and ruining our deliverability reputation, the faster we get at detecting and banning spammers, the lower we can price outgoing emails.

Ultimately, we hope to have that down to a negligible price, but this is a good first step.

Not a whole lot of other news for this week, mostly because we’ve been working on our internal monitoring and logging.