Fighting Spam and Viruses at the Server, Part V: The Linux Edition
In parts I through III of this series, we looked at spam and spam-fighting issues. While part IV looked at Windows-based solutions, along with appliances and third party services, in this article we look at the bulk of anti-spam products and projects out there, which are all aimed at the Linux and Unix market. We refer to Unix throughout this article, but in fact most of the items here can be run under any flavor of Unix--Solaris OS, Linux, Mac OS X, FreeBSD, and so on.
Why are most spam-fighting options aimed at the Unix world? It probably has to do with the fact that the vast majority of mail servers on the Internet run on Unix-based machines. That factor, in turn, stems from the close relationship between Unix and the Internet, which dates back decades before the arrival of Windows. It could also be argued, these days, that Unix attracts a more tech-savvy and activist crowd, which is exactly the group you'd expect to be interested in developing spam-fighting solutions. When developing for yourself, you're most likely to write programs for the platform(s) you use.
There are two different classes of anti-spam software available in the Unix world, commercial and free (as in, "free beer"). Under each of these, further classifications can sort the various types of solutions available. We've broken them down into groupings that we feel make sense, to help keep things clear.
Keep in mind that particularly high-volume sites (say, those who handle over 100,000 pieces of E-mail a day) often will want to separate these components onto different machines. For example, you can have one machine acting as the main E-mail server, receiving E-mail from the Internet, then passing that E-mail off to one or more content filters running your spam filter and the virus scanners, and then passing that off to another E-mail server that the users actually access to collect their mail.
Let's take a look at the most popular spam-fighting offerings in the Unix space.
Free (As In Beer) Solutions
Compared to the Windows world, the Unix world offers a staggering number of free spam-fighting solutions. Many of these tools are open source, and definitely fall into the "scratching your own itch" category from which many of these software projects began, before growing into projects used internationally.
We've loosely placed the free tools into four categories: spam detectors/classifiers, virus scanners (hey, if you're going to block one, you should block the other!), content filtering frameworks, and quarantine managers. Often, you'll use all four of these tools together, in order to build a super-tool extraordinaire.
Arguably the most popular of the free spam detectors out there is SpamAssassin, and for good reason: it's not just free, it's among the most effective spam-fighting tools available for any platform. It offers all of the best of breed spam-fighting techniques in one nice, neat package: a feature recognizer, DNSBL and SPF lookups, collaborative reporting networks, and Bayesian filtering. On top of that, SpamAssassin includes a comprehensive scoring framework that takes all of these tests into account before making a diagnosis, rather than just making a quick decision based on a single test (a technique that's terribly prone to error).
On top of all of this, whitelists and blacklists are supported, and an "auto-whitelist" feature can bias an E-mail message's "this is spam" score downward toward the ham spectrum if it comes from someone who has a history of sending you mostly ham. Add feature recognizer custom rulesets contributed by spam-fighters from around the world--and collected at sites like theSpamAssassin Custom Rules Emporium--and who wouldn't be amazed at the high levels of accuracy SpamAssassin boasts?
Yet another of SpamAssassin's strengths is that you can use it on the desktop, even if your mail administrator isn't filtering spam on the server. (But now that you've read this series, you're going to start doing this, aren't you?) Unix desktop users can use SpamAssassin directly, while Windows desktop users can install McAfee's "SpamKiller" product, which is based on the SpamAssassin engine.
SpamAssassin can be integrated directly with many popular Unix-based mail servers, including Sendmail and Postfix, along with other content-filtering applications.
The field is narrower when it comes to free anti-virus tools, in part because someone has to pay attention and keep updating the tools' virus signatures for users to download when a virus or worm is discovered. Open source products likeClam AntiVirus aim to fill this void with free, community-supported anti-virus protection. Clam AntiVirus offers the same features found in most commercial anti-virus products, including the ability to run in command-line or daemonized mode, automate signature updates, and scan archives of various types.
This program's remaining weakness lies in its relatively limited database of virus signaturesabout 20,000 at the moment, as compared with the 60,000 or so that many commercial anti-virus vendors advertise. Arguably, however, database sizes are misleading, since many signatures from Symantec, NAI, Kaspersky, and Sophos are for ancient viruses that haven't been seen since the days of MS-DOS, back when viruses were spread primarily through infected floppy disks.
When it comes to modern mail-borne viruses, Clam AntiVirus competes head-to-head with commercial products. New threats are identified at lightning speed, with signature updates submitted by contributors around the world, often within hours of an outbreak. Certainly, if you're going to integrate spam and virus-fighting with your mail server, it doesn't hurt to include Clam AntiVirus along with any commercial anti-virus products you may use. Having more than one virus scanner increases your chances of catching the latest worms and other nasties infecting the E-mail pipes.
Content Filtering Frameworks
Content filtering frameworks help you to integrate multiple scanning tools, in order to enhance your server's spam-fighting effectiveness. A couple of good examples of content-filtering applications based on SpamAssassin are amavisd-new and MailScanner, both of which are free and open source projects.
Adding one of these tools on top of SpamAssassin increases the number of content-filtering tests available, including the ability to call multiple virus scanners, check for invalid mail headers, and identify dangerous file attachment types (as defined by the mail administrator).
This framework approach adds thoroughness to the filtering process. For example, it provides central quarantining options for blocked mail even when it was marked as "bad" by different tools, such as allowing administrators to choose to do different things with different types of mail. You can choose to do one thing with spam, another with viruses, and yet a third with the ham. It also gives your end users some control over their content-filtering settings, without having to figure out the separate interfaces for every other tool under that framework.
Quarantine Management Tools
And finally, in the "free as in beer" department, there's quarantine management. We'll note up front that this is one of our own obsessions, as we're both firm believers in Lose No Mail (especially our own), and neither of us really trusts any computer-based filter to reach nearly 100% efficiency without first losing a pile of things we needed to see.
Since there was nothing out there that quite fit what we needed, Rob "scratched his own itch" and started what turned into his own open source project, Maia Mailguard. This tool takes amavisd-new yet a step further, adding a rich set of quarantine management facilities with a Web-based interface.
Without needing to be technophiles, Maia Mailguard users can adjust their personal filter settings at the mail server end, manage their own whitelists and blacklists, and rescue quarantined items from their own quarantine areas. Maia Mailguard also lets users "confirm" SpamAssassin's educated guesses, so that false positives and false negatives can be used to train the Bayesian learning system to not make those mistakes again, and confirmed spam can automatically be reported to the collaborative spam-fighting networks. Understandably, we think this is the best thing since sliced bread! You might not, and that's okay. That's the beauty of the great software multiverse. Use whatever tool best meets your needs.