Learn how to tune and extend SpamAssassin to catch more spam.
The December 2003 “Tech Support” (http://www.linux-mag.com/content/view/1524/) showed you how to install SpamAssassin (SA), a free, open source, and effective spam filter.
After installing SA, you should immediately notice a dramatic decrease in the amount of spam that makes it to your Inbox. The SA developers do a tremendous job of writing and thoroughly testing the rules that determine each incoming message’s spam score. Moreover, a default install of SA incorporates several means of detecting spam, including Bayesian-style probabilistic classification, email header and body analysis, and much more. (For a full list of rules, see http://spamassassin.apache.org/tests_3_1_x.html.)
While the default SA rules are comprehensive, spammers are extremely tenacious and inventive. By utilizing the modular nature of SA, you can enable additional plug-ins to help you catch a significant amount of additional spam, while keeping false positives relatively low.
To ensure you’re getting the most out of SA, verify that you’re using the package appropriately. Do you have bayes enabled and working properly? Just as importantly, are you training it? Are your network tests running properly? A cursory check should be able to verify all this, and can make a substantial difference.
Next, look at the “Optional Modules” section of the INSTALL file. This section details which CPAN modules are needed to enable additional SA functionality. Some of them, such as Net::DNS are absolutely vital (Net::DNS is used for all DNS-based tests including SBL, XBL, SpamCop, and DSBL, among a variety of other DNS-related tasks). Once you have the required CPAN modules installed, it’s time to move to SA plug-ins. For a myriad of reasons, including licensing and terms of service issues, some SA plugins are disabled by default. Among the plugins disabled by default are DCC and Razor2.
DCC, or Distributed Checksum Clearinghouse, is an anti-spam content filter that uses fuzzy checksums to recognize unsolicited bulk mail. Available from http://www.rhyolite.com/anti-spam/dcc/, you should read the DCC license before you deploy it as it is not Open Source software.
In-depth installation instructions are beyond the scope of this column, but for a client-only setup, it’s essentially as easy as…
$ ./configure ––disable-dccm \
––disable-server && make
$ sudo make install
Once installed, edit the appropriate configuration files per DCC’s INSTALL.txt, start dccifd, and uncomment the line loadplugin Mail::SpamAssassin::Plugin::DCC in the SA v310.pre file.
Similar to DCC, Vipul’s Razor is a distributed, collaborative, spam detection and filtering network. Through user contribution, Razor establishes a distributed and constantly updating catalog of spam in propagation that is consulted by email clients to filter out known spam. Available from http://razor.sourceforge.net/, Razor2 is distributed under the Artistic License.
While Razor2 has a few CPAN module requirements, they’ve all been conveniently bundled in the razor-agents-sdk package. You can choose to install it via whichever mechanism you are comfortable with. Once the prerequisites are installed, razor-agents can be installed as most Perl modules are:
$ perl Makefile.PL
$ make; make test
$ sudo make install
$ razor-admin -create
$ razor-admin -register
Finally, you must uncomment the loadplugin Mail::SpamAssassin::Plugin::Razor2 line in v310.pre.
With DCC and Razor2 now installed, you look though the SA *.pre files for other disabled-by-default tests that you feel may help in your environment.
With SA properly tuned and running with some optional modules, you should notice less spam. If you want to take things a step further, you’ll want to look into adding additional SA rules. he best place for this is http://www.rulesemporium.com/, which has a vast repository of tested SA rules. Be sure to carefully read the notes before implementing any third-party rules, and keep in mind what impact they may have on legitimate mail. Being overzealous can cause you as much grief as it saves.
Your goal should be getting as little spam as possible, while keeping false positives to an absolute minimum. Always test the rules you add thoroughly and keep a close eye on things after you make changes. If you notice that you consistently get a kind of spam that SA lets through, even with the additional modules enabled and 3rd party rules added, don’t be afraid to come up with a custom rule that targets your specific problem exactly.