3 Comments

Summary:

Unless you just got an e-mail address recently, you’ve had to deal with what we in the industry refer to as a “metric crapload” of spam. By some accounts, spam now accounts for some 80 percent of all e-mail traffic. How can you stop the flood […]

Unless you just got an e-mail address recently, you’ve had to deal with what we in the industry refer to as a “metric crapload” of spam. By some accounts, spam now accounts for some 80 percent of all e-mail traffic. How can you stop the flood and keep e-mail usable?

Server-side tools like SpamAssassin are a good start, but they’re useless if your mail server administrator doesn’t set them up properly. In many cases, ISPs don’t even use SpamAssassin, or they use older versions that are significantly worse at identifying spam. Whatever the case, the typical end user sees far more junk e-mail than they should.

Enter SpamSieve, by Michael Tsai. This $25 piece of shareware (with a 30-day fully functional trial) is the answer to spam on Mac OS X. It uses Bayesian filtering techniques, coupled with intelligent whitelisting based on Mac OS X’s Address Book and a blacklist of spam addresses, to identify and separate your junk e-mail (“spam”) from legitimate e-mail (“ham”).

Installation is fairly simple, but I recommend you give the manual a thorough read if you’re installing SpamSieve for the first time. The installation process for most clients is fairly straightforward, but there are minor “gotchas” for some of the more esoteric clients, particularly Eudora (which I happen to love and use). The only major Mac OS X mail client not covered by SpamSieve is Thunderbird, which has pretty good Bayesian filtering integrated into the application, rendering most of SpamSieve’s functionality redundant.

Once SpamSieve is installed, it can go to work right away. However, due to the probabilistic nature of Bayesian filtering, the first few weeks (less if you get a lot of e-mail; more if you get very little) are necessarily a “training period.” If you keep archives of past spam and ham, training SpamSieve is a piece of cake and takes only a few minutes. The more messages you can train it with, the better. I used the past three months’ worth of spam and ham, roughly 4000 messages, to train the filter, and with the initial learning period included, total accuracy is at 96.5 percent. (If the first two weeks are discarded, that accuracy goes up north of 99 percent.) If you don’t have a large body of junk to train SpamSieve, don’t worry — it will still do an excellent job, but the more you train it, the better it gets. All but the most casual e-mail users should find acceptable filtering after a month of training, “acceptable” being defined as “I trust SpamSieve to put the junk directly into the trash without double-checking it first.”

SpamSieve keeps great statistics, so stats geeks will have plenty to keep them entertained. Since I installed SpamSieve, there have been 258 “hams” and 603 spams. Legitimate e-mail represents just 30% of my incoming e-mail traffic, and I’m averaging 36 spams per day over the past two weeks. There have been 23 false positives (legitimate e-mail marked as spam, in my case almost entirely low-priority automated mailing list traffic) and seven false negatives (all of which were immediately used to train the filter further).

I don’t normally go for shareware in the fairly rarefied over-$20 realm, but SpamSieve is absolutely worth it. Eudora users will find it half the cost of an upgrade to Eudora’s Paid mode (which is required to use the vastly inferior built-in junk mail filters). SpamSieve will pay for itself within a month by freeing you from worrying about whether or not your junk mail filter(s) caught anything important. For people who archive all their incoming mail — like me — SpamSieve will easily save you $25 worth of time in the first month.

Go try it free for 30 days. If a month of spam-free e-mail doesn’t convince you it’s worth the price, I don’t know what will.

Full disclosure: as a Man of Many Hats™, Chris Lawson also writes for About This Particular Macintosh, where Michael Tsai, the author of SpamSieve, is editor-in-chief. This review was in no way influenced by his professional relationship with SpamSieve’s author.

  1. And for those who don’t use the metric system, it’s just a “crapload”…

    Share
  2. I have been using spamsieve since november. I too trained the product using old junk mail and old saved messages, and spam sieve is unbelievably accurate.

    Statistics report that I have 83% of my mail is spam, 8 false positives and 2 false negatives.

    This product has an amazing 99.8 % accuracy rate and has had from about day one ( I trained it with 2 years worth of mail then went through the corpus and removed all entries that only had 1 or 2 hits. ) The product does yell at me every day that my corpus is to large, and that I should start over and retrain for accuracy and performance reasons. As far as accuracy goes, I couldn’t ask for a more accurate product. I am happy with the performance even with a monsterous corpus. periodically I cleanout the rules with only a single hit which in my case usually removes about 100,000 rules from the corpus.

    I don’t have any relation to the author of the software other than having sent him $25.00. One of the best investments ever made.
    If only it interfaced with thunderbird? HINT HINT HINT!!!

    Jeffrey Anderson

    Share
  3. Richard Brown Monday, November 6, 2006

    I’ve used SpamSieve for about two years with Apple Mail. My statistics show an accuracy rate of 98.1%. Frankly, I though the percentage was even higher. I don’t recall the last time SpamSieve made a mistake.

    Share

Comments have been disabled for this post