June 15, 2004

Greylisting Enabled

I upgraded our Postfix installation today to the latest-n-greatest, primarily so we can make use of a great new technique in spamfighting: greylisting.

Here's how it works: there's a program called postgrey running on hexogen, which gets consulted by postfix everytime someone from the outside world tries to send us an email. Postgrey maintains a little database of (sending IP, sender address, recipient address) triples, along with the time each combination was seen. If the current message is a new triple, which either doesn't exist or was added very recently, postfix tells the sender to try again later (with a '450' SMTP error); if the triple already exists and is more than five minutes old, the mail is accepted.

"But how does this block spam?" you may ask. The whitepaper states (and my own observations have shown this to be true):

[.. The] vast majority of spam appears to be sent from applications designed specifically for spamming. These applications appear to adopt the "fire-and-forget" methodology. That is, they attempt to send the spam to one or several MX hosts for a domain, but then never attempt a true retry as a real MTA would ... In addition, with the recent rampant proliferation of email-based viruses, Greylisting has been shown to be extremely effective in blocking these viruses, as they also do not tend to retry deliveries.
I don't have hard stats yet -- I'll update when I do -- but so far it looks like the trend shown in this graph from the postgrey page applies to us, too..


Posted by eric at 02:13 PM | Comments (0)

March 11, 2004

Lump of Coal upgraded to Kryptonite: Amavisd-New

I've blogged before about the ongoing war to combat the unholy blight of spam around here, and last week I deployed a new weapon in the effort. It's called amavisd-new, and it takes the place of the previous SMTP proxy we were using (called spampd, which has served us nobly and well). The upshot is, we get some nice additional customization options and potent antivirus protection (courtesy of clamav) all under one roof. Techie details follow... The most useful customization option, and the one which took me a bit of headscratching to get right, is the idea of variable spam thresholds. As the FAQ says:
  • tag level is where X-Spam-Status and X-Spam-Level header fields start to appear (e.g. setting tag level to 0 (or even better to -999) would turn this on permanently);
  • tag2 level is where a message is considered spam as far as mail header fields are concerned: the X-Spam-Flag: YES header field appears, the X-Spam-Status gets a YES, Subject gets a ***SPAM*** if subject editing is enabled;
  • kill level is where a message is considered spam and countermeasures are taken: (reject/bounce/discard/pass), quarantine, notify, adding optional recipient address extension). It is common to set tag2 level the same as kill level, but some may prefer to set kill level even higher, perhaps combined with $final_spam_destiny=D_DISCARD;
I had to chew on that a bit, so let me post the relevant bits of our amavisd.conf and describe what it ends up doing to the spam.
$sa_tag_level_deflt  = -999.0; 
$sa_tag2_level_deflt = 5.0;
$sa_kill_level_deflt = { 'noc@explosive.net' => 5, 'ac@explosive.net' => 600, '.' => 20};
$final_spam_destiny  = D_DISCARD;  
So with $sa_tag_level_deflt = -999.0, we make sure that everything gets tagged. This is mostly just to make sure that SpamAssassin is working properly. Our $sa_tag2_level_dflt = 5.0 sets our threshold for what ought to be considered spam. The value for $sa_kill_level_deflt is an anonymous hash reference to key => value pairs where the key is somebody's email address (the '.' sets the default) and the value is the spam score at which the destiny will be invoked.
Since $final_spam_destiny is set to "discard", anything that gets a SA score of 20 or above goes straight into the bit bucket while crafty amavisd-new tells the sending MTA that it accepted the message, so no bounces are generated. There is a low kill level for 'noc@explosive.net' because that address goes into our Request Tracker queue and tickets being auto-generated by spam causes me to curse and gnash my teeth. There's a high queue level for 'ac@explosive.net' because it is a spamtrap address that feeds directly into the SpamAssassin Bayes database.
(You hear that, trawling spam-address harvesters? ac@explosive.net is a great address to send some email to! I hear they're particularly interested in performance-enhancing pharmeceuticals...)
I somewhat arbitrarily picked the 20-point default because it seems high enough that the odds of a false positive are very slim, but low enough that we can cut down on the amount of outrageously spammy spam we have to store on disk, even in your spam folder.

At any rate, this is a lengthy way of explaining how amavisd-new implements something that I had previously thought was not possible; namely, per-recipient preferences for SpamAssassin thresholds. Because it's a SMTP proxy, it knows each of a message's ultimate recipients, receives and saves a copy of the whole message, and then decides for each destination address whether to tag-and-forward or discard it.

And I haven't even talked about the Antivirus part, which is also really cool -- hourly automatic virus signature updates? same-or-better coverage and response compared to the big boy AV vendors? for FREE?! Hell yeah! Big props to the clamav and amavisd-new developers, this is one little corner of the net that's a brighter place thanks to you.

Posted by eric at 03:03 PM

December 18, 2003

A Lump of Coal for the Spammers

Some December SpamAssassin updates:
  • Chris Santerre's comprehensive EvilRules.cf has been updated to the latest "bigevil.cf"
  • Bayesian filtering is turned on -- a misplaced configuration directive had been causing us to miss out on this great Force For Good in the world. Here's Paul Graham's latest thoughts regarding Bayesian classification
  • I made some rules to catch the incredibly annoying "Re: MANJHOY, a ticket exploding" type spams since they were slipping by as false negatives. The additions are below...
My local.cf additions for these "dada spam":
header      ERIC_2THRU8 Subject =~ /^Re:\s+[[:upper:]]{2,8},(?:\s\w+){3}/
describe    ERIC_2THRU8 Annoying spams with 2-8 caps, then 3 words
score       ERIC_2THRU8 2.5

header      ERIC_2THRU0 Subject =~ /^Re: \%RND_UC_CHAR\[2-8\],/
describe    ERIC_2THRU0 When spamware screws up
score       ERIC_2THRU0 2.5
They seem to be doing pretty well. Between these, the new EvilRules, and the BAYES_90 rules, my miss rate has gone down to zero over the past 12 hrs. Let's see how long it lasts..

Posted by eric at 08:25 AM | Comments (1)

November 18, 2003

More SA updates

Brought us up to the newest Rules Emporium SpamAssassin rulesets. There's some pretty wily stuff in here. Check this out:


meta FVGT_combo_IMAGEONLY1
((HTML_IMAGE_ONLY_02 + MIME_HTML_ONLY + MIME_HTML_ONLY_MULTI) > 1)
describe FVGT_combo_IMAGEONLY1 FVGT - Image only type spam?
score FVGT_combo_IMAGEONLY1 4.3

This is a "combo rule" that (just like a Tony Hawk combo) gives a spam an additional 4.3-point bonus if all three of the standard "image only" type rules have already been matched. I think they just added this "meta" capability recently, and it's a great idea because it should help to put marginal spams over the top.

UPDATE: However, some of the new rules seem to be a bit aggressive. I've adjusted the threshold score back up to 5 (from 4.5) and bumped down some of the point values for overzealous rules. Please let me know if you're getting false positives.

Posted by eric at 08:07 AM

October 23, 2003

SpamAssassin Updates

I made some updates to the sitewide SpamAssassin setup today. Things look good so far -- no false positives and only three misses out of sixty SPAMs I've gotten.

We're now running the latest version of Mail::SpamAssassin (2.60) via spampd, a fast and safe pseudo-SMTP proxy which runs everyone's mail through SA without having to use procmail. However the spammers have been getting ever-craftier, and recent weeks have seen a miss rate of about 30%, so more stringent measures were clearly needed.

I found this great site after some googling and catching up on the sa-talk mailing list. I incorporated all of the popcorn, evil rules, and header checks in their current incarnations. Additionally I poked around a bit on the SA Twiki (whose domain name is strangely, coincidentally close to Malcolm's site) but haven't incorporated any of the rules there just yet.

I'd be happy to hear about any additional resources for user-contributed rules, especially if you're using something to catch those random letter groups they use to pad out the message bodies... x ltgvilujlaaaedo cjgmvybmux

-=Eric

Posted by eric at 06:10 PM