Email without filters

… or Find the False Positive.
Anyone sending a lot of email has complained about spam filters and false positives at some point. But most people haven’t run a mailbox with no spam filters in front of it in recent years, so don’t have much of a feel for what an unfiltered mailbox looks like, how important filters are and how difficult their job is.
I run no transaction level filters in front of my mailbox, just content filters that route mail to one of several inboxes or a junk folder, so if I want to I can look at what unfiltered email looks like. I took data from all mail that was sent to me yesterday, and put it in a format that really shows the problem filters face and especially the difficulty of spotting which mail in the junk folder is a false positive.
An inbox with no filters looks like this.

Running a spam filter against it, simply categorizing each email as spam (pink) or not-spam (green) looks like this.
 

Even with the messages categorized as spam vs not-spam it’s hard to work out which messages are important and which aren’t, let alone where the false positives might be.
If I sort the categories by hand you get this – where you can see that out of 1200 or so mails about three quarters were spam. Of the three false positives two were bulk email that I didn’t care that I didn’t receive and only one was email that I considered important.
 
 

Related Posts

Spamtraps

There is a lot of mythology surrounding spamtraps, what they are, what they mean, how they’re used and how they get on lists.
Spamtraps are very simply unused addresses that receive spam. They come from a number of places, but the most common spamtraps can be classified in a few ways.

Read More

Why do ISPs do that?

One of the most common things I hear is “but why does the ISP do it that way?” The generic answer for that question is: because it works for them and meets their needs. Anyone designing a mail system has to implement some sort of spam filtering and will have to accept the potential for lost mail. Even the those recipients who runs no software filtering may lose mail. Their spamfilter is the delete key and sometimes they’ll delete a real mail.
Every mailserver admin, whether managing a MTA for a corporation, an ISP or themselves inevitably looks at the question of false positives and false negatives. Some are more sensitive to false negatives and would rather block real mail than have to wade through a mailbox full of spam. Others are more sensitive to false positives and would rather deal with unfiltered spam than risk losing mail.
At the ISPs, many of these decisions aren’t made by one person, but the decisions are driven by the business philosophy, requirements and technology. The different consumer ISPs have different philosophies and these show in their spamfiltering.
Gmail, for instance, has a lot of faith in their ability to sort, classify and rank text. This is, after all, what Google does. Therefore, they accept most of the email delivered to Gmail users and then sort after the fact. This fits their technology, their available resources and their business philosophy. They leave as much filtering at the enduser level as they can.
Yahoo, on the other hand, chooses to filter mail at the MTA. While their spamfoldering algorithms are good, they don’t want to waste CPU and filtering effort on mail that they think may be spam. So, they choose to block heavily at the edge, going so far as to rate limit senders that they don’t know about the mail. Endusers are protected from malicious mail and senders have the ability to retry mail until it is accepted.
The same types of entries could be written about Hotmail or AOL. They could even be written about the various spam filter vendors and blocklists. Every company has their own way of doing things and their way reflects their underlying business philosophy.

Read More

Change is required

I get a lot of calls from senders who tell me that they have not changed what they were doing, but all of a sudden their mail isn’t performing the way it used to. Sometimes it’s simply less effective marketing, but more often than not the issue is mail being blocked or filtered to the bulk folder.
What worked today won’t work tomorrow. Spammers are forever evolving new techniques to get past spam filters. ISPs are forever evolving new techniques to stop them.
One of the current driving forces for spam filter development is focused on the individual recipients. Recipient wants and needs are king in the world of ISP mail filtering. Much of that is driven by the underlying business models of the free ISPs. They are selling eyeballs to their advertisers and that relies on keeping as many eyeballs around for as long as possible.
An early version of the recipient driven filtering was “add to your address book” where individual users could over ride ISP delivery decisions by actively adding a From: address to their address book. The ISPs have been refining this over time. For instance, if you reply to an email in some clients, you are prompted to add that address to your address books. If you take an email out of your bulk folder and move it to your inbox then that address is automatically added to your address book.
But the refinements haven’t stopped there. ISPs are now making smart decisions about what emails a particular recipient will want to receive. This raises a number of challenges to senders. How do you send email to ten thousand or a hundred thousand or a million people and make it relevant to all of them?
Smart senders will take the individual delivery challenge in stride. They will change along with the ISPs, to send mail that their recipients want to receive. Change is inevitable and required.

Read More