Content based filters

Content based filters are incredibly complex and entire books could be written about how they work and what they look at. Of course, by the time the book was written it would be entirely obsolete. Because of their complexity, though, I am always looking for new ways to explain them to folks.
Content based filters look at a whole range of things, from the actual text in the message, to the domains, to the IP addresses those domains and URLs point to. They look at the hidden structure of an email. They look at what’s in the body of the message and what’s in the headers. There isn’t a single bit of a message that content filters ignore.
Clients usually ask me what words they should change to avoid the filters. But this isn’t the right question to ask. Usually it’s not a word that causes the problem. Let me give you a few examples of what I mean.
James H. has an example over on the Cloudmark blog of how a single missing space in an email caused delivery problems for a large company. That missing space changed a domain name in the message sufficiently to be caught by a number of filters. This is one type of content filter, that focuses on what the message is advertising or who the beneficiary of the message is. Some of my better clients get caught by these types of filters occasionally. A website they’re linking to or a domain name they’re using in the text of the message has a bad reputation. The mail gets bulked or blocked because of that domain in the message.
One of my clients went from 100% inbox every day to random failures at different domains. Their overall inbox was still in the 96 – 98% range, but there was a definite change. The actual content of their mail hadn’t changed, but we kept looking for underlying causes. At one point we were on the phone and they mentioned their new content management system. Sure enough, the content management company had a poor reputation and the delivery problems started exactly when they started using the content management. The tricky part of this was that the actual domains and URLs in the messages never changed, they were still clickthrough.clientdomain.example.com. But those URLs now pointed to an IP address that a lot of spammers were abusing. So there were delivery problems. We made some changes to their setup and the delivery problems went away.
The third example is one from quite a long time ago, but illustrates a key point. A client was testing email sends through a new ESP. They were sending one-line mail through the ESPs platform to their own email account. Their corporate spamfilter was blocking the mail. After much investigation and a bit of string pulling, I finally got to talk to an engineer at the spamfiltering company. He told me that they were blocking the mail because it “looked like spam.” When pressed, he told me they blocked anything that had a single line of text and an unsubscribe link. Once the client added a second line of text, the filtering issue went away.
These are just some of the examples of how complex content based filters are. Content is almost a misnomer for them, as they look at so many other things including layout, URLs, domains and links.

Related Posts

New Delivery tools

A couple nifty new delivery tools were published over the weekend.
Mickey published Bounce P.I. where senders can paste in an error message or bounce and it will tell you what filter generated it. If the rejection is unrecognized, it will flag the message internally and it will be researched to see if the filter can be identified.
Steve has a new tool at the DKIMCore site. The key generating tool and the record checking tool have been up for a while. This weekend, though, he published a tool to check the validity the DKIM record published in DNS. Tool output shows if the record is valid, the version and the public key.

Read More

Define "spam"

A comment came through recently from Trent asking me to define spam. It’s been a while since I’ve talked about how I define spam, so let’s look at it.
Personally, I describe spam as unsolicited bulk email. If I didn’t ask for it and it looks like bulk mail then I consider it spam. In many cases the spammers have multiple email addresses of mine so I can demonstrate the mail was sent in bulk.
In my consulting and working with clients, though, I rarely use the word spam. There are so many different definitions of spam, I have no way to know if my clients understand what I am saying, so I avoid the term as much as humanly possible. An example of some of the few definitions of spam I’ve seen used over the years.

Read More