When did the reject happen?

conversation_for_blogEarlier today I approved a comment from Mike on a post about problems at AOL from 2012. The part of the comment that caught my attention:

SMTP error from remote mail server after end of data:
521 5.2.1 : AOL will not accept delivery of this message.

Mike also mentioned his IP reputation is good, when he checks at AOL so he doesn’t understand why mail is being blocked.
I think the big clue is after the end of data and would look at the full content of the mail, particularly domains and URLs, to identify is triggering the block.
In the SMTP transaction there are only a few places the ISP can stop the transaction and each spot tells us different things about why the ISP is rejecting the message.

After connection

A block after connection is a block either against the IP address or against the domain in the rDNS of the IP. IPs with no rDNS or generic DNS can also be blocked here. Blocks here do happen, but many recipients will let the SMTP transaction continue.

After HELO/EHLO

A block after HELO/EHLO is often a block against the domain in the HELO/EHLO or against a particular HELO/EHLO. Malware and bots often have distinctive HELO/EHLO patterns and it’s common for those kinds of senders to be blocked at this point.

After Mail From

A block after Mail From is often directed at the domain in the bounce string. Some senders do check to make sure the domain has a MX and will block if it doesn’t. Blocks don’t happen here very often.

After RCPT To

Blocks here are not always spam related. Most of the delivery failures at this point have to do with non-existent addresses.

After DATA

Blocks after data mean the ISP has actually seen the full content of the email. If a block comes after DATA the full content of the message including the recipient and their permission status should be evaluated as part of the determination about what is triggering the block.
Using when the rejection happened is an important part of understanding why a block happened. For instance, if a block happens before DATA, you know that content isn’t relevant, because the ISP never saw the content. If a block happened before Mail From: you know it’s the IP address reputation or configuration. If a block happened after DATA you know you need to look at the whole message.
 

Related Posts

Deliverability and IP addresses

Almost 2 years ago I wrote a blog post titled The Death of IP Based Reputation. These days I’m even more sure that IP based reputation is well and truly dead for legitimate senders.
There are a lot of reasons for this continued change. Deliverability is hard when some people like the same email other people think is spam

Read More

Filter complexity

URLBlockingForBlogDuring the Q&A last week, I mentioned an example of a type of filter trying to demonstrate how complex the filters are. There was some confusion about what I was saying, so I thought I’d write a blog post explaining this.

Read More

Pattern matching primates

Why do we see faces where there are none? Paradolia
Why do we look at random noise and see patterns? Patternicity
Why do we think we have discovered what’s causing filtering if we change one thing and email gets through?
It’s all because we’re pattern matching primates, or as Michael Shermer puts it “people believe weird things because of our evolved need to believe nonweird things.”
Our brains are amazing and complex and filter a lot of information so we don’t have to think of it. Our brains also fill in a lot of holes. We’re primed at seeing patterns, even when there’s no real pattern. Our brains can, and do, lie to us all the time. For me, some of the important part of my Ph.D. work was learning to NOT trust what I thought I saw, and rather to effectively observe and test. Testing means setting up experiments in different ways to make it easier to not draw false conclusions.
Humans are also prone to confirmation bias: where we assign more weight to things that agree with our preconceived notions.
Take the email marketer who makes a number of changes to a campaign. They change some of the recipient targeting, they add in a couple URLs, they restructure the mail to change the text to image ratio and they add the word free to the subject line. The mail gets filtered to the bulk folder and they immediately jump to the word free as the proximate cause of the filtering. They changed a lot of things but they focus on the word free. 
Then they remove the word free from the subject line and all of a sudden the emails are delivering. Clearly the filter in question is blocking mail with free in the subject line.
Well, no. Not really. Filters are bigger and more complex than any of us can really understand. I remember a couple years ago, when a few of my close friends were working at AOL on their filter team. A couple times they related stories where the filters were doing things that not even the developers really understood.
That was a good 5 or 6 years ago, and filters have only gotten more complex and more autonomous. Google uses an artificial neural network as their spam filter.  I don’t really believe that anything this complex just looks at free in the subject line and filters based on that.
It may be that one thing used to be responsible for filtering, but those days are long gone. Modern email filters evaluate dozens or hundreds of factors. There’s rarely one thing that causes mail to go to the bulk folder. So many variables are evaluated by filters that there’s really no way to pinpoint the EXACT thing that caused a filter to trigger. In fact, it’s usually not one thing. It could be any number of things all adding up to mean this may not be mail that should go to the inbox.
There are, of course, some filters that are one factor. Filters that listen to p=reject requests can and do discard mail that fails authentication. Virus filters will often discard mail if they detect a virus in the mail. Filters that use blocklists will discard mail simply due to a listing on the blocklist.
Those filters address the easy mail. They leave the hard decisions to the more complex filters. Most of those filters are a lot more accurate than we are at matching patterns. Us pattern matching primates want to see patterns and so we find them.
 

Read More