Bounce handling is hard

Sometimes I find it hard to find a new topic to write about. I decide I’m going to write about X and then realize I did, often more than once. Other times I think I can blog about some issue only to realize that it’s too complex to handle in a quick post. There are concepts or issues that need background or I have to work a little harder to explain them.
One thing I haven’t blogged about before is bounce handling. That particular topic falls into the other category of posts that take a lot of time to write and need a significant amount of work to make sense. I was even joking with my fellow panel members at EEC a few months ago about how that’s a post that so needs to be written but I’m avoiding it because it’s so hard. There’s so much to be conceptualized and explained and I realize it’s not a blog post but multiple blog posts, or a white paper or even a book.
Bounce Rate words on a thermometer or gauge measuring the rate of abandonment as visitors or audience leaves your website or online page or resource
So let’s start with some simple definitions.  Those of you who work at ISPs are probably thinking of bounces in terms of accept than reject, that’s not exactly what I’m talking about here. I’m writing these for senders, who usually call rejects during the SMTP transaction bounces.

What is bounce handling?

In the bulk mail space, bounce handling describes the process of what to do with future email to addresses that have not accepted one or more emails.
Most ESPs and senders segregate bounces into two categories: hard and soft. I’ll be honest, I’ve never been able to get a good definition for what a hard bounce vs. a soft bounce is in this context. I think every ESP defines them a little differently. So what I’m going to do here is describe them as I understand most people to use them. Anyone with different ideas or definitions, feel free to address it in comments.

What is a hard bounce?

A hard bounce is an email address that is not accepting email and is unlikely to accept email in the future. The most common example is address not found or unknown user responses from the ISP.

What is a soft bounce?

A soft bounce that is an email address that did not accept an email but is likely to accept email in the future. These can be things like spam blocks or temp failures or rate limiting by the receiving mail server.

That sounds so simple.

Well, yeah, it does. But at what point do you make the determination that an address is good or bad? We’re trying to make decisions about what to do in the future based on SMTP response codes. SMTP response codes do not address what to do with future mail to an email address. They just tell us what to do with that message.
In order to deal with bulk email, senders take the SMTP response codes and the text of the response and try to infer what to do with future emails. Each ESP and bulk SMTP server handles the interpretation differently. Incidentally the interpretation of SMTP codes for future mails is a not a feature found in most of the open source SMTP software. That software implements the RFCs. Anyone using these packages for bulk mail needs to build their own bounce handling. That’s part of why open source servers are a bad match for bulk mail.

What are SMTP response codes?

SMTP response codes are the ways mail servers communicate while sending mail. A sending server basically issues a command (HELO, EHLO, MAIL FROM, DATA) and the receiving server responds to those commands with 3 digit codes. The first number in the code (with one exception) a 2, 4 or 5.
Any response that starts with a 2 means: Yup! We’re good!  Receiving servers respond with 2 codes throughout the SMTP transaction. After the sending server completes the send and says “that’s the whole email” the final 2xx means that the message was received and it’s no longer the sending server’s responsibility. Usually these are 250 responses, but there are others.
Any response that starts with a 5 means: Woah! Stop right there! This response can be due to a number of things from the address not existing to a protocol error to a spam block. Errors starting with a 5 can also happen any time during the SMTP transaction. When receiving a 5xx the sending server should stop the transaction and not try to send that message again.
Any response that starts with a 4 means: Stop, briefly, let me catch my breath! These are temporary errors. When the sending server gets a 4xx code, it should stop the transaction, queue the mail and attempt it in the future.

A hard bounce starts with 5 and a soft starts with 4, right?

Sorta. In the SMTP world a 5xx is a hard bounce and means that message will never be delivered. A 4xx is a soft bounce and means the message can be queued and reattempted at a later date. In the bulk mail world a hard bounce means that no further mail should ever be sent to that address from that sender. A soft bounce means that future emails can be sent to that address. Because we’re trying to map apples onto oranges, there are some grotty corners.

You lost me.

I think I lost me. (This is usually the point where I start pacing around the office and deciding this is not a blog post I want to write. I hit that about half way through the 2xx description…)
Here’s the thing, there is no published or standard for how a receiver should alert a sender as to what to do with future emails to an email address that’s currently undeliverable. All the RFCs talk about is what to do with the current message. We’ve tried to interpret those messages to make sane decisions about how to send mail. But there is no right way to do it.

Let’s add new codes!

Yeah, no. That’s not going to work. First, some folks have proposed some changes in the past, and that’s never gotten anywhere through the IETF. Second, it’s complicated. I can come up with half a dozen reasons for why this is a challenge. Some of them start with agreeing on the problem space. Others involve having to update SMTP services across the internet. It’s hard and it’s complicated and email is such an entrenched protocol making substantive changes to the SMTP transaction is really a non-starter.

So… where now?

ESPs do their best at classifying response codes and phrasing to make good decisions for what to do with future email. But there is no real right way to do it. Everyone processes bounces a little differently. Sometimes addresses that have bounced off a list will still be deliverable. It happens. There are any number of “right” things to do with the address, depending on why it initially bounced off the list.
There is no one way to do things, but the better informed you are about how your ESP handles bounces the better you can deal with the issues.

Related Posts

Bounces, complaints and metrics

In the email delivery space there are a lot of numbers we talk about including bounce rates, complaint rates, acceptance rates and inbox delivery rates. These are all good numbers to tell us about a particular campaign or mailing list. Usually these metrics all track together. Low bounce rates and low complaint rates correlate with high delivery rates and high inbox placement.

Read More

20% of email doesn't make it to the inbox

Return Path released their global delivery report for the second half of 2009. To put together the report, they look at mail delivery to the Mailbox Monitor accounts at 131 different ISPs for 600,000+ sends. In the US, 20% of the email sent by Mailbox Monitor customers to Return Path seed accounts doesn’t make it to the inbox. In fact, 16% of the email just disappears.
I’ve blogged in the past about previous Return Path deliverability studies. The recommendations and comments in those previous posts still apply. Senders must pay attention to engagement, permission, complaints and other policy issues. But none of those things really explain why email is missing.
Why is so much mail disappearing? It doesn’t match with the philosophy of the ISPs. Most ISPs do their best to deliver email that they accept and I don’t really expect that ISPs are starting to hard block so many Return Path customers in the middle of a send. The real clue came looking at the Yahoo numbers. Yahoo is one of those ISPs that does not delete mail they have accepted, but does slow down senders. Other ISPs are following Yahoo’s lead and using temporary failures as a way to regulate and limit email sent by senders with poor to inadequate reputations. They aren’t blocking the senders outright, but they are issuing lots of 4xx “come back later” messages.
What is supposed to happen when an ISP issues a 4xx message during the SMTP transaction is that email should be queued and retried. Modern bulk MTAs (MessageSystems, Port25, Strongmail) allow senders to fine tune bounce handling, and designate how many times an email is retried, even allowing no retries on a temporary failure.
What if the missing mail is a result of senders aggressively handling 4xx messages? Some of the companies I’ve consulted for delete email addresses from mailing lists after 2 or 3 4xx responses. Other companies only retry for 12 – 24 hours and then the email is treated as hard bounced.
Return Path is reporting this as a delivery failure, and the tone of discussion I’m seeing seems to be blaming ISPs for overly aggressive spamfiltering. I don’t really think it’s entirely an ISP problem, though. I think it is indicative of poor practices on the part of senders. Not just the obvious permission and engagement issues that many senders deal with, but also poor policy on handling bounces. Perhaps the policy is fine, but the implementation doesn’t reflect the stated policy. Maybe they’re relying on defaults from their MTA vendor.
In any case, this is yet another example of how senders are in control of their delivery problems. Better bounce handling for temporary failures would lower the amount of email that never makes it to the ISP. This isn’t sufficient for 100% inbox placement, but if the email is never handed off to the ISP it is impossible for that email to make it to the inbox.

Read More

Yahoo DMARC articles worth reading

There are a bunch of them and they’re all worth reading.
I have more to say about DMARC, both in terms of advice for senders and list managers affected by this, and in terms of the broader implications of this policy decision. But those articles are going to take me a little longer to write.
How widespread is the problem? Andrew Barrett publishes numbers, pulled from his employer, related to the number of senders using @yahoo.com addresses in their commercial emails. Short version: a low percentage but a lot of users and emails in raw numbers.
What can mailing list managers do? Right now the two answers seem to be stop Yahoo.com addresses from posting or fix your mailing list software. Al has posted how he patched his software to cope, and linked to a post by OnlineGroups.net about how they patched their software.
A number of people are recommending adding an Original Authentication Results header as recommended in the DMARC.org FAQ. I’m looking for more information about how that would work.
For commercial mailers, there doesn’t seem to be that much to do except to not use @yahoo.com address as your header-From address. Yes, this may affect delivery while you’re switching to the new From address, but right now your mail isn’t going to any mailbox provider that implements DMARC checking.
One other thing that commercial mailers and ESPs should be aware of. Depending on your bounce handling processes, this may cause other addresses to bounce off the list. Once the issue of the header-From address is settled, you can reactivate addresses that bounced off the list due to authentication failures since April 4.
 

Read More