Sometimes I find it hard to find a new topic to write about. I decide I’m going to write about X and then realize I did, often more than once. Other times I think I can blog about some issue only to realize that it’s too complex to handle in a quick post. There are concepts or issues that need background or I have to work a little harder to explain them.
One thing I haven’t blogged about before is bounce handling. That particular topic falls into the other category of posts that take a lot of time to write and need a significant amount of work to make sense. I was even joking with my fellow panel members at EEC a few months ago about how that’s a post that so needs to be written but I’m avoiding it because it’s so hard. There’s so much to be conceptualized and explained and I realize it’s not a blog post but multiple blog posts, or a white paper or even a book.
So let’s start with some simple definitions. Those of you who work at ISPs are probably thinking of bounces in terms of accept than reject, that’s not exactly what I’m talking about here. I’m writing these for senders, who usually call rejects during the SMTP transaction bounces.
What is bounce handling?
In the bulk mail space, bounce handling describes the process of what to do with future email to addresses that have not accepted one or more emails.
Most ESPs and senders segregate bounces into two categories: hard and soft. I’ll be honest, I’ve never been able to get a good definition for what a hard bounce vs. a soft bounce is in this context. I think every ESP defines them a little differently. So what I’m going to do here is describe them as I understand most people to use them. Anyone with different ideas or definitions, feel free to address it in comments.
What is a hard bounce?
A hard bounce is an email address that is not accepting email and is unlikely to accept email in the future. The most common example is address not found or unknown user responses from the ISP.
What is a soft bounce?
A soft bounce that is an email address that did not accept an email but is likely to accept email in the future. These can be things like spam blocks or temp failures or rate limiting by the receiving mail server.
That sounds so simple.
Well, yeah, it does. But at what point do you make the determination that an address is good or bad? We’re trying to make decisions about what to do in the future based on SMTP response codes. SMTP response codes do not address what to do with future mail to an email address. They just tell us what to do with that message.
In order to deal with bulk email, senders take the SMTP response codes and the text of the response and try to infer what to do with future emails. Each ESP and bulk SMTP server handles the interpretation differently. Incidentally the interpretation of SMTP codes for future mails is a not a feature found in most of the open source SMTP software. That software implements the RFCs. Anyone using these packages for bulk mail needs to build their own bounce handling. That’s part of why open source servers are a bad match for bulk mail.
What are SMTP response codes?
SMTP response codes are the ways mail servers communicate while sending mail. A sending server basically issues a command (HELO, EHLO, MAIL FROM, DATA) and the receiving server responds to those commands with 3 digit codes. The first number in the code (with one exception) a 2, 4 or 5.
Any response that starts with a 2 means: Yup! We’re good! Receiving servers respond with 2 codes throughout the SMTP transaction. After the sending server completes the send and says “that’s the whole email” the final 2xx means that the message was received and it’s no longer the sending server’s responsibility. Usually these are 250 responses, but there are others.
Any response that starts with a 5 means: Woah! Stop right there! This response can be due to a number of things from the address not existing to a protocol error to a spam block. Errors starting with a 5 can also happen any time during the SMTP transaction. When receiving a 5xx the sending server should stop the transaction and not try to send that message again.
Any response that starts with a 4 means: Stop, briefly, let me catch my breath! These are temporary errors. When the sending server gets a 4xx code, it should stop the transaction, queue the mail and attempt it in the future.
A hard bounce starts with 5 and a soft starts with 4, right?
Sorta. In the SMTP world a 5xx is a hard bounce and means that message will never be delivered. A 4xx is a soft bounce and means the message can be queued and reattempted at a later date. In the bulk mail world a hard bounce means that no further mail should ever be sent to that address from that sender. A soft bounce means that future emails can be sent to that address. Because we’re trying to map apples onto oranges, there are some grotty corners.
You lost me.
I think I lost me. (This is usually the point where I start pacing around the office and deciding this is not a blog post I want to write. I hit that about half way through the 2xx description…)
Here’s the thing, there is no published or standard for how a receiver should alert a sender as to what to do with future emails to an email address that’s currently undeliverable. All the RFCs talk about is what to do with the current message. We’ve tried to interpret those messages to make sane decisions about how to send mail. But there is no right way to do it.
Let’s add new codes!
Yeah, no. That’s not going to work. First, some folks have proposed some changes in the past, and that’s never gotten anywhere through the IETF. Second, it’s complicated. I can come up with half a dozen reasons for why this is a challenge. Some of them start with agreeing on the problem space. Others involve having to update SMTP services across the internet. It’s hard and it’s complicated and email is such an entrenched protocol making substantive changes to the SMTP transaction is really a non-starter.
So… where now?
ESPs do their best at classifying response codes and phrasing to make good decisions for what to do with future email. But there is no real right way to do it. Everyone processes bounces a little differently. Sometimes addresses that have bounced off a list will still be deliverable. It happens. There are any number of “right” things to do with the address, depending on why it initially bounced off the list.
There is no one way to do things, but the better informed you are about how your ESP handles bounces the better you can deal with the issues.
Actually, the IETF adds new SMTP status codes all the time. I got 521 and 556 for null MX, and gmail at least says they use them.
What you’re not going to get is new response codes to make like easier for ESPs, and in particular you’ll never see a code that means “this is spam but if you change it a little and try again maybe it won’t be”.
I was thinking a new class of codes, rather than just some tweaks to the 2nd and 3rd digits. Something like 6 series for “never mail this address again.” Adding a new first digit makes it a lot harder for interoperability, IMO.
You could probably do it with just one new 5xx code that explicitly means “never email this user again”, that existing mailservers could see as a hard bounce and people could code for in the future. …but that’d hand some of the cranks another tool to break their email with.
This seems like a good topic for a shouting match in a hotel bar.
Looking forward to your discussion of what senders should do when faced with a situation where there is no one to give you a SMTP response code, aka NXDOMAIN. The mohawk-haired guy from Optivo thought it was a no-brainer. Apparently few others do, which is also the general sentiment I got in Dublin…
The discussion in Dublin was less about what to do with an NXDomain and more about how to season a spamtrap feed. Folks running spamtraps Should Not use NXDomain as a way to season.
An example of a hard bounce that is not permanent is one caused by an SPF problem with the sender’s email address.
ESP’s need to treat that very differently to a hard bounce such as ‘recipient does not exit’.
In the case of the former, we need to advise the customer to correct their SPF record and try again. In the case of the latter, we need to actively block attempts to send to that email address again.
The only current way to do this is to analyze the text of the 5xx response and use our own rules.
I don’t think it is very hard. We use Port25 and I have written our own bounce/FBL/unsubscribe handlers which integrate with Port25.
All of this runs in real-time and is available for free on Github – https://github.com/magicdude4eva/port25-bouncehandler
[…] less obvious difference is in how the ESP processes and categorizes bounces. Since many receivers send unclear or deceptive bounce replies, ESPs fine-tune their bounce processing to make sure responses are categorized correctly. This […]
[…] less obvious difference is in how the ESP processes and categorizes bounces. Since many receivers send unclear or deceptive bounce replies, ESPs fine-tune their bounce processing to make sure responses are categorized correctly. This […]