One of the delivery challenges that regularly comes up in various delivery discussion spaces is the “Gmail suddenly put my mail in spam.” From my perspective, there is rarely a “suddenly” about Gmail’s decision making process.
As I was answering one of these questions I had a number of thoughts. I’ll share them here on the blog so I can find them in the future.
The first thing that occurred was that I’d shifted my thinking to considering Gmail filtering, in particular, as a filter based on a mailstream, not on a message. Thus my tweet:
The other was a realization that most people don’t consider how what they’re doing might be not OK, but might not be bad enough to be filtered this time. That not OK behaviour builds up over time and eventually tips into mail being filtered. Not because of one message, but because of the dozen or 40 or 75 or hundred messages behind it.
Gmail is really good about just watching and monitoring and watching and scoring. They measure mainstreams over time, and reputation is the sum of all your sends. There may be nothing different or new about a particular send, but Gmail’s been seeing overall reputation decrease just a little bit every time you send.
To put it more succinctly: Senders see themselves “doing the same things” and not realising that every time they do this, they’re slowly eroding away at their reputation with Gmail. Once the erosion hits the tipping point
As I explained it on Facebook.
The effect (moving mail to spam) happens suddenly. The cause builds up over time. Gmail is really good about just watching and monitoring and watching and scoring. They measure mainstreams over time, they don’t really measure individual sends.
What happens is that Gmail starts moving mail sent to people who don’t engage with it to the bulk folder. Your open rates don’t change because these people aren’t opening the mail anyway. But every message delivered to the bulk folder is a ding on your reputation. Those dings build up, until your reputation hits a tipping point and all mail to some of your engaged users sometimes goes to bulk. You might see a small decrease in open rates, but nothing major. You continue mailing as you are and then “all of a sudden” your mail is going to bulk. But it’s just mail to the people who were receiving it in the inbox before, the majority of your mail was always going to bulk.
This is why any changes you might make to mail, changing an IP or a domain name or using a slightly different format, can sometimes work for a little while. It’s basically decoupling that individual message from the broader history of sending. So you can keep doing the same things over and over again and not hit the inbox.
If, however, you don’t fix the underlying problem, one of two things will happen. Google will connect the new mailstream with the old mailstream and your reputation will fall to the old reputation in a few days. Alternatively, Google won’t connect the new mailstream with the old mailstream and you’ll have many months to slowly erode the reset reputation. In either case it’s just a matter of time before you end up back in the bulk folder again.
Gmail looks at the whole message stream when making decisions. In order to improve delivery at gmail we must also look at the whole message stream.
I’m seeing some questions about TLS and Gmail. Folks are seeing a correlation between sending without TLS and the mail going to bulk.
Has anyone seen this? Are you sending mail with TLS and can’t get to the inbox? Or are you sending mail without TLS and getting to the inbox?
Inquiring minds and all that.
Many people believe that if they remove non-existent addresses from their mailing lists that their lists will make it to the inbox without a problem. In fact, an entire industry has grown up around the idea that sending mail to valid addresses can never be spam. This isn’t true, of course, spammers use many of the same techniques legitimate mailers do to clean their lists.
I don’t think it’s much of a secret that I don’t have time for many of the data hygiene companies. I think they are selling something that the vast majority of companies don’t need. Furthermore, at least in the beginning, many of them actively abused mailbox providers to gather the data they were selling.
Why did data hygiene become such a thing? Because one way that mailbox providers could identify spam is to look at the number of non-existent email addresses a particular IP was attempting to send to. Too many non-existent email addresses from an IP, and the IP was blocked and mail from that IP went to the bulk folder or was discarded.
Once it was clear that non-existent addresses were a metric used by mailbox providers, it became a target for senders to hit. This led to an entire industry designed to help senders hit this target of low bounces.
There are multiple fallacies all wrapped up in the data hygiene business model.
Fallacy 1: Spam never has a low bounce rate. This is untrue. In fact, spammers were some of the first groups to clean lists of bounces. Some of the early data hygiene companies even grew out of spammers offering services to other spammers.
Fallacy 2: Spam always has a high bounce rate. See above.
Fallacy 3: Filters act on mail with high bounce rates, so if I take off bad addresses then my mail will get to the inbox. Attempting to send mail to non-existent addresses is not, inherently, a bad thing. It happens, in the days before address books it was actually pretty common. The issue is that a list with lots of non existent addresses on it often lacks permission.
The bounces are not the problem. The problem is all the other addresses on the list that never asked for the mail. And this is the reason I have such a problem with the majority of data hygiene companies. Their business model is to remove all signals that a list might be bad, without actually doing anything to make the list good.
Most data hygiene is a waste of money. There are better ways to ensure a good quality list that don’t involve handing over address list to third parties.
I’m working on a blog post about correlation and causation and how cleaning a list doesn’t make it opt-in and permission isn’t actually as outdated as many think and is still important when it comes to delivery. Today is a hard-to-word day, so I headed over to twitter. Only to find someone in my personal network re-tweeted this:
I don’t think I can sum it up much better than that.
Explicit consent to receive mail Is Still Important in order to reach the inbox. Anything else is just sending spam.
All too often folks come to me with delivery problems and lead off with all of the things they’ve done to send mail right. They assure me they’re using SPF and DKIM and DMARC and they can’t understand why things are bad. There is this pervasive belief that if you do all the technical things right then you will reach the inbox.
Getting the technical bits right is an important part of demonstrating you’re a legitimate sender but it’s not, on its own, sufficient to reach the inbox. All you need to do is look at some of the mail in your junk folder to see that even companies with full DMARC can sometimes reach the spam folder (the Uber example, again).
To put it another way, spammers regularly get all the technical bits right and implement best practices, often in better ways than actual companies. Their mail still goes to the spam folder because, well, it’s spam. They even do things like pass lists through data hygiene companies and sometimes even pay attention to engagement on some levels.
What really drives delivery, particularly at the consumer mailbox providers, is engagement.
The big drivers of engagement are having permission to send email and sending mail users want to receive and interact with.
Authentication is there so that the filtering engines know what mail is actually from you. It allows them to be really harsh on spam forging your domain or sent without your authority and still delivering your legitimate mail to the inbox. If your mail is fully authenticated and still going to the bulk folder, then the problem is related to your email. Something you’re doing, whether it’s a permission problem or an engagement problem or whatever, is making the filters think your mail isn’t wanted.
Fixing authentication isn’t going to fix delivery problems caused by authenticated email.
An email address was entered into our website
An email address was associated with a purchase on our website.
We have a relationship with a 3rd party that shares email addresses with us.
We have a cookie on a web browser that visited out website and we sent an email to the address associated with that cookie.
We both went to the same conference and the attendee list was given to every exhibitor.
One of our employees has a connection with this person on LinkedIn.
They liked our Facebook page.
They commented on our Instagram feed.
They followed us on Twitter.
We have a legitimate interest under GDPR to send you email about our products.
The email address is published on a website as a contact point.
Have you heard about the Baader-Meinhoff effect?
The Baader-Meinhof effect, also known as frequency illusion, is the illusion in which a word, a name, or other thing that has recently come to one’s attention suddenly seems to appear with improbable frequency shortly afterwards (not to be confused with the recency illusion or selection bias). Baader–Meinhof effect at Wikipedia
There has to be an corollary for email. For instance, over the last week or so I’ve gotten an influx of questions about how to fix delivery for one to one email. Some have been from clients “Oh, while we’re at it… this happened.” Others have been from groups I’m associated with “I sent this message and it ended up in spam.”
The challenge is, what we do to fix delivery of bulk mail doesn’t really apply to one to one mail. The underlying theory is the same: Send mail people expect to receive and if it gets delivered to the bulk folder have them go fish it out. But when we are sending bulk mail we have a whole population of recipients to work with. When we’re sending one to one mail we only have one person to work with.
Most people don’t know what their filters do under the covers.I know we have a fairly stock install of SpamAssassin and there are some bayesian filters built into mail.app. It’s pretty easy to ID why something was filtered by SA, it tells you. But the built in filters are a black box. All I know is that they learn from what mail I mark as spam.
I can see the results. For instance, almost every time I do a password reset the “here’s your temporary password” message ends up in my spam folder. Doesn’t really matter what provider it is or how regularly I get mail from the vendor or anything like that. If I’m doing the password reset dance then 90% of the time I have to go dig the message out of my spam folder.
It’s possible that I could reset the filters built into mail.app and have this mail come into the inbox. But that will also mean I have to go retrain my filters by manually sorting through the 40 – 50 spams that get through spam assassin every day. That’s tedious and not a lot of fun and there’s no guarantee that the filters won’t re-learn that password reset style messages are spam more often than not.
“Why did this actual, real, one-to-one message go to spam?” is a question we can almost never answer. Sure, sometimes a domain reputation is bad enough or the message is in the wrong language or there’s something blindingly obvious with the content that makes it clear why the message went to spam. But those cases are not as common as we may like. Sometimes the filters just decide this mail should be delivered there.
The point here is that a lot of what we do for deliverability works for bulk, because we’re managing populations and statistics and are sending enough mail we can move the needle on machine learning. When we’re sending very small quantities of mail, then we’re relying on individual users knowing why their mail is going to bulk. Almost no one does, it’s just gotten way too complex.
Which leaves us in a position where email is unreliable for some forms of communication. I don’t think this is a permanent status. I think we’re in a period of filtering changes where folks are trying lots of things to see what works. A decade ago it was whitelists and blacklists and FBLs and paid certification. Now, it’s machine learning and recipient behaviour and individualised inbox experiences.
Filters are continually adapting to spammers. Spammers are continually adapting to filters. This competition is driving rapid evolution on both sides. It’s like punctuated equilibrium for email.
Yesterday,Gmail announced they’re rolling out AMP support in their web client, with support for mobil coming soon.
AMP allows a more web page experience in email. Things that would previously have required clicking through to a separate web page, can now be done directly within the message.
According to the announcement, this feature is only available to senders who register with Google. The registration page has more information about requirements. Of particular interest is that DKIM must align in order to have AMP active in emails.
There is a belief among some folks that the way to get widespread DMARC support is to make it more appealing to marketers. That if marketing pushes for alignment then it’s much more likely to happen than if security pushes for it. In this case, alignment comes with significant and immediate benefits to marketers.
Both Microsoft and Verizon Media have announced they are also participating in AMP.
In the email system there are all sorts of different belief systems. One contingent will have you believe that IP reputation is the be all and end all of delivery. Get a decent IP reputation, and the clouds will part, angels will sing and your mail will reach the inbox. This group of folks often recommends every sender should have their own dedicated IP address. Anything less is just admitting your mail will never reach the inbox.
I’m not one of those people.
In the current environment there’s absolutely nothing wrong with sending off shared infrastructure. Filters, particularly those at Verizon Media, Gmail and Microsoft are good at sorting good mail from bad, even when that mail comes from the same IP address. This does, of course, assume the ESP commits to actively monitoring shared IPs and requiring customers to meet minimum standards. But any decent ESP is going to be doing that for dedicated IPs as well.
The B2B space is a little different and IP reputation may have more impact on delivery there. but I’m seeing that change, too.
If you’re going to be on a dedicated IP you need to be sending at least 50,000 emails a minimum of 3 times a week. And I think, these days, that’s low. I’m more in the dedicated IPs for folks sending more than 1 million a day. Everyone else, can sit on the shared.
Don’t believe me? SendGrid and Mailchimp each send millions of emails across shared IPs every day. These two successful companies don’t need to have everyone of their customers on dedicated IPs, and still have great delivery.
Older IP reputation posts:
This morning I got a rather suspicious message from a colleague on LinkedIn.
I asked around and it seems other folks got the same message and were equally confused. I didn’t click the link because that seemed risky. A few hours later one of the folks I had talked to mentioned that the person’s entire profile was gone. Likewise, the above message disappeared from my messages tab.
I’ll give LinkedIn credit. They acted quickly to remove the problematic content. I received the message at 9:30 this morning and by 1:30 the message and account were gone.
I often talk about how the email channel is very noisy and messy and ripe for abuse. But this incident shows that no channel is immune. And even authenticated channels can be subverted.
In the email space one attack vector is compromising individual user accounts and then using that account to send spam. Authentication is helpful, but as long as a third party can compromise an ESP account, we can’t rely solely on authentication to tell us what mail is good and what mail is bad.
In fact, there was a comment on mailop over the weekend where an individual said that spammers were much better at getting authentication right than legitimate mailers. And they authenticated at a much higher rate. I’m unsurprised, but glad to have someone actually say it.
It totally makes sense. Spammers deploy infrastructure for a few weeks or months, just long enough to get the spam out. They can automate the deployment with scripts – input the IP address, domains and the scripts create the website, DNS entries, set up the mail server and do all the things that need to be done.
Real companies don’t have it so simple. They’ve got to deal with legacy infrastructure, corporate policies, security, ESP functionality and a whole host of other things spammers don’t care about.