Address verification doesn’t fix any real problem

laura
Industry
February 17, 2020

Would you trust an address verification company that used twitter spam to advertise their product?

Would you trust an address verification company that sent email spam to advertise their product?

Would you trust an address verification company that sent email advertising their product to made up addresses?

Would you trust an address verification company that sent email advertising their product to spamtraps?

Would you trust an address verification company that sent email advertising their product to role accounts?

Would you trust an address verification company that uses a botnet as part of their business?

Would you trust an address verification company that sends spam to verify addresses?

Would you trust an address verification company that resells customer address data to other customers?

Would you trust an address verification company that lies about their relationship with blocklists like Spamhaus?

If you’re using an email address verification company, chances are they’re doing at least one of the above.

What is Address Verification?

Address verification is primarily a way to identify which addresses are deliverable. The simple way to verify an address is to start a normal mail transaction, tell the MX you are sending mail to a particular address and then wait for the MX to tell you if that email address is active or not. Once the MX has said the address exists or not, you can quit the email transaction and go on to the next MX.

This technique was pioneered by spammers back in the early 2000s. Known as dictionary attacks they were a way for spammers to identify which addresses might exist on a domain without having to spend any effort sending mail to non existent ones. This technique was also used by companies who sold addresses to ensure ‘high deliverability’ for their lists.

Dictionary attacks caused a number of problems for the ISPs, including tying up SMTP resources for hours or days at a time. ISPs quickly developed automated defences against these types of attacks. In fact, I suspect some of these defences are why spammers pivoted to selling address verification services.

But address verification isn’t a dictionary attack!

I can hear the protests now, just because they use the same underlying technology and intend to do the same thing, address verification isn’t the same as a dictionary attack. It’s not a problem! People aren’t doing it for the same reason. Well, except they are. The fundamental idea behind both processes is the same: to remove a non-existent address before sending mail to it. There is not even a real difference in intention here.

If ISPs wanted senders to verify addresses before sending to them, they would have never turned off VRFY. That’s right, there’s verification built right into the SMTP protocol. This was quickly abused and disabled at most ISPs. It’s not a far stretch to say ISPs don’t support this activity since they disabled VRFY and they implemented active measures to stop 3rd parties from probing their MX looking for active addresses.

Some ISPs, like Yahoo, have gone a step further. They moved all of their rejections, even ones for non-existent addresses, until after the entire email has been sent to them. It’s impossible for dictionary attacks to identify deliverable addresses at domains they manage.

So what’s the problem?

My biggest issue, though, is not that address verification is what spammers do, or that the ISPs don’t like it or that it’s borderline abusive when done in bulk. The reason I hate address verification is that is erases one of the most valuable pieces of data we have about how accurate our address collection process is.

Bounce rates give us an indication of how inaccurate our address collection process is. We know that every bounced address is an address that doesn’t belong to the person who entered it into the form. Whether it was a typo or a deliberately faked address it is a fact someone gave us an address that didn’t belong to them. Bad addresses are a way of monitoring data accuracy.

This doesn’t mean that every deliverable address is accurate. We have to assume that some of the deliverable addresses also don’t belong to the person who submitted it. Senders who use address verification are lulled into a false sense of security. They assume that a deliverable address is a correct address. This simply isn’t true.

An example of bad email verification. The is an email from the marketing manager at proofy.io to a spamtrap at wordtothewise advertising how accurate their verification process is.

To address the Yahoo problem and the deliverable problem, many address verification companies have moved on to new techniques. One of the things they’re doing now is collecting PII on recipients for resale. I’m not sure how honest they’re being with their customers about this. The words proprietary technology hide a multitude of sins and business practices.

Let’s be honest, though, no company actually has permission from the address owner to collect this data. Senders typically don’t have permission to share email addresses with 3rd parties for commercial gain. The third parties don’t have permission to collect data about the address. We all kind of hand wave around the issue, but the reality is, if consumers knew their activity was being tracked in this manner, many would object.

Data accuracy is vital

There’s a woman in Florida who thinks the gmail address I acquired in 2004 belongs to her. I get lots of things on her behalf: event tickets, forwards from her church groups, invites for Sunday brunch. A couple years ago an insurance company sent me all her documents. Do you know how much PII is in insurance documents? Address, date of birth, social security number, home value, next of kin… the list goes on and on.

The only thing typical address verification does is identify if an address is deliverable or not. It cannot say an address belongs to the person the sender thinks it does. If the insurance company did true data verification rather than simply identifying an address was deliverable, then K.S. who lives on the Loxahatchee River in Jupiter, Florida would have her insurance documents and I wouldn’t know anything about her.

Address verification doesn’t actually fix the real problem, it hides it. The real problem is data accuracy. With verification we can pretend our data is accurate, simply by removing the very visibly non-accurate data. But the much bigger issue is sending mail to addresses that don’t belong to our customers. The most problematic of those is where the email address belongs to a completely different person. Address verification does nothing to affect that, other than hide that our address collection process is broken.

Address verification has been aggressively sold as a way to make data more accurate, but it doesn’t. All standard verification does is hide very obvious data errors. So what can we do?

I know my anti-spammer friends are jumping to tell me that confirmed opt-in is the solution. They can just sit back down because COI has it’s own problems. As senders we can’t always trust that just because a link is clicked that we are connecting with the right person. One ESP tracked thousands of clicks on a confirmation link sent to a spamtrap. Various security appliances and filters follow links in emails. Even some spamtraps will follow links in messages, including confirmation links. COI is not the solution.

What next?

At some point, the email industry has conflated the issues of deliverability and data accuracy. They’ve assumed if their delivery is good their data must also be good. It’s never been true, though,

The first step of solving a problem is admitting there is one. Address verification lets senders of all sizes pretend there is no problem with their data collection. It hides the most visible examples of bad data, and lets companies ignore the true scope of the problem.

We need to measure data accuracy by the metric of data being accurate not by the metric of being deliverable. How we do that depends on how important or vital the data is. Banks, health care and insurance providers should incorporate much more verification into the data processes. But even receipts and tickets and ‘low value’ email gets misdirected all the time.

I wish I had a pre-packaged, easily saleable product that I could tell you would fix the issue. But there isn’t going to be one solution, nor is it going to apply to every situation. I certainly can’t sell it as a fix to deliverability problems. Let’s be honest, even perfectly accurate data is going to have delivery problems.

The challenge I pose to email marketers is to incorporate real data validation into your collection processes. Start paying attention to the accuracy of your data instead of focusing on deliverability. Stop designing address acquisition processes that focus on how to collect the most email addresses with the least amount of friction. Design processes starting with the question: how can we link the email address to the person associated with this transaction, account or inquiry? What are the ways to close the circle so we know our data is accurate?

It's not about the spamtraps

I’ve talked about spamtraps in the past but they keep coming up in so many different discussions I have with people about delivery that I feel the need to write another blog post about them.
Spamtraps are …
… addresses that did not or could not sign up to receive mail from a sender.
… often mistakenly entered into signup forms (typos or people who don’t know their email addresses).
… often found on older lists.
… sometimes scraped off websites and sold by list brokers.
… sometimes caused by terrible bounce management.
… only a symptom …

Data is the key to deliverability

Last week I had the pleasure of speaking to the Sendgrid Customer Advisory Board about email and deliverability. As usually happens when I give talks, I learned a bunch of new things that I’m now integrating into my mental model of email.
One thing that bubbled up to take over a lot of my thought processes is how important data collection and data maintenance is to deliverability. In fact, I’m reaching the conclusion that the vast majority of deliverability problems stem from data issues. How data is collected, how data is managed, how data is maintained all impact how well email is delivered.
Collecting Data
There are many pathways used to collect data for email: online purchases, in-store purchases, signups on websites, registration cards, trade shows, fishbowl drops, purchases, co-reg… the list goes on and on. In today’s world there is a big push to make data collection as frictionless as possible. Making collection processes frictionless (or low friction) often means limiting data checking and correction. In email this can result in mail going to people who never signed up. Filters are actually really good at identifying mail streams going to the wrong people.
The end result of poor data collection processes is poor delivery.
There are lots of way to collect data that incorporates some level of data checking and verifying the customer’s identity. There are ways to do this without adding any friction, even. About 8 years ago I was working with a major retailer that was dealing with a SBL listing due to bad addresses in their store signup program. What they ended up implementing was tagged coupons emailed to the user. When the user went to the store to redeem the coupons, the email address was confirmed as associated with the account. We took what the customers were doing anyway, and turned it into a way to do closed loop confirmation of their email address.
Managing Data
Data management is a major challenge for lots of senders. Data gets pulled out of the database of record and then put into silos for different marketing efforts. If the data flow isn’t managed well, the different streams can have different bounce or activity data. In a worst case scenario, bad addressees like spamtraps, can be reactivated and lead to blocking.
This isn’t theoretical. Last year I worked with a major political group that was dealing with a SBL issue directly related to poor data management. Multiple databases were used to store data and there was no central database. Because of this, unsubscribed and inactivated addresses were reactivated. This included a set of data that was inactivated to deal with a previous SBL listing. Eventually, spamtraps were mailed again and they were blocked. Working with the client data team, we clarified and improved the data flow so that inactive addresses could not get accidentally or unknowingly reactivated.
Maintaining Data
A dozen years ago few companies needed to think about any data maintenance processes other than “it bounces and we remove it.” Most mailbox accounts were tied into dialup or broadband accounts. Accounts lasted until the user stopped paying and then mail started bouncing. Additionally, mailbox accounts often had small limits on how much data they could hold. My first ISP account was limited to 10MB, and that included anything I published on my website. I would archive mail monthly to keep mail from bouncing due to a full mailbox.
But that’s not how email works today. Many people have migrated to free webmail providers for email. This means they can create (and abandon) addresses at any time. Free webmail providers have their own rules for bouncing mail, but generally accounts last for months or even years after the user has stopped logging into them. With the advent of multi gigabyte storage limits, accounts almost never fill up.
These days, companies need to address what they’re going to do with data if there’s no interaction with the recipient in a certain time period. Otherwise, bad data just keeps accumulating and lowering deliverability.
Deliverability is all about the data. Good data collection and good data management and good data maintenance results in good email delivery. Doing the wrong thing with data leads to delivery problems.

Spamtraps on the brain

I really dislike whomever it was that coined the term pristine spamtraps. I get what they were trying to do, explain the different kinds of spamtraps and how different traps get on your list in different ways. Except… any type of trap can end up on your list in any way.