Address verification doesn’t fix any real problem

A

Would you trust an address verification company that used twitter spam to advertise their product?

Would you trust an address verification company that sent email spam to advertise their product?

Would you trust an address verification company that sent email advertising their product to made up addresses?

Would you trust an address verification company that sent email advertising their product to spamtraps?

Would you trust an address verification company that sent email advertising their product to role accounts?

Would you trust an address verification company that uses a botnet as part of their business?

Would you trust an address verification company that sends spam to verify addresses?

Would you trust an address verification company that resells customer address data to other customers?

Would you trust an address verification company that lies about their relationship with blocklists like Spamhaus?

If you’re using an email address verification company, chances are they’re doing at least one of the above.

What is Address Verification?

Address verification is primarily a way to identify which addresses are deliverable. The simple way to verify an address is to start a normal mail transaction, tell the MX you are sending mail to a particular address and then wait for the MX to tell you if that email address is active or not. Once the MX has said the address exists or not, you can quit the email transaction and go on to the next MX.

This technique was pioneered by spammers back in the early 2000s. Known as dictionary attacks they were a way for spammers to identify which addresses might exist on a domain without having to spend any effort sending mail to non existent ones. This technique was also used by companies who sold addresses to ensure ‘high deliverability’ for their lists.

Dictionary attacks caused a number of problems for the ISPs, including tying up SMTP resources for hours or days at a time. ISPs quickly developed automated defences against these types of attacks. In fact, I suspect some of these defences are why spammers pivoted to selling address verification services.

But address verification isn’t a dictionary attack!

I can hear the protests now, just because they use the same underlying technology and intend to do the same thing, address verification isn’t the same as a dictionary attack. It’s not a problem! People aren’t doing it for the same reason. Well, except they are. The fundamental idea behind both processes is the same: to remove a non-existent address before sending mail to it. There is not even a real difference in intention here.

If ISPs wanted senders to verify addresses before sending to them, they would have never turned off VRFY. That’s right, there’s verification built right into the SMTP protocol. This was quickly abused and disabled at most ISPs. It’s not a far stretch to say ISPs don’t support this activity since they disabled VRFY and they implemented active measures to stop 3rd parties from probing their MX looking for active addresses.

Some ISPs, like Yahoo, have gone a step further. They moved all of their rejections, even ones for non-existent addresses, until after the entire email has been sent to them. It’s impossible for dictionary attacks to identify deliverable addresses at domains they manage.

So what’s the problem?

My biggest issue, though, is not that address verification is what spammers do, or that the ISPs don’t like it or that it’s borderline abusive when done in bulk. The reason I hate address verification is that is erases one of the most valuable pieces of data we have about how accurate our address collection process is.

Bounce rates give us an indication of how inaccurate our address collection process is. We know that every bounced address is an address that doesn’t belong to the person who entered it into the form. Whether it was a typo or a deliberately faked address it is a fact someone gave us an address that didn’t belong to them. Bad addresses are a way of monitoring data accuracy.

This doesn’t mean that every deliverable address is accurate. We have to assume that some of the deliverable addresses also don’t belong to the person who submitted it. Senders who use address verification are lulled into a false sense of security. They assume that a deliverable address is a correct address. This simply isn’t true.

An example of bad email verification. The is an email from the marketing manager at proofy.io to a spamtrap at wordtothewise advertising how accurate their verification process is.

To address the Yahoo problem and the deliverable problem, many address verification companies have moved on to new techniques. One of the things they’re doing now is collecting PII on recipients for resale. I’m not sure how honest they’re being with their customers about this. The words proprietary technology hide a multitude of sins and business practices.

Let’s be honest, though, no company actually has permission from the address owner to collect this data. Senders typically don’t have permission to share email addresses with 3rd parties for commercial gain. The third parties don’t have permission to collect data about the address. We all kind of hand wave around the issue, but the reality is, if consumers knew their activity was being tracked in this manner, many would object.

Data accuracy is vital

There’s a woman in Florida who thinks the gmail address I acquired in 2004 belongs to her. I get lots of things on her behalf: event tickets, forwards from her church groups, invites for Sunday brunch. A couple years ago an insurance company sent me all her documents. Do you know how much PII is in insurance documents? Address, date of birth, social security number, home value, next of kin… the list goes on and on.

The only thing typical address verification does is identify if an address is deliverable or not. It cannot say an address belongs to the person the sender thinks it does. If the insurance company did true data verification rather than simply identifying an address was deliverable, then K.S. who lives on the Loxahatchee River in Jupiter, Florida would have her insurance documents and I wouldn’t know anything about her.

Address verification doesn’t actually fix the real problem, it hides it. The real problem is data accuracy. With verification we can pretend our data is accurate, simply by removing the very visibly non-accurate data. But the much bigger issue is sending mail to addresses that don’t belong to our customers. The most problematic of those is where the email address belongs to a completely different person. Address verification does nothing to affect that, other than hide that our address collection process is broken.

Address verification has been aggressively sold as a way to make data more accurate, but it doesn’t. All standard verification does is hide very obvious data errors. So what can we do?

I know my anti-spammer friends are jumping to tell me that confirmed opt-in is the solution. They can just sit back down because COI has it’s own problems. As senders we can’t always trust that just because a link is clicked that we are connecting with the right person. One ESP tracked thousands of clicks on a confirmation link sent to a spamtrap. Various security appliances and filters follow links in emails. Even some spamtraps will follow links in messages, including confirmation links. COI is not the solution.

What next?

At some point, the email industry has conflated the issues of deliverability and data accuracy. They’ve assumed if their delivery is good their data must also be good. It’s never been true, though,

The first step of solving a problem is admitting there is one. Address verification lets senders of all sizes pretend there is no problem with their data collection. It hides the most visible examples of bad data, and lets companies ignore the true scope of the problem.

We need to measure data accuracy by the metric of data being accurate not by the metric of being deliverable. How we do that depends on how important or vital the data is. Banks, health care and insurance providers should incorporate much more verification into the data processes. But even receipts and tickets and ‘low value’ email gets misdirected all the time.

I wish I had a pre-packaged, easily saleable product that I could tell you would fix the issue. But there isn’t going to be one solution, nor is it going to apply to every situation. I certainly can’t sell it as a fix to deliverability problems. Let’s be honest, even perfectly accurate data is going to have delivery problems.

The challenge I pose to email marketers is to incorporate real data validation into your collection processes. Start paying attention to the accuracy of your data instead of focusing on deliverability. Stop designing address acquisition processes that focus on how to collect the most email addresses with the least amount of friction. Design processes starting with the question: how can we link the email address to the person associated with this transaction, account or inquiry? What are the ways to close the circle so we know our data is accurate?

About the author

2 comments

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  • Completely agree, address “verification” means nothing at best and at worst, it’s a data privacy/reputational risk nightmare.

    But… I still recommend the limited use of (select) services in some situations for just the “data accuracy” component, which the good ones have baked in. By that I mean typo detection, i.e. catching the gmial.com or john.smithg@mail.com type errors, and more. And they are generally better than some standalone library for detecting anything more than basic typos.

    Outside of tech, there are many sectors where it’s still the norm to hand over a business card or write your email address on a form to interact with a supplier. That all gets OCRed at some point and running it through a good typo checker does catch those unintentional/software type errors. Not perfect but at volume does give the marketing team/intern less to manually review.

    Ken.

    P.S. I get that some of the typo stuff is based on a PII pattern database (“data footprint” in industry parlance) but it’s a lesser evil than address harvesting methods IMHO.

  • Standing up again and suggesting multi-factor authentication

    Won’t fix everything, but the chance of having the right person rises significantly

By laura

Recent Posts

Archives

Follow Us