I’ve been waiting for this to happen. An email verification vendor has left their database of 800 million email addresses along with detailed individual data. unprotected on the internet. Bob Diachenko reported the discovery yesterday on his blog. Wired also ran an article (An Email Marketing Company Left 809 Million Records Exposed Online) based on his findings.
It’s not really a secret I don’t have much time for the vast majority of email verification companies and their business models. The first iteration was to hammer on SMTP servers without sending mail. This wasn’t horribly successful because the process looked like a dictionary attack. ISPs instituted protections against dictionary harvesting long before verification companies were a thing, so this was never terribly successful.
SMTP verification became even less useful when Yahoo, now Verizon Media, moved all delivery failures to the very end of the SMTP transaction. This requires verification companies send actual mail in order to determine if an email address is valid. The verification companies don’t want to do this so they can’t tell anything about @yahoo.com addresses.
What did the verification companies do? They pivoted to maintaining vast amounts of data about individual email addresses. In some cases, I believe they are even taking open and click data from their customers or other sources. Now, when you upload a list to verify they don’t test the address, they just compare it with their current database.
Clearly there are issues here. One is that 30% of email addresses go bad over a year. The verification companies have to be doing something to keep their databases current. I don’t know what that is, but taking delivery data from their customers is one way to do it.
The other issue is that there are data aggregators that collect personal data on us and our online activities and then sell that on. None of us have given permission for verification.io or any of their competitors to collect and store our data. Yet they not only do that, many of them also sell the data to any company who wants it.
This breach, of course, wasn’t based on someone cracking into the vendor’s system. The vendor just left their entire database publicly accessible. But now that it’s clear just how much data about us verification vendors have, it’s not out of the question they’ll be a target moving forward.
I have a lot of objection to email verification in general. They don’t actually verify permission or whether or not mail is wanted or even if the recipient gave the address to the vendor. When they use SMTP probing they are abusing resources belonging to third parties to support their business model. Now they’re aggregating and selling data on hundreds of millions of people.
There are a couple of companies in the space that are different. They’re not just “cleaning” data, but providing platforms so senders can actually collect true permission from their customers. And that’s the real crux of it. Bad data verification companies are all about helping senders get addresses that don’t look like spam by keeping bounces and spamtraps low. Good data verification companies are about helping senders curate lists of recipients who want their mail.
Data ownership and privacy is a big deal. Hundreds of billions of dollars have been made by companies collecting and selling PII. They know all sorts of things about consumers, but the consumers have no control over who has their data or what they do with it. Governments are trying to start regulating PII. GDPR was the first, but there are a lot of groups fighting any sort of privacy laws in the US. I think over the long term consumers are going to expect and require more transparency from data aggregators.
I don’t know where the verification industry is going. I do think it’s going to have to significantly change if it’s going to survive. There are significant filtering advantages to handling all rejections after data (like Verizon Media is doing). But there are a few companies in the space that are trying to change how the industry works and make it, overall, less abusive and more consumer friendly.