Filtering by gestalt

One of those $5.00 words I learned in the lab was gestalt. We were studying fetal alcohol syndrome (FAS) and, at the time, there were no consistent measurements or numbers that would drive a diagnosis of FAS. Diagnosis was by gestalt – that is by the patient looking like someone who had FAS.
It’s a funny word to say, it’s a funny word to hear. But it’s a useful term to describe the future of spam filtering. And I think we need to get used to thinking about filtering acting on more than just the individual parts of an email.

Filtering is not just IP reputation or domain reputation. It’s about the whole message. It’s mail from this IP with this authentication containing these URLs.  Earlier this year, I wrote an article about Gmail filtering. The quote demonstrates the sum of the parts, but I didn’t really call it out at the time.

Gmail uses a 10+ year old neural network that analyzes thousands of factors, related to email, IP, and web, integrated with all Google products, and with 99.9%+ accuracy for identifying certain types of messages, combined with an email-specific domain-based reputation system that combines IP reputation, content, read rates, reputation of other senders with similar content.

With filters, Gmail looks at the whole picture. They look at all the data and assess the whole.  Gmail filters by Gestalt. I think other companies are catching up and this is the filtering of the future.

So… what’s that mean?

That means that we’re not looking at warming up an IP or a domain. Instead we’re warming up a domain on an IP. Take the domain to another IP, and the reputation doesn’t carry. Change a domain on an IP and that needs to be warmed up as a domain/IP pair.
But even that is overly simplified from reality. It’s not a domain/IP pair, it’s this SPF domain, that d= domain, this IP, this DMARC alignment, these URLs, and on and on. A recent talk referred to warming up resources in relationship to each other, where resources were things like IPs, domains, and URLs.

Spamassassin with relative scores

I think most readers have a good feeling for how Spamassassin works. It has a bunch of rules, and assigns scores based to each rule. All the scores are added together and if they’re higher than a certain value the mail is filtered.
In more modern filtering, particularly at Gmail, scoring is dynamic. There are still rules and they still assign scores. But the scores themselves can be modified by other scores in the process. It’s not a simple sum of scores so changing anything can change the overall status of a message.
Take two identical messages and two IP addresses one with an arbitrary reputation of 5 and another with an arbitrary reputation of 10. By the score and sum method, the final email reputation scores would be message+5 and message+10. With relative scoring, though, the IP reputations might turn out to be 2 and 13.

Look at the whole picture

There’s a West Wing episode where Jeb is playing chess with multiple members of the White House staff while negotiating the international crisis of the week. Throughout the episode he tells staff to “look at the whole board.” This is really what we have to be doing in deliverability right now. We have to look at the whole board. We have to look at the whole face. We have to see the gestalt.
We can’t just look at the domains and URLs in a message, we have to consider them in context with the IP addresses. All mailstreams affect each other. No longer can we look at transactional messages as separate from marketing messages. The reputation of each affects the other.
This is actually good. It means that different mailstreams, even with the same URLs from the same IPs can develop independent reputations. It makes it easier to use shared IPs. Reputation isn’t reliant on keeping everything separate. It’s the whole picture that’s important.
Email is much more than the sum of its parts.
 
 

Related Posts

Domains need to be warmed, too

One thing that came out of the ISP session at M3AAWG is that domains need to be warmed up, too. I can’t remember exactly which ISP rep said it, but there was general nodding across the panel when this was said.
This isn’t just the domain in the reverse DNS of the sending IP, but also domains used in the Return Path (Envelope From) and visible from.
From the ISP’s perspective, this makes tons of sense. Some of the most prolific snowshoe spammers use new domains and new IPs for every send. They’re not trying to establish a reputation, rather they’re trying to avoid one. ISPs respond by distrusting any mail from a new IP with a new domain.

Read More

Thoughts on Gmail and the inbox

Over the last few months more and more marketers are finding their primary delivery challenge is the Gmail inbox. I’ve been thinking about why Gmail might be such a challenge for marketers. Certainly I have gotten a lot of calls from people struggling to figure out how to get into the Gmail inbox. I’ve also seen aggressive domain based filtering from Gmail, where any mention of a particular domain results in mail going to the bulk folder.
It’s one of those things that’s a challenge, because in most of these cases there isn’t one cause for bulk foldering. Instead there’s a whole host of things that are individually very small but taken together convince Gmail that the mail doesn’t need to be in the inbox.
A pattern that I’m starting to see is that Gmail is taking a more holistic look at all the mail from a sender. If the mail is connected to an organization, all that mail is measured as part of their delivery decision making. This is hurting some ESPs and bulk senders. I’ve had multiple ESPs contact me in the last 6 months looking for help because all their customer emails are going to bulk folder.
Gmail’s filtering is extremely aggressive. From my perspective it always has been. I did get an invite for a Gmail account way back in the day. I moved a couple mailing lists over to that account to test it with some volume and discussion lists. I gave up not long after because no matter what I did I couldn’t get gmail to put all the mail from that list into the tag I had set up for it. Inevitably some mail from some certain people would end up in my spam folder.
Gmail has gotten better, now they will let you override their filters but give you a big warning that the message would have been delivered to spam otherwise.
Gmail_NotSpam
What are mailers to do? Right now I don’t have a good answer. Sending mail people want is still good advice for individual senders. But I am not sure what can be done about this ESP wide filtering that I’m starting to see. It’s possible Gmail is monitoring all the mail from a particular sender or ESP and applying a “source network” score. Networks letting customers send mail Gmail doesn’t like (such as affiliate mail or payday mail, things they mentioned specifically at M3AAWG) are having all their customers affected.
I suspect this means that ESPs seeing problems across their customer base are going to have to work harder to police their customers and remove problematic mail streams completely. Hopefully, ESPs that can get on the Gmail FBL can identify the problem customers faster before those customers tank mail for all their senders.

Read More