Filter complexity

URLBlockingForBlogDuring the Q&A last week, I mentioned an example of a type of filter trying to demonstrate how complex the filters are. There was some confusion about what I was saying, so I thought I’d write a blog post explaining this.

Background

This story came from another deliverability person, let’s call her ESPer. One of their customers (Customers) is using a 3rd party service that provides tracking links (Tracker). Tracker sent email to their customers saying that mails with more than 3 links were getting blocked.

It has come to our attention that Google has recently started flagging emails with multiple tracked links as suspicious or malicious. For example, if you have an email with more than 3 links (including any in your signature) and have Tracker link tracking turned on, recipients who use Gmail may see your message flagged with a warning. If your email contains 3 or fewer tracked links then you will be unaffected by this issue.

This triggered some Customers to call the ESP and start asking if Google was blocking mail with 3 or more links.

The Investigation

Multiple ESP folks checked their systems and found no correlation between multiple links in an email and bulk foldering at Gmail. I checked my Gmail account and a number of emails in my inbox have 4 or 5 or 6 links in them. None with the Tracker tracking cookie, though.
In an effort to test this a little more, I tried to sign up for a free account with the Tracker to do a little more checking. Tracker is used through an add on for use in Firefox, but it’s unsigned so I decided not to install it. It’s probably not malware, but if they can’t be bothered to sign their Add-on, I’m not going to risk installing it on my machine, even for my readers.

What we know

  1. Gmail is blocking mail with 3 or more links with one that is a Tracker link.
  2. Remove the Tracker link then mail goes to the inbox.
  3. Send with less than 3 links and a Tracker link then mail goes to the inbox.

What we speculate

One of the customer of Tracker is sending spam with 3 or more links plus the tracking links. Google has identified this mail as a problem and is blocking mail that has the same characteristics.
Removing the Tracker link should get the mail into the inbox.
Removing links so there are less than 3 links should get the mail to the inbox.

What this tells us

Filtering is complex. Like Really Really Complex. It’s not the presence of the tracking URL, it’s the presence of the tracking URL and 3 other URLs. Generally when we here at Word to the Wise try and test “what’s wrong” we’ll start removing URLs to see if one particular URL is causing a problem. In this case, that testing would have led us to an erroneous conclusion. We might find one URL “responsible” but only because we’d lowered the total number of URLs under 3.
I’ve been telling people and clients that filters are complex. More than 3 URLs + a specific URL is something that people wouldn’t normally identify as a filter criteria. But the neural net / machine learning / AI filters in use at Gmail noticed that mail with a particular number of links plus the Tracker link aren’t wanted by the recipients. The filters then started blocking mail selectively based on those criteria.
Filters aren’t magic, but sometimes the complexity makes them seem like it.
 
 
 

Related Posts

Delivering to Gmail

Gmail is a challenge for even the best senders these days.
With the recent Gmail changes there isn’t any clear fix to getting open rates or inbox delivery back up. Some of it depends on what is causing Gmail to filter the mail. Changing subject lines, from name, from address may get mail back to the inbox in the short term, but it only works until the filters catch up.
What I am seeing, across a number of clients, is that Gmail is doing a lot of content reputation and that content reputation gets spread across senders of that content.  That means you want to look at who is sending any mail on your behalf (mentioning your domain or pointing at your website) and their practices. If they have poor practices, then it can reflect badly on you and result in filtering.
From what I’ve seen, these are very deliberate filtering decisions by Google. And it’s making mail a lot harder for many, many senders. But I think it is, unfortunately, the new reality.

Read More

Why do ISPs do that?

One of the most common things I hear is “but why does the ISP do it that way?” The generic answer for that question is: because it works for them and meets their needs. Anyone designing a mail system has to implement some sort of spam filtering and will have to accept the potential for lost mail. Even the those recipients who runs no software filtering may lose mail. Their spamfilter is the delete key and sometimes they’ll delete a real mail.
Every mailserver admin, whether managing a MTA for a corporation, an ISP or themselves inevitably looks at the question of false positives and false negatives. Some are more sensitive to false negatives and would rather block real mail than have to wade through a mailbox full of spam. Others are more sensitive to false positives and would rather deal with unfiltered spam than risk losing mail.
At the ISPs, many of these decisions aren’t made by one person, but the decisions are driven by the business philosophy, requirements and technology. The different consumer ISPs have different philosophies and these show in their spamfiltering.
Gmail, for instance, has a lot of faith in their ability to sort, classify and rank text. This is, after all, what Google does. Therefore, they accept most of the email delivered to Gmail users and then sort after the fact. This fits their technology, their available resources and their business philosophy. They leave as much filtering at the enduser level as they can.
Yahoo, on the other hand, chooses to filter mail at the MTA. While their spamfoldering algorithms are good, they don’t want to waste CPU and filtering effort on mail that they think may be spam. So, they choose to block heavily at the edge, going so far as to rate limit senders that they don’t know about the mail. Endusers are protected from malicious mail and senders have the ability to retry mail until it is accepted.
The same types of entries could be written about Hotmail or AOL. They could even be written about the various spam filter vendors and blocklists. Every company has their own way of doing things and their way reflects their underlying business philosophy.

Read More

Thanks for the great session

I had a great time answering questions at the 2015 All About eMail Virtual Conference & Expo today. Thanks so much to everyone who participated and asked questions. They were great and I’m sorry we didn’t have more time.
I did get some questions on twitter (@wise_laura) afterwards. One was about an example I gave to explain how filters are complex. There have been rumors going around recently that Gmail is filtering mail with more than 3 URLs in it. Let me just say right now THIS IS NOT TRUE emails with more than 3 URLs in them are being delivered just fine to Gmail.
There is a situation involving the number (and type) of URLs that I think are a useful example of the filter complexity happening at some places, like Gmail. I started working on it, but don’t quite have time to finish it today, but will keep working on and it should go up in the next day or so.
Thanks again to everyone who joined the session. You asked some great questions and I had fun answering them.
 

Read More