Google makes connections

One of the client projects I’m working on includes doing a lot of research on MXs, including some classification work. Part of the work involves identifying the company running the MX. Many of the times this is obvious; mail.protection.outlook.com is office365, for instance.

There are other cases where the connection between the MX and the host company is not as obvious. That’s where google comes into play. Take the domain canit.ca, it’s a MX for quite a few domains in this data set. Step one is to visit the website, but there’s no website there. Step 2 is drop the domain into google, who tells me it’s Roaring Penguin software.
In some cases, though, the domain wasn’t as obvious as the Roaring Penguin link. In those cases, Google would present me with seemingly irrelevant hosting pages. It didn’t make sense until I started digging through hosting documentation. Inevitably, whenever Google gave me results that didn’t make sense, they were right. The links were often buried in knowledge base pages telling users how to configure their setup and mentioning the domain I was searching for.
The interesting piece was that often it was the top level domain, not the support pages, that Google presented to me. I had to go find the actual pages. Based on that bit of research, it appears that Google has a comprehensive map of what domains are related to each other.
This is something we see in their handling of email as well. Gmail regularly makes connections between domains that senders don’t expect. I’ve been speaking for a while about how Gmail does this, based on observation of filtering behavior. Working through multiple searches looking at domain names was the first time I saw evidence of the connections I suspected. Gmail is able to connect seemingly disparate hostnames and relate them to one another.
For senders, it means that using different domains in an attempt to isolate different mainstreams doesn’t work. Gmail understands that domainA in acquisition mail is also the same as domainB in opt-in mail is the same as domainC in transactional mail. Companies can develop a reputation at Google which affects all email, not just a particular mail stream. This makes it harder for senders to compartmentalize their sends and requires compliance throughout the organization.
Acquisition programs do hurt all mail programs, at least at Gmail.
 

Related Posts

Google wiretapping case, what the judge ruled

Yesterday I reported that the judge had ruled on Google’s motion to dismiss. Today I’ll take a little bit deeper look at the case and the interesting things that were in denial of the motion to dismiss.
Google is being sued for violations of federal wiretapping laws, the California invasion of privacy act (CIPA) and wiretapping laws in Florida, Pennsylvania and Maryland. This lawsuit is awaiting class certification for the following groups.

Read More

Filtering by gestalt

One of those $5.00 words I learned in the lab was gestalt. We were studying fetal alcohol syndrome (FAS) and, at the time, there were no consistent measurements or numbers that would drive a diagnosis of FAS. Diagnosis was by gestalt – that is by the patient looking like someone who had FAS.
It’s a funny word to say, it’s a funny word to hear. But it’s a useful term to describe the future of spam filtering. And I think we need to get used to thinking about filtering acting on more than just the individual parts of an email.

Filtering is not just IP reputation or domain reputation. It’s about the whole message. It’s mail from this IP with this authentication containing these URLs.  Earlier this year, I wrote an article about Gmail filtering. The quote demonstrates the sum of the parts, but I didn’t really call it out at the time.

Read More

Google Postmaster Tools

Earlier this month Google announced a new set of tools for senders at their Postmaster Tools site. To get into the site you need to login to Google, but they also have a handy support page that doesn’t require a login for folks who want to see what the page is about.
We did register, but don’t send enough mail to get any data back from Google. However, the nice folks at SendGrid were kind enough to share their experiences with me and show me what the site looked like with real data, when I spoke at their recent customer meeting.
Who can register?
Anyone can register for Google Postmaster tools. All you need is the domain authenticated by DKIM (the d= value) or by SPF (the Return Path value).
Who can see data?
Google is only sharing data with trusted domains and only if a minimum volume is sent from those domains. They don’t describe what a trusted domain is, but I expect the criteria include a domain with some history (no brand new domains) and a reasonable track record (some or all of the mail is good).
For ESPs who want to monitor all the mail they send, every mail needs to be signed with a common d= domain. Individual customers that want their own d= can do so. These customers can register for their own access to just their mail.
ESPs that want to do this need to sign with the common key first, and then with the customer’s more selective key.
How does it work?
Google collects data from DKIM and/or SPF authenticated mail, aggregates it and presents it to a Google user that has authenticated the domain.
How do I authenticate?

Read More