Trawling through the junk folder

As a break from writing unit tests this morning I took a few minutes to go through my Mail.app junk folder, looking for false positives for mail delivered over the past six weeks.

We don’t do any connection level rejection here, so any mail sent to me gets delivered somewhere. Anything that looks like malware gets dumped in one folder and never read, anything that scores a ridiculously high spamassassin score gets dumped in another folder and never read, mailing lists get handled specially and everything else gets delivered to Mail.app to deal with. That means that Mail.app sees less of the ridiculously obvious spam and is mostly left to do bayesian filtering, and whatever other magic Apple implemented.
There were about thirty false positives, and they were all B2C bulk advertising mail. I receive a lot of 1:1 mail, transactional mail and B2B marketing mail and there were no false positives at all for any of those.
All the false positives were authenticated with both SPF and DKIM. All of them were for marketing lists I’d signed up for while making a purchase. All of them were “greymail” – mail that I’d agreed to receive, and that was inoffensive but not compelling. While I easily spotted all of them as false positives via the from address and subject, none of them were content I’d particularly missed.
Almost all of the false positives were sent through ESPs I recognized the name of, and about 80% of them were sent through just two ESPs (though that wasn’t immediately obvious, as one of them not only uses random four character domain names, it uses several different ones – stop doing that).
If you’d asked me to name two large, legitimate ESPs from whom I recalled receiving blatant, blatant spam recently, it would be those same two ESPs. Is Mail.app is picking up on my opinions of the mail those ESPs are sending? It’s possible – details specific to a particular ESPs mail composition and delivery pipelines are details that a bayesian learning filter may well recognize as efficient tokens.

Related Posts

Email filtering: not going away.

VirusBlockI don’t do a whole lot of filtering of comments here. There are a couple people who are moderated, but generally if the comments contribute to a discussion they get to be posted. I do get the occasional angry or incoherent comment. And sometimes I get a comment that is triggers me to write an entire blog post pointing out the problems with the comment.
Today a comment from Joe King showed up for The Myth of the Low Complaint Rate.

Read More

Confusing the engineers

We went camping last weekend with a bunch of friends. Had a great time relaxing on the banks of the Tuolumne River, eating way too much and visiting.
On Saturday I was wearing a somewhat geeky t-shirt. It said 554: abort mission. (Thank you MessageSystems). At some point on Saturday every engineer came up to me, read my shirt and then looked at me and said “That’s not HTTP.”
That lead to various discussions about how their junior engineers don’t actually know SMTP at all. Why? Because the SMTP libraries just work. Apparently the HTTP libraries aren’t that great, so folks have to learn more about HTTP to troubleshoot and use them.
I’m sure there’s a joke in there somewhere: A Kindle engineer, an Android engineer and a robot engineer walk into a campsite…
EmailFilters_boxes_forblogIt did leave me thinking, though, about how it’s not that easy to run your own mail server these days. Gone are the days when running your own server was cost effective and easy. These days, there is just too much spam coming in. Crafting filters is a skilled job. It’s not that hard to run good filters. But to run good filters takes time to do well.
There are also a lot of challenges to sending mail. One of the discussions I had at the campsite was how hard it was to configure outbound mail. The engineer was helping a friend set up a website and trying to get the website to send notifications to the friend. But without setting up authentication the mail kept silently failing.
Of course, we do run our own mail server. But it’s our job and, in many ways, it keeps us honest. We don’t run many filters meaning we see what spammers are doing and can use our own experiences to better understand what commercial filters are dealing with.
For most people, though, I really think using a service is the right solution. Find one with filters that meet your needs and just pay them to deal with the headache.
 

Read More

Deliverability and IP addresses

Almost 2 years ago I wrote a blog post titled The Death of IP Based Reputation. These days I’m even more sure that IP based reputation is well and truly dead for legitimate senders.
There are a lot of reasons for this continued change. Deliverability is hard when some people like the same email other people think is spam

Read More