Content, trigger words and subject lines

There’s been quite a bit of traffic on twitter this afternoon about a recent blog post by Hubspot identifying trigger words senders should avoid in an email subject line. A number of email experts are assuring the world that content doesn’t matter and are arguing on twitter and in the post comments that no one will block an email because those words are in the subject line.
As usually, I think everyone else is a little bit right and a little bit wrong.
The words and phrases posted by Hubspot are pulled out of the Spamassassin rule set. Using those words or exact phrases will cause a spam score to go up, sometimes by a little (0.5 points) and sometimes by a lot (3+ points). Most spamassassin installations consider anything with more than 5 points to be spam so a 3 point score for a subject line may cause mail to be filtered.
The folks who are outraged at the blog post, though, don’t seem to have read the article very closely. Hubspot doesn’t actually say that using trigger words will get mail blocked. What they say is a lot more reasonable than that.

Trigger words are known to cause problems and increase the chances of your email getting caught in a SPAM trap. By avoiding these words in your email subject lines, you can dramatically increase your chances of getting beyond SPAM filters.

OK, so I’m not sure about the “dramatic” part, as some of the words they list as triggers in the subject lines will also trigger scores when used in the body of the message. But the gist of the Hubspot post is not wrong. If you use too many words and phrases used by spammers, then your mail is going to be difficult to distinguish from spam. I don’t think this is actually controversial (although I’ve been known to be wrong…)
But some of the comments on the post go too far in the other direction and totally misrepresent reality.

Content filtering hasn’t been a big component of spam filtering algorithms for nearly a decade.

This is blatantly and demonstrably untrue. Naive content filtering hasn’t been a big component for nearly a decade, but content filtering is where filtering is going. IP based filtering is good for some things but content filtering allows for much finer grained sorting and filtering. I think content filtering is where the industry is going. Too many spammers have created too many ways to avoid and subvert IP based filters for them to be the full solution to protecting users.
Content matters, don’t think it doesn’t. But don’t let word lists like the above frighten you off from crafting good subject lines.

Related Posts

Why do ISPs do that?

One of the most common things I hear is “but why does the ISP do it that way?” The generic answer for that question is: because it works for them and meets their needs. Anyone designing a mail system has to implement some sort of spam filtering and will have to accept the potential for lost mail. Even the those recipients who runs no software filtering may lose mail. Their spamfilter is the delete key and sometimes they’ll delete a real mail.
Every mailserver admin, whether managing a MTA for a corporation, an ISP or themselves inevitably looks at the question of false positives and false negatives. Some are more sensitive to false negatives and would rather block real mail than have to wade through a mailbox full of spam. Others are more sensitive to false positives and would rather deal with unfiltered spam than risk losing mail.
At the ISPs, many of these decisions aren’t made by one person, but the decisions are driven by the business philosophy, requirements and technology. The different consumer ISPs have different philosophies and these show in their spamfiltering.
Gmail, for instance, has a lot of faith in their ability to sort, classify and rank text. This is, after all, what Google does. Therefore, they accept most of the email delivered to Gmail users and then sort after the fact. This fits their technology, their available resources and their business philosophy. They leave as much filtering at the enduser level as they can.
Yahoo, on the other hand, chooses to filter mail at the MTA. While their spamfoldering algorithms are good, they don’t want to waste CPU and filtering effort on mail that they think may be spam. So, they choose to block heavily at the edge, going so far as to rate limit senders that they don’t know about the mail. Endusers are protected from malicious mail and senders have the ability to retry mail until it is accepted.
The same types of entries could be written about Hotmail or AOL. They could even be written about the various spam filter vendors and blocklists. Every company has their own way of doing things and their way reflects their underlying business philosophy.

Read More

A Disturbing Trend

Over the last year or so we’ve been hearing some concerns about some of the blacklisting policies and decisions at Trend Micro / MAPS.
One common thread is that the ESP customers being listed aren’t the sort of sender who you’d expect to be a significant source of abuse. Real companies, gathering addresses from signup forms on their website. Not spammers who buy lists, or who harvest addresses, or who are generating high levels of complaints – rather legitimate senders who are, at worst, being a bit sloppy with their data management. When Trend blacklist an IP address due to a spamtrap hit from one of these customers the actions they are demanding before delisting seem out of proportion to the actual level of abuse seen – often requiring that the ESP terminate the customer or have the customer reconfirm the entire list.
“Reconfirming” means sending an opt-in challenge to every existing subscriber, and dropping any subscriber who doesn’t click on the confirmation link. It’s a very blunt tool. It will annoy the existing recipients and will usually lead to a lot of otherwise happy, engaged subscribers being removed from the mailing list. While reconfirmation can be a useful tool in cleaning up senders who have serious data integrity problems, it’s an overreaction in the case of a sender who doesn’t have any serious problems. “Proportionate punishment” issues aside, it often won’t do anything to improve the state of the email ecosystem. Rather than staying with their current ESP and doing some data hygiene work to fix their real problems, if any, they’re more likely to just move elsewhere. The ESP loses a customer, the sender keeps sending the same email.
If this were all that was going on, it would just mean that the MAPS blacklists are likely to block mail from senders who are sending mostly wanted email.
It’s worse than that, though.
The other thread is that we’re being told that Trend/MAPS are blocking IP addresses that only send confirmed, closed-loop opt-in email, due to spamtrap hits – and they’re not doing so accidentally, as they’re not removing those listings when told that those addresses only emit COI email. That’s something it’s hard to believe a serious blacklist would do, so we decided to dig down and look at what’s going on.
Trend/MAPS have registered upwards of 5,000 domains for use as spamtraps. Some of them are the sort of “fake” domain that people enter into a web form when they want a fake email address (“fakeaddressforyourlist.com”, “nonofyourbussiness.com”, “noneatall.com”). Some of them are the sort of domains that people will accidentally typo when entering an email address (“netvigattor.com”, “lettterbox.com”, “ahoo.es”). Some of them look like they were created automatically by flaky software or were taken from people obfuscating their email addresses to avoid spam (“notmenetvigator.com”, “nofuckinspamhotmail.com”, “nospamsprintnet.com”). And some are real domains that were used for real websites and email in the past, then acquired by Trend/MAPS (“networkembroidery.com”, “omeganetworking.com”, “sheratonforms.com”). And some are just inscrutable (“5b727e6575b89c827e8c9756076e9163.com” – it’s probably an MD5 hash of something, and is exactly the sort of domain you’d use when you wanted to be able to prove ownership after the fact, by knowing what it’s an MD5 hash of).
Some of these are good traps for detecting mail sent to old lists, but many of them (typos, fake addresses) are good traps for detecting mail sent to email addresses entered into web forms – in other words, for the sort of mail typically sent by opt-in mailers.
How are they listing sources of pure COI email, though? That’s simple – Trend/MAPS are taking email sent to the trap domains they own, then they’re clicking on the confirmation links in the email.
Yes. Really.
So if someone typos their email address in your signup form (“steve@netvigattor.com” instead of “steve@netvigator.com”) you’ll send a confirmation email to that address. Trend/MAPS will get that misdirected email, and may click on the confirmation link, and then you’ll “know” that it’s a legitimate, confirmed signup – because Trend/MAPS did confirm they wanted the email. Then at some later date, you’ll end up being blacklisted for sending that 100% COI email to a “MAPS spamtrap”. Then Trend/MAPS require you to reconfirm your entire list to get removed from their blacklist – despite the fact that it’s already COI email, and risking that Trend/MAPS may click on the confirmation links in that reconfirmation run, and blacklist you again based on the same “spamtrap hit” in the future.

Read More

The sledgehammer of confirmed opt-in

We focused Monday on Trend/MAPS blocking fully confirmed opt-in (COI) mail, because that is the Gold Standard for opt-in. It is also Trend/MAPS stated policy that all mail should be COI. There are some problems with this approach. The biggest is that Trend/MAPS is confirming some of the email they receive and then listing COI senders.
The other problem is that typos happen by real people signing up for mail they want. Because MAPS is using typo domains to drive listings, they’re going to see a lot of mail from companies that are doing single opt-in. I realize that there are problems with single opt-in mail, but the problems depends on a lot of factors. Not all single opt-in lists are full of traps and spam and bad data.
In fact, one ESP has a customer with a list of more than 50 million single opt-in email addresses. This sender mails extremely heavily, and yet sees little to no blocking by public or private blocklists.
Trend/MAPS policy is singling out senders that are sending mail people signed up to receive. We know for sure that hard core spammers spend a lot of time and money to identify spamtraps. The typo traps that Trend/MAPS use are pretty easy to find and I have no doubt that the real, problematic spammers are pulling traps out of their lists. Legitimate senders, particularly the ESPs, aren’t going to do that. As one ESP rep commented on yesterday’s post:

Read More