Can a bad SPF record ruin your delivery, even though all your mail still passes SPF?
Yes, it can.
One of our clients had issues with poor delivery rates to the inbox at gmail and came to us with the theory that it was due to other people using their domain to send spam to gmail. This theory was based on ReturnPath instrumentation showing mail “from” their domain coming from other IP addresses, and a plausible looking correlation between that mail being sent and their problems at gmail.
Checking their bounce handler, we see a lot of bounces coming in suggesting that someone is sending poor quality mail using their bounce domain from quite a few IP addresses, including a suspicious number scattered in small blocks across 126.96.36.199/8.
Their question they had was whether they should publish DMARC records to fix the problem, and whether they should use a DMARC policy of p=reject or p=none. They’re a good candidate for DMARC – their domains are used purely for bulk or transactional mail, they have a tightly controlled mail infrastructure for their marketing domains, and they’re already publishing SPF records and signing all the mail they send with DKIM.
I was half way through writing up my normal answer about DMARC deployment for customers with this sort of mail infrastructure – “It won’t help with delivery problems directly, but publishing with p=none and analyzing the reports you get back will give you insight into your mail flows, and provide the data you need to decide whether using DMARC p=reject is appropriate for your business model and mail flows.” – when I realized that something just didn’t make sense.
Gmail, perhaps more than most other mailbox providers, base their delivery decisions on data they gather mechanically from all their mailboxes. And they really understand domain-based reputation and the difference between authenticated and non-authenticated email. Why on earth would non-authenticated email from an unrelated IP address be damaging the domain reputation, and hence the delivery of authenticated legitimate email? That makes no sense.
Meanwhile, over in our slack channel, Josh was double-checking their infrastructure…
Oops. They have a small block of 8 IP addresses from which they send most of their email. When setting up their SPF records they inadvertently used ip4:188.8.131.52/8 instead of ip4:184.108.40.206/29 for that block of addresses. A /8 isn’t eight IP addresses – it’s every one of the 16,777,216 IP addresses that begins with “69.”.
Suddenly everything makes sense.
The SPF thinko means that all mail claiming to be from the client domain that’s sent from any IP address beginning with “69.” passes SPF – including the deluge of spam coming from the snowshoe spammers in 69.64.*.
Gmail (and other ISPs) don’t see a difference between the legitimate email and the SPF authenticated spam – they’re just seeing a high volume of authenticated email from the client domain, a large fraction of which is spam. That’s damaged the reputation of the client domain, causing their legitimate email to end up in the spam folder.
(The reality of filtering is more than just domain reputation, of course, but a terrible domain reputation is definitely going to cause you problems.)
The immediate action to take is simple – fix the SPF record so only legitimate mail will be authenticated. That’ll take effect within a couple of hours, as the old SPF record has a short TTL, and ISPs will start seeing the correct SPF record and begin rejigging their reputation.
We’ll keep monitoring delivery rates, check how long ISPs take to notice reputation changes, potentially reach out to some ISPs to see if it’s appropriate for them to do a one-time reputation reset for the affected domains, but we’re hoping things will begin to improve in the next few days.
What can you do to avoid or mitigate this sort of problem?
- Check your SPF records are sane. We have an SPF checker that might be useful.
- Have a reasonable TTL on your SPF DNS records. Too low and you’ll cause too much DNS traffic and risk timeouts. Too high and it could take days for an SPF fix to propagate. An hour or two is reasonable.
- Monitor traffic to your bounce domain, even if they’re not valid bounces. Anyone spamming using your domain will be sending to at least some recipients who’ll cause asynchronous bounces, and a spike there can be a sign that someone is misusing your domain.
Something else to think about: did the spammer pick our client’s domain at random, or did they search for a domain with a good reputation and an SPF bug they could take advantage of?