A DKIM primer resurrected
I was looking for some references today back in old blog posts. This means I discover some old links are dead, blog posts are gone or moved, and information is lost.
In this case it’s a post by J.D. Falk on deliverability.com. The link is dead (it looks like the whole website is dead), but I found a copy of his post and am reproducing it here. I don’t have permission, because I can’t get permission from him, but the content is extremely useful and I don’t want it lost.
Originally posted at http://blog.deliverability.com/2009/12/the-final-word-on-dkim-and-deliverability.html in December 2009
Seems like every week, I see another industry colleague asking for a detailed list of how each DKIM option affects deliverability. Everyone who’s asked for this is a smart person, generally clueful, but this question stumps me. Perhaps it’s that while I learned about email technology as a way to get a message from one autonomous system to another, he learned about it in the frustrating context of trying to figure out why his mail was being blocked — so it has never before crossed his mind that new email technology might be invented that won’t make delivery of his marketing messages more difficult.
See, DKIM isn’t some wacky new anti-spam method intended to reduce your ability to get mail delivered (that’s what that made-up word “deliverability” means, after all.) It’s authentication, designed to make it easier to identify the good senders.
DKIM only answers two questions:
- Does the message have a valid signature?
- If it does, what domain signed it?
The signing domain, identified by the d= tag in the DKIM signature header, is the only part of the DKIM signature where the choices you make now will directly affect the continued deliverability of your messages. This is because d= is how you tell the receiving system who you are.
With a valid signature on the message, if the receiving system has a domain-based whitelist, and your d= is on it, the message gets in. If they have a domain-based blacklist, and your d= is on it, the message will be rejected. Few mailbox providers have either of those today — but if they have a domain-based reputation system, which we know the big mailbox providers are working on, then delivery depends on reputation. It’s exactly the same as with IP addresses today.
And just as with IP addresses, consistency is critical. If you want to separate different mailstreams, then instead of sending from different IPs like you were before, you can now sign with different domains: shipping.example.com, marketing.example.com, corporate.example.com. Within the context of authentication, each of those is an entirely separate entity. Reputation assessment systems will quickly figure out that there’s a relationship between everything that’s part of example.com, though, so you can’t use this to escape the much-deserved bad reputation of a bad mail stream.
If you send through an ESP today, chances are they sign with their own domain. This means that if you switch to another ESP, you can’t take your reputation with you. However, it also means you can borrow the ESP’s reputation as long as you’re their customer. Work with your ESP to choose the configuration most appropriate for your situation.
So you can stop worrying, sign your mail, and get back to the important work of making sure your recipients are happy to receive the messages you send.
If you’re interested, here’s a rundown of all the other options in the base spec — RFC 4871 — and what effect they’re likely to have on delivery of signed messages. If you haven’t read the introduction and the terminology and definitions section yet, please do so now.
There’s currently only one acceptable value for the version (v=) tag. If yours isn’t 1, then the DKIM signature isn’t valid. Effect on deliverability: none if it’s 1, otherwise the message will be treated as if it wasn’t signed.
The algorithm (a=) is very important to cryptography geeks, but we’re not talking about ICBM launch codes here. Unless you remember why DLG2209TVX was replaced with CPE1704TKS, accept whichever algorithm and key size your mail software vendor or ESP recommends and be done with it. (Just watch, someone will comment that rsa-sha1 is insecure because someone could decrypt it in a matter of months — per message.) Effect on deliverability: none.
Canonicalization (c=) is a sneaky way to get around the fact that sometimes an intermediary mail server will make minor changes to a message, like capitalizing header field names or snipping empty lines at the end of a message. With the default “simple” algorithm, those changes would cause the signature verification to fail. With the “relaxed” algorithm, those changes may pass. Effect on deliverability: none unless the message fails.
You can choose to specify, in the h= tag, which header fields you’re signing. There’s a good description in the base spec of why you might or might not choose particular fields. If you use this, I’d go with the headers that users are likely to see in their mail client, plus anything you use for tracking. Effect on deliverability: none.
Similarly, you can copy all of the signed header fields into the signature with the z= tag. I’m not sure why you would, except for debugging. Effect on deliverability: none.
The selector (s=) is just a way to look up which key you’re using, allowing you to use multiple keys with the same domain. You might have different keys for different offices, or systems, or create a key that you can give to your ESP to sign on your behalf. The selector is also useful for changing keys periodically, in case the private key is no longer private — for example, you could change selectors every other month, removing old ones a few months after you’ve stopped using ’em. Effect on deliverability: none.
A somewhat controversial option is the body length limit, designated by the l= tag. This allows the signer to say “I signed this much of the message, but there might be more content after that — and if so I’m not responsible for it.” It’s a reaction to discussion list software which may automatically add an informational footer to the end of a message. Thing is, these lists invariably make other changes also — new headers, et cetera — so the signature would be broken anyway. And, if your focus is on keeping the recipient safe (as it is for all mailbox providers), why would you deliver a message where the top part is from a trusted sender and the bottom part could be malware? Effect on deliverability: could be bad. Don’t use this.
The q= value is easy: it can only be “dns/txt”. Anything else is invalid. Effect on deliverability: none if it’s dns/txt, otherwise the message will be treated as if it wasn’t signed.
There are two optional tags referring to time: t= is the time the signature was created, while x= is when it expires. Both of these are designed to catch stupid criminals. If the signature was (allegedly) created after the message was received, it’s not valid. Or if the message is received after the signature expires, it’s not valid. While it’s not entirely clear what will happen in the wild, I’d recommend skipping both of these. Effect on deliverability: none if the times match up or the tags aren’t used; otherwise, the message will appear suspicious.
A formerly controversial feature is the i= tag, which looks like an email address — but probably isn’t. As I explained back in March, Cisco uses this to identify individual users: email@example.com, if Santa Claus worked for Cisco. And you know, he might. More common, I’d expect, senders will use i= to denote distinct mailstreams or internal divisions for their own tracking purposes: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org. Thing is, there’s simply no way for anyone on the receiving side to know whether email@example.com is a mailstream, a department, a individual email address, or simply a string of randomly generated characters. As such, reputation is more likely to accrue to the d= value. Effect on deliverability: probably none.
So unless you use l= or have unrealistic expectations about i= or s=, as we discussed above, d= is the only thing that matters. See? Nothing to worry about.