A DKIM primer resurrected

I was looking for some references today back in old blog posts. This means I discover some old links are dead, blog posts are gone or moved, and information is lost.
In this case it’s a post by J.D. Falk on deliverability.com. The link is dead (it looks like the whole website is dead), but I found a copy of his post and am reproducing it here. I don’t have permission, because I can’t get permission from him, but the content is extremely useful and I don’t want it lost.

The Final Word on DKIM and Deliverability

Originally posted at http://blog.deliverability.com/2009/12/the-final-word-on-dkim-and-deliverability.html in December 2009

By: J.D. Falk of Return Path
Seems like every week, I see another industry colleague asking for a detailed list of how each DKIM option affects deliverability. Everyone who’s asked for this is a smart person, generally clueful, but this question stumps me. Perhaps it’s that while I learned about email technology as a way to get a message from one autonomous system to another, he learned about it in the frustrating context of trying to figure out why his mail was being blocked — so it has never before crossed his mind that new email technology might be invented that won’t make delivery of his marketing messages more difficult.
See, DKIM isn’t some wacky new anti-spam method intended to reduce your ability to get mail delivered (that’s what that made-up word “deliverability” means, after all.) It’s authentication, designed to make it easier to identify the good senders.
DKIM only answers two questions:

  1. Does the message have a valid signature?
  2. If it does, what domain signed it?

The signing domain, identified by the d= tag in the DKIM signature header, is the only part of the DKIM signature where the choices you make now will directly affect the continued deliverability of your messages. This is because d= is how you tell the receiving system who you are.
With a valid signature on the message, if the receiving system has a domain-based whitelist, and your d= is on it, the message gets in. If they have a domain-based blacklist, and your d= is on it, the message will be rejected. Few mailbox providers have either of those today — but if they have a domain-based reputation system, which we know the big mailbox providers are working on, then delivery depends on reputation. It’s exactly the same as with IP addresses today.
And just as with IP addresses, consistency is critical. If you want to separate different mailstreams, then instead of sending from different IPs like you were before, you can now sign with different domains: shipping.example.com, marketing.example.com, corporate.example.com. Within the context of authentication, each of those is an entirely separate entity. Reputation assessment systems will quickly figure out that there’s a relationship between everything that’s part of example.com, though, so you can’t use this to escape the much-deserved bad reputation of a bad mail stream.
If you send through an ESP today, chances are they sign with their own domain. This means that if you switch to another ESP, you can’t take your reputation with you. However, it also means you can borrow the ESP’s reputation as long as you’re their customer. Work with your ESP to choose the configuration most appropriate for your situation.
So you can stop worrying, sign your mail, and get back to the important work of making sure your recipients are happy to receive the messages you send.
If you’re interested, here’s a rundown of all the other options in the base spec — RFC 4871 — and what effect they’re likely to have on delivery of signed messages. If you haven’t read the introduction and the terminology and definitions section yet, please do so now.
There’s currently only one acceptable value for the version (v=) tag. If yours isn’t 1, then the DKIM signature isn’t valid. Effect on deliverability: none if it’s 1, otherwise the message will be treated as if it wasn’t signed.
The algorithm (a=) is very important to cryptography geeks, but we’re not talking about ICBM launch codes here. Unless you remember why DLG2209TVX was replaced with CPE1704TKS, accept whichever algorithm and key size your mail software vendor or ESP recommends and be done with it. (Just watch, someone will comment that rsa-sha1 is insecure because someone could decrypt it in a matter of months — per message.) Effect on deliverability: none.
Canonicalization (c=) is a sneaky way to get around the fact that sometimes an intermediary mail server will make minor changes to a message, like capitalizing header field names or snipping empty lines at the end of a message. With the default “simple” algorithm, those changes would cause the signature verification to fail. With the “relaxed” algorithm, those changes may pass. Effect on deliverability: none unless the message fails.
You can choose to specify, in the h= tag, which header fields you’re signing. There’s a good description in the base spec of why you might or might not choose particular fields. If you use this, I’d go with the headers that users are likely to see in their mail client, plus anything you use for tracking. Effect on deliverability: none.
Similarly, you can copy all of the signed header fields into the signature with the z= tag. I’m not sure why you would, except for debugging. Effect on deliverability: none.
The selector (s=) is just a way to look up which key you’re using, allowing you to use multiple keys with the same domain. You might have different keys for different offices, or systems, or create a key that you can give to your ESP to sign on your behalf. The selector is also useful for changing keys periodically, in case the private key is no longer private — for example, you could change selectors every other month, removing old ones a few months after you’ve stopped using ’em. Effect on deliverability: none.
A somewhat controversial option is the body length limit, designated by the l= tag. This allows the signer to say “I signed this much of the message, but there might be more content after that — and if so I’m not responsible for it.” It’s a reaction to discussion list software which may automatically add an informational footer to the end of a message. Thing is, these lists invariably make other changes also — new headers, et cetera — so the signature would be broken anyway. And, if your focus is on keeping the recipient safe (as it is for all mailbox providers), why would you deliver a message where the top part is from a trusted sender and the bottom part could be malware? Effect on deliverability: could be bad. Don’t use this.
The q= value is easy: it can only be “dns/txt”. Anything else is invalid. Effect on deliverability: none if it’s dns/txt, otherwise the message will be treated as if it wasn’t signed.
There are two optional tags referring to time: t= is the time the signature was created, while x= is when it expires. Both of these are designed to catch stupid criminals. If the signature was (allegedly) created after the message was received, it’s not valid. Or if the message is received after the signature expires, it’s not valid. While it’s not entirely clear what will happen in the wild, I’d recommend skipping both of these. Effect on deliverability: none if the times match up or the tags aren’t used; otherwise, the message will appear suspicious.
formerly controversial feature is the i= tag, which looks like an email address — but probably isn’t. As I explained back in March, Cisco uses this to identify individual users: i=santaclaus@cisco.com, if Santa Claus worked for Cisco. And you know, he might. More common, I’d expect, senders will use i= to denote distinct mailstreams or internal divisions for their own tracking purposes: i=transactional@example.com, i=marketing@example.com, i=nyc-office@example.com. Thing is, there’s simply no way for anyone on the receiving side to know whether marketing@example.com is a mailstream, a department, a individual email address, or simply a string of randomly generated characters. As such, reputation is more likely to accrue to the d= value. Effect on deliverability: probably none.
So unless you use l= or have unrealistic expectations about i= or s=, as we discussed above, d= is the only thing that matters. See? Nothing to worry about.

Related Posts

What to expect in 2016

WttWColorEye_forBlogI don’t always do predictions posts, even though they’re  popular. Most years I skip them because I don’t see major changes in the email space. And, I’m not the type to just write a prediction post just to post a prediction.
This year, though, I do see changes for everyone in the email space. Most of them center on finally having to deal with the technical debt that’s been accumulating over the past few years. I see ISPs and ESPs spending a lot of development effort to cope with the ongoing evolution authentication requirements.
When people started seriously looking at how to authenticate email, the first goal was getting organizations to implement the protocols. This was a practical concession; in order for a new protocol to be used it needs to be widely implemented. Phase one of authenticating email was simply about publishing protocols and getting organizations to use them.
During phase one, the organization that authenticated a mail hasn’t been important. In fact, the SPF spec almost guarantees that the ESP domain is the authenticated domain. In DKIM, the spec says any domain could sign as long as they could publish a public key in that domain’s domainkeys record.
ESPs took full advantage of this and lowered their own development overhead by taking most of the authentication responsibility on themselves. Their domains were in the 5321.from and they published the SPF records. Domains they control were in the d= and they generated and published the DKIM keys. Mail was authenticated without ESP customers having to do much.
We’ve hit the end of phase one. Most of the major players in the email space are authenticating outbound email. Many of the major players are checking authentication on the inbound. Phase one was a success.
We’re now entering phase two, and that changes thing. In phase two, SPF and DKIM are used as the foundation for user visible authentication. Neither SPF nor DKIM were designed to be user visible protocols. To understand what they’re authenticating you have to understand SMTP and email. Even now there are days when I begin talking about one of them and have to take a step back and think hard about what is being authenticated. And I use these things every day!
DMARC is the first of these end user visible protocols built on SPF and DKIM. It uses the established and widespread authentication to validate the user visible from address. This authentication requires that the d= value or the 5321.from address belong belong to the same domain in the visible from address. While you can pick whether the alignment between the visible from and the authentication is “strict” or “relaxed” you have no choice about the alignment.
Prior to DMARC no one really paid much attention to the domain doing the authentication. Authentication was a yes or a no question. If the answer was yes, then receivers could use the authenticated domain to build a reputation. But they weren’t really checking much in the way of who was doing the authentication.
In the push to deploy authentication, ESPs assumed the responsibility for authentication deployed ESPs took the responsibility and did most of the work. For many or most customers, authentication was as simple as clicking a checkbox during deployment. Some ESPs do currently let customers authenticate the mail themselves, but there’s enough overhead in getting that deployed that they often charged extra to cover the costs.
DMARC is rapidly becoming an expectation or even a full on requirement for inbox delivery. In order to authenticate with DMARC, the authenticating domain must be in the same domain space as the visible from. If senders want to use their own domain in the visible from, DNS records have to be present in that domain space. Whether it’s a SPF TXT record or a domainkeys record the email sender customer needs to publish the correct information in DNS. Even now, if you try to authenticate with DKIM through google apps, they require you to publish DNS records.
ESPs aren’t in a situation where they can effectively manage authentication alignment for all their customers. Hosting companies are in even worse shape when it comes to letting customers authenticate email. Developers are facing the fact they need to go back and rework their authentication code. Businesses are facing the fact they need to change their processes so customers can authenticate with DMARC.
It’s not just the infrastructure providers that are facing challenges with authentication. Senders are going to discover they can no longer hand authentication off to their ESPs and not worry about it. They’re going to have to get DNS records published by their own staff.
Getting DNS updates through some big companies is sometimes more difficult than it should be. I had one client a few years ago where getting rDNS changed to something non-generic took over a month. From an IT standpoint, changing DNS should require approvals and proper channels. Marketers may find this new process challenging.
And, if organizations want to publish reject policies for their domains, then they will have to publish records for every outside provider they use. Some of those providers can’t support DMARC alignment right now.
In 2016 a lot of companies will discover their current infrastructure can’t cope with modern authentication requirements. A lot of effort, both in terms of product development and software development, will need to be spent to meet current needs. This means a lot of user visible features will be displaced while the technical debt is paid.
These changes will improve the security and safety of email for everyone. It won’t be very user visible, which will give the impression this was a slow year for email development. Don’t let that fool you, this will be a pivotal year in email.

Read More