When trying to find out why Something Went Wrong during delivery of an email we sometimes want to look at the route by which it was delivered.
Did SPF break because of an unexpected forward? Did DKIM break because an intermediate mailserver modified the content of the message? Why did it take nine hours from the mail leaving our ESP to make it to the inbox? Did it really leave our ESP when they say it did? Did Microsoft internal handoffs break something again?
Received headers are the breadcrumbs that record the path an email took. Every time a mailserver receives an email it adds a Received header at the top of the mail (so the most recent Received headers come first). And it mustn’t modify any of the existing Received headers.
A Received header looks complex, but they have quite a strict structure, and they’re fairly easy to read once you see that. Each received header consists of a set of clauses, each with it’s own structure. If we put some line breaks in they’re easier to see.
Received:
from mx15.a.outbound.createsend.com (mx15.a.outbound.createsend.com [203.55.21.15])
by mx.turscar.ie (Postfix)
with ESMTPS
id 7B8A8A24
for <steve@blighty.com>
; Wed, 27 Mar 2024 15:02:58 +0000 (UTC)
The first clause consists of the literal word “from“, followed by the hostname that the sending server used in the HELO1 at the beginning of the SMTP session. That’s optionally followed by some information about the connecting mailserver in parentheses – a literal IPv4 or IPv6 address, optionally preceded by the reverse DNS of that address).
The second clause starts with the literal word “by” followed by the hostname of the mailserver that’s adding this Received header. This can be optionally followed by the IP address of this server in parentheses, just the same as the “from” clause, but usually isn’t. In this example we have “Postfix” in parentheses, which is not part of the formal structure – it’s just a comment, about which … more later.
After the first two there are typically three or four more clauses, in no particular order.
A clause starting with the literal word “with” gives the protocol the mail was delivered over. The IETF maintain a full list of what these mean, but simply:
- It it begins with UTF8, the mail was sent as unicode on the wire
- If the next bit is “SMTP” or “ESMTP”, it’s email; if it’s “LMTP” it’s email delivered locally
- If it ends in “S” or “SA” it was encrypted using TLS during delivery
- If it ends in “A” or “SA” the mail was authenticated, e.g. an end user used a username and password when sending it to their ISPs smarthost
You sometimes see other terms here, such as HTTP or HTTPS to signify that the mail was submitted via a web API.
A clause starting with “id” gives the identifier that this machine used to track this mail as it was delivered. This is often critical to find more information about the delivery in this mailservers log files.
A clause starting with “for” gives the recipient email address (from the RCPT TO) of this email as it was delivered to this machine. If this changes it’s a sign that the email may have been forwarded, or sent via an exploder.
You’ll sometimes see a clause starting with “via” – this is a sign that the message came from a system not entirely SMTP, and this is the point at which it was converted to an SMTP mail.
After all the clauses there’ll be a semicolon, then the timestamp at which the mailserver received the mail and added the Received header. By comparing the timestamps (correcting for timezones) in each header you can see where a mail may have been delayed – either because it was sat in a queue waiting for delivery, or because it’s initial delivery was deferred with a 4xx response and it sat waiting for a redelivery attempt.
The mailserver adding the Received header can also add human readable comments inside parentheses between any of the clauses. In the example I gave above it put a comment saying “Postfix” after the “by” clause. That doesn’t formally mean anything – it’s just an unstructured comment – but it’s very common for mailservers to include information identifying what software they’re running, or more details about the transaction (e.g. the return path at the point the mail was received) in comments. They’re usually fairly self-explanatory.
- or more likely EHLO, but you know what i mean ↩︎