Technical

Setting up a smarthost

We run most of our own network services – inbound and outbound email, DNS and web presence. We run separate services for inbound and outbound email to give us more flexibility in how we set things up.

Don’t add your domain to the Public Suffix List

(At least, not if you ever intend to use it for email. It might break the domain for email, maybe forever.)

Diagnosing Hard Bounces

A very short post about diagnosing hard bounces, because I’ve had to give the same advice to a dozen folks over the past few months.

Stop using Entrust for your BIMI Certificates

In July I talked about how Entrust was mistrusted by, well, pretty much everyone due to a years long series of security and trust violations.

How long does a tracking link need to be?

There’s text and then there’s text

If you want to send someone an email with some text in it there are quite a few different ways you can do it. The main differences are the ways the text is packaged up in MIME entities to be sent.

Your bounce classification is a bit rubbish

When a mailbox provider rejects or defers an email it sends back a message explaining why.

Do we care about SPF alignment?

SPF and DKIM are the two main ways we associate a domain name with a stream of email in an authenticated way. We can choose the DKIM signing domain fairly freely – we can choose any domain or subdomain we control and put it in the d= field of the DKIM signature. But our choice for the SPF domain is more constrained.

Comparing DKIM keys

Sometimes we have a client who has done something wrong when setting up authentication. Their DKIM signing fails due to something being wrong with the public key they’ve published.

DNS Failures

We use DNS a lot in email, particularly for authentication, so diagnosing why DNS isn’t returning what we expect it to is a pretty common challenge. And DNS responses aren’t exactly the clearest thing to understand.

Bears and Spam Filters

“Why is my inbox full of spam, while I still can’t get the mail I send into the inbox reliably?” — most email marketers at some point in their career.

Prefetches and Proxies

Jody asks “Are ‘prefetch opens’ and ‘proxy opens’ the same thing?”

Sendy and one-click unsubscribe

If you’re using sendy and you’ve found that RFC 8058 one click unsubscribe fails – or, worse, seems to work but doesn’t actually unsubscribe the user – you should take a look at James’ workaround.

Sending domains and hostnames

Lots of times I see someone asking a question and they talk about their sending domain. And it’s sometimes not 100% clear which domain they mean by that – and when we’re talking about alignment and reputation it can make a difference. So here’s a list of (some of?) the different places a mailserver uses a domain.

I always want to say “Emails, and Opens, and Clicks… Oh My!” when I’m talking about them.

SWAKS: Test your SMTP

We’ve mentioned SWAKS here a few times – but I always use it for delivering test mail directly to a recipient’s MX.

Anatomy of a Received header

When trying to find out why Something Went Wrong during delivery of an email we sometimes want to look at the route by which it was delivered.

TXT Records

DKIM public keys live in DNS TXT records. A DNS TXT record contains strings of text, and each string is limited to be no more than 255 characters long.

One-click unsubscribe

The worst thing about the yahoogle requirements has been their use of the term “one-click unsubscribe”. It’s an overloaded term that’s being used here to mean RFC 8058 in-app unsubscription. That’s a completely different thing to what one-click unsubscription has been used to mean for decades, often in the context of complying with legal requirements around unsubscription.

Don’t trust Gmail’s Show Original

It’s not always easy to know what the actual headers and body of an email as sent look like. For a long time accepted wisdom was that you could send a copy to your gmail account, and use the Show Original menu option to, well, see the original message as raw text.

About My Email

Happy 2024, everyone!

We’ve released a shiny new tool to let folks self-check a lot of common questions we see about email requirements.

Can you STARTTLS?

Email supports TLS (Transport Layer Security), what we used to call SSL.

Customer subdomain authentication

EDIT: Now with a production-ready implementation I talk about more here.

Wildcards and DKIM and DMARC, oh my!

If you’re an ESP with small customers you may have looked at the recent Google / Yahoo requirements around DMARC-style alignment for authentication and panicked a bit.

The trouble with CNAMEs

When you query DNS for something you ask your local DNS recursive resolver for all answers it has about a hostname of a certain type. If you’re going to a website your browser asks your resolver for all records for “google.com” of type “A”¹ and it will either return all the A records for google.com it has cached, or it will do the complex process of looking up the results from the authoritative servers, cache them for as long as the TTL field for the reply says it should, then return them to you.

How to Unsubscribe

Eventually our subscribers won’t want our email in their inbox any more.

The Case of the 500-mile Email

I stumbled across this story again this morning, and it’s such a lovely delivery yarn I thought I’d share it.

iOS17 filtering click tracking links

I’ve heard quite a bit of concern about what iOS 17’s automatic removal of click-tracking parameters means, but less discussion of what it actually does.

Trekkie Monster. He’s obsessed by social media and isn’t owned by Children’s Television Workshop.

What is a Cookie?

I’m not talking about biscuits, nor about web cookies, at least not exactly.

Unresolvable RFC.5321 domain at Yahoo

Seen this recently?

451 Message temporarily deferred due to unresolvable RFC.5321 from domain; see https://postmaster.yahooinc.com/error-codes

Is .edu a canary?

Several times recently I’ve heard about something unusual happening email delivery-wise at academic domains that was new, and wasn’t being seen at non-academic domains on the same lists.

“Friendly From” addresses

When we’re looking at the technical details of email addresses there are two quite different contexts we talk about.

Don’t break the (RFC) rules

It looks like Microsoft are getting pickier about email address syntax, rejecting mail that uses illegal address formats. That might be what’s causing that “550 5.6.0 CAT.InvalidContent.Exception: DataSourceOperationException, proxyAddress: prefix not supported – ; cannot handle content of message” rejection.

Life of an Email

I’m repeating the presentation I gave at M3AAWG in London for the Certified Senders Alliance.

It’s all about how to send an email by hand, and how knowing the mechanics of how an email is sent can help us diagnose email delivery issues.

We’re starting in about five hours from when I post this.

Register at https://register.gotowebinar.com/register/2268789893122531343

Sending email

I did a class at M3AAWG teaching the basic mechanics of sending an email, both really by hand using dig and netcat, and using SWAKS. No slides, but if you’re interested in the script I’ve posted a very rough copy of my working notes here.

Stop with the incorrect SPF advice

Another day, another ESP telling a client to publish a SPF include for the wrong domain. It shouldn’t annoy me, really. It’s mostly harmless and it’s just an extra DNS look up for most companies. Heck, we followed Mailchimp’s advice and added their include to our bare root domain and it’s not really a huge deal for companies with only a couple SaaS providers. Still, it’s an incorrect recommendation and it does cause problems for some senders who are using multiple SaaS providers and Google.

Command Line Tools

Tools that you run from the command line – i.e. from a terminal or shell window – are often more powerful and quicker to use than their GUI or web equivalents.

Apple MPP reporting and geolocation

A while back I wrote about Apple Mail Privacy Protection, what it does and how it works. Since MPP was first announced I’d assumed that it would be built on the same infrastructure as iCloud Private Relay, Apple’s VPN product, but hadn’t seen anything from Apple to explicitly connect the two and didn’t have access to enough data to confirm it independently.

Apple MPP

You’ve probably heard about Apple Mail Privacy Protection. Email marketing chat has been all a-twitter about it since it was announced in June.

The OSI Seven Layer Model

In the 1970s, while the early drafts of the Internet were being developed, a competing model for networking was being put together by the ISO (International Organization for Standardization).

Authentication

Some notes on some of the different protocols used for authentication and authentication-adjacent things in email. Some of this is oral history, and some of it may be contradicted by later or more public historical revision.

It’s not too difficult to build your own link redirector, perhaps a few hours work for a basic implementation
Read More

Link tracking redirectors

Almost every bulk mail sent includes some sort of instrumentation to track which users click on which links and when. That’s usually done by the ESP rewriting links in the content so they point at the ESP’s tracking server, and include information about the customer, campaign and recipient. The recipient clicks on the link in the email, their web browser fetches the link from the tracking server, the tracking server records the details of that click and tells the browser to immediately open the original destination page.

Gradual DMARC Rollout

Over on twitter Alwin de Bruin corrected me on an aspect of DMARC soft rollout I’d entirely forgotten about. It’s useful, so I thought I’d write a quick post about it.

Why do my URLs have two dots?

You take a turn, I take a turn

At the SMTP level email is very much a simple line-by-line text based protocol. The client sends a command on a single line, the server responds with one or more lines (the last one marked by having a space in the fourth column), and then the client sends another command.

Captchas

SPF and TXT records and Go

A few days ago Laura noticed a bug in one of our in-house tools – it was sometimes marking an email as SPF Neutral when it should have been a valid SPF pass. I got around to debugging it today and traced it back to a bug in the Go standard library.

DNS Flag Day

There are quite a lot of broken DNS servers out there. I’m sure that’s no surprise to you, but some of them might be yours. And you might not notice that until your domains stop working early next year.

Check your abuse addresses

Even if you have excellent policies and an effective, empowered enforcement team you can still have technical problems that can cause you to drop abuse mail, and so lose the opportunity to get a bad actor off your network before they damage your reputation further.

The Problem With Affiliates (2)

On Friday I mentioned spam coming from a BarkBox affiliate programme.

Reading RFCs

We mention RFCs quite a lot, both explicitly (RFC 6376 is the specification for DKIM) and implicitly (the 822.From aka bounce address aka return path).

Minimal DMARC

The intent of DMARC is to cause emails to silently vanish.
Ideally deploying DMARC would cause all malicious email that uses your domain in the From address, but which has absolutely nothing to with you to vanish, while still allowing all email you send, including mail that was sent through third parties or forwarded, to be delivered.
For some organizations you can get really close to that ideal. If you control (and know about) all the points from which email is sent, if your recipients are individuals with normal consumer or business mailboxes, their mailbox providers don’t do internal forwarding in a way that breaks DKIM before DMARC is checked and, most importantly, if your recipients are a demographic that doesn’t do anything unusual with their email – no vanity domain forwarding, no automated forwarding to other recipients, no alumni domain forwarding, no forwarding to their “real” mailbox on another provider – then DMARC may work well. As long as you follow all the best practices during the DMARC deployment process it’ll all be fine.
What, though, if you’re not in that situation? What if your recipients have been happily forwarding the mail you send to them to internal mailing lists and alternate accounts and so on for decades? And that forwarding is the sort that’s likely to break DKIM signatures as well as break SPF? And while everyone would advise you not to deploy DMARC p=reject, or at least to roll it out very slowly and carefully with a long monitoring period where you watch what happens with p-none, you have to deploy p=reject real soon now?
What can you do that’s least likely to break things, while still letting you say “We have deployed DMARC with p=reject” with a straight face?

List the world!

We often say that a blacklist has “listed the world” when it shuts down ungracefully. What exactly does that mean, and why does it happen?
Blacklists are queried by sending a DNS lookup for an A record, just the same as you’d find the address of a domain for opening a webpage there. The IP address or domain name that’s being queried is encoded in the hostname that’s looked up.
For example, if you wanted to see whether the IP address 82.165.36.226 was listed on the SpamHaus SBL you’d ask DNS for an A record for the hostname 226.36.165.82.sbl.spamhaus.org. If that returns an answer, the IP address is listed. If it doesn’t, it isn’t.
If a blacklist returns an answer for any IP address (or domain) you ask it about it’s “listing the world” or “listing the internet”, saying that everyone you ask about is listed.
Sometimes this is done intentionally as an attempt to get people to stop using a blacklist. If it blocks all your mail, you’ll stop using it. Unfortunately, that never works. Most blacklists aren’t used to block mail, they’re used as part of a scoring based spam filter. And a blacklist that’s poorly run or unmaintained enough that it shuts down ungracefully probably wasn’t trusted much, so added a very small spamminess value to a spam filters score … so nobody notices when they start listing every address.
More often it’s done when a blacklist is abandoned, leaving it’s base domain name to expire.
When a domain expires it reverts to the control of the registrar and eventually is resold, typically to a domain squatter. (A domain squatter is someone who buys up domains when they become available and hopes to sell them on at vastly inflated prices).
Both the registrar and the squatter really want to resell the domain, for a lot of money. But while they control the domain they might as well make tiny amounts of money from it. The way they do that is to run advertising on the site, typically with low end banner or text ads (cheap to serve, low standards as to where they can be run) along with a link to “Buy This Domain For A Lot Of Money!”.
Every bit of traffic that went to websites in the expired domain is valuable to them – every misdirected open from someone looking for the expired content is now an advertising view. They don’t know what hostnames in the domain were actually in use. www.example.com and example.com are a safe bet, but there may also have been forums.example.com, webmail.example.com, chat.example.com and so on …
They don’t know, or care, what hostnames were in use. They just want as many page views as possible to inflate the tiny amount of money they’re getting from their text ads.
So they set up wildcard DNS for the domain, pointing it at a webserver that’s configured to show a domain-specific advertising page for any hostname pointed at it.
*.example.com -> 192.0.2.25
That means that forums.example.com will resolve to 192.0.2.25, as will www.example.com.
And so will 226.36.165.82.nfn.example.com – so anyone using nfn.example.com as a blacklist will get a valid A record response for any IP address the look up. It “listed the world”.

SpamCannibal is dead

The SpamCannibal blacklist – one that didn’t affect your email too much but which would panic users who found it on one of the “check all the blacklists!” websites – has gone away.
It was silently abandoned by the operator at some point in the past year and the domain registration has finally expired. It’s been picked up by domain squatters who, as usual, put a wildcard DNS record in for the domain causing it to list the entire internet.
Al has more details over at dnsbl.com.
If you run a blacklist, please don’t shut it down this way. Read up on the suggested practice in RFC 6471. If you just can’t cope with that consider asking people you know in the industry for help gracefully shutting it down.
Blacklist health checks
If you develop software that uses blacklists, include “health check” functionality. All relevant blacklists publish records that show they’re operating correctly. For IP based blacklists that means that they will always publish “127.0.0.2” as listed and “127.0.0.1” as not listed. You should regularly check those two IP addresses for each blacklist and if 127.0.0.1 is listed or 127.0.0.2 isn’t listed immediately disable use of that list (and notify whoever should know about it).
For IPv6 blacklists the always listed address is “::FFFF:7F00:2” and the never listed address is “::FFFF:7F00:1”. For domain-based blacklists the always listed hostname is “TEST” and the never listed hostname is “INVALID”. See RFC 5782 for more details. (And, obviously, check that the blacklists your software supports out of the box actually do implement this before turning it on).
If you use someone else’s blacklist code, ask them about their support for health checks. If your mail filter doesn’t use them you risk either suddenly having all your mail go missing (for naive blacklist based blocking) or having some fraction of wanted mail being delivered to your spam folder (for scoring based filters).

EFAIL PGP / S/MIME "flaw" ?

There’s going to be a lot of hype today about something the security researchers who found it are calling “EFAIL”. Interviews, commemorative T-Shirts, press tours, hype.
The technical details are interesting, but the un-hyped end-user advice would probably be “If you’re using a mail client that’s got bugs in it’s MIME handling, and you’ve configured it to load remote content automatically, and you’re using a less secure encryption tool or protocol, and you’ve configured it to decrypt things automatically, and security of your email is so important to you that you’re defending against skilled attackers who have already acquired the encrypted emails you’re concerned about (by compromising your ISP? Sniffing non-TLS traffic?) then you may have a problem.”
I can’t imagine anyone for whom email security is a critical issue would make all those mistakes, so this mostly merits a heads-up to the MUA developers (which has happened) and maybe a “Do people rely on S/MIME? Why?” retrospective. But as someone on twitter described it “The Vulnerability Hype Train has begin, choo choo.”

There are several different issues all mixed together by the efail folks. All of them require an attacker to already have access to (encrypted) sensitive emails, and to send copies of those to you wrapped up in another message and to have you decrypt that incoming mail.

Dodgy PDF handling at Gmail

We sent out some W-9s this week. For non-Americans and those lucky enough not to have to deal with IRS paperwork those are tax forms.
They’re simple single page forms with the company name, address and tax ID numbers on them. Because this is the 21st Century we don’t fill them in with typewriters and snail mail them out, we fill in a form online at the IRS website which gives us PDFs to download that we then send out via email.

We started to get replies from people we’d sent them to that we hadn’t included the tax ID number. Which was odd, because it was definitely there in the PDFs we’d sent.
The reports of missing numbers came from Google Apps users, so we sent a copy to one of our Gmail addresses to see. Sure enough, when you click on the attachment it’s mostly there, but some of the digits of the tax ID number are missing.

And all the spaces have been stripped from our address.

The rest of the form looked fine, but the information we’d entered was scrambled. Downloading the PDF from Gmail and displaying it – everything is there, and in the right place.
Weird. After a brief “Are gmail hiding things that look like social security numbers?” detour I realized that the IRS website was probably generating the customized forms using PDF annotations.
PDF is a very powerful, but very complex, file format. It’s not just an image, it’s a combination of different elements – images, lines, vector artwork, text, interactive forms, all sorts of things – bundled together into a single file. And you can add elements to an existing PDF file to, for example, overlay text on to it. These “annotations” are a common way to fill in a PDF form, by adding text in the right place over the top of an existing template PDF.
I cracked the PDF open with some forensics tools and sure enough, the IRS had generated the PDF form using annotations.

Meltdown & Spectre, Oh My

If you follow any infosec sources you’ve probably already heard a lot about Meltdown and Spectre, Kaiser and KPTI. If not, you’ve probably seen headlines like Major flaw in millions of Intel chips revealed or Intel sells off for a second day as massive security exploit shakes the stock.

What is it?
These are all about a cluster of related security issues that exploit features shared by almost all modern, high performance processors. The technical details of how they work are fascinating if you have a background in CPU architecture but the impact is pretty simple: they allow programs to read from memory that they’re not supposed to be able to read.
That might mean that a program running as a normal user can read kernel memory, allowing a malicious program to steal passwords, authentication cookies or even the entire state of the kernels random number generator, potentially allowing it to compromise encryption.
Or it might mean a program running on a virtual machine being able to escape from the sandbox the virtual machine’s hypervisor keeps it in and reading memory of other virtual machines that are running on the same hardware. A malicious user could sign up for a cloud service, such as Amazon EC2 Google Code Engine or Microsoft Azure, repeatedly create temporary virtual machines and grovel through all the other virtual machines running on the same hardware to steal, login credentials or TLS private keys.
Or it might mean a malicious piece of javascript running in a browser from a hostile website or a malicious banner ad being able to steal secrets and credentials not just from your web browser, but from any other software running on your laptop.
It’s pretty bad.
Meltdown and Spectre
One variant has been given the snappy name Meltdown. It (mostly) affects Intel CPUs, and is trivial to exploit reliably by unskilled skript kiddies. It can be mitigated at the operating system level, and all major operating system vendors are doing so, but that mitigation will have significant impact on performance – perhaps 20% slower for common workloads.
The other variant has been named Spectre. It’s more subtle, relying on measuring how long it takes to run carefully crafted code. Whether the code is fast or slow tells the malicious actor whether a particular bit of forbidden memory is zero or one, allowing them to step through reading everything they want. This is likely to be harder to exploit reliably, but is also going to be much harder to mitigate reliably in software (I’ve seen some speculation that it might be impossible to mitigate – I’m pretty sure that’s not true, but it is going to be difficult to do so reliably and will probably have significant performance impact). It affects pretty much everything, including AMD processors (despite what their PR flacks would like you to believe).
What should you do

As a typical end user you should apply your security patches as normal to mitigate Meltdown. macOS was patched on December 6th, the Windows kernel has mitigation in place. The latest release candidate of the Linux kernel has mitigation patches in place, which’ll presumably trickle out to various distributions over the next few days.
You should also update your browser. One nasty vector Spectre can use is timing attacks from malicious javascript. Chrome and Firefox have partial mitigation in their mainline development, and Microsoft have announced fixes for IE11 and Edge.
Keep updating your ‘phones. At least some of the ARM chips in iPhone and Android are vulnerable, and the more constrained ‘phone environment may make targeted attacks more likely.
If you’re using any virtual machines or cloud hosted services then your provider has probably already done rolling reboots so they can patch their hypervisors to mitigate Meltdown. You’ll still need to update your kernel yourself, to protect against attacks within your machine, even though your provider has patched their hypervisors.
Performance (and Email)
The operating system level mitigation for Meltdown works by having the CPU throw away a bunch of information every time the thread of execution goes from the kernel back to the application. Most common applications will switch between kernel code and application code a lot so this has a significant performance impact.
Initial tests with PostgreSQL show slowdowns as bad as 23%, but more realistic workloads look to be maybe 5-15% slower, depending on the workload and the hardware features available.
I wondered whether there’d be much impact on network service performance, so I set up a test network with a couple of mailservers running latest release candidates of the Linux kernel. I sent mail from one to the other, using postfix, smtp-source and smtp-sink – smtp-source and -sink are test tools distributed with postfix that make it easy to send mail or to receive and discard mail.
I wasn’t really expecting to find any performance impact for something that was likely network limited, but ran some tests anyway, slinging a few million emails from one machine to the other and turning mitigation on and off on the sender and receiver. There wasn’t any performance impact that I could measure – if it’s there it was well below the noise floor.
So you’ll probably see slight performance degradation for some things, especially disk-heavy workloads, but nothing to worry too much about.

Authentication is about Identity, not Virtue

I just got some mail claiming to be from “Bank of America <secure@bofasecure.com>”.
It passes SPF:

Organizational Domain

We often want to know whether two hostnames are controlled by the same person, or not.

Mandatory TLS is coming

Well, not exactly mandatory but Chrome will start labeling any text or email form field on a non-TLS page as “NOT SECURE”.

Chrome 62 will be released as stable some time around October 24th. If you want to avoid the customer support overhead then, regardless of whether any of the information on a form is sensitive, you should probably make sure that all your forms are accessible via TLS and redirect any attempt to access them over plain http to https. You can do that globally for a whole site pretty easily, and there’s not really any downside to doing so.
I still have half a dozen sites I need to convert to supporting TLS – the cobbler’s children have no shoes – and I’m beginning to feel a little urgency about it.
There’s more information in Google’s announcement, their checklist of how to set up TLS, and some background at Kaspersky Labs.

Local-part Semantics

An email address has two main parts. The local-part is the bit before the @-sign and the domain is the bit after it. Loosely, the domain part tells SMTP how to get an email to the destination mailserver while the local part tells that server whose mailbox to put it in.
I’m just looking at the local part today, the “steve” in “steve@example.com”.
Talkin’ ‘Bout a Specification
The original specification for SMTP email delivery, RFC 821, specifies a few things about the local-part. It can’t be more than 64 character ascii characters long, and it must be wrapped in double quotes if it includes any punctuation. But that’s just syntax, nothing to do with what it means. It does mention that it’s case-sensitive: “steve@example.com” is not the same recipient as “sTeve@example.com”.
The specification for the structure of email messages, RFC 822, tells us a little more. It clarifies that the local-part is case-sensitive, with the sole exception of the “postmaster” account, which is required to be deliverable as “postmaster”, “POSTMASTER”, “POSTmasTER” or any other variant you like.

TLS certificates and CAA records

Transport Layer Security (TLS) is what gives you the little padlock in your browser bar. Some people still call it SSL, but TLS has been around for 18 years – it’s time to move on.
TLS provides two things. One is encryption of traffic as it goes across the wire, the other is a cryptographic proof that you’re talking to the domain you think you’re talking to.
The second bit is important, as if you can’t prove you’re talking to, for example, your bank you could really be talking to a malicious third party who has convinced your browser to talk to their server instead of your bank which makes the encryption of the traffic much less useful. They could even act as a man-in-the-middle and pass your traffic through to your bank, so that you wouldn’t notice anything wrong.

When your browser connects to a website over TLS it, as part of setting up the connection, fetches a “TLS certificate” from the server. That certificate includes the hostname of the server, so the browser can be sure that it’s talking to the server it thinks it is.
How does the browser know to trust the certificate, though? There’s not really a great way to do that, yet. There’s a protocol called DANE that stores information in DNS to validate the certificate, much the same as we do with DKIM. It’s a promising approach, but not widely supported.
What we have today are “Certificate Authorities” (CAs). These are companies that will confirm that you own a domain, issue you a certificate for that domain where they vouch for it’s authenticity (and usually charge you for the privilege). Anyone can set themselves up as a CA (really – it’s pretty trivial, and you can download scripts to do the hard stuff), but web browsers keep a list of “trusted” CAs, and only certificates from those authorities count. Checking my mac, I see 169 trusted root certificate authorities in the pre-installed list. Many of those root certificates “cross-sign” with other certificate authorities, so the actual number of companies who are trusted to issue TLS Certificates is much, much higher.
If any of those trusted CAs issue a certificate for your domain name to someone, they can pretend to be you, secure connection padlock and all.
Some of those trusted CAs are trustworthy, honest and competent. Others aren’t. If a CA is persistently, provably dishonest enough then they may, eventually, be removed from the list of trusted Certificate Authorities, as StartCom and WoSign were last year. More often, they don’t: Trustwave, MCS Holdings/CNNIC, ANSSI, National Informatics Center of India (who are currently operating a large spam operation, so …).
In 2011 attackers compromised a Dutch CA, DigiNotar, and issued themselves TLS Certificates for over 500 high-profile domains – Skype, Mozilla, Microsoft, Gmail, … – and used them as part of man-in-the-middle attacks to compromise hundreds of thousands of users in Iran. Coverage at the time blamed it on “DigiNotar’s shocking ineptness“.
In 2015 Symantec/Thawte issued 30,000 certificates without authorization of the domain owners, and even when they issued extended validation (“green bar”) certificates for “google.com” they weren’t removed from the trusted list.
So many CAs are incompetent, many are dishonest, and any of them can issue a certificate for your domain. Even if you choose to use a competent, reputable CA – something that’s not trivial in itself – that doesn’t stop an attacker getting certificates for your domain from somewhere else.
This is where CAA DNS records come in. They’re really simple and easy to explain, with no fancy crypto needed to set them up. If I publish this DNS record …

Are they using DKIM?

It’s easy to tell if a domain is using SPF – look up the TXT record for the domain and see if any of them begin with “v=spf1”. If one does, they’re using SPF. If none do, they’re not. (If more than one does? They’re publishing invalid SPF.)
AOL are publishing SPF. Geocities aren’t.
For DKIM it’s harder, as a DKIM key isn’t published at a well-known place in DNS. Instead, each signed email includes a “selector” and you look up a record by combining that selector with the fixed string “._domainkey.” and the domain.
If you have DKIM-signed mail from them then you can find the selector (s=) in the DKIM-Signature header and look up the key. For example, Amazon are using a selector of “taugkdi5ljtmsua4uibbmo5mda3r2q3v”, so I can look up TXT records for “taugkdi5ljtmsua4uibbmo5mda3r2q3v._domainkey.amazon.com“, see that there’s a TXT record returned and know there’s a DKIM key.
That’s a particularly obscure selector, probably one they’re using to track DKIM lookups to the user the mail was sent to, but even if a company is using a selector like “jun2016” you’re unlikely to be able to guess it.
But there’s a detail in the DNS spec that says that if a hostname exists, meaning it’s in DNS, then all the hostnames “above” it in the DNS tree also exist (even if there are no DNS records for them). So if anything,_domainkey.example.com exists in DNS, so does _domainkey.example.com. And, conversely, if _domainkey.example.com doesn’t exist, no subdomain of it exists either.
What does it mean for a hostname to exist in DNS? That’s defined by the two most common responses you get to a DNS query.
One is “NOERROR” – it means that the hostname you asked about exists, even if there are no resource records returned for the particular record type you asked about.
The other is “NXDOMAIN” – it means that the hostname you asked about doesn’t exist, for any record type.
So if you look up _domainkey.aol.com you’ll see a “NOERROR” response, and know that AOL have published DKIM public keys and so are probably using DKIM.
(This is where Steve tries to find a domain that isn’t publishing DKIM keys … Ah! Al’s blog!)
If you look up _domainkey.spamresource.com you’ll see an “NXDOMAIN” response, so you know Al isn’t publishing any DKIM public keys, so isn’t sending any DKIM signed mail using that domain.
This isn’t 100% reliable, unfortunately. Some nameservers will (wrongly) return an NXDOMAIN even if there are subdomains, so you might sometimes get an NXDOMAIN even for a domain that is publishing DKIM. shrug
Sometimes you’ll see an actual TXT record in response – e.g. Yahoo or EBay – that’s detritus left over from the days of DomainKeys, a DomainKeys policy record, and it means nothing today.

Protocol-relative URLs in email

When you link to an external resource – an image, a javascript file, some css style – from a web page you do so with a URL, usually something like “https://example.com/blahblah.css” or “http://example.com/blahblah.css”.
The world is beginning to go all https, all the time, but until recently good practice was to make a web page available via both http and https.
The problem is that if you try and load a resource from an http URL from a page that was loaded via https it’ll complain about it, and not load the resource.
And if your users web browser is loading the http version of your page because it’s Internet Explorer 6 and doesn’t speak modern SSL then it’ll be unable to load anything over https, including any of your resources.
So whether you choose https or http protocol for loading your page resources it’s going to break for someone.
One common trick to avoid the problem is to use a protocol-relative URL. That looks like “//example.com/blahblah.css”, and it’ll load the resource over the same protocol that the page was loaded over.
While we can safely use “https://…” everywhere now, “//…” URLs are still a common idiom for things like loading things like CSS libraries from public content-distribution networks as well as your own resources.
I was reading long-but-excellent writeup about Stack Overflow’s migration to TLS (hey, I read this sort of stuff for fun) and they point out something I hadn’t considered – mail clients don’t really have any sensible way to use protocol-relative URLs. The mail client loaded the “page” from a mailbox and so has no base document protocol to work from (even if there’s a <base> element in the content it’s not likely to affect a mail client). If the user is reading it via webmail or a mail client that’s using an embedded web browser to render HTML it might work, sometimes, but it’s not going to reliably load that resource in general.
So, if you’re copy-pasting content from your web collateral to reuse in an email, make sure you’re not loading anything external – including images – via a “//…” style URL. Rewrite them to use “https://…”.

ARC: Authenticated Received Chain

On Friday I talked a little about DMARC being a negative assertion rather than an authentication method, and also about how and when it could be deployed without causing problems. Today, how DMARC went wrong and a partial fix for it that is coming down the standards pipeline.
What breaks?

DMARC (with p=reject) risks causing problems any time mail with the protected domain in the From: field is either sent from a mailserver that is not under the control of the protected domain, or forwarded by a mailserver not under the control of the protected domain (and modified, however trivially, as it’s forwarded). “Problems” meaning the email is silently discarded.
This table summarizes some of the mail forwarding situations and what they break – but only from the original sender’s perspective. (If forwarding mail from a users mailbox on provider A to their mailbox on provider-Y breaks because of a DMARC policy on provider-A that’s the user’s problem, or maybe provider-A or provider-Y, but not the original sender’s.)

The philosophy of DMARC

We know that legitimate email sent with valid SPF and a DKIM signature often breaks in transit.
SPF will fail any time mail is forwarded – via a mailing list, a forwarding service used by the recipient, or just ad-hoc forwarding.
DKIM will fail any time the message is modified in transit. That can be obviously visible changes, such as a mailing list tagging a subject header or adding a footer to the body. It can also be less obvious changes, such as intermediate MTAs wrapping lines that are too long, reencoding content or repackaging the message altogether – perhaps when delivering from a mailserver that is 8BITMIME compliant to one that isn’t.

(This image has absolutely nothing to do with email authentication, but searching for stock photography about email or authentication or chains or, well, pretty much anything like that leads to horribly depressing corporate imagery. So, no. Have something colourful and optimistic instead.)
As SPF and DKIM are typically used, none of this is much of a problem. A message being authenticated provides a little extra information to the receiving mailserver, and the domain attached to the authentication can be used to look up a senders reputation, giving a potential boost to the chances of the mail being sent to the inbox. If the authentication is broken, though, the mail will still be judged on it’s merits – is it coming from an IP address that’s a source of good mail, does the content look legitimate, and all the other things a spam filter looks at.
That authentication is a (potentially big) positive signal, but lack of authentication isn’t really any signal at all is why SPF and DKIM being fragile wasn’t an issue. SPF and DKIM are positive assertions – “IF this mail IS authenticated THEN IT IS from me”.
That changed when DMARC became popular, though.
DMARC allows the owner of a domain to say “We send no mail that is not authenticated, and we promise that none of that authentication will be broken in transit”. DMARC is a negative assertion – “IF this mail IS NOT authenticated THEN it IS NOT from me”. It converts the absence of a positive assertion into a negative assertion.
This isn’t the first attempt to layer a “we authenticate everything” negative assertion on top of fragile email authentication. SPF did it, with the -all flag (which is universally ignored, leaving SPF purely as a positive assertion). DomainKeys did it, with DomainKeys policy records (which you occasionally still see published, but were never really used to reject mail). DKIM did it with ADSP – which didn’t see much use either.
The reason none of them were used much is because even when senders were telling the truth about “we send no email that is not authenticated” they were always lying, to varying degrees, about “none of the authentication will be broken in transit”.
If your domain that is solely used for bulk email. If it’s never for used mail sent by human beings, not even customer support employees. If it’s a newly created domain with no legacy usage that only sends email from a very tightly controlled infrastructure. If you only send email that’s been created via a well implemented message composition pipeline that ensures the content of the is not just RFC compliant but also “well formed”, with short lines, simple widely implemented encoding, vanilla mime structure and so on. And it’s sent out via conservatively configured smarthosts that deliver directly to the end recipients MX. And if you know that the demographics of your recipients are such that the minority that are forwarding that mail elsewhere (e.g. from their Yahoo account to their Google account or via an alumni mail alias) is a small enough group that you don’t care about them…
If all of those things are true, then your domain is going to be able to deploy DMARC pretty easily and safely. If not, though, how can you tell?
That’s the place where DMARC improves over it’s predecessors. It allows you not only to publish a DMARC policy record in test mode, so it’s not actually used to filter your mail (well, mostly, but that’s a longer story) but also to ask recipients to notify you of mail that seems to be from you but which isn’t authenticated.
You can publish a “p=none” DMARC record with notification addresses in it and wait and see what happens. You’ll get notification of mail that has your domain in the From: field but which isn’t authenticated.
As a first round of action that lets you see where you’re sending email from that you didn’t know about. Sysadmin notification email. That marketing splinter group in Sasketchwan. The outsourced survey company.
Once you’ve cleaned all that up, and made sure everyone is authenticating their mail then you can look at what’s left. The next step is likely to be mistakes you’re making in authentication or message composition that’s causing some of your mail – typically depending on content, and source and recipient domain – to become unauthenticated. Clean that up, make sure all your message composition is squeaky clean, make sure employees aren’t sending mail using that domain in ways you don’t authorize (interacting with mailing list, for example).
By that point you’ll have reduced the torrent of reports you’re getting to two types. One is mail that you send that has it’s authentication broken in transit through some process you have no control over. The other type is mail that has your domain in the “From” field but which you didn’t send. Some of that may be legitimate use of your domain by your employees, such as forward-to-a-friend services, signing up for document delivery via email, third-party notification services. By deploying DMARC you are declaring all that sort of usage to be illegitimate, and you’ll need to get all your employees to stop doing it (or, at least, know that it’s going to stop working). The rest of it is likely a mix of spam and phishing mail. The spam, that’s just using your domain in random from addresses, you probably don’t care about. The phishing you do.
You’ve finally cleaned up your mail infrastructure and policies enough to gather the data you need. How much of my legitimate email will have it’s authentication broken (and hence be silently thrown away by DMARC)? And how much hostile phishing mail is targeting my users (and using the exact domain you are)?
Then you have the information you need to make an informed decision as to how badly deploying DMARC will break your legitimate use of email (after you’ve done everything you can to minimize that) and some idea of whether it will provide you any benefit, at least in the shorter term.
That testing phase, where senders can use other peoples mail infrastructure to investigate their sending practices, gradually fix any problems and finally gather some metrics is what made gave the developers of the DMARC spec confidence that it wouldn’t break things, and made it much more deployable than previous approaches to negative assertion.
On Monday, how all that optimistic reasoning went to hell, what it broke and how we’re trying to fix it.

Tools!

I just added a DMARC validation tool over on tools.wordtothewise.com.

You can give it a domain – such as ebay.com – and it will fetch the DMARC record, then explain and validate it. Or you can paste the DMARC record you’re planning to publish into it, to validate it before you go live.
If you’ve not seen our tools page before, take a look. As well as DMARC we have a DKIM validator, SPF expander and optimizer, general DNS lookup tools, a bunch of RFCs covering all sorts of protocols, and base64 and quoted-printable decoders.
There’s also a widget that lets you add those little unicode pictures to your subject lines, whether you need a snowman ⛄, a forest ????, or a pig getting closer ???.
The results pages all have easily copyable URLs so they’re pretty good for sharing with co-workers or customers if you need that sort of thing.
(And if you need a cidr calculator, whois, or easy access to abuse.net & Microsoft SNDS check out Al’s xnnd.com.)

The twilight of /8s

A “/8” is a block of 16,777,214 usable IP addresses. That’s a big fraction of the entire IPv4 address space – about 1/224, in fact. Each one is all the addresses that begin with a given number: 10.0.0.0/8 is all the IP addresses that begin with “10.”, “184.0.0.0/8” (or “184/8” for short) is all the IP addresses that begin with “184.” and so on.
How are they used? You can see in this map of the entire IPv4 Internet as of 2006.

In the early days of the Internet /8s were given out directly to large organizations. If you look near the middle-top of the map, just left of “MULTICAST” and above “DISA” you can see “MIT”.
The Massachusetts Institute of Technology got into the Internet game pretty early. This is the first map I have where they appear, in June 1970:

The Laboratory for Computer Science at MIT were assigned the 18.0.0.0/8 block sometime around 1977, according to RFC 739, though it looks like they may have been using it since at least 1976.
By 1983 (RFC 820) it belonged to the whole of MIT, rather just the CS Lab, though you have to wonder how long term that was supposed to be, given the block was named “MIT-TEMP” by 1983 (RFC 870). According to @fanf (who you should follow) it was still described as temporary until at least the 1990s.
But no longer. MIT is upgrading much of their network to IPv6, and they’ve found that fourteen million of their sixteen million addresses haven’t been used, so they’re consolidating their use and selling off eight million of them, half of their /8. Thanks, MIT.
Who else is still sitting on /8s? The military, mostly US, have 13. US Tech companies have 5. Telcos have 4. Ford and Daimler have one each. The US Post Office, Prudential Securities, and Societe Internationale de Telecommunications Aeronautiques each have one too.
One is set aside for use by amateur radio.
And two belong to you.
10.0.0.0/8 is set aside by RFC 1918 for private use, so you can use it – along with 192.168.0.0/16 and 172.160.0.0/12 – on your home network or behind your corporate NAT.
And the whole of 127.0.0.0/8 is set aside for the local address of your computer. You might use 127.0.0.1 most of the time for that, but there are 16,777,213 other addresses you could use instead if you want some variety. Go on, treat yourself, they’re all assigned to you.

A due diligence story

due diligence
noun. research and analysis of a company or organization done in preparation for a business transaction

It’s a term that’s been around for five centuries or so. Originally it meant the effort that was necessary for something, but it evolved into a legal term for “the care that a reasonable person takes to avoid harm to other persons or their property“.
More recently it’s evolved to mean “the research that a company should perform before engaging in a financial transaction“.
One aspect of that is doing at least a bare minimum of research on a customer before you let them take advantage of your reputation.
I just got some SMS spam from a short code, advertising two domains – 29designx.us and customlogocoupon.us. It’s SMS spam, so there’s no hidden content, no affiliate tags, just the bare domains. One spam has both domains in it, the other has 29designx.us twice.
According to the company that operates the SMS gateway this is a dedicated short code, not a shared code. In ESP terms that’s kinda equivalent to a customer on a dedicated IP address rather than one sharing a pool. Except much more so – short codes are a scarcer resource than IP addresses, with the US having fewer short codes in total than some ESPs have IP addresses.
What would 60 seconds of due diligence have told the SMS provider about this customer?
Let’s start by looking at the two websites.
They’re clearly built from the same template. Same annoying animation, same fake sale countdown timers, same live chat window.
The live chat was answered by Harvey (who is a real person, one I managed to annoy by talking with him through multiple live chat windows on their different sites simultaneously). Different ‘phone numbers though – 1-866-212-2217 for the coupon site vs 1-619-942-5964.
Then lets look at whois for the domains:
Domain Name: 29DESIGNX.US
Registrant Name: Mildred Smith
Registrant Organization: 29designs
Registrant Address1: 1854 Valley View Drive (that’s in Kansas)
Registrant City: Boston
Registrant State/Province: MA (not Boston, Massachusetts)
Registrant Postal Code: DN3 6GB (see note)
Registrant Country: UNITED KINGDOM (nor the United Kingdom)
Registrant Country Code: GB
Registrant Phone Number: +92.3233000306 (nor Pakistan)
Registrant Email: rhiannon.desir@gmail.com (gmail? rhiannon != Mildred)
Registrant Application Purpose: P1 (= business registration)
Registrant Nexus Category: C11
and
Domain Name: CUSTOMLOGOCOUPON.US
Registrant Name: Antonio R. Flores
Registrant Organization: Oranges Records & Tapes (see note)
Registrant Address1: 4243 Marie Street Annapolis (doesn’t exist)
Registrant City: MD
Registrant State/Province: MD
Registrant Postal Code: 21401
Registrant Country: United States
Registrant Country Code: US
Registrant Phone Number: +1.4108498868
Registrant Email: mj9729395@gmail.com (seven digit number, huh?)
Registrant Application Purpose: P3 (= personal website)
Registrant Nexus Category: C11
That’d make me suspicious enough to put the customer on hold and maybe doing a little actual investigation of them before allowing them to send. That’s the due diligence an ESP or SMS provider should do.

Laura is in Las Vegas today, so I have a little spare time. Let’s do the next level of investigation to find a little more. Nothing fancy, just some creative use of Google.
“DN3 6GB” is an interesting UK postcode. Not because Doncaster – the South Yorkshire town that “DN3” would imply – is particularly interesting, nor because of the fact that DN3 6GB doesn’t exist, despite being syntactically correct.
No. It’s interesting because it is the first postcode in a test suite for validating UK postcodes via regular expression so it’s all over developers forums and FAQs when people are talking about valid UK postcodes. Not only a fake, but a manually created fake.
“Orange’s Records and Tapes” is interesting too. It’s an odd looking business name to have attached to a logo design company. And the mention of “Tapes” looks rather dated. It seems to be a Chicago-based record store (or, possibly, small chain) that either went out of business or was bought out and the name abandoned quite some years ago. It’s still on some easily available lists of business names, though.
And it’s also in output from fakenamegenerator.com – a handy little site that generates fake names, email addresses, employer names, birth dates, credit card numbers and everything else you might want to have as test data. That makes me pretty sure that everything about customlogocoupon.us is fake.
Reverse whois search suggests that the same “Mildred Smith” also registered 29design.us, paperx.us, 99videos.us, 29designs.us and 99videoz.us. As well as the similarity in domain names, the sites that are up are using the same template as the first two sites and selling services in much the same style. And appear to use equally fake registration data.
We still have the ‘phone numbers published on the original sites…
The 866 number on customlogocoupon.us shows up in the contact information for logoventure.com and logoventure.net. They’re a small graphic design and flash animation company, consisting of Russell Bryant, Jessica Sandler, George Isaacson and Jason somebody. No Antonio R. Flores, and it’s a much more restrained site than the customlogocoupon.us hyperactivity.
The 619 number from 29designx.us shows up on animationsharks.com. Which is a little better designed, but still has the same live chat box manned by Harvey. (Hi, Harvey!). It’s been mentioned elsewhere in the SMS spam context too.
There’s no useful contact information on the site, and the domain registration data is falsified via Domains by Proxy (reasonable for a personal site, a bad sign on a business site).
My best guess is that animationsharks.com / 29designx.us / 29design.us / 29designns.com are the SMS spammers, while logoventure.com are a customer of theirs.
Hidden by CSS on the animationsharks.com site is a list of services, support and postal contact information that’s identical to that of a legitimate corporate animation studio based out of Boston. It’s possible that they just ripped off the site of another company, but it’s also possible it’s a side-job, something done by an ex-employee…
But that’s all I have time to look at now. Back to work.

Hi Laura,
Merry Xmas and wishing you a Happy New Year!
I recently looked at a popular ESP’s IPv4 space and I was astounded. How does an ESP get an IP allocation of 20,480 IPs? ARIN guidelines do not allow “MX/Mailing” IPs to count towards a valid justification especially in the case when each and every IP is being used for this purpose. That’s 80 /24’s…and at a time when we are out of IPv4 space….Would love to see a blog post with your insight about this issue….
Read More

Is your website up? Are you sure?

“What would you do for 25% more sales?”
It’s panicked gift-buying season, and I got mail this morning from Boutique Academia, part of their final push before Christmas.
Inbox__18_975_messages__26_unread_
They’re hoping for some Christmas sales in the next three days. They do make some lovely jewelry – ask Laura about her necklace some time – so I clicked on their mail.
Failed_to_open_page
That’s not good. I like Boutique Academia, and fixing email and dns problems is What We Do, so I took a look.
Safari isn’t quite as bad with not-exactly-truthful error messages as Internet Explorer, but I still don’t really trust it. Perhaps the problem is with the click-tracking domain in the email, rather than with boutiqueacademia.com? So I open the base page at http://boutiqueacademia.com, get redirected immediately to https://www.boutiqueacademia.com – which fails to load.
15542402_1501169219896451_6901276936993410491_n
OK, start with the basics. DNS.

One of these things is just like the other

Canonicalization is about comparing things to see if they’re the same. Sometimes you want to do a “fuzzy” comparison, to see if two things are interchangeable for your purposes, even if they’re not exactly identical.
As a concrete example, these two email addresses:

Traffic Light Protocol

If you’re sharing sensitive computer security information it’s important to know how sensitive a document is, and who you can share it with.
US-CERT and many other security organizations use Traffic Light Protocol as shorthand for how sensitive the information in a document is. It’s simple and easy to remember with just four colour categories: Red, Amber, Green and White. If you’re likely to come into contact with sensitive infosec data, or you just want to understand the severity of current leaks, it’s good to know that it exists.

Spam, campaign statistics and red flag URLs

It’s not often spammers send me their campaign statistics, but on Tuesday one did.
The spam came “from” news@udemy.com, used udemy.com in the HELO and message-ids and, sure enough, was advertising udemy.com:

More on ARC

ARC – Authenticated Received Chain – is a way for email forwarders to mitigate the problems caused by users sending mail from domains with DMARC p=reject.
It allows a forwarder to record the DKIM authentication as they receive a mail, then “tunnel” that authentication on to the final recipient. If the final recipient trusts the forwarder, then they can also trust the tunneled DKIM authentication, and allow the mail to be delivered despite the DMARC p=reject published by the sending domain.
The specification and interoperability testing are progressing nicely and it’s definitely going to be useful for discussion list operators and vanity forwarders soon. It’s not something that’s as likely to help ESPs targeting small organizations and individuals, so all y’all shouldn’t be holding your breath for that.
There’s a more information about it at arc-spec.org and they’ve just published a great presentation with a technical overview of how it works:

Google drops obsolete crypto

Google is disabling support for email sent using version 3 of SSL or using the RC4 cypher.
They’re both very old – SSLv3 was obsoleted by TLS1.0 in 1999, and RC4 is nearly thirty years old and while it’s aged better than some cyphers there are multiple attacks against it and it’s been replaced with more recent cyphers almost everywhere.
Google has more to say about it on their security blog and if you’re developing software you should definitely pay attention to the requirements there: TLS1.2, SNI, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, DNS alternate names with wildcards.
For everyone else, make sure that you’ve applied any patches your vendor has available well before the cutoff date of June 16th.

DMARC p=reject

Mail.ru is switching to p=reject.
This means that you should special-case mail.ru wherever …
Actually, no. Time to change that script.
If you operate an ESP or develop mailing list software you should be checking whether the email address that is being used in the From: address of email you’re sending is in a domain that’s publishing p=reject (is a “rejective” email address) automatically. And you should probably do that in real time, whenever you need that piece of information, relying on DNS caching to reduce the network latency.
If you find you’re about to send an email From a rejective email address, you probably shouldn’t send it. Depending on how the recipients’ ISPs handle it, it might be discarded put in the bulk folder or rejected – potentially leading to recipients being unsubscribed.
If you’re writing mailing list software, ideally you should provide your users with several options for handling submissions from rejective email addresses, perhaps some from this list:

Foundation: A toolkit for designing responsive emails

Zurb announced today version 2 of “Foundation for Email”, a full stack for designing content for responsive email.
inky
It looks rather nice, with features a modern web developer might look for when working on email content. It has many of the things you’d expect a web design stack to have. It support SASS for styling, includes browser sync for previewing content as it’s edited, both on a local browser and on a device, and uses gulp to tie the workflow together.
But it also has some features useful for email that you’d be unlikely to find in a web design stack. It has an inliner, to convert separate SASS/CSS and HTML content into a single HTML document suitable for sending by mail. And it supports a slightly extended HTML format called “Inky”, which lets you use simple tags like <row> and <column> to develop grid-based content, then compile those into old-school HTML tables which mail clients will happily render.
And it comes with ten starter templates for different types of email.
You can find documentation, downloads and examples here.

Optimize your SPF records

I talked on Monday about the SPF rule of ten and how it made it difficult for companies to use multiple services that send email on their behalf.
Today I’m going to look at how to fix things, by shrinking bloated SPF records. This is mostly aimed at those services who send email on their customers behalf and ask their customers to include an SPF record as that’s the biggest pain point, but some of it is also useful to people publishing their own SPF records.

Get rid of costly SPF directives
First, rethink using the “mx” directive. It’s often used in example SPF records, because it makes them look simpler. But an MX directive always triggers a DNS lookup that counts against your limit of ten, and it’ll also trigger a DNS lookup for each hostname in your MX record – they don’t count against the SPF limit but may increase the latency of your delivery a little. Better than using “mx” is to use explicit “ip4” and “ip6” records to list the addresses your smarthost and MX send mail from. Even though this makes your SPF record look longer it’ll actually make it smaller, as measured by DNS queries, as a single “mx” directive costs more than 20 “ip4” directives.
Similarly, avoid the “a” directive. It’s much less commonly seen but again can usually be replaced with “ip4” or “ip6” directives.
Don’t use “ptr” directives. They’re deprecated by the current SPF RFC.
Check the address ranges
If you have many “ip4” and “ip6” directives, make sure they’re not redundant. Are there any address ranges that you’re not using any more? Are there any adjacent address ranges that can be merged? For example, “ip4:x.y.z.4/24” and “ip4:x.y.z.5/24” can be replaced with “ip4:x.y.z.4/23” (note that you can’t always replace adjacent address blocks of the same size – read up on CIDR notation).
If you’ve generated your CIDR blocks from address ranges you can sometimes have very inefficient representations. The address range 10.11.12.1-10.11.12.254 needs 14 “ip4” directives to represent precisely. Instead you can use the single directive “ip4:10.11.12.0/24”, even if you’re not sending any email from the .0 or .255 addresses.
You don’t need a “~all” or “-all” at the end of a TXT record that is only included in another SPF record, not used directly. It won’t do any harm but it wastes a few characters.
Once you’ve got your list of SPF directives cleaned up the next thing to do is to pack them into one or more DNS TXT records.
Use as few TXT records as you can
Some SPF tutorials say that you can’t put more than 255 characters of SPF data into each TXT record. That’s not quite true, though.
A TXT record contains one or more strings of text and each string can contain no more than 255 characters. But an SPF checker will take all of the strings in a TXT record and concatenate them together in order before it starts looking at the content. So you can have more than 255 characters of SPF data in a TXT record by splitting it into more than one string. (Some low-end DNS management web front ends don’t really understand TXT records and won’t let you include multiple strings – you should check that your DNS management system does before relying on this).
How much more than 255? That’s where you have to get a little familiar with the DNS protocol, as the real limitation is that you don’t want your DNS packets to be more than 512 bytes long. (Why 512 bytes? That’s a long story of protocol changes and incompatibility, but 512 bytes are about as big as you can reliably use over UDP. Just trust me.)
The DNS overhead for a reply that contains a single TXT record with two strings is about 34 bytes, plus the length of the hostname that’s being queries (e.g. “spf.example.com” is 15 bytes). So to keep within the 512 byte limit you need to break your SPF into chunks of no more than 478 minus the length of the hostname. Then you need to break that SPF data into two strings (remembering that they’ll be concatenated with no white space added, so if you break it at a space you need to include the space at the end of the first string or the beginning of the second).
That’ll give you a TXT record that looks something like this:

Some mechanisms and modifiers (collectively, “terms”) cause DNS queries at the time of evaluation, and some do not. The following terms cause DNS queries: the “include”, “a”, “mx”, “ptr”, and “exists” mechanisms, and the “redirect” modifier. SPF implementations MUST limit the total number of those terms to 10 during SPF evaluation, to avoid unreasonable load on the DNS. If this limit is exceeded, the implementation MUST return “permerror”.
Read More

Mutt: Mailbox power tool

Milestone_1_6_–_Mutt

“All mail clients suck. This one just sucks less.”

Following the SMTP rules

An old blog post from 2013, that’s still relevant today.
“Blocked for Bot-like Behavior”
An ESP asked about this error message from Hotmail and what to do about it.
“Bot-like” behaviour usually means the sending server is doing something that bots also do. It’s not always that they’re spamming, often it’s a technical issue. But the technical problems make the sending server look like a bot, so the ISP is not taking any chances and they’re going to stop accepting mail from that server.
If you’re an ESP what should you look for when tracking down what the problem is?
First make sure your server isn’t infected with anything and that you’re not running an open relay or proxy. Second, make sure your customers aren’t compromised or have had their accounts hijacked.
Then start looking at your configuration.
HELO/EHLO values

Clickthrough forensics

When you click on a link in your mail, where does it go? Are you sure?
HTTP Redirects
In most bulk mail sent the links in the mail aren’t the same as the page the recipients browser ends up at when they click on it. Instead, the link in the mail goes to a “click tracker” run by the ESP that records that that recipient clicked on this link in this email, then redirects the recipients web browser to the link the mail’s author wanted. That’s how you get the reports on how many unique users clicked through on a campaign.
In the pay-per-click business that’s often still not the final destination, and the users browser may get redirected through several brokers before ending up at the final destination. I walked through some of this a few years ago, including how to follow link redirection by hand.
HTTP Forensics
Evil spammers sometimes deploy countermeasures against that approach, though – having links that will only work once or twice, or redirects that must be followed within a certain time, or javascript within an intermediate page or any of a bunch of other evasions. For those you need something that behaves more like a web browser.
For serious forensics I might use something like wireshark to passively record all the traffic while I interact with a link from inside a sandboxed browser. That’s not terribly user-friendly to use or set up, though, and usually overkill. It’s simpler and usually good enough to use a proxy to record the web traffic from the browser. There are all sorts of web proxies, used for many different things. What they have in common is that you configure a web browser to talk to a proxy and it’ll send all requests to the proxy instead of to the actual website, allowing the proxy to make any changes it wants as it forwards the requests on and the results back.
For investigating what a browser is doing the most useful proxies are those aimed at either web developers debugging web apps or ~~crackers~~ penetration testers compromising web apps. Some examples are Fiddler (Windows), Cellist (OS X, commercial), mitmdump (OS X, linux, Windows with a little work), Charles (anything, commercial) or ZAP (anything).
I’m going to use mitmdump and Firefox. You don’t want to use your main browser for this, as the proxy will record everything you do in that browser while you have it configured – and I want to keep writing this post in Safari as I work.

Lets Encrypt Everything

Using ~~SSL~~ TLS to protect data in transit and authenticate servers you contacted originally required specialized software, complex configuration and expensive and complicated to require certificates.
The need for specialized software is long since gone. Pretty much every web server and mail server will support SSL out of the box.
Basic server configuration is now pretty simple – give the server a couple of files, one containing the TLS certificate and the other the associated private key. (Configuring a server securely, avoiding a variety of attacks on weak parts of the SSL/TLS protocol, can be a bit more work, but there are a lot of tools and documentation to help with that).
But acquiring a certificate from a reputable provider is still expensive enough (one Certificate Authority’s list price for entry level certificates is $77/year, though the same certificates are available for <$10/year from resellers, which tells you a lot about the SSL certificate market) that you might not want to buy one for every endpoint.
And the process of buying an SSL certificate is horrible.
First you have to find a Certificate Authority or, more likely, a cheap reseller. Then it requires generating a “Certificate Signing Request” – something that can be done in dozens of different ways, depending on the device it’s being generated for. The user-friendliest way I’ve found to do it is to use the openssl commandline tools – and that’s really, really not very friendly.
Then you need to log in to the CA, upload the CSR, follow a bunch of directions to confirm that you are who you say you are and you’re entitled to a certificate for the domain you’re asking for. That can be as simple uploading a file to a webserver that serves the domain you’re getting a certificate for. Or it can turn into an inquisition, where the CA requires your home address and personal phone number before they’ll even talk to you.
Then you need to use the same browser you submitted the CSR request from to get your certificate – as the CA doesn’t do all the cryptography itself, it relies on your web browser for some of it. There are some security-related reasons why they chose to do that, but it makes for a fragile – and near impossible to automate – process. And the attempts to upsell you to worthless “security” products are never-ending.
help
There’s probably more horribleness, but it’s 9 months since I last bought a certificate and I may have forced myself to forget some of the horror.
There’s no need for it to be that painful. It’s easy to mechanically prove to a Certificate Authority that you control a domain in the same way you prove ownership to other companies – put a file the CA gives you onto the webserver, or add a special CNAME to your DNS.
And once you’ve proved you own a domain a certificate for that domain could be generated completely automatically. For the most common web setups you could even prove domain ownership, generate a new certificate and install that certificate into the webserver entirely automatically.
I’m sure there’s some reason other than “because we’re milking the SSL cash cow for all it’s worth” that dealing with most Certificate Authorities is so painful and expensive. But there’s no need for them to be that way.
This year there’ve been several groups who have stepped forward to escape the pain of dealing with legacy Certificate Authorities, at least for basic domain verified TLS certificates (as opposed to the “green bar” extended verification certificates you might want for, e.g. an eCommerce site).
Lets Encrypt has been one of the higher profile new CAs who are driving this effort. With the help of ACME, an open protocol being developed for issuing certificates automatically, they’re providing zero-cost TLS certificates. They’re hoping that, to use their phrase, you’ll encrypt everything, using TLS to encrypt traffic everywhere it makes sense to do so.
So, how does it work?
coyote
It works wonderfully well.
Lets Encrypt is still in public beta testing for a few more weeks, but I just got beta access. Installing their client tool on a reasonably recent Linux box just took copy-pasting two commands and waiting a minute.
The tool supports a variety of different ways to request a certificate – from entirely automated for a vanilla Apache installation through to the most complicated, entirely manual.
To see how difficult it was I decided to install a certificate for blighty.com, a domain on one of our very old webservers – one that was too old to install the letsencrypt tool itself. Fully manual! For a website on a different server!
I ran the letsencrypt tool, telling it I wanted a certificate for blighty.com. It gave me a the contents and location of a file to put on the blighty.com webserver – creating that required copying and pasting two commands from one shell window to another. Then it created and gave me the new certificate and key. I copied those across to the webserver and reloaded it’s configuration. And I have TLS!
This is going to be very simple to completely automate, so you could easily build it in to existing automation to create TLS protected action and tracking links for branded domains. Supporting thousands of TLS protected hostnames wouldn’t be difficult.
And how about using the certificates for things other than webservers? Mail servers, say? You do need a webserver to be live while you’re generating or renewing a certificate – but that’s actually easier to do for a host that doesn’t usually serve web pages than one that does. Once the certificate is generated you can use it for any service, including mail servers.
ACME-movie

SPF debugging

Someone mentioned on a mailing list that mail “from” intuit.com was being filed in the gmail spam folder, with the warning “Our systems couldn’t verify that this message was really sent by intuit.com“. That warning means that Gmail thinks it may be phishing mail. Given they’re a well-known financial services organization, I’m sure there is a lot of phishing mail claiming to be from them.
But I’d expect that a company the size of Intuit would be authenticating their mail, and that Gmail should be able to use that authentication to know that the mail wasn’t a phish.
Clearly something is broken somewhere. Lets take a look.
Looking at the headers, the mail was being sent from Salesforce, and (despite Salesforce offering DKIM) it wasn’t DKIM signed by anyone. So … look at SPF.
SPF passes:

Trawling through the junk folder

As a break from writing unit tests this morning I took a few minutes to go through my Mail.app junk folder, looking for false positives for mail delivered over the past six weeks.
trashcans
We don’t do any connection level rejection here, so any mail sent to me gets delivered somewhere. Anything that looks like malware gets dumped in one folder and never read, anything that scores a ridiculously high spamassassin score gets dumped in another folder and never read, mailing lists get handled specially and everything else gets delivered to Mail.app to deal with. That means that Mail.app sees less of the ridiculously obvious spam and is mostly left to do bayesian filtering, and whatever other magic Apple implemented.
There were about thirty false positives, and they were all B2C bulk advertising mail. I receive a lot of 1:1 mail, transactional mail and B2B marketing mail and there were no false positives at all for any of those.
All the false positives were authenticated with both SPF and DKIM. All of them were for marketing lists I’d signed up for while making a purchase. All of them were “greymail” – mail that I’d agreed to receive, and that was inoffensive but not compelling. While I easily spotted all of them as false positives via the from address and subject, none of them were content I’d particularly missed.
Almost all of the false positives were sent through ESPs I recognized the name of, and about 80% of them were sent through just two ESPs (though that wasn’t immediately obvious, as one of them not only uses random four character domain names, it uses several different ones – stop doing that).
If you’d asked me to name two large, legitimate ESPs from whom I recalled receiving blatant, blatant spam recently, it would be those same two ESPs. Is Mail.app is picking up on my opinions of the mail those ESPs are sending? It’s possible – details specific to a particular ESPs mail composition and delivery pipelines are details that a bayesian learning filter may well recognize as efficient tokens.

DMARC News – Gmail p=reject and ARC

DMARC.org announced this morning that Gmail will be moving to publishing a p=reject DMARC record in June of next year, much the same as Yahoo and AOL have.
Unlike Yahoo and AOL, Gmail are giving those who will be affected plenty of time to prepare for any issues, and have waited until there are some potential ways to mitigate problems in the development pipeline.
The ARC proposal, mentioned in the announcement, is one of the more promising mitigation approaches, and the specification for it can be found here:
Authenticated Received Chain (ARC) (draft-anderson-arc-00)
Recommended Usage of the Authenticated Received Chain (ARC) (draft-jones-arc-usage-00)
And some background on the issues it intends to mitigate can be found here:
Interoperability Issues Between DMARC and Indirect Email Flows (draft-ietf-dmarc-interoperability-07)

IPv6 and authentication

I just saw a post over on the mailop mailing list where someone had been bitten by some of the IPv6 email issues I discussed a couple of months ago.
They have dual-stack smarthosts – meaning that their smarthosts have both IPv4 and IPv6 addresses, and will choose one or the other to send mail over. Some domains they send to use Office 365 and opted-in to receiving mail over IPv6, so their smarthosts decided to send that mail preferentially over IPv6.
The mail wasn’t authenticated, so it started bouncing. This is probably going to happen more and more over the next year or so as domain owners increasingly accept mail over IPv6.
If your smarthosts are dual stack, make sure that your workflow authenticates all the mail you send to avoid this sort of delivery issue.
One mistake I’ve seen several companies make is to have solid SPF authentication for all the domains they send – but not for their IPv6 address space. Check that all your SPF records include your IPv6 ranges. While you’re doing that keep in mind that having too many DNS records for SPF can cause problems, and try not too bloat the SPF records you have your customers include.

TXTing

txt
On Friday I talked a bit about the history behind TXT records, their uses and abuses.
But what’s in a TXT record? How is it used? When and where should you use them?
Here’s what you get if you query for the TXT records for exacttarget.com from a unix or OS X command line with dig exacttarget.com txt

A brief history of TXT Records

txt
When the Domain Name System was designed thirty years ago the concept behind it was pretty simple. It’s mostly just a distributed database that lets you map hostname / query-type pairs to values.
If you want to know the IP address of cnn.com, you look up {cnn.com, A} and get back a couple of IP addresses. If you want to know where to send mail for aol.com users, you look up {aol.com, MX} and you get a set of four hostname / preference pairs back. If you want to know the hostname for the IP address 206.190.36.45 you look up {45.36.190.206.in-addr.arpa, PTR} and get a hostname back.
There’s a well-defined meaning to each of those query types – A is for IP addresses, MX is for mailservers, PTR is for hostnames – and that was always the intent for how DNS should work.
When DNS was first standardized, though, there was one query type that didn’t really have any semantic meaning:

IPv6 Email is a little different

On Monday I talked about how big IPv6 address space is, and how many IPv6 addresses will be available to end users. We’re mostly an email blog, though, so what’s the relevance to sending email?
If the recipient you’re sending to has an IPv6 mailserver you can send mail to them over IPv6, if you choose to. If they only have an IPv6 mailserver, with no IPv4 mailserver at all then you have to send over IPv6 to reach them.
For a long time I was pretty sure that IPv6-only mailservers were unlikely to be an issue any time soon – as IPv6 rolls out end users will get IPv6 addresses, and that will free up a huge number of IPv4 addresses that can then be used where they’re more valuable, for webservers and mailservers. As I’ve watched IPv4 addresses run out and the rise of a secondary market I’m begining to think that hoarding may make IPv4 addresses effectively unavailable or prohibitively expensive for small companies and individuals in some regions. If so, then a few IPv6-only mailservers will encourage others to support sending and receiving email over IPv6, which will in turn make IPv6-only servers more viable.
And you might want to use IPv6 even if the recipient has a dual-stack IPv4+IPv6 mailserver. As one example, Gmail accepts mail on IPv6 – and scuttlebutt is that right now their IPv6 servers are somewhat more forgiving for properly authenticated email, which is interesting. And if you’re running short of IPv4 addresses yourself, routing all your gmail recipients over IPv6 instead might free up some capacity and save you from having to go IPv4 address shopping.
But there are a few things to know before starting to send over IPv6.
IPv6 to IPv4 fallback
If you turn on IPv6 support on a mailserver it is likely to prefer IPv6 when sending mail to dual-stack recipients. That’s great if everything is set up perfectly, but if your IPv6 network configuration is flakey, or your authentication is not good enough for IPv6 mail, or the recipient has an IPv6-specific configuration problem then your delivery over IPv6 might fail. How to choose an MX for delivery – and how to fall back to an IPv4 MX – isn’t terribly well defined so there’s some risk of delivery of a message failing repeatedly. You should check how your smarthosts handle this sort of delivery failure.
There’s no legacy IPv6 to support
There are twenty year old servers sending email over IPv4, so attempts to enforce better authentication both of mailservers and messages have moved very slowly so as not to disrupt mail from those old servers.
IPv6 is a whole new world, though. Any mailserver set up to send via IPv6 has been set up relatively recently and it’s much more reasonable to expect it’s operators to follow best practices (PDF). If you want to send mail to Google over IPv6 you have to have “good” reverse DNS, and you have to authenticate the mail you send with at least one of SPF or DKIM. Google are much less tolerant of violations of those requirements for mail sent over IPv6, more likely to mark messages as spam or reject them altogether compared to delivery attempts over IPv4.
What does “good” reverse DNS mean? The IP address you’re sending from must have reverse DNS that resolves to a hostname, and that hostname must resolve back to the sending IP address. (You’ll sometimes see that described as “FCrDNS”.)
One customer, one /64
As I mentioned on Monday a consumer end user should be allocated no less than a /64 of IPv6 space. If you’re in the IPv4 mindset of addresses being scarce and valuable you might decide that you don’t need to do that with your customers, maybe assigning each of them a/124 to send their mail through. 16 IP addresses is plenty, right?
A large hosting company did that recently, assigning each of their customers a small range of IPv6 addresses out of a single /64 – and they discovered why it’s a terrible idea. They had no more than the usual level of email delivery problems on IPv4, but all of their IPv6 mail was blocked at a lot of destinations. Because a /64 is the smallest recommended range to assign to a user it’s also the smallest quantum that reputation services and blacklists will block by. Bad behaviour by one of their customers got the /64 that customer was sending from blocked – along with all the other customers sending from other parts of that /64.
So don’t do that.

IPv6 is big. Really big. You just won’t believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist, but that’s just peanuts to IPv6.
Read More

Microsoft Send

Microsoft Send is a new mail client by Microsoft for iPhones and soon Windows Phone and Android phones. Send is designed to send quick, short messages to contacts. Instead of building a chat application build on a proprietary protocol, Send sends and receives its messages over email and uses your existing mailbox to handle the messages. What makes Send neat is that I can start a conversation within the app and when I get back to my computer, I can log into Outlook Web Access and continue the conversation.
MicrosoftSend

Messages to and from the Send app do not utilize subjects lines.

Sending a message from my personal account with Office365 to my Word to the Wise account and the email looks like any other email I received except with the #Send on the subject line.
Inbox

The message goes through the same outbound mail servers as if I sent it from Outlook or OWA, so emails pass SPF.

SPF

If you are signing with DKIM, the emails will be signed and authenticated.

(Office365 will sign emails with DKIM soon, it’s on the Office RoadMap.)
For an email to show up within the Send app, the subject contains #Send.
Microsoft has taken a unique approach to building a messaging app that utilizes existing SMTP infrastructure. If you’re sending to a tech savvy list, take a look at your logs to see how many recipients are using Microsoft Send and consider reaching out to them specifically using #Send.

Image Blocking

I received this email earlier this week, an email that I wanted but this is how it arrived.
email example 1
The email contained a single image link, a text line of who the message was sent to, the senders name, address, and finally an unsubscribe link.
Good news, the email is CAN-SPAM compliant! Bad news, I have no idea what the content of the message is and it looks somewhat spammy. The email was sent to my Junk Folder and all images were blocked. As a good netcitizen, we’re trained not to click links if we’re not sure what they are.
Here is another message I received around that same time and also had the images blocked. I immediately recognize the domain name, bowling.com and there is text that mentions bowling shoes, balls, and bags. Being an avid bowler, I wanted this message and I will be adding them to my safe senders list in Outlook.
email example 2
The good news for marketers who rely on image based emails is Gmail and many mobile mail clients will auto-load images but there are still many clients that will only display images if the user sets the sender as a trusted sender. If you are sending a Welcome Message, it’s best to include text along with your images so the recipient can recognize your email and will then add you as a trusted sender. You can also segment your list by users who are opening the images. The recipients who have not loaded the images would get a different version of the message that includes more text.

PTR Records

PTR records are easy to over look and they have a significant impact on your ability to deliver mail without them. Some ISP and mailbox providers will reject mail from IP addresses that do not have a PTR record created. PTR records are a type of DNS record that resolves an IP address to a fully qualified domain name or FQDN. The PTR records are also called Reverse DNS records. If you are sending mail on a shared IP address, you’ll want to check to make sure the PTR record is setup, however you most likely will not be able to change it. If you are on a dedicated IP address or using a hosting provider like Rackspace or Amazon AWS, you’ll want to create or change the PTR records to reflect your domain name.
We usually think about DNS records resolving a domain name such as www.wordtothewise.com to an IP address. A query for www.wordtothewise.com is sent to a DNS server and the server checks for a matching record and returns the IP address of 184.105.179.167. The A record for www is stored within the zone file for wordtothewise.com. PTR records are not stored within your domain zonefile, they are stored in a zonefile usually managed by your service provider or network provider.
Some service providers provide an interface where you can create the PTR record yourself, others require you to submit a support request to create or change the PTR record.
If you know what IP address you are sending mail from, use our web based DNS tool to check if you have a PTR record created.
http://tools.wordtothewise.com/dns
Checking for a PTR record for 184.105.179.167 returns
167.128-25.179.105.184.in-addr.arpa 3600 PTR webprod.wordtothewise.com.
If you received Response: NXDOMAIN (There is no record of any type for x.x.x.x.in-addr.arpa), this means you’re missing the PTR record and need to create one ASAP if you are sending mail from that IP address!

DMARC=BestGuessPass

Looking at the headers within the mail received with my Office365 domain I see dmarc=bestguesspass. BestGuessPass? That’s a new.
Authentication Results
A few days after seeing dmarc=bestguesspass, Terry Zink at Microsoft posted an explanation. Exchange Online Protection, the filtering system for Office365, is analyzing the authentication of incoming emails and if the domain is not publishing a DMARC record, EOP attempts to determine what the results would be if they did. If an email is received that is not authenticated with either SPF or DKIM, the dmarc= results show none just as it always had. DMARC=BestGuessPass will appear if the message is authenticated and the matching authenticated domain does not have a DMARC record.
Having this information is helpful to see what the results would be before setting up a DMARC record. If you are seeing dmarc=bestguesspass when your mail is sent to an Office365 address and you are considering DMARC, the next step would be to publish a p=none DMARC policy and begin to document where your mail is being sent from. P=none will not have an impact on your delivery and asks the receiving mail server to take no action if a DMARC check fails. Once you have setup SPF and DKIM for your mail, p=none policy gives you the ability to begin receiving failure reports from receiving mail servers when unauthenticated mail is sent from your domain.

What is the Mail From field?

When emails are sent, there are two from fields, the Mail From and the Display From address. The Display From address (technically referred to as RFC.5322 from address) is the from address that is displayed to the end user within their email client. The Mail From (technically referred to as RFC.5321 from address) is the email address to which bounce messages are delivered. The Mail From field is sometimes referred to as the Return Path address, Envelope From address, or Bounce address.
It may seem confusing to have an email with two from fields, but knowing the difference is important to properly setup your SPF records.
Taking a look at this email I received from GoPro, the Return-Path (5321.From) goes back to @bounce.email.gopro.com. If I were to reply to the email, the message would go to @email.gopro.com. The Display From (5322.From) address is gopro@email.gopro.com.

I would want to add the email address GoPro@email.gopro.com to my address book because that is the email address that is displayed in my email client. The reason why the Return-Path is different from the From address is because GoPro likely has an automated system that will process the bounce back messages (sent to @bounce.email.gopro.com) and automatically flag or unsubscribe those email addresses. This allows GoPro to setup automatic processing of the different mail streams sent to them, one stream being the bounce backs after a mailing and the second being an automated customer service system.
Where does SPF fit in?
SPF checks the Mail From (5321.From) address, not the Display From (5322.From) address. In the example above, there should be an SPF record for the subdomain of bounce.email.gopro.com. I can check the SPF record using our Authentication tools http://tools.wordtothewise.com/spf/check/bounce.email.gopro.com and I receive the following results:

Checking the headers shows that GoPro does have a SPF record setup and the message was authenticated with SPF.

For SPF records, make sure the SPF record matches the Mail From (From.5321)/Return-Path domain name. Have your recipients add the Display From (From.5322) email address to their address book so they will continue to receive your mailings.

Office365/EOP and Outlook.com/Hotmail will converge

Terry Zink posted two informative blog posts recently, the first being the change to unauthenticated mail sent over IPv6 to EOP and the second post about EOP (Office365 and Exchange Hosting) and Outlook.com/Hotmail infrastructure converging.
Exchange Online Protection (EOP) is the filtering system in place for Office 365 and hosted Exchange customers. Outlook.com/Hotmail utilized its own mail filtering system and provides SNDS/JMRP programs. EOP is setup for redundancy, failover, provides geo-region servers to serve customers, and has supported TLS for over a decade. Terry explains that Hotmail’s spam filtering technology is more advanced than EOP’s, but EOP’s backend platform is more advanced. The process to convert Outlook.com/Hotmail to use EOP’s filtering system started six months ago and is still a work in progress. Once completed, Outlook.com/Hotmail and Office365/EOP will share the same UX look and feel. The anti-spam technologies will be able to be shared between the two as they will share the same backend infrastructure.
Some of the challenges of merging the two systems include:

Office365/EOP IPv6 changes starting today

Terry Zink at Microsoft posted earlier this week that Office365/Exchange Online Protection will have a significant change this week. Office365 uses Exchange Online Protection (EOP) for spam filtering and email protection. One of the requirements to send to EOP over IPv6 is to have the email authenticated with either SPF or DKIM. If the mail sent to Office365/EOP over IPv6 is not authenticated with SPF or DKIM, EOP would reject the message with a 554 hard bounce message. Most mail servers accept the 554 status code and would not retry the message. After multiple 5xx hard bounces to an email address, many mail servers would unsubscribe the user from future email campaigns. The update starting today April 24, will change the error status code for unauthenticated mail to EOP from a 554 hard bounce to a 450 soft bounce and a RFC-compliant and properly configured mail server would then retry the message.
Prior to April 24, 2015, EOP responds to unauthenticated mail with a status code of: “554 5.7.26 Service Unavailable, message sent over IPv6 must pass either SPF or DKIM validation”.

Authentication and Repudiation

Email Authentication lets you demonstrate that you sent a particular email.
Email Repudiation is a claim that you didn’t send a particular email.

SPF is only for email authentication¹
DKIM is only for email authentication
DMARC is only for email repudiation

¹ SPF was originally intended to provide repudiation, but it didn’t work reliably enough to be useful. Nobody uses it for that now.

Salesforce and DKIM

Last month I wrote about how Salesforce was implementing the ability to sign emails sent from Salesforce CRM with DKIM. The Spring 15 update is now live as is the ability to use an existing DKIM key or allow Salesforce to create a new one for you.
Setting up DKIM within Salesforce is straightforward. A Salesforce Administrator would go to Setup->Email Administration->DKIM Keys.
sf-dkim-step0
You can either allow Salesforce to create you a new DKIM key or you can import an existing key. For this example, I am going to create a new DKIM key for the domain wttwexample.com with a DKIM selector of 2015Q1.
Step 1 – Creating a new key within Salesforce, you enter the Selector for the key (2015Q1), the domain for the key (wttwexample.com), and the strictness of the key allowing either the exact domain only, subdomains of the domain only, or Exact domain and subdomains.
sf-dkim-step1
Step 2 – The next screen will display both the Public Key and the Private Key.
sf-dkim-step2
Step 3 – With the key being created, we need to store the Public Key within our DNS for the domain by created a TXT record with a hostname of 2015Q1._domainkey.
sf-dkim-step3
Using a DKIM check tool like ours http://tools.wordtothewise.com/authentication, we can see if the DKIM key is in the DNS and if the key is valid.
Step 4 – Once we have confirmed the key is valid and in DNS, we can go back to Salesforce and activate the key.
Step 5 – Emails sent from the Salesforce CRM Sales Cloud will now be signed with the new DKIM key and the emails will have a new header added called DKIM-Signature.
Signing with DKIM allows us to tell the recipient ISP that “yes, I sent this email” and this allows the ISP to track our reputation by the domain instead of just by the IP address. This means that some fraction of our good reputation will be associated with these emails that are sent from Salesforce CRM. If we have not established any reputation yet, signing with DKIM is a good key to enable services like feedback loops as it includes the proof that you’re sending the FBL reports to someone responsible, not a random third party.
If you have plans to consider utilizing DMARC, you need to have ALL of your sources of mail authenticated. DMARC looks for a passing SPF or DKIM validation during its evaluation of the message. Utilizing both SPF and DKIM for DMARC validation is recommended.
Having emails signed with DKIM, having a valid SPF, setting up sensible reverse DNS, having good hostnames all show that you are doing your part to send legitimate and valid mail. Signing with DKIM does not give you a free pass to send spammy emails, it just tells the receiving party who is taking responsibility for sending the message.

thirty.years.com

Thirty years ago this Sunday, symbolics.com was registered – the first .com domain. It was followed, within a few months, by bbn.com, think.com, mcc.com and dec.com.
Symbolics made lisp machines – symbolics.com is now owned by a domain speculator.
BBN is a technology R&D company who’ve worked on everything. If I had to pick one thing they were involved with it’d be the Internet Message Processor – the router used on the very first Internet nodes. They are still around, as a division of Raytheon.
Think.com made some amazing massively parallel computers. Their hardware group was bought out by Sun, who were bought out by Oracle and think.com now redirects to a broken error page at oracle.com.
Mcc.com were the first – and for a while, the largest – computing research and development consortium in the US. They did groundbreaking work on everything from silicon to AI. Their domain is now a generic parked page owned by a domain speculator.
Dec.com were Digital Equipment Corporation – creators of the PDP, VAX, Alpha and StrongARM processors, amongst many other things. They were a huge company when I worked for them designing Alpha CPUs in the mid 90s, then they were acquired by Compaq, then HP, then split up. Their domain is now a personal website.
It took nearly three years to reach 100 registered .com domains and nearly 10 years to reach 9,000.
As of this morning there are 116,621,517 domains registered in .com, from (64 zeros).com to (64 letter z).com, out of a possible total of more than two googol – so there’s still a domain there for you.
221,848 of those domains in .com mention “mail”.

Salesforce SPF and now DKIM support

Salesforce has published a SPF record for sending emails from Salesforce for years and with the Spring ’15 release, they will provide the option to sign with DKIM.
The SPF record is straight forward, include:_spf.salesforce.com which includes _spf.google.com, _spfblock.salesforce.com, several IP address blocks, mx, and ends with a SoftFail ~all.
Salesforce Knowledge Article Number: 000006347 goes in-depth with information regarding their SPF Record.

Mailbox preview and HTML content

I just received a slightly confusing email.

Inbox__86332_messages__19_unread_

The From address and the Subject line are from Sony, but the content looks like it’s from email analytics firm Litmus. What’s going on here?
Opening the mail it looks like a fairly generic “Oops, we lost a class-action lawsuit, have $2 worth of worthless internet points!” email from Sony; no mention of Litmus at all. My first thought is that Mail.app has managed to scramble it’s summary database and it’s pulling summaries from the wrong email, as I am on a Litmus mailing list or two, but nothing else looks off.
Digging around inside the source of the mail I do find a bunch of tracking gifs from emltrk.com, which is a Litmus domain so there is a Litmus connection there somewhere. Curious.
Finally, about two pages in to the HTML part of the mail I find this:

AHBL Wildcards the Internet

AHBL (Abusive Host Blocking List) is a DNSBL (Domain Name Service Blacklist) that has been available since 2003 and is used by administrators to crowd-source spam sources, open proxies, and open relays. By collecting the data into a single list, an email system can check this blacklist to determine if a message should be accepted or rejected. AHBL is managed by The Summit Open Source Development Group and they have decided after 11 years they no longer wish to maintain the blacklist.
A DNSBL works like this, a mail server checks the sender’s IP address of every inbound email against a blacklist and the blacklist responses with either, yes that IP address is on the blacklist or no I did not find that IP address on the list. If an IP address is found on the list, the email administrator, based on the policies setup on their server, can take a number of actions such as rejecting the message, quarantining the message, or increasing the spam score of the email.
The administrators of AHBL have chosen to list the world as their shutdown strategy. The DNSBL now answers ‘yes’ to every query. The theory behind this strategy is that users of the list will discover that their mail is all being blocked and stop querying the list causing this. In principle, this should work. But in practice it really does not because many people querying lists are not doing it as part of a pass/fail delivery system. Many lists are queried as part of a scoring system.
Maintaining a DNSBL is a lot of work and after years of providing a valuable service, you are thanked with the difficulties with decommissioning the list. Popular DNSBLs like the AHBL list are used by thousands of administrators and it is a tough task to get them to all stop using the list. RFC6471 has a number of recommendations such as increasing the delay in how long it takes to respond to a query but this does not stop people from using the list. You could change the page responding to the site to advise people the list is no longer valid, but unlike when you surf the web and come across a 404 page, a computer does not mind checking the same 404 page over and over.
Many mailservers, particularly those only serving a small number of users, are running spam filters in fire-and-forget mode, unmaintained, unmonitored, and seldom upgraded until the hardware they are running on dies and is replaced. Unless they do proper liveness detection on the blacklists they are using (and they basically never do) they will keep querying a list forever, unless it breaks something so spectacularly that the admin notices it.
So spread the word,

M3AAWG Recommends TLS

SSL or Secure Sockets Layer is protocol designed to provide a secure way of transmitting information between computer systems. Originally created by Netscape and released publicly as SSLv2 in 1995 and updated to SSLv3 in 1996. TLS or Transport Layer Security was created in 1999 as a replacement for SSLv3. TLS and SSL are most commonly used to create a secure (encrypted) connection between your web browser and websites so that you can transmit sensitive information like login credentials, passwords, and credit card numbers.
M³AAWG published a initial recommendation that urges the disabling of all versions of SSL. It has been a rough year for encryption security, first with Heartbleed vulnerability with the OpenSSL library, and again with POODLE which stands for “Padding Oracle on Downgraded Legacy Encryption” that was discovered by Google security researchers in October of 2014. On December 8, 2014 it was reported that TLS implementations are also vulnerable to POODLE attack, however unlike SSLv3, TLS can be patched where as SSL 3.0 has a fundamental issue with the protocol.

A code glitch in a new DBL sub-zone known as 'Abused-Legit' caused the new Abused-Legit zone to list ".net." for 60 minutes from 08:35 UTC.
Read More

Lorem Ipsum for PII

When you’re developing code to handle data it’s almost essential to have a decent sized set of test data, so you can build a test harness to check on functionality and performance as you go.
A common way of doing that is to take a snapshot of your production database and pull out an appropriate subset from there. That works pretty well in most cases, but it’s a really bad idea if the data you’re working with is personally identifiable information, such as email addresses, phone numbers, credit cards and so on.
Test data gets spread everywhere. It’s checked in to source control systems, copied to developers laptops, included in publicly visible bug reports, shared with mailing lists when asking questions and sent to that dodgy overseas outsourcing company your CTO is evaluating. And if the code you’re developing sends email or SMS messages then sooner or later you’re going to misconfigure your test platform and send test messages to the contacts in your test data. (I’ve only done that once, and it was a memorable experience.)
But test data needs to be similar to real data, and look plausible, or it’s hard for manual testers to identify problems using it.
Enter randomuser.me – a simple API for generating random user data – name, email address, birthdate, phone numbers, postal address, social security number, even photos.
Need something more configurable, that lets you create a fake API to test your code against? Try RandomAPI for a web API returning JSON, SQL, CSV or YAML.
Just need some test JSON files you can generate and paste in to your test suite? Try JSON Generator.
Need bulk data, to load into your test database? Look at Mockaroo, DummyData or GenerateData.
Just don’t use your production PII, even if you plan on anonymizing it before use. Really.

Friendly email addresses

Most of the time when we’re talking about email addresses, we’re talking about the actual user@domain format that’s used to send mail over the wire, but that’s not how we most often see them. When they’re used in a To: or From: header they’re usually associated with a display name – the “real name” of the user with the associated email address. In the From: field that’s often called the “friendly from”, but the syntax used in the To:, Cc: and Bcc: fields is identical.
The display name is important, as it’s shown more in mail clients than the actual email address is. Some mobile clients don’t display the email address at all, just the display name.
There are three ways you can put an email address in a header field.
The best way is to wrap the email address itself in angle brackets, and put the display name in front of it.

STARTTLS and misplaced outrage

About a month ago someone posted a heavily elided screenshot that they claimed was evidence of their ISP, AT&T, sabotaging SMTP connections being sent over their network, meaning that anyone could sniff their passwords and traffic.
This is it:

Most email people looking at that saw the asterisks in the banner and went “Oh. That’s not the ISP tampering with the traffic, the person running the mailserver doesn’t know how to configure their PIX firewall.”
It’s a very, very, very, well known issue.
But some groups who should know better, such as Ars Technica and the EFF, don’t seem to understand – even when they know about PIX fixup – that this isn’t tampering by intermediate ISPs, it’s just the operator of the mailserver in question not knowing how to configure his firewall. And it’s not a general attempt by consumer ISPs to “tamper with email encryption”, it’s just the operator of one mailserver not knowing how to configure his firewall.
PIX is a simple NAT/firewall appliance from Cisco. It’s a reasonable firewall, but it has some quirks. One of them is it’s “MailGuard” or “SMTP fixup” feature. When that’s turned on, it intercepts SMTP traffic and “sanitizes” it, to protect the mailserver from hostile traffic. To do this, it does a couple of things. One is that it blocks any attempt at sending a command that’s not one of the bare basic SMTP commands, by intercepting them and rejecting them with the error “502 5.5.2 Error: command not recognized”. The other is that it hides the software that’s running on the mailserver, removing any mention of it from the banner string sent when you connect. In fact, it replaces any character other than “2” or “0” with an asterisk.
I had an old PIX that I’ve not used in years, so I thought I’d set it up to show you. Here it is, being guarded by Freddy Chimpenheimer.

I set it up as though it was protecting our mailserver.
Here’s what happens when I connect to the mailserver with the PIX configured correctly:

And here’s what happens when I configure the PIX to use “fixup protocol smtp 25” and try and connect to the mailserver again:

Looks pretty similar to the “ISP tampering with the traffic” screenshot this all started with. I’m using an older PIX firmware image (I really didn’t want to spend the time and money to upgrade my PIX) so it errors out on EHLO, rather than just on STARTTLS. And because this old firmware doesn’t support EHLO, you also don’t see it using “XXX” to block out the string “STARTTLS” in the response to EHLO – the line in the original that says “250-XXXXXXXXA” said “250-STARTTLSA” before the PIX censored it.
Now I have those screenshots I’m going to disconnect my PIX and put it back in the pile of spare networking gear.
So the whole issue is just a mailserver operator who has a badly misconfigured firewall in front of his mailserver, nothing more.

SWAKS: the SMTP Swiss Army Knife

flash_m_laser_1200_900
SWAKS is a general purpose testing tool for SMTP. For basic SMTP testing it’s a more convenient, scriptable alternative to running a transaction by hand, but it also lets you test things that are difficult to do manually, such as authentication or TLS encryption.
It’s a perl script that installs fairly easily on OS X or any Linux/unix system (and can be installed on Windows, if you have perl installed there).
It’s pretty well documented, but it can be a bit overwhelming to start with. Here are some simple recipes:
Send a test email:

I can't click through if you don't exist

Recipients can’t click through if you don’t exist
A tale of misconfigured DNS wrecking someone’s campaign.
I got mail this morning from A Large Computer Supplier, asking me to fill in a survey about them. I had some feedback for them, mostly along the lines of “It’s been two decades since I bought anything other than rackmount servers from you, maybe I’m not a good advertising target for $200 consumer laptops?” so I clicked the link.

Failed_to_open_page

(I’ve replaced the real domain with survey.example.com in this post, to protect the innocent, but everything else is authentic).
That’s not good. The friendly error messages web browsers give sometimes hide the underlying problem, but that looks like a DNS problem. Did they do something stupid, like putting the wrong URL in the mail they sent?

DMARC and report size limits

I just saw an interesting observation on the dmarc-discuss mailing list. Apparently some of the larger providers who are implementing DMARC for inbound email may not be handling some of the grubbier corners of the spec perfectly. That’s not surprising at all – early adopters tend to deploy code that implements early versions of the draft specification – but I can see this particular issue tripping up people who are beginning to deploy DMARC for their outbound mail.
DMARC includes the feature of requesting feedback reports about authentication failures – you just include the email address you want them sent to as a mailto: URI in the rua= and ruf= fields:

Alice and Bob and PGP Keys

Last week Alice and Bob showed how to cryptographically sign messages so that the recipient can be sure that the message came from the purported sender and hasn’t been forged by a third party. They can only do that if they can securely retrieve the senders public key – which means they need to retrieve it from the actual sender, rather than an impostor, and be sure it’s not tampered with en route. How does this work in practice?
If I want to send someone an encrypted email, or I want to verify that a signed email I received from them is valid, I need a copy of their public key (almost certainly their PGP key, in practice). Perhaps I retrieve it from their website, or from a copy they’ve sent me in the past, or even from a public keyserver. Depending on how I retrieved the key, and how confident I need to be about the key ownership, I might want to double check that the key belongs to who I think it does. I can check that using the fingerprint of the key.
A key fingerprint looks like this:

Alice and Bob Sign Messages

Alice and Bob can send messages privately via a nosy postman, but how does Bob know that a message he receives is really from Alice, rather than from the postman pretending to be Alice?
If they’re using symmetric-key encryption, and Bob is sure that he was talking to Alice when they exchanged keys, then he already knows that the mail is from Alice – as only he and Alice have the keys that are used to encrypt and decrypt messages, so if Bob can decrypt the message, he knows that either he or Alice encrypted it. But that’s not always possible, especially if Alice and Bob haven’t met.
Alice’s shopping list is longer for signing messages than for encrypting them (and the cryptography to real world metaphors more strained). She buys some identical keys, and matching padlocks, some glue and a camera. The camera isn’t a great camera – funhouse mirror lens, bad instagram filters, 1970s era polaroid film – so if you take a photo of a message you can’t read the message from the photo. Bob also buys an identical camera.
sign1
Alice takes a photo of the message.
sign2
Then Alice glues the photo to one of her padlocks.
sign3
Alice sends the message, and the padlock-glued-to-photo to Bob.
sign4
Bob sees that the message claims to come from Alice, so he asks Alice for her key.
sign5
(If you’re paying attention, you’ll see a problem with this step…)
Bob uses Alice’s key to open the padlock. It opens (and, to keep things simple, breaks).
Bob then takes a photo of the message with his camera, and compares it with the one glued to the padlock. It’s identical.
sign6
Because Alice’s key opens the padlock, Bob knows that the padlock came from Alice. Because the photo is attached to the padlock, he knows that the attached photo came from Alice. And because the photo Bob took of the message is identical to the attached photo, Bob knows that the message came from Alice.
This is how real world public-key authentication is often done.

Who's publishing DMARC?

DMARC is a way for a domain owner to say “If you see this domain in a From: header and it’s not been sent straight from us, please don’t deliver the mail”. If a domain is only used for bulk and transactional mail, it can mitigate a subset of phishing attacks without causing too many problems for legitimate email.
In other cases, it can cause significant problems. Some of those problems impact discussion lists, but others cause problems for ESPs servicing small companies and individuals. ESP customers use their email addresses in the From: field; if they’re a small customer using the email address provided by their ISP, and that ISP publishes a DMARC record with p=reject, a large chunk of the mail they’re sending will bounce. When that happens recipients will stop getting their email, they’ll be removed from the mailing list due to bounces, and there’s some risk of blocks being raised against the sending IP address.
Because of that, it’s good to be able to see what consumer ISPs are doing with DMARC.
I’ve created a tool at dmarc.wordtothewise.com that regularly checks a list of large consumer ISPs and webmail providers and sees what DMARC records they’re publishing.
There are two main variants of DMARC records.
One is policy “reject” – meaning that mail that isn’t authenticated (or for which authentication has been broken in transit) will likely be rejected.
The other is policy “none” – meaning that the ISP publishing the record doesn’t want recipients to change their delivery decisions, but are asking for feedback about their mailstream, and how much of it fails authentication. That can mean that the ISP is evaluating whether or not to publish p=REJECT, or is in the process of deploying p=REJECT. Or it can just mean that they’re using DMARC to monitor where mail using their domain in the From: address is being sent from. There’s no way to tell which is the case unless they’ve made an announcement about their plans.
Hopefully this will be a useful tool to monitor DMARC deployment by consumer ISPs, and to help diagnose delivery problems that may be caused by DMARC.

Cryptography with Alice and Bob

Untrusted Communication Channels
This is a story about Alice and Bob.
Alice wants to send a private message to Bob, and the only easy way they have to communicate is via postal mail.
closedletter
Unfortunately, Alice is pretty sure that the postman is reading the mail she sends.
openletter
That makes Alice sad, so she decides to find a way to send messages to Bob without anyone else being able to read them.
Symmetric-Key Encryption
Alice decides to put the message inside a lockbox, then mail the box to Bob. She buys a lockbox and two identical keys to open it. But then she realizes she can’t send the key to open the box to Bob via mail, as the mailman might open that package and take a copy of the key.
Instead, Alice arranges to meet Bob at a nearby bar to give him one of the keys. It’s inconvenient, but she only has to do it once.
lockstore
After Alice gets home she uses her key to lock her message into the lockbox.
shared1 Then she sends the lockbox to Bob. The mailman could look at the outside, or even throw the box away so Bob doesn’t get the message – but there’s no way he can read the message, as he has no way of opening the lockbox.
shared2
Bob can use his identical key to unlock the lockbox and read the message.
shared3
This works well, and now that Alice and Bob have identical keys Bob can use the same method to securely reply.
Meeting at a bar to exchange keys is inconvenient, though. It gets even more inconvenient when Alice and Bob are on opposite sides of an ocean.
Public-Key Encryption
This time, Alice and Bob don’t ever need to meet. First Bob buys a padlock and matching key.
public1
Then Bob mails the (unlocked) padlock to Alice, keeping the key safe.
public2
Alice buys a simple lockbox that closes with a padlock, and puts her message in it.
public3
Then she locks it with Bob’s padlock, and mails it to Bob.
public4
She knows that the mailman can’t read the message, as he has no way of opening the padlock. When Bob receives the lockbox he can open it with his key, and read the message.
public5
This only works to send messages in one direction, but Alice could buy a blue padlock and key and mail the padlock to Bob so that he can reply.
Or, instead of sending a message in the padlock-secured lockbox, Alice could send Bob one of a pair of identical keys.
publichared
Then Alice and Bob can send messages back and forth in their symmetric-key lockbox, as they did in the first example.
shared2
This is how real world public-key encryption is often done.

Cryptography and Email

A decade or so ago it was fairly rare for cryptography and email technology to intersect – there was S/MIME (which I’ve seen described as having “more implementations than users”) and PGP, which was mostly known for adding inscrutable blocks of text to mail and for some interesting political fallout, but not much else.

That’s changing, though. Authentication and privacy have been the focus of much of the development around email for the past few years, and cryptography, specifically public-key cryptography, is the tool of choice.
DKIM uses public-key cryptography to let the author (or their ESP, or anyone else) attach their identity to the message in a way that’s almost impossible to forge. That lets the recipient make informed decisions about whether to deliver the email or not.
DKIM relies on DNS to distribute it’s public keys, so if you can interfere with DNS, you can compromise DKIM. More than that, if you can compromise DNS you can break many security processes – interfering with DNS is an early part of many attacks. DNSSEC (Domain Name System Security Extensions) lets you be more confident that the results you get back from a DNS query are valid. It’s all based on public-key cryptography. It’s taken a long time to deploy, but is gaining steam.
TLS has escaped from the web, and is used in several places in email. For end users it protects their email (and their passwords) as they send mail via their smarthost or fetch it from their IMAP server. More recently, though, it’s begun to be used “opportunistically” to protect mail as it travels between servers – more than half of the mail gmail sees is protected in transit. Again, public-key cryptography. Perhaps you don’t care about the privacy of the mail you’re sending, but the recipient ISP may. Google already give better search ranking for web pages served over TLS – I wouldn’t be surprised if they started to give preferential treatment to email delivered via TLS.
The IETF is beginning to discuss end-to-end encryption of mail, to protect mail against interception and traffic analysis. I’m not sure exactly where it’s going to end up, but I’m sure the end product will be cobbled together using, yes, public-key cryptography. There are existing approaches that work, such as S/MIME and PGP, but they’re fairly user-hostile. Attempts to package them in a more user-friendly manner have mostly failed so far, sometimes spectacularly. (Hushmail sacrificed end-to-end security for user convenience, while Lavabit had similar problems and poor legal advice).
Not directly email-related, but after the flurry of ESP client account breaches a lot of people got very interested in two-factor authentication for their users. TOTP (Time-Based One-Time Passwords) – as implemented by SecureID and Google Authenticator, amongst many others – is the most commonly used method. It’s based on public-key cryptography. (And it’s reasonably easy to integrate into services you offer).
Lots of the other internet infrastructure you’re relying on (BGP, syn cookies, VPNs, IPsec, https, anything where the manual mentions “certificate” or “key” …) rely on cryptography to work reliably. Knowing a little about how cryptography works can help you understand all of this infrastructure and avoid problems with it. If you’re already a cryptography ninja none of this will be a surprise – but if you’re not, I’m going to try and explain some of the concepts tomorrow.

Make Mail.app work for you

Mark Nottingham (@mnot) posted a good idea to twitter:

Highlight e-mails that your MTA receives with TLS. Make sure to include your mail server’s name in the value (here to the left of what’s shown)
Read More

The origins of network email

The history of long distance communication is a fascinating, and huge, subject. I’m going to focus just on the history of network email – otherwise I’m going to get distracted by AUTODIN and semaphore and facsimile and all sorts of other telegraphy.

Electronic messaging between users on the same timesharing computer was developed fairly soon after time-sharing computer systems were available, beginning around 1965 – including both instant messaging and mail. I’m interested in network mail, though, so we need to skip forward a few years.
You need a network. And a community.
Around 1968 the initial plans for “ARPANET”, a network to link the various ARPA-funded computers together were underway. Local mail between users on the same system was already a significant part of the nascent community.

Email History through RFCs

Many aspects of email are a lot older than you may think.
There were quite a few people in the early 1970s working out how to provide useful services using ARPANET, the network that evolved over the next 10 or 15 years into the modern Internet.

They used Requests for Comment (RFCs) to document protocol and research, much as is still done today. Here are some of the interesting milestones.
April 1971 [rfc 114]RFC 114 A File Transfer Protocol.[/rfc] One of the earliest services that was deployed so as to be useful to people, rather than a required part of the network infrastructure, was a way to transfer files from one computer to another. In the [rfc 114]earliest versions[/rfc] of the service I can find it could already append text to an existing file. This was soon used for sending short messages, initially to a remote printer from where it would be sent by internal mail, but soon also to a mailbox where they could be read online.
August 1971 [rfc 221]RFC 221 A Mail Box Protocol, Version-2[/rfc] had this prescient paragraph:

Asynchronous Bounces

There are three ways that an email can fail to be delivered:

Fun with new mailservers

I’m building a new set of mailservers for wordtothewise.com – our existing mailserver was “I’ll repurpose this test box for a week” about four years ago, so it’s long past time.
I tested our new smarthost by sending a test mail to gmail. This is the very first email this IP address has sent in at least three or four years, possibly forever:

DKIM Key Rotation

Several people have asked me about how to rotate DKIM keys in the past few days (as if you’re modifying anything to mitigate replay attacks, you need to invalidate the signatures of all the mail you sent before you made those changes).

DKIM and injected headers

If you look at the DKIM-Signature header in any piece of email signed with DKIM you’ll see that one of the fields it contains, the h= field, lists some email header names, for example:

DKIM replay attacks

Replay attacks on DKIM signed messages
When you receive an email validly signed with DKIM by example.com that might not mean that example.com sent the email to you, or that they even sent this email at all.
What it does tell you is that at some point in the past, example.com signed an email with exactly the same headers and body and sent it to someone. That’s often close enough to the same thing. But if that original recipient were to resend the email to you completely unchanged then the DKIM signature would still validate when you received it. That’s not a bug; it’s one of the design features of DKIM that it typically survives mail forwarding.
That original recipient could also forward the exact same email to a million of their closest friends, and the DKIM signature would validate at each of those million recipients ISPs. This is one form of a replay attack, and it isn’t something DKIM prevents.
DKIM doesn’t prevent replay, but does mitigate it
Completely eliminating replay attacks over SMTP is difficult – it’s inherently a store-and-forward protocol, so there’s no way to have the sender and recipient do any sort of handshake to ensure that a particular signature is only used once. It’s not unheard of for email to be delayed for days, and delays of hours aren’t unusual, so allowing a signature to be valid for only a few seconds after it’s sent won’t work. And the design requirement that DKIM signature survive forwarding means that it has to survive the final recipient’s email address not being the same as the email address the mail was originally sent to so you can’t include the envelope recipient in the signature.
So what does DKIM do to mitigate replay attacks? The answer to that is surprising – almost everything DKIM does is there to mitigate them. The DKIM signature depends on the body of the message, the subject line and the content of any other headers the sender chooses to include; changing any of that will invalidate the signature. That means that while anyone can grab a copy of an email sent by, for example, paypal and forward it on to someone else, if they modify the content at all it will no longer have paypal’s signature. So an attacker can’t just grab someone else’s signed email and replace it with modified content – and if they can’t do that, where’s the benefit to a spammer or phisher to replay a message?
But all that work is for naught if you allow the attacker to choose the content before you sign the message. There are several ways an attacker can do that, but one example that’s particularly relevant today is ESP trial accounts.
I’m stealing your reputation
If you allow anonymous signups for trial accounts that let a potential customer try out your system you’ll want to put very tight limits on how it can be used, so as to avoid spammers signing up and spamming through your servers. Maybe you’ll limit the number of email addresses the trial user can upload, or the number of emails they can send. At the most extreme you might even limit the trial account to sending mail solely to the trial users own (confirmed) email address.
But if an attacker can send even one piece of email they create through your trial account to themselves, and you sign that email, they can take it and send it to a million recipients – and it’ll still have your DKIM signature on it so it’ll use your reputation to avoid filters and end up in the recipients inbox. And then the recipients will report it as spam, and all that spam will be counted against the reputation associated with your DKIM identifier. If you share a DKIM identifier (“d=”) across all your customers that could cause all your customer mail to start being rejected or sent to the spam folder. (Even if you don’t it could still affect your delivery negatively, as spam filtering systems – both automated and human – sometimes aren’t entirely rational or predictable).
Spam that’s sent like this will be a little “off”, compared to legitimate email – the To: field won’t have the email address of the recipient, for instance, and there’ll be no personalization in the Subject or body of the message. It’s no worse than most spam, and it’s more than balanced out by being able to hijack someone else’s reputation.
So if you provide any way for unvetted non-customers to send email through your systems you should consider adding some DKIM limitations to the constraints you already have on that mail path. Not signing with DKIM at all avoids the problem altogether, but also means you can’t demonstrate your DKIM prowess to legitimate potential customers. You might want to sign with a DKIM d= domain that’s different to your production signatures, perhaps even a completely different top level domain to avoid any risk of confusion (but don’t try and hide that it’s your domain – that’s what spammers do).
Other operational mistakes
There are some grubby corners of the email and DKIM specs that sometimes interact to cause other holes that this sort of reputation hijacker can exploit. I’ll talk about header duplication tomorrow.

Emoji – older than you think

It might just be random 17th Century punctuation, but this poem from 1648 certainly seems to be using a smiley face emoji.
(OK, it’s probably not intentional, but it’s lovely intersection of the emoji and the word.)

TLS and Encryption

Yesterday I talked about STARTTLS deployment, and how it was a good thing to support to help protect the privacy of your recipients.
STARTTLS is just one aspect of protecting email from eavesdropping; encrypting traffic as the mail is being sent or read and encrypting the message itself using PGP or S/MIME are others. This table shows what approaches protect messages at different stages of the messages life:
[table nl=”~”] Compromise point,SUBMIT~+IMAPS,TLS,PGP /~SMIME
Sender’s computer as mail is sent,,,
Sender’s computer later,,,[icon name=check-square] Sender’s network,[icon name=check-square],,[icon name=check-square] Sender’s ISP,,,[icon name=check-square] Global Internet (passive),,[icon name=check-square],[icon name=check-square] Global Internet (active),,[icon name=question-circle],[icon name=check-square] Global Internet (later),,[icon name=question-circle],[icon name=check-square] 3rd party mail services,,,[icon name=check-square] Recipient’s ISP,,,[icon name=check-square] Recipient’s network,[icon name=check-square],,[icon name=check-square] Recipient’s computer as mail is read,,,
Recipient’s computer later,,,[icon name=check-square] [/table] You can see that if you’re sending really sensitive data, you should be encrypting the entire message with PGP or S/MIME (or not sending the message via email at all). Doing so will protect the content of the mail against pretty most sorts of attack, but is pretty intrusive for the sender and recipient so can’t really be used without prior agreement with the recipient.
The other approaches will make some sorts of passive surveillance much more difficult, though.
Encrypting the connection a user uses to send mail ([rfc 6409]using the SUBMIT protocol[/rfc]) and to read mail ([rfc 2595]using TLS to protect IMAP or POP3[/rfc]) will protect against passive sniffing when the user is on possibly hostile network, such as public wifi or an employers network. That’s an easy place to try and sniff traffic, and if that traffic isn’t protected an attacker can not only read someone’s email, they can steal their credentials and cause all sorts of havoc. All general purpose mail clients and all ISPs support encryption here, so it’s almost universally used.
STARTTLS use with SMTP is all about protecting email traffic when it’s being sent between ISPs – both between the sender’s ISP and the recipient’s and also between any 3rd party mail services (outsourced spam filtering, mailing list providers, vanity domain fowarders, etc.).
I’ve listed three different sorts of attack on that inter-ISP traffic – passive, active and “later”.
A passive attack is where the attacker has the ability to listen to bytes as they go by, but isn’t able to modify or intercept them. While you might think of this as something a nation state would do, via secret agreements with backbone providers or high-tech fiber optic cable taps, there are ways a smaller attacker might be able to compromise an intermediate router and tap that traffic with little risk of detection. Deploying any sort of STARTTLS will protect against this, even if it’s misconfigured, using expired certificates or even just the default setup of a newly deployed mailserver. Facebook describe these weaker forms of STARTTLS as “opportunistic” in their survey – it’s not perfect, but it’s a lot better than nothing.
An active attack is one where the attacker has the ability to intercept and modify traffic between the two ISPs. This seems like it would be harder to do than a passive attack, but it’s often easier, though not as stealthy. Once that’s done, the attacker can pretend to be the recipient ISP and have full access to read, modify or discard messages. To protect against this sort of attack TLS needs to be used not just to encrypt the traffic in-flight, but also to allow the sender to validate that the mailserver they’re talking to really is who they think it is. This is what Facebook describe as “strict” – it requires that the mailserver have a valid certificate, issued by a legitimate certification authority for the domain that the mail is being sent to.
What about “later”? It’s easy to imagine a case where an attacker has been passively monitoring and recording encrypted traffic for a while, and then later they manage to acquire the encryption keys that were used (by, for example, issuing a subpoena to the recipient ISP, or using a compromise to rip them out of your servers memory). With many forms of encryption once you have those private keys it’s possible to decrypt all the traffic you’ve already captured. There are a few algorithms, though, that have what’s known as perfect forward secrecy – knowing the private keys that were used at the time the mail was transferred doesn’t allow you to decrypt them at a later time. If you’re concerned about the privacy of your messages, you should definitely read up on how to set that up.
All of these techniques are a great way to defend against ubiquitous or casual attempts to read your messages, but none of them are proof against a determined attacker. If all else fails, there’s always a wrench attack.

Protect your email with TLS

You probably use TLS hundreds of times a day. If you don’t recognize the term, you might know it better by it’s older name, SSL.
TLS is what protects your data in transit whenever you go to Google, or Yahoo or even this blog. The little padlock in your browser address bar tells you that your browser has used the TLS protocol to do two things. First, it’s decided that the server you’re connecting to really is operated by Google, or Yahoo or us – you’re (probably) not having your session intercepted by someone in the middle between you and the webserver, either to read your traffic or modify it en-route. Second, it is encrypting all the traffic between you and the webserver, so that it can’t be passively monitored while in transit. Because of concerns about ubiquitous surveillance many websites – including ours – are moving to use TLS for everything, not just for protecting a login page or a credit card number.
That’s great for the web, but how does it apply to email? One place it’s used is for connections between your mail client and your local mailserver – sending mail to the smarthost via [rfc 4409]SUBMIT[/rfc] and fetching mail using [rfc 2595]IMAP or POP3[/rfc] almost always use TLS. That protects the privacy of your messages between you and your ISP and also protects the username and password you use to authenticate with.
Mail traveling between ISPs didn’t used to be encrypted “on the wire” , but about 15 years ago [rfc 3207]an extension to SMTP was proposed[/rfc] that would allow ISPs to negotiate during each session whether they should encrypt it or not. This extension, often referred to as STARTTLS after the command it uses, allows gradual rollout of encryption of mail traffic between ISPs without requiring any sort of flag day. A mailserver that supports STARTTLS will tell everyone who connects to it “Hey! I support STARTTLS!”. When a smarthost that also supports it connects to that mailserver it will go “Great! I support STARTTLS too! Lets do this!” and convert the plain text SMTP session into an encrypted session protected by TLS.
Fifteen years seems like a long period in Internet time, but non-intrusive protocol changes can take a long time to deploy. Facebook Engineering have done the work to see how that deployment is going with their survey of the current state of SMTP STARTTLS deployment. The results are really quite positive – over three quarters of the mailservers they sent mail to supported STARTTLS, covering nearly 60% of their users. That’s definitely enough to make supporting STARTTLS worthwhile.
More about TLS and encryption tomorrow.

SMTP Level Rejections

While discussing a draft of a Deliverability BCP document the issue came up of what rejections at different phases of the email delivery transaction can mean. That’s quite a big subject, but here’s a quick cheat sheet.
At initial connection
Dropped or failed connection:

The anatomy of From:

Compared with some of the more complex pieces of the email protocol the From: header seems deceptively simple. But I’ve heard several people be confused about what it’s made up of over the past couple of months, so I thought I’d dig a bit deeper into how it’s defined and how it’s used in practice.
Here’s a simple example:

There are two interesting parts.
The first is what’s technically called the display-name, but more commonly known as the “friendly from” in the bulk email industry. It has no meaning within the email protocol, it’s just text that’s displayed to the recipient to describe who an email was sent by. Because it’s just text, you can put anything you like in there, but it’s usually either the name of the person who wrote the mail or the name of the company or brand that sent it.
The second is the actual email address, the thing with an at-sign in it. Surprisingly, this isn’t used at all during the actual delivery of the email; there’s a hidden field (called the return path or the 5321.MailFrom or the envelope sender or the bounce address) that’s used instead. For person-to-person email it’s usually the same address, but for bulk mail it’s often different.
So what does the actual email address, the 5322.From, mean? For that we go to the document that specifies what email headers mean – RFC 5322, “Internet Message Format”. (RFC 5322 is the updated replacement of the older RFC 822 – and that’s why the actual email address is often called the 822.From or 5322.From when people are being precise about exactly which email address they’re talking about).
RFC 5322 says “The From: field specifies the author of the message, that is, the mailbox of the person or system responsible for the writing of the message.” and “In all cases, the From: field SHOULD NOT contain any mailbox that does not belong to the author of the message”. It’s the email address of the author of the message.
(In some cases the email may have been written by the author, but then sent on their behalf by someone else. RFC 5322 says that in that situation the email address in the From field is still the author of the message. The person who sent the message gets their own field, “Sender:”).
What is the 5322.From used for? During the delivery process it’s used for some sorts of filtering and authentication. In particular, if you’re reading about DMARC you’ll see “identifier alignment” mentioned a lot – which basically means “the only domain we care about authenticating is the one in the 5322.From”. It’s also the usual field that’s used in user-visible mail filtering such as whitelisting email addresses that are in the users address book.
In the mail client itself the most obvious use of the 5322.From is that when you hit reply, that’s the email address your reply will go to by default. The author of the mail can override that by adding a Reply-To field, containing one or more email addresses if they want different behaviour. It’s also commonly used to filter email and to group mails by author.
What’s displayed to the end user? Originally the entire content of the From: header was shown in the recipients mailbox but it’s now fairly common to display just the friendly from, with no mention of the email address at all. That started in mobile clients, where space is at a premium and the friendly from is just, well, friendlier – but it’s spread to desktop and webmail clients too. In Yahoo webmail the 5322.From isn’t displayed anywhere at all unless you find the View Full Header menu option and dig through the raw headers, and my phone doesn’t display it anywhere obvious and only recently made it possible to see it at all.

If you have servers using SSL, read this

I was going to post about SSL certification and setup today, but the security world got ahead of me.
Recent versions of openssl – the library used by most applications to implement SSL – released over the past couple of years have a critical bug in them. This bug lets any attacker read any information from the process that’s running SSL, reliably, silently and without leaving any trace on the compromised server that it’s happened.
What’s so dangerous about that? As well as things like usernames, passwords, private email and so on, this lets the attacker take the private key for your SSL certificate. Once they have that private key, they can run a server that pretends to be you, even over SSL, opening up all sorts of shenanigans.
There are more details at heartbleed.com – but the short form is that you should check the version of openssl on all your servers – if it’s running openssl version 1.0.1 through 1.0.1f, it’s vulnerable.
You should obviously upgrade to openssl 1.0.1g on vulnerable machines, but given the scope of the potential attack you might want to consider the information on them already compromised. If so, that’d mean replacing the SSL certificates and changing any passwords the affected services have access to (both user passwords and any service passwords, such as database credentials).

More denial of service attacks

There are quite a lot of NTP-amplified denial of service attacks going around at the moment targeting tech and ecommerce companies, including some in the email space.
What does NTP-amplifed mean? NTP is “Network Time Protocol” – it allows computers to set their clocks based on an accurate source, and keep them accurate. It’s very widely used – OS X and Windows desktops typically use it by default, and most servers should have it running.
NTP is a UDP based service, like DNS, one that works by sending a packet to a server and the server sending a packet back rather than opening a persistent connection to the server as TCP based services (e.g. SMTP, HTTP, …) do. That simpler protocol means that it’s easy for me to send a request to an NTP server with a false source address, claiming I’m someone else – and the NTP server will send it’s reply back to that fake source address rather than to me. So if I want to DoS someone by flooding their network with packets I can send NTP requests to a public NTP server claiming to be the victims server. The NTP server will send the replies back to the victim – and it’ll be almost impossible for the victim, or the NTP server, to trace where I’m sending those request packets from.
As a malicious attacker that already sounds good – but it gets better. The size of a reply is often bigger than the size of a request, sometimes a lot bigger. If I choose the request I make carefully I can easily make sure the reply is at least an order of magnitude bigger than the request. So for every megabit of forged requests I sent to NTP servers, the victim might see at least 10 times that hitting their servers. That’s amplification.
What can you do about it? If you’re running NTP servers that will respond to requests from the general Internet, you ideally need to lock those down so that they’ll only respond to requests from your own clients. You can use the instructions and information at the open NTP project to check to see if you’re running open NTP servers and use the templates provided by Project Cymru as a basis to secure your NTP servers and appliances.
What can you do to prepare for this sort of attack? Have monitoring in place, so that you’re notified if there are large volumes of unexpected traffic. Overprovision your bandwidth, if possible, to give you more time to react. Block “large” (>90 bytes for IPv4, >110 for IPv6) UDP packets with a source or destination port of 123 as far upstream as possible, and all UDP packets that have both a source port of 123 and a destination port of 80 or 25 – this shouldn’t affect legitimate use of NTP by your users. Consider having your production servers use NTP servers operated by you, rather than public NTP servers – that way, if they’re targeted you can block any traffic that looks like NTP to them without affecting their time synchronization. Research DoS mitigation providers – different providers have different strengths and cost structures, and they can be much more reasonably priced if you talk to them before an attack rather than during one.
What if you’re targeted by this sort of attack? If you’re not a sysadmin, stay out of your sysadmins way and make sure there’s coffee, food and a quiet place without interruptions available. If you are a sysadmin, talk to your upstream NOC. They’re in a much better place, in information, resources and knowledge, than you to help mitigate. Reach out to your peers who are also being attacked and offer to share information. Look at Cisco’s mitigation advice. The attack will probably target your publicly visible website. If so, consider moving that to another network (or behind a commercial DoS mitigation provider) so that your production servers and customer portal web presence isn’t impacted.
More information

Open relays

Spamhaus wrote about the return of open relays yesterday. What they’re seeing today matches what I see: there is fairly consistent abuse of open relays to send spam. As spam problems go it’s not as serious as compromised machines or abuse-tolerant ESPs / ISPs/ freemail providers – either in terms of volume or user inbox experience – but it’s definitely part of the problem.
I’m not sure how much of a new problem it is, though.
Spammers scan the ‘net for mailservers and attempt to relay email through them back to email addresses they control. Any mail that’s delivered is a sign of an open relay. They typically put the IP address of the mailserver they connected to in the subject line of the email, making it easy for them to mechanically extract a list of open relays.
We run some honeypots that will accept and log any transaction, which looks just like an open relay to spammers other than not actually relaying any email. They let us see what’s going on. Here’s a fairly typical recent relay attempt:

Compromising a Mail Client

Your entire work life is in your work mail client.
All the people you communicate with – co-workers, friends, family, vendors, customers, colleagues.
Every email you send. Every email you receive. Any files you attach or receive.
If someone can compromise your mail client, they can see all that.
They can save copies of all your emails, data-mine them and use them for whatever purpose they like. They can build a view of your social network, based on who you exchange emails with, and a model of who you are, based on what you talk about.
That companies like Google do this for “free”, advertising supported webmail shouldn’t be much of a surprise by now – but your corporate email system and your work email is secure, right?
What if an attacker were to set up a man-in-the-middle attack on your employees? Install malware on their iPhone, such that all traffic were transparently routed through a proxy server controlled by the attacker?
Or they could use a more email-centric approach, configuring the compromised mail client to fetch mail from an IMAP server controlled by the attacker that took the employees credentials and passed them through to their real corporate IMAP server – that would let the attacker completely control what the compromised user saw in their inbox. As well as being able to read all mail sent to that user, they could silently filter mail, they could deliver new mail to the users inbox directly, bypassing any mail filters or security. They could even modify the contents of email on-the-fly – adding tracking links, redirection URLs or injecting entirely new content into the message.
Similarly, the attacker could route all outbound mail through a man-in-the-middle smarthost that copied the users credentials and used them to send mail on to their real corporate smarthost. As well as being able to read and modify all mail sent the attacker could also use that access to send mail that masqueraded as coming from the user.
Sounds like the sort of thing you’d expect from criminal malware? Not quite. What I’ve just described is Intro, a new product from LinkedIn.
LinkedIn will be asking your users to click on a link to install a “security profile” to their iPhones. If they do, then LinkedIn will have total control over the phone, and will use that to inject their SMTP and IMAP proxies into your users mailstreams. The potential for abuse by LinkedIn themselves is bad enough – I’ve no doubt that they’ll be injecting adverts for themselves into the mailstream, and their whole business is based on monetizing information they acquire about employees and their employers. But LinkedIn have also been compromised in the past, with attackers stealing millions of LinkedIn user credentials – if they can’t protect their own users credentials, I wouldn’t trust them with your employees credentials.
You might want to monitor where your employees are logging in to your servers from – and suspend any accounts that log in from LinkedIn network space.
Edit: Bishop Fox has looked at Intro too, and come to similar conclusions. TechCrunch too.

Ad-hoc analysis

I often pull emails into a database to analyze them, but sometimes I want something simpler. Emails are typically stored in one of two ways: mbox format, where an entire mailbox is stored in a single file, and maildir format, where a mailbox is a directory with one file in it for each email.
My desktop mail application is Mail.app on OS X, and it stores messages in a maildir-ish format, so I’m going to work with that here. If you’re using mbox format mailboxes it’s a little trickier (but you can use a tool called formmail to split an mbox style format into a maildir directory and go from there).
I want to gather some statistics on mail I’ve sent to abuse desks, so the first thing I do is open up a terminal window and change directory to where my “Sent Messages” mailbox is:
cd Library/Mail/V2/IMAP-steve@misc.wordtothewise.com/Sent Messages.mbox
(Tab completion is really useful for navigating through the mailbox hierarchy.)
Then I need to go through every email (file) in that directory, for each file find the “To:” header and check to see if it was sent to an abuse desk. If it was sent to an abuse desk I want to find the email address for each one, count how many times I see that email address and find the top twenty or so abuse desks I send reports to. I can do all that with a single command line:
find . -type f -exec egrep -m1 '^To:' {} ; | egrep -o 'abuse@[a-zA-Z0-9._-]+' | sort | uniq -c | sort -nr | head -20
(Enter that all as a single line, even though it’s wrapped into two here).
That’s a bit much to understand all at once, so lets redo that in several stages, with an intermediate file so we can see what’s going on.
find . -type f -exec egrep -m1 '^To:' {} ; >tolines.txt
The find command finds all the files in a directory and does something with them. In this case we start looking in the current directory (“.”), look just for files (“-type f”) and for each file we find we run that file through another command (“-exec egrep -m1 ‘^To:’ {} ;”) and write the result of that command to a file (“>tolines.txt”). The egrep command we run for each file goes through the file and prints out the first (“-m1”) line it finds that begins with “To:” (“‘^To:'”). If you run that and take a look at the file it creates you can see one line for each message, containing the “To:” header (or at least the first line of it).
The next thing to do is to go through that and pull out just the email addresses – and just the ones that are sent to abuse desks:
egrep -o 'abuse@[a-zA-Z0-9._-]+' tolines.txt
This uses egrep a second time, this time to look for lines that look like an email address (“‘abuse@[a-zA-Z0-9._-]+'”) and when it finds one print out just the part of the line that matched the pattern (“-o”).
Running that gives us one line of output for each email we’re interested in, containing the address it was sent to. Next we want to count how many times we see each one. There’s a command line idiom for that:
egrep -o 'abuse@[a-zA-Z0-9._-]+' tolines.txt | sort | uniq -c
This takes all the lines and sorts (“sort”, reasonably enough) them – so that identical lines will be next to each other – then counts runs of identical lines (“uniq -c”). We’re nearly there – the result of this is a count and an email address on each line. We just need to find the top 20:
egrep -o 'abuse@[a-zA-Z0-9._-]+' tolines.txt | sort | uniq -c | sort -nr | head -20
Each line begins with the count, so we can use sort again, this time telling it to sort by number, high to low (“sort -nr”). Finally, “head -20” will print just the first 20 lines of the result.
The final result is this:

SPF Fail: too many DNS lookups

I’ve had a couple folks come to me recently for help troubleshooting SPF failures. The error messages said the SPF record was invalid, but by all checks it was valid.
Eventually, we tracked the issue down to how many include files were in the SPF record.
The SPF specification specifically limits the number of lookups that can happen during a SPF check.

New top level domains

ICANN have signed agreements for four new top level domains, all internationalized domains from the 2o12 applications for new TLDs.
They are شبكة (“network” or maybe “web” in arabic), 游戏 (“game” in chinese), онлайн and сайт (“online” and “website” in russian).
It’ll take a while for the registries to ramp up their infrastructure, but you might start seeing domains registered in these TLDs as soon as Q1 of next year. Email in internationalized domains is still not really viable, but web pages with fully internationalized URLs certainly are, and they’re likely to get much more popular with these new TLDs.
Can your message composition and reputation monitoring infrastructure handle non-ascii URLs correctly? If not, you’ve got six months or so to start getting that in place.

Weird mail problems today? Clear your DNS cache!

A number of sources are reporting this morning that there was a problem with some domains in the .com zone yesterday. These problems caused the DNS records of these domains to become corrupted. The records are now fixed. Some of the domains, however, had long TTLs. If a recursive resolver pulled the corrupted records, it could take up to 2 days for the new records to naturally age out.
Folks can fix this by flushing their DNS cache, thus forcing the recursive resolver to pull the uncorrupted records.
EDIT: Cisco has published some more information about the problem. ‘Hijacking’ of DNS Records from Network Solutions

What is a dot-zero listing?

Some email blacklists focus solely on allowing their users to block mail from problematic sources. Others aim to reduce the amount of bad mail sent and prefer senders clean up their practices, rather than just blocking them wholesale. The Spamhaus SBL is one of the second type, using listings both to block mail permanently from irredeemable spammers and as short term encouragement for a sender to fix their practices.
All a blacklists infrastructure – and the infrastructure of related companies, such as reputation monitoring services – is based on identifying senders by their IP addresses and recording their misbehaviour as records associated with those IP addresses. For example, one test entry for the SBL is the IP address 192.203.178.107, and the associated record is SBL230. Because of that they tend not to have a good way to deal with entities that aren’t associated with an IP address range.
Sometimes a blacklist operator would like put a sender on notice that the mail they’re emitting is a problem, and that they should take steps to fix that, but they don’t want to actually block that senders mail immediately. How to do that, within the constraints of the IP address based blacklist infrastructure?
IP addresses are assigned to users in contiguous blocks and there’s always a few wasted, as you can’t use the first or last addresses in that range (for technical / historical reasons). Our main network consists of 128 IP addresses, 184.105.179.128 to 184.105.179.255, but we can’t put servers on 184.105.179.128 (as it’s our router) or 184.105.179.255 (as it’s the “broadcast address” for our subnet).
So if Spamhaus wanted to warn us that we were in danger of having our mail blocked, they could fire a shot across our bow without risk of blocking any mail right now by listing the first address in our subnet – 184.105.179.128 – knowing that we don’t have a server running on that address.
For any organization with more than 128 IP addresses – which includes pretty much all ISPs and ESPs – IP addresses are assigned such that the first IP address in the range ends in a zero, so that warning listing will be for an address “x.y.z.0” – it’s a dot-zero listing.

DKIM and DomainKeys, Spam and Ham

I’ve been preaching “DKIM is great! DomainKeys is obsolete, get rid of it!” for several years now. I thought I’d take a look at my mailbox and see who was using authentication.
I’ve divided this into “Ham” and “Spam”. Spam is, well, all the spam I’ve received over the past couple of years. Ham is the non-spam mail in my inbox, whether personal, business, bulk or transactional. I’ve excluded most of the discussion mailing lists I’m on (not least because many of them consist of people in the email industry or are email standards development mailing lists, so have email authentication levels that are way outside the norm).

DNS, SERVFAIL, firewalls and Microsoft

When you look up a host name, a mailserver or anything else there are three types of reply you can get. The way they’re described varies from tool to tool, but they’re most commonly referred to using the messages dig returns – NXDOMAIN, NOERROR and SERVFAIL.
NXDOMAIN is the simplest – it means that there’s no DNS record that matches your query (or any other query for the same host name).
NOERROR is usually what you’re hoping for – it means that there is a DNS record with the host name you asked about. There might be an exact match for your query, or there might not, you’ll need to look at the answer section of the response to see. For example, if you do “dig www.google.com MX” you’ll get a NOERROR response – because there is an A record for that hostname, but no answers because there’s no MX record for it.
SERVFAIL is the all purpose “something went wrong” response. By far the most common cause for it is that there’s something broken or misconfigured with the authoritative DNS for the domain you’re querying so that your local DNS server sends out questions and never gets any answers back. After a few seconds of no responses it’ll give up and return this error.
Microsoft
Over the past few weeks we’ve heard from a few people about significant amounts of delivery failures to domains hosted by Microsoft’s live.com / outlook.com, due to SERVFAIL DNS errors. But other people saw no issues – and even the senders whose mail was bouncing could resolve the domains when they queried Microsofts nameservers directly rather than via their local DNS resolvers. What’s going on?
A common cause for DNS failures is inconsistent data in the DNS resolution tree for the target domain. There are tools that can mechanically check for that, though, and they showed no issues with the problematic domains. So it’s not that.
Source ports and destination ports
If you’re even slightly familiar with the Internet you’ve heard of ports – they’re the numbered slots that servers listen on to provide services. Webservers listen on port 80, mailservers on port 25, DNS servers on port 53 and so on. But those are just the destination ports – each connection comes from a source port too (it’s the combination of source port and destination port that lets two communicating computers keep track of what data should go where).
Source ports are usually assigned to each connection pretty much randomly, and you don’t need to worry about them. But DNS has a history of the source port being relevant (it used to always use source port 53, but most servers have switched to using random source ports for security reasons). And there’s been an increasing amount of publicity about using DNS servers as packet amplifiers recently, with people being encouraged to lock them down. Did somebody tweak a firewall and break something?
Both source and destination ports range between 1 and 65535. There’s no technical distinction between them, just a common understanding that certain ports are expected to be used for particular services. Historically they’ve been divided into three ranges – 1 to 1023 are the “low ports” or “well known ports”, 1024-49151 are “registered ports” and 49152 and up are “ephemeral ports”. On some operating systems normal users are prevented from using ports less than 1024, so they’re sometimes treated differently by firewall configurations.
While source ports are usually generated randomly, some tools let you assign them by hand, including dig. Adding the flag -b "0.0.0.0#1337" to dig will make it send queries from source port 1337. For ports below 1024 you need to run dig as root, but that’s easy enough to do.
A (slightly) broken firewall
“sudo dig -b "0.0.0.0#1024" live.com @ns2.msft.net” queries one of Microsofts nameservers for their live.com domain, and returns a good answer.
“sudo dig -b "0.0.0.0#1023" live.com @ns2.msft.net” times out. Trying other ports above and below 1024 at random gives similar results. So there’s a firewall or other packet filter somewhere that’s discarding either the queries coming from low ports or the replies going back to those low ports.
Older DNS servers always use port 53 as their source port – blocking that would have caused a lot of complaints.
But “sudo dig -b "0.0.0.0#53" live.com @ns2.msft.net” works perfectly. So the firewall, wherever it is, seems to block DNS queries from all low ports, except port 53. It’s definitely a DNS aware configuration.
DNS packets go through a lot of servers and routers and firewalls between me and Microsoft, though, so it’s possible it could be some sort of problem with my packet filters or firewall. Better to check.
“sudo dig -b "0.0.0.0#1000" google.com @ns1.google.com” works perfectly.
So does “sudo dig -b "0.0.0.0#1000" amazon.com @pdns1.ultradns.net“.
And “sudo dig -b "0.0.0.0#1000" yahoo.com @ns1.yahoo.com“.
The problem isn’t at my end of the connection, it’s near Microsoft.
Is this a firewall misconfiguration at Microsoft? Or should DNS queries not be coming from low ports (other than 53)? My take on it is that it’s the former – DNS servers are well within spec to use randomly assigned source ports, including ports below 1024, and discarding those queries is broken behaviour.
But using low source ports (other than 53) isn’t something most DNS servers will tend to do, as they’re hosted on unix and using those low ports on unix requires jumping through many more programming hoops and involves more security concerns than just limiting yourself to ports above 1023. There’s no real standard for DNS source port randomization, which is something that was added to many servers in a bit of a hurry in response to a vulnerability that was heavily publicized in 2008. Bind running on Windows seems to use low ports in some configurations. And even unix hosted nameservers behind a NAT might have their queries rewritten to use low source ports. So discarding DNS queries from low ports is one of the more annoying sorts of network bugs – one that won’t affect most people at all, but those it does affect will see it much of the time.
If you’re seeing DNS issues resolving Microsoft hosted domains, or you’re seeing patterns of unexpected SERVFAILs from other nameservers, check to see if they’re blocking queries from low ports. If they are, take a look and see what ranges of source ports your recursive DNS resolvers are configured to use.
(There’s been some discussion of this recently on the [mailop] mailing list.)

CBL website and email back on line

The CBL website is back on line.
It’s possible that your local DNS resolver has old values for it cached. If so, and if you can’t flush your local DNS cache, and you really can’t wait until DNS has been updated then you may be able to put a temporary entry in your hosts file to point to cbl.abuseat.org.
You can get the IP address you need to add by querying the nameserver at ns-2038.awsdns-62.co.uk for cbl.abuseat.org. No, I’m not going to tell you the IP address – if you can’t do a basic DNS query, you shouldn’t be modifying your hosts file and you can just wait a day.

Mail that looks good on desktop and mobile

Over the weekend I noticed a new CSS framework aimed at email rather than web development, “Antwort“.
This isn’t the first or only framework for email content, but this one looks simple and robust, and it allows for content that doesn’t just adapt for different sized displays but looks good on all of them. The idea behind it is to divide your content into columns, magazine style, then display the columns side-by-side on desktop clients and top to bottom on mobile clients. That opens up much more interesting designs than the more common single fluid column approach.

It looks nice, it supports pretty much every interesting email client, but it also comes with some directions based on real world experience.

Images in the subject line

I’ve seen this trick used by a few senders recently, with varying effectiveness.

Where do they get these pictures?
While you can scatter any images you like across the body of your message, the subject line is limited to just text. But “text” is more than just “a, b, c” – using RFC 2047 encoding you can use any character you like, including many tiny pictures.
⛄ 💰 🐘 ✈ 🎁 ☂
☀|||||||☀
Experian, Vertical Response and Bronto all have some interesting things to say about the effectiveness of using these.
Finding the right glyph can be tricky. Macs have a fairly decent glyph search engine (under Edit > Special Characters… in most applications) while Windows has a fairly mediocre one (Start > All Programs > Accessories > System Tools > Character Map > Advanced View). Both are missing some useful features, though, so I put together something better.
emailstuff.org/glyph lets you search for glyphs by name. It’ll tell you about related glyphs (“helicopter” and “airplane”, or “package” and “wrapped present”) which can help you find the right image when you don’t know it’s name. And, once you’ve chosen a glyph, it shows how to use it in various encodings (if you’re using a GUI tool or a web form to compose your emails you can probably just copy and paste, but it’s handy for manually editing messages when your composition tool isn’t unicode-friendly).
Will all your recipients be able to see these glyphs? All mail clients support utf-8 text and this sort of encoding so the only issue is whether the recipient has a font installed with the glyph in it. That’s operating system specific, rather than depending on the web browser or mail client, so if you want to test – and you probably should – you can get away with just Windows and OS X for desktop, iOS and Android for mobile.
Have fun! But don’t overdo it.

DKIM and Gmail

After they were a a little embarrassed by their own DKIM keys being poorly managed a few months ago, Google seem to have been going through their inbound DKIM handling and tightening up on their validation so that badly signed mail that really shouldn’t be treated as DKIM signed, won’t be treated as signed by Gmail.
This is a good thing, especially as things like DMARC start to be layered on top of DKIM, but it does mean that you really need to check your signing configuration and make sure you’re not doing anything silly.

Setting up DNS for sending email

Email – and email filtering – makes a lot of use of DNS, and it’s fairly easy to miss something. Here are a few checklists to help:

Want to learn about Networking and the Internet?

You can trust the “experts” that populate Facebook.

Or you can take An Introduction to Computer Networks from Stanford University.

DMARC Interoperability

Facebook hosted a DMARC interoperability event earlier this week. In terms of protocol development, interoperability events are a sign that the protocol is ready for more widespread use.

More awesome than email

This morning was the final flight of the Space Shuttle Endeavor. In fact, it was the last flight of any shuttle ever, anywhere. We were lucky enough to get passes to NASA Ames Research Center at Moffett Field to watch the flyover.

Open Relays and Mail Sinks

Email is a “store and forward” protocol. The sender doesn’t connect directly to the recipient to send the mail with just one network hop, rather the sender connects to a mailserver (usually referred to as an “MTA”, short for Mail Transfer Agent) and sends the message there. Once that MTA has received the message it sends it on to another MTA, and so on until it reaches the recipient.
Mail clients typically don’t have any intelligence built in to them to decide which MTA to send an email to. Instead they’re configured to blindly send every message to one particular local MTA, the smarthost, which then does all the proper SMTP work to decide where to send it on to.

More than just PGP

Cryptography is the science of securing communication from adversaries. In the email world it’s most obvious use is tools like PGP or S/MIME that are used to encrypt a message so that it can only be read by the intended recipient, or to sign a message so that the recipient can be sure of who it came from. There are quite a few other aspects of sending email where a little cryptography is useful or essential, though – bounce management, suppression lists, unsubscription handling, DKIM and DMARC, amongst others.

Things Spammers Do

Much like every other day, I got some spam today. Here’s a lightly edited copy of it.
Let’s go through it and see what they did that makes it clear that it’s spam, which companies helped them out, and what you should avoid doing to avoid looking like these spammers…

The 500 mile email

This is a great story from Trey Harris about a real email delivery issue from the mid 1990s.
Here’s a problem that sounded impossible… I almost regret posting the story to a wide audience, because it makes a great tale over drinks at a conference. 🙂 The story is slightly altered in order to protect the guilty, elide over irrelevant and boring details, and generally make the whole thing more entertaining.
I was working in a job running the campus email system some years ago when I got a call from the chairman of the statistics department.
“We’re having a problem sending email out of the department.”
“What’s the problem?” I asked.
“We can’t send mail more than 500 miles,” the chairman explained.
I choked on my latte. “Come again?”
“We can’t send mail farther than 500 miles from here,” he repeated. “A little bit more, actually. Call it 520 miles. But no farther.”
“Um… Email really doesn’t work that way, generally,” I said, trying to keep panic out of my voice. One doesn’t display panic when speaking to a department chairman, even of a relatively impoverished department like statistics. “What makes you think you can’t send mail more than 500 miles?”
“It’s not what I think,” the chairman replied testily. “You see, when we first noticed this happening, a few days ago–”
“You waited a few DAYS?” I interrupted, a tremor tinging my voice. “And you couldn’t send email this whole time?”
“We could send email. Just not more than–”
“–500 miles, yes,” I finished for him, “I got that. But why didn’t you call earlier?”
“Well, we hadn’t collected enough data to be sure of what was going on until just now.” Right. This is the chairman of *statistics*. “Anyway, I asked one of the geostatisticians to look into it–”
“Geostatisticians…”
“–yes, and she’s produced a map showing the radius within which we can send email to be slightly more than 500 miles. There are a number of destinations within that radius that we can’t reach, either, or reach sporadically, but we can never email farther than this radius.”
“I see,” I said, and put my head in my hands. “When did this start? A few days ago, you said, but did anything change in your systems at that time?”
“Well, the consultant came in and patched our server and rebooted it. But I called him, and he said he didn’t touch the mail system.”
“Okay, let me take a look, and I’ll call you back,” I said, scarcely believing that I was playing along. It wasn’t April Fool’s Day. I tried to remember if someone owed me a practical joke.
I logged into their department’s server, and sent a few test mails. This was in the Research Triangle of North Carolina, and a test mail to my own account was delivered without a hitch. Ditto for one sent to Richmond, and Atlanta, and Washington. Another to Princeton (400 miles) worked.
But then I tried to send an email to Memphis (600 miles). It failed. Boston, failed. Detroit, failed. I got out my address book and started trying to narrow this down. New York (420 miles) worked, but Providence
(580 miles) failed.
I was beginning to wonder if I had lost my sanity. I tried emailing a friend who lived in North Carolina, but whose ISP was in Seattle. Thankfully, it failed. If the problem had had to do with the geography of the human recipient and not his mail server, I think I would have broken down in tears.
Having established that–unbelievably–the problem as reported was true, and repeatable, I took a look at the sendmail.cf file. It looked fairly normal. In fact, it looked familiar.
I diffed it against the sendmail.cf in my home directory. It hadn’t been altered–it was a sendmail.cf I had written. And I was fairly certain I hadn’t enabled the “FAIL_MAIL_OVER_500_MILES” option. At a loss, I telnetted into the SMTP port. The server happily responded with a SunOS sendmail banner.
Wait a minute… a SunOS sendmail banner? At the time, Sun was still shipping Sendmail 5 with its operating system, even though Sendmail 8 was fairly mature. Being a good system administrator, I had standardized on Sendmail 8. And also being a good system administrator, I had written a sendmail.cf that used the nice long self-documenting option and variable names available in Sendmail 8 rather than the cryptic punctuation-mark codes that had been used in Sendmail 5.
The pieces fell into place, all at once, and I again choked on the dregs of my now-cold latte. When the consultant had “patched the server,” he had apparently upgraded the version of SunOS, and in so doing downgraded Sendmail. The upgrade helpfully left the sendmail.cf alone, even though it was now the wrong version.
It so happens that Sendmail 5–at least, the version that Sun shipped, which had some tweaks–could deal with the Sendmail 8 sendmail.cf, as most of the rules had at that point remained unaltered. But the new long configuration options–those it saw as junk, and skipped. And the sendmail binary had no defaults compiled in for most of these, so, finding no suitable settings in the sendmail.cf file, they were set to zero.
One of the settings that was set to zero was the timeout to connect to the remote SMTP server. Some experimentation established that on this particular machine with its typical load, a zero timeout would abort a connect call in slightly over three milliseconds.
An odd feature of our campus network at the time was that it was 100% switched. An outgoing packet wouldn’t incur a router delay until hitting the POP and reaching a router on the far side. So time to connect to a lightly-loaded remote host on a nearby network would actually largely be governed by the speed of light distance to the destination rather than by incidental router delays.
Feeling slightly giddy, I typed into my shell:

The Physics of the Email Universe

We talk a lot about rules and best practices in email, but we’re mostly talking about “squishy” rules-of-thumb that are based on simplified models of how mail systems, spam filters, recipients, postmasters and blacklist operators behave. They’re the biology, ecology and sociology of the email ecosystem.
There’s another set of rules we tend to only mention in passing, if at all, though. They’re the steely, sharp-edged laws that control the email universe. They’re the RFCs that define how email works and make sure that mail systems written by hundreds of different people across the globe all work and all interoperate with each other.
Building a message from Zeros and Ones
RFC 5322 – Internet Message Format
This tells you everything you need to know about crafting a simple email, with a subject line, a sender, some recipients and a simple plain-text message. It’s also the foundation of all fancier emails. If you’re creating emails, this is where to start.
A little more than plain ASCII
RFC 2047 – MIME Part 3: Message Header Extensions for Non-ASCII Text
RFC 2047 is one small part of the MIME (Multipurpose Internet Mail Extensions) suite of protocols that allow you to include pictures and attachments and prettily formatted text and comic sans in your email. This part defines how you can put things other than the plainest of plain text in your subject lines or in the “friendly from” of your message. It’s what allows you to put Hiragana, or Cyrillic, or umlauts, or cedillas, or properly matched double quotes in your subject line. It also let’s you put hearts or smiley faces or other little pictograms there – but nothing this useful is going to be perfect.
RFC 2045 – MIME Part 1: Format of Internet Message Bodies
This shows how to send an image, or a plain text mail in a different character set, or an HTML mail. It doesn’t tell you how to send plain text and HTML, or to send HTML with embedded images, or a message with an attached document. For that you need…
Finally, Modern Email
RFC 2046 – MIME Part 2: Media Types
This builds on RFC 2045 to allow you to have many different chunks in a message – this is what you need if you want to send “proper” HTML mail with a plain text alternative, or if you want embedded images or attachments.
Getting From A To B
RFC 5321 – Simple Mail Transfer Protocol
A message isn’t much use unless you send it somewhere. RFC 5321 explains the mysteries of actually sending that message over the wire to the recipient. If you need to know about the different phases of a message delivery, what “4xx” and “5xx” actually mean, why there’s not really any such thing as a hard or soft bounce defined, just temporary or permanent failures, or anything else about actually sending mail or diagnosing mail delivery, this is your starting point.
The Rest Of The Iceberg
I’ve only touched on the very smallest tip of the email iceberg here. There’s much, much more – both in RFCs and ad-hoc non-RFC standards. If you’re interested in more, this is a decent place to start.

So you want to start a company? (part 4)

You’re setting up a company (or a new division or maybe even a new brand) and you’d like to use email to communicate with your customers. In this series of posts I’m going to touch on some of the things you can do today to make email life easier for you in the future. Today’s final post is on DNS hosting and setup.

So you want to start a company? (part 3)

So you want to start a company? (part 2)

So you want to start a company? (part 1)

I know your customers' passwords

Go to your ESP customer login page and use “View Source” to look at the HTML (under “Page” on Internet Explorer, “Tools->Web Developer” on Firefox, and “View” on Safari).
Go on, I’ll wait.
Search for the word autocomplete. If it says something like autocomplete=”off” then your web developers have already thought about this security issue. If it doesn’t, then you might have a serious security problem.
What’s going on here? You’ve probably noticed that when you’re filling in a web form your browser will often offer to fill in data for you once you start typing. This feature is supported by most modern browsers and it’s very convenient for users – but it works by recording the contents of the form in the browser, including the username and password.
As a bad guy that’s very interesting data. I can take some off-the-shelf malware and configure it with the URLs of a bunch of ESP login pages. Then I just need to get that malware installed on your customers desktops somehow. A targeted web drive-by malware attack, maybe based on targeted hostile banner ads is one approach, but sending email to people likely to be ESP customers is probably more effective. Maybe I’ll use hostile email that infects the machine automatically, or – most likely – I’ll use a phishing attack, sending a plausible looking email with an attachment I’m hoping recipients will open.
Once the malware is installed it can rummage through the users browser files, looking for any data that matches the list of login pages I gave it. I just need to sit back and wait for the malware to phone home and give me a nicely packaged list of ESPs, usernames and passwords. Then I can steal that customer’s email lists and send my next phishing run through that ESP.
This isn’t a new issue – it’s been discussed since browsers started implementing autocompletion over a decade ago, and it’s been a best practice to include autocomplete=”off” for password fields or login forms for years.
How serious a risk is this for ESPs? Well, I looked at the customer login pages at several ESPs that have a history of being compromised and none of them are using autocomplete=”off”. I looked at several that haven’t been compromised that I know of, and they’re all using either autocomplete=”off” or a complex (and reasonably secure-looking) javascript approach to login. Correlation isn’t causation, but it’s fairly strong circumstantial evidence.
ESPs should fix this hole if they haven’t already. If any customers are upset about having to actually type in their password (really?) they can take a look at secure password management tools (e.g. 1Password, LastPass or KeePass).
Thanks to Tim at Silverpop for reminding me that this is a serious security hole that many ESPs haven’t plugged yet and pointing me at some of these resources.
More on passwords and application security tomorrow.

DKIM is Done

This was posted to the IETF DKIM Working Group mailing list this morning:

Authentication Cheat Sheet

There are a several approaches to authenticating email, and the different authentication methods have a lot of different settings to choose from (sometimes because they’re useful, other times just because they were designed by committee). It’s nice that they have that flexibility for the complex situations that might benefit from them, but almost all the time you just want to choose a good, default authentication approach.
So here’s some short prescriptive advice in no particular order for “how to do email authentication at an ESP well” without the long discussions of alternative approaches and justification of each piece of advice.

Who leaked my address, and when?

Providing tagged email addresses to vendors is fascinating, and at the same time disturbing. It lets me track what a particular email address is used for, but also to see where and when they’ve leaked to spammers.
I’d really like to know who leaked an email address, and when.
All my inbound mail is sorted into “spam” and “not-spam” by a combination of SpamAssassin, some static sieve rules and a learning spam filter in my mail client. That makes it fairly easy for me to look at my “recent spam”. That’s a huge amount of data, though, something like 40,000 pieces of spam a month.
Finding the needle of interesting data in that haystack is going to take some automation. As I’ve mentioned before you can do quite a lot of useful work with a mix of some little perl scripts and some commandline tools.
I’m interested in the first time a tagged address started receiving spam, so I start off with a perl script that will take a directory full of emails, one per file, find the ones that were sent to a tagged address and print out that address and the time I received the email. I can’t rely on the Date: header, as that’s under the control of the spammer, and often bogus. But I can rely on the timestamp my server adds when it receives the email – and it records that in the first Received: header in the message.

Analysing a data breach – CheetahMail

I often find myself having to analyze volumes of email, looking for common factors, source addresses, URLs and so on as part of some “forensics” work, analyzing leaked emails or received spam for use as evidence in a case.
For large volumes of mail where I might want to dig down in a lot of detail or generate graphical or statistical reports I tend to use Abacus to slurp in and analyze all the emails, store them in a SQL database in an easy to handle format and then do the ad-hoc work from a SQL commandline. For smaller work, though, you can get a long way with unix commandline tools and some basic perl scripting.
This morning I received Ukrainian bride spam to a tagged address that I’d only given to one vendor, RedEnvelope, so that address has leaked to criminal spammers from somewhere. Looking at a couple of RedEnvelope’s emails I see they’re sending from a number of sources, so I decided to dig a little deeper.
I started by searching for all emails to that tagged address in my mail client, then copied all the matching emails to a newly created folder. Then I took a copy of that folder and split it into one file per email using a shell one-liner: