After Epsilon lost a bunch of customer lists last week, I’ve been keeping an eye open to see if any of the vendors I work with had any of my email addresses stolen – not least because it’ll be interesting to see where this data ends up.
Yesterday I got mail from Marriott, telling me that “unauthorized third party gained access to a number of Epsilon’s accounts including Marriott’s email list.”. Great! Lets start looking for spam to my Marriott tagged address, or for phishing targeted at Marriott customers.
I hit what looks like paydirt this morning. Plausible looking mail with Marriott branding, nothing specific to me other than name and (tagged) email address.
It’s time to play Real. Or. Phish?
1. Branding and spelling is all good. It’s using decent stock photos, and what looks like a real Marriott logo.
All very easy to fake, but if it’s a phish it’s pretty well done. Then again, phishes often steal real content and just change out the links.
Conclusion? Real. Maybe.
2. The mail wasn’t sent from marriott.com, or any domain related to it. Instead, it came from “Marriott@marriott-email.com”.
This is classic phish behaviour – using a lookalike domain such as “paypal-billing.com” or “aolsecurity.com” so as to look as though you’re associated with a company, yet to be able to use a domain name you have full control of, so as to be able to host websites, receive email, sign with DKIM, all that sort of thing.
3. SPF pass
Given that the mail was sent “from” marriott-email.com, and not from marriott.com, this is pretty meaningless. But it did pass an SPF check.
4. DKIM fail
Authentication-Results: m.wordtothewise.com; dkim=fail (verification failed; insecure key) email@example.com;
As the mail was sent “from” marriott-email.com it should have been possible for the owner of that domain (presumably the phisher) to sign it with DKIM. That they didn’t isn’t a good sign at all.
5. Badly obfuscated headers
From: =?iso-8859-1?B?TWFycmlvdHQgUmV3YXJkcw==?= <Marriott@marriott-email.com>
Base 64 encoding of headers is an old spammer trick used to make them more difficult for naive spam filters to handle. That doesn’t work well with more modern spam filters, but spammers and phishers still tend to do it so as to make it harder for abuse desks to read the content of phishes forwarded to them with complaints. There’s no legitimate reason to encode plain ascii fields in this way. Spamassassin didn’t like the message because of this.
6. Well-crafted multipart/alternative mail, with valid, well-encoded (quoted-printable) plain text and html parts
Just like the branding and spelling, this is very well done for a phish. But again, it’s commonly something that’s stolen from legitimate email and modified slightly.
Conclusion? Real, probably.
7. Typical content links in the email
Most of the content links in the email are to things like “http://marriott-email.com/16433acf1layfousiaey2oniaaaaaalfqkc4qmz76deyaaaaa”, which is consistent with the from address, at least. This isn’t the sort of URL a real company website tends to use, but it’s not that unusual for click tracking software to do something like this.
8. Atypical content links in the email
We also have other links:
- http://ad.doubleclick.net/activity;src=3286198;type=mari1;cat=rwdemls;ord=1; num=[Random Number]?
- http://action.mathtag.com/mm//MARI//red?nm=rwdemls&s0=&s1=& s2=&v0=&v1=&v2=&ri=[Random Number]
(Those “[Random Number]” bits aren’t me hiding things. That’s literally what is in the email.)
That’s an awful lot of other servers this mail is going to try and contact when you read it. I’m pretty sure that most of those are tracking links (but how many legitimate emails that advertise a single company and which are sent directly by that company, need to use half a dozen independent affiliate tracking links?).
Conclusion? Doesn’t look terribly honest. Maybe some sort of affiliate scam rather than a phish, though.
9. Most of the links in the email go to marriott-email.com, but then immediately redirect to marriott.com.
This shows someone is tracking clicks, which is pretty common for mail sent via ESPs, so as to make click tracking information available to the client without the client having to do any work to capture data on their website.
10. The unsubscription link goes to a terrible page with a set of checkboxes, rather than providing a simple unsubscription button.
Conclusion? Sadly, that’s a sign that it’s real.
11. Sending network configuration
It was sent from a machine with reverse DNS of dmailer0112.dmx1.bfi0.com, but which claimed to be called dmx1.bfi0.com, not a valid hostname for the IP address it came from.
This is pretty common misconfiguration of the network that happens at larger ESPs with complex outbound smarthost farms. I’d expect a phisher not to have that sort of mistake if they were sending from their own machine or through a botnet. And while “dmx1.bfi0.com” could be an obscure end-user DSL, the reverse DNS of dmailer0112 looks like it’s a system intended to send email, not a botnet.
You’ve probably guessed by now. It’s real email, sent on behalf of Marriott Rewards through one of their ESPs. But if it takes me several minutes of groveling through the mail before I convince myself it’s real, what chance does a typical consumer have of telling the difference between a well targeted phishing email and a typical piece of commercial email?
DNS of source IP is bfi0.com, that would be the dead giveaway for me – that’s Epsilon. It’s called bfi0.com because of their history as Bigfoot Interactive.
This post depresses me.
while I agree with pgl and your post in general, I would like to disagree with your characterization of base64 encoded headers (re 5. Badly obfuscated headers).
“There’s no legitimate reason to encode plain ascii fields in this way”
Is totally false for those of us in France (or any number of other countries whose languages employ non-ascii characters).
It’s a bit more subtle than that, Justin. If the fields are “plain ASCII”, as these are, then there’s no legitimate reason at all to encode those fields at all (and even if you wanted to encode them to simplify your tool chain, you’d use Q-encoding, not B-encoding).
If you’re using a character set that’s a superset of ASCII, as many mainland European languages use – and where a significant fraction of the characters in the field are plain ASCII – then the appropriate encoding to use is Q-encoding. The whole point of Q-encoding is so that mostly-ascii character sets are encoded in a way that’s efficient and moderately readable when encoded.
That is to say that you should use “Subject: =?iso-8859-1?Q?Notre_Publicit=E9?=” instead of “Subject: =?iso-8859-1?B?Tm90cmUgUHVibGljaXTDqQ==?=”. The first is pretty much readable in it’s encoded form, and shorter.
It’s just the same issue as quoted-printable vs base64 encoding for body text – if your content is an ascii superset then you should be using quoted-printable encoding for the body and Q encoding for headers that need it, and a character encoding that embeds ascii, such as ISO-8859-* or UTF-8.
Base64 encoding and B encoding are more suitable for primarily non-ascii content. Spammers love using them even for primarily ascii content, for the reasons I mention.
If you’re sending primarily ascii content – which if you’re sending French language content, you are – and you’re using a sane encoding such as ISO-8859-1 or UTF-8 then you should be using Q-encoding. That ‘should’ is strong enough that many people and spam filters will assume malice or extreme incompetence if you use B-encoding.
While these are all legitimate ways to determine the authenticity of a mail, and as your post pointed out, at least provide some insight and clues – there’s on thing that is missing.
What was the purpose of the email? When I look out for phishing, I’m concerned about links that take me to login pages or other types of pages that try to gather my personal information.
If such links are not present, I can write that mail off as legitimate in most cases. (And if it’s still not real, I’m not really concerned about it)
Regarding the base64 encoded headers, you are correct in saying that encoding is only necessary when non 7bit ASCII characters are used. This is a problem worldwide when sending in any language other than English. Spanish in the US for instance.
The correct procedure would be to only encode WHEN those characters appear, but as is often the case, encoding is done “just in case”.
Many of the “good” phishes I see work by taking a legitimate email and simply replacing the “login to our website” link with a link to the phishers site instead. That approach means that it’s hard to judge based on appearance, other than (as you say) being suspicious of any email that has a link to “your account” or any link to a website that challenges you to log in. The Marriott mail had several links of that sort, all going to marriott-email.com. (That was one of the half-dozen other points I could have mentioned, but I didn’t want to make the post even longer than it already was 🙂 )
As for the encoding – if the primary content you send is western european languages then your toolchain should be preferring Q-encoding and quoted-printable. A smart toolchain that only encodes as needed is certainly nice, but I would be fairly happy with one that Q-encodes everything.
But B-encoding or Base64 encoding of all content, even plain text or primarily plain text encoding, in that situation is a sign of incompetence or malice, though.
Key indicators of legitimate vs phish messages: The goal of the message is to try to extract your password, financial, or other personal information
[…] issues and as Laura Wise recently showed it can be hard enough for an expert in this field to tell real email from fake. Worse still some companies send legitimate email that easily fits the profile of a phishing […]
Hey Steve, thanks for the clarification. For some real world data, we send email with three systems: two large(ish for the French market) ESPs and our own internal system.
Our internal system auto detects high ascii characters and does indeed use Q encoding when necessary.
The two ESPs however use B encoding for everything (even subjects with no high ascii characters).
At my previous employer we auto detected and then, if I recall correctly, base64 encoded (I should remember because I wrote it, but hey, it was a while ago). I am going to go out on a limb on this one and say that the reason this was done was because we were dealing with European, Asian and Arabic languages using UTF8 and we were disinclined to make the Q vs B vs nothing decision, when the B vs nothing seemed acceptable. As I already qualified, this decision was taken quite some time ago and my memory is foggy, but it seems reasonable.
I am guessing the two French ESPs we use don’t get a lot of asian language content, but they could well have a decent amount of Arabic.
All of this goes is to say that I think your final statement here:
“But B-encoding or Base64 encoding of all content, even plain text or primarily plain text encoding, in that situation is a sign of incompetence or malice, though.”
Is a bit harsh. The RFC (http://tools.ietf.org/html/rfc2047#section-4) permits both and while it recommends Q encoding for mostly ascii text, it does not say that one is incompetent if one chooses B.