Recipients can’t click through if you don’t exist
A tale of misconfigured DNS wrecking someone’s campaign.
I got mail this morning from A Large Computer Supplier, asking me to fill in a survey about them. I had some feedback for them, mostly along the lines of “It’s been two decades since I bought anything other than rackmount servers from you, maybe I’m not a good advertising target for $200 consumer laptops?” so I clicked the link.
(I’ve replaced the real domain with survey.example.com in this post, to protect the innocent, but everything else is authentic).
That’s not good. The friendly error messages web browsers give sometimes hide the underlying problem, but that looks like a DNS problem. Did they do something stupid, like putting the wrong URL in the mail they sent?
~ ∙ host survey.example.com Host survey.example.com not found: 3(NXDOMAIN)
“NXDOMAIN”. That means that there are no records in DNS for the hostname I looked up. From my part of the Internet, at least, that hostname doesn’t exist. I used to build DNS software, so I find the variety of ways in which people break their DNS interesting. Time to dig a little deeper.
~ ∙ host -t ns example.com example.com name server ns2.dreamhost.com. example.com name server ns1.dreamhost.com. example.com name server ns3.dreamhost.com. ~ ∙ host survey.example.com ns1.dreamhost.com Using domain server: Name: ns1.dreamhost.com Address: 126.96.36.199#53 Aliases: Host survey.example.com not found: 3(NXDOMAIN)
Here I look up the authoritative servers for the domain, and find it’s hosted by dreamhost. Then I check the records at one of the authoritative servers, ns1.dreamhost.com, and it’s returning NXDOMAIN too. survey.example.com doesn’t exist. Oops.
I told a little fib
Except … I’ve not been entirely truthful about how I investigate DNS issues. “host” is a user-friendly tool, and it provides nice, brief output for normal queries, so it’s the tool I use when I’m showing queries to clients or putting them on the blog. But I’m a DNS geek, so the tool I actually use is “dig“. Dig is anything but user-friendly. The results it gives you aren’t really interpreted at all, just a human-readable representation of the raw DNS packets – verbose, with lots of output that doesn’t necessarily make sense unless you’re familiar with how DNS works under the covers ([rfc 1034] and [rfc 1035] if you really want to know).
This is what that last query looks like using dig:
~ ∙ dig @ns1.dreamhost.com survey.example.com ; <<>> DiG 9.8.3-P1 <<>> @ns1.dreamhost.com survey.example.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 54756 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;survey.example.com. IN A ;; ANSWER SECTION: survey.example.com. 14400 IN CNAME example-com.surveygizmo.com. ;; AUTHORITY SECTION: surveygizmo.com. 14400 IN SOA ns1.dreamhost.com. hostmaster.dreamhost.com. 2011072000 14618 1800 1814400 14400 ;; Query time: 54 msec ;; SERVER: 188.8.131.52#53(184.108.40.206) ;; WHEN: Fri Oct 10 10:25:13 2014 ;; MSG SIZE rcvd: 147
Well … now I’m interested. With dig we can see exactly what the response from the authoritative server is – and it’s very broken. It’s returning an NXDOMAIN response, saying definitively that there are no records of any type for survey.example.com of any type. But it’s also returning an answer record for survey.example.com – a CNAME that redirects to the survey vendor. That’s really not allowed.
I contacted the firm running the survey and gave them a heads-up that their DNS was broken – and they replied telling me that it was working fine for them. I wonder how that could be.
I have three different DNS resolvers on my network: PowerDNS, a very solid and standards-compliant resolver. BIND, the oldest resolver, often installed by default and full of both features and bugs. And also whatever embedded resolver Mikrotik appliances use, likely similar to the embedded resolvers used in consumer routers.
Lets see what that record looks like through different resolvers:
~ ∙ dig @192.168.80.100 survey.example.com ; <<>> DiG 9.8.3-P1 <<>> @192.168.80.100 survey.example.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 21305 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;survey.example.com. IN A ;; ANSWER SECTION: survey.example.com. 14400 IN CNAME example-com.surveygizmo.com. ;; Query time: 194 msec ;; SERVER: 192.168.80.100#53(192.168.80.100) ;; WHEN: Fri Oct 10 14:18:41 2014 ;; MSG SIZE rcvd: 93
PowerDNS returns what it received from the authoritative server – an NXDOMAIN and an answer. Most applications are going to see the NXDOMAIN and stop there, unable to resolve the hostname.
~ ∙ dig @192.168.80.1 survey.example.com ; <<>> DiG 9.8.3-P1 <<>> @192.168.80.1 survey.example.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 37002 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;survey.example.com. IN A ;; Query time: 114 msec ;; SERVER: 192.168.80.1#53(192.168.80.1) ;; WHEN: Fri Oct 10 14:18:01 2014 ;; MSG SIZE rcvd: 36
The embedded resolver Mikrotik uses sees the NXDOMAIN response and provides just that, without the answer record.
And finally, BIND
steve@scratch:~$ dig @127.0.0.1 survey.example.com ; <<>> DiG 9.9.5-3-Ubuntu <<>> @127.0.0.1 survey.example.com ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1911 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 5 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;survey.arborvistallc.com. IN A ;; ANSWER SECTION: survey.example.com. 14400 IN CNAME example-com.surveygizmo.com. example-com.surveygizmo.com. 300 IN A 220.127.116.11 ;; AUTHORITY SECTION: surveygizmo.com. 172800 IN NS jack.ns.cloudflare.com. surveygizmo.com. 172800 IN NS dina.ns.cloudflare.com. ;; ADDITIONAL SECTION: dina.ns.cloudflare.com. 172800 IN A 18.104.22.168 dina.ns.cloudflare.com. 172800 IN AAAA 2400:cb00:2049:1::adf5:3a6b jack.ns.cloudflare.com. 172800 IN A 22.214.171.124 jack.ns.cloudflare.com. 172800 IN AAAA 2400:cb00:2049:1::adf5:3b79 ;; Query time: 868 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Fri Oct 10 07:58:13 PDT 2014 ;; MSG SIZE rcvd: 253
BIND handles it differently (and, I think, wrongly). It sees that there’s an answer, so it returns an answer, along with a lot of other related records. And it returns a NOERROR response, instead of the NXDOMAIN it received. Any client application, such as a web browser, will see that as a perfectly reasonable response, and clicking on the link will work, ending up at surveygizmo.
And the moral of this story is…
Steve gets overly excited by obscure DNS bugs, mostly.
But also, it’s possible to mess up your DNS records such that it will work perfectly for you, and some fraction of your recipients, while being broken for the rest of your recipients (anyone at an ISP not using BIND in this example). So if you get reports that your links aren’t working (or your SPF records or DKIM records are bad) don’t assume that it can’t be a DNS problem because it works correctly when you check them.