Why do my URLs have two dots?

W

You take a turn, I take a turn

At the SMTP level email is very much a simple line-by-line text based protocol. The client sends a command on a single line, the server responds with one or more lines (the last one marked by having a space in the fourth column), and then the client sends another command.

The main exception to that is when the client sends the payload of the email. Once the server has said it’s ready to receive the email the client sends the whole thing, then tells the server it’s finished by sending a line consisting of just a single period “.”.

But what if you have a line in your email that’s just a single period? The developers of SMTP thought of that! Whenever the SMTP client sees a line that begins with a period, it has to add another one. And whenever the SMTP server sees a line that begins with a period, it has to remove it.

This is called “dot-stuffing”. If an email has the line “.”, it’ll be sent on the wire as “..”. Email with the line “.yakko.wakko.dot.” will be sent as “..yakko.wakko.dot.”.

If everything works properly, nobody will ever notice the dot-stuffing. Often it doesn’t, though. One way to break things is for a sender to use a message composition API that produces mail that’s been dot-stuffed and is ready to send, then to send it via an API that expects non-dot-stuffed content, and will stuff it again before sending. Another is for the receiving server or intermediate server to just not do un-stuffing properly.

Understuffed, overstuffed, wombling free

The symptoms of this vary, depending on whether the mail is over-stuffed or under-stuffed. If it’s not stuffed enough, and there’s a line in the email which consists of a single dot then the receiving server will see that line as the end of the message, and accept the (truncated) message for delivery at that point – and then throw an error for every line the client sends after that, as it’s expecting SMTP keywords, not a message body.

If it’s over-stuffed then the problem is a bit less obvious. Any line that begins with a single dot when it’s sent will begin with two dots when it’s received.

You might think that it’d be pretty rare for a line to begin with a dot, and even rarer for anyone to notice a problem if that dot is doubled. But there’s one case where it’s quite likely.

Quoted-Printable encoded HTML

HTML sent quoted-printable encoded is the normal format for commercial emails. The HTML gives us all the rich content we like, while the quoted-printable encoding lets us not have to worry so much about sending non-ascii characters or violating SMTP specs by accidentally sending lines longer than 998 characters.

Quoted-printable encoding is pretty simple. Some characters are encoded as three ascii characters – an equals sign, followed by two hex digits representing the character. So “=” is quoted-printable encoded as “=3D”. And long lines are wrapped at 76 characters, with an = sign to mark the wrapping. There are a few other rules, but that’s about it.

(We have a tool that’ll decode quoted-printable, if you ever need that, on tools.wordtothewise.com).

If you have a line that looks like …

about forty-five characters of text <a href=”http://whatever.example.com”>click here</a>

… it will quoted-printable encode to look like …

about forty-five characters of text  <a href=3D”http://whatever.example=
.com”>click here</a>

… then overstuffing will mean it ends up at the recipient as …

about forty-five characters of text  <a href=3D”http://whatever.example=
..com”>click here</a>

… and the mail client will quoted-printable decode it to be …

about forty-five characters of text <a href=”http://whatever.example..com”>click here</a>

That looks identical, but the link won’t work because of the doubled dot in the URL.

The symptoms of this will be that some links in the mail you send won’t work when it’s received. If the stuffing bug is at your end then it’ll mean that some links, depending on the content and layout of your email, won’t work. If the stuffing bug is at the recipient mailserver or a forwarding middlebox it means that some links, depending on the content and layout of your email, won’t work for some recipients.

Mitigation

If your mail generation and delivery pipeline isn’t handling dot-stuffing correctly you should fix that. You can test it by sending yourself a plain text email with a line that starts with a dot – if it’s doubled, there’s a problem.

If the issue is at the receiver it’s harder to identify that it’s happening, let alone get it fixed. One way to mitigate in that case would be to configure your quoted-printable encoder to encode “.” as “=2E”. That way no raw “.” will appear in your messages, so there’s no way it can get overstuffed.

About the author

1 comment

Leave a Reply to mike brescia

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  • My problem is with understuffing somewhere. Some quoted-printable html, which has many dots scattered in URLs, occasionally shows up with broken URLs where a dot is missing from the beginning of a wrapped portion of text. A recent example in the !DOCTYPE header where the http://www.w3.org name is not conveyed.

    Question: how does a user begin to identify the problem site(s) and notify them?

By steve

Recent Posts

Archives

Follow Us