What really is "spam" anyway?

laura
February 7, 2008
Best practices

A few days ago I was reading the attempt by e360 and Dave Linhardt to force Comcast to accept his mail and to stop people posting in the newsgroup news.admin.net-abuse.email from claiming he is a spammer. The bit that pops out at me in this complaint of his, is the fact that he believes that by complying with the minimal standards of the CAN-SPAM act, he is not spamming.
The problem with this claim is that CAN SPAM lists the minimal standards an email must meet in order to avoid prosecution. CAN SPAM does not define what is spam, it only defines the things senders must do in order to not be violating the act. There is no legal definition of spam or of what is not spam.
To add to the confusion there are a number of confusing and contradictory definitions of spam. Definitions people have used over the years include:

unsolicited bulk email
unsolicited commercial email
mail I don’t want
mail I don’t think my customers want
mail that is identical/similar to mail that hit my spamtrap
mail that was sent to a non-existent address at my domain
mail that contains HTML
unsolicited email
mail that advertises Viagra or porn sites or similar
mail that other people send

I rarely use the word spam. There are so many different definitions of spam, I have no way to know if my clients understand what I am saying, so I avoid the term completely. I do think it is important for senders to understand the definitions of spam as used by entities responsible for filtering large amounts of incoming email.
Spamhaus and some other blocking lists use “unsolicited bulk email” as their definition. Generally, they have addresses that have never been used to sign up for email, and if a mailer sends mail to them, the mailer is sending unsolicited bulk email and is eligible for listing on the blocklist. The lists believe that if a mailer is sending one piece of email to a user who did not request it, then they are likely mailing many other users who did not request any mail. This definition centers around permission, and only sending email when you have the permission of the recipient.
Many of the large ISPs use “mail our users complain about” as their definition. With this definition, they do not have to argue permission status with a sender. The data shows that their customers complain about mail from that sender or with that URL. The ISPs are going to block, or deliver to the bulk folder, email that their users do not want.
Filters and some blocking lists use “mail that has characteristics of mail we know is unsolicited bulk mail” as their definition. These characteristics can be things like an invalid HELO string, or lack of reverse DNS on the connecting IP address, or badly formatted HTML. Mail that looks like spam, in the technical sense, is often treated like spam.
Resolving a block or listing requires first understanding the definition that entity is using. For blocklists senders usually must make changes to eliminate any possibility an address will get on the list without permission of the owner of that address. For ISPs, senders must decrease the complaints from users, usually accomplished by improving the signup process, getting a FBL from the ISP and and sending more relevant email. For filters, fixing the technical issues, cleaning up HTML and sending mail that does not look like spam will resolve many of the issues.
Complying with the law is not sufficient to meet the standards of recipients. If e360 is sending mail users are complaining about, then the recipient ISPs are going to treat the mail as spam and filter or block it. If e360 is sending mail to people who have not requested it, then posters in NANAE are going to claim e360 is spamming. Is e360 sending mail that complies with CAN SPAM? I expect that they are. Does this mean they are not spamming as defined by some people? Of course not.

Updating SenderID records for Microsoft

CAN SPAM compliance.

Why do ISPs limit emails per connection?

steve
Jan 17, 2008

Best Practices

A few years ago it was “common knowledge” that if you were sending large amounts of email to an ISP the most polite way to do that, the way that would put the least load on the receiving mailserver, was to open a single SMTP session to the mailserver and then to send all the mail for that ISP down that single connection.
That’s because the receiving mailserver is concerned about two main resources when handling inbound email – the pool of “slots” assigned one per inbound SMTP session, and the bandwidth (network and disk, and related resouces such as memory and CPU) consumed by the inbound mail – and this approach means the sender only uses one slot, and it allows the receiving mailserver to control the bandwidth used simply by accepting data on that one connection at a given rate. It also amortizes all the connection setup costs over multiple emails. It’s a beautiful thing – it just doesn’t get any more efficient than that.
That seems perfect for the receiving ISP – but ISPs don’t encourage bulk senders to do this. Instead many of them have been moving from “one connection, lots of mail through it” to “multiple connections, a few messages through each”. They’re even limiting the number of deliveries permitted over a single connection. Why would that be?
The reason for this is driven by three things. One is that the number of simultaneous inbound SMTP sessions that a mailserver can handle is quite tightly limited by the architecture of most mailservers. Another is that the amount of mail that’s being sent to large ISP mailservers keeps going up and up – so there are sometimes more inbound SMTP sessions asking for access than the mailserver can handle. The third is that ISPs know that there are different categories of email being sent to their users – 1:1 mail from their friends that they want to see as soon as possible, wanted bulk mail that their users want to see when it arrives and spam; lots and lots of spam.
So ISPs want to be able to do things like accept 1:1 mail all the time, while deferring bulk mail and spam to allow them to shed traffic at times of peak load. But they can only make decisions about whether to accept or defer delivery in an efficient way at SMTP connection time – they pick and choose amongst the horde of inbound connection attempts to prioritize some and defer others, letting them keep within the number of inbound sessions that they can handle simultaneously.
But once the ISP lets a bulk mailer connect to deliver their mail, they lose most of the ability to further control that delivery as the sender might send thousands of emails down that connection. (Even if the ISP has the ability to throttle bandwidth – as some do to control obvious spam – that just means that the sender would tie up an expensive inbound delivery slot for longer).
So, in order to allow them to prioritize inbound connections effectively the ISP needs to terminate the session after a few deliveries, and then make that sender start competing with other senders for a connection again.
So ISPs aren’t limiting the number of deliveries per SMTP connection to make things difficult for senders, or because they don’t understand how mail works. They’re doing it because that lets them prioritize wanted email to their users. The same is true when they defer your mail with a 4xx response.
It might be annoying to have to deal with these limits on delivery, but for legitimate bulk mail senders all this throttling and prioritization is a good thing. Your mail may be given less priority than 1:1 mail – but, if you maintain a good reputation, you’re given higher priority than all the spam, higher priority than all the email borne viruses, higher priority than all the junk email, higher priority than the 419 spams. And higher priority than mail from those of your competitors who have a worse reputation than yours.

laura
Nov 9, 2007

Industry

Over the last couple days multiple people have asserted to me that Yahoo is greylisting mail. The fact that Yahoo itself asserts it is not using greylisting as a technique to control mail seems to have no effect on the number of people who believe that Yahoo is greylisting.
Deeply held beliefs by many senders aside, Yahoo is not greylisting. Yahoo is using temporary failures (4xx) as a way to defer and control mail coming into their servers and their users.
I think much of the problem is that the definition of greylisting is not well understood by the people using the term. Greylisting generally refers to a process of refusing email with a 4xx response the first time delivery is attempted and accepting the email at the second delivery attempt. There are a number of ways to greylist, per message, per IP or per from address. The defining feature of greylisting is that the receiving MTA keeps track of the messages (IP or addresss) that it has rejected and allows the mail through the second time the mail is sent.
This technique for handling email is a direct response to some spamming software, particularly software that uses infected Windows machines to send email. The spam software will drop any email in response to a 4xx or 5xx response. Well designed software will retry any email receiving a 4xx response. By rejecting anything on the first attempt with a 4xx, the receiving ISPs can trivially block mail from spambots.
Where does this fit in with what Yahoo is doing? Yahoo is not keeping track of the mail it rejects and is not reliably allowing email through on the second attempt. There are a couple reasons why Yahoo is deferring mail.

laura
Oct 31, 2007

Best Practices

Seth Godin links to a post up over on The Long Tail about spammers who send PR mail to Chris Anderson, an editor at wired. Apparently lots of people send automated email to the editor of Wired hawking their latest and greatest product, service or photos.
In response to this overwhelming amount of mail, Chris has instituted a new email acceptance policy. He says

What really is "spam" anyway?

Related Posts

Why do ISPs limit emails per connection?

Greylisting: that which Yahoo does not do

Wired editor has enough spam!

What really is "spam" anyway?

Share :

Related Posts

Why do ISPs limit emails per connection?

Greylisting: that which Yahoo does not do

Wired editor has enough spam!