GFI/SORBS – a DDoS Intermezzo

Act 1Act 2IntermezzoAct 3Act 4Act 5
Management Summary, Redistributable Documents and Links
I’ve been stage-managing for a production of The Nutcracker this week, so musical terminology is on my mind. In opera, the intermezzo is a comedic interlude between acts of an opera series.
This comedic interlude is about the “DDoS” – a distributed denial of service attack. What is a denial of service attack?

… an attempt to make a computer resource unavailable to its intended users.
One common method of attack involves saturating the target machine with external communications requests, such that it cannot respond to legitimate traffic, or responds so slowly as to be rendered effectively unavailable.Wikipedia on DoS attacks

That’s pretty much what we’re discussing here. There are a variety of ways to mount a DoS attack but by far the most common, and the sort that’s characterized by descriptions of “gigabits per second” or “packets per second”, is simply sending high volumes of network traffic aimed at a webserver or network routers the webserver relies on. The network traffic might be pure garbage, or it might be valid web requests, but the goal of the attacker is to overwhelm either the server itself or, more commonly, the network pipe connecting the server to the internet such that it can’t provide it’s service to the public. At it’s most basic level the symptom of a DoS attack is that you can’t reach any web page on the server that’s under attack.
(What’s a distributed denial of service attack? It’s an implementation detail – the attacker uses multiple machines to attack simultaneously, enabling them to provide more attack traffic and to make that traffic a little harder to block.)
So that’s what a DoS attack can do – make a service unavailable. What can’t a DoS attack do? It can’t make any changes to the server(s) under attack. It can’t deface a web page. It can’t cause a web service to give wrong answers. It can’t corrupt information stored in a database. It can’t cause a blacklist to add false listings. That last point is fairly important – no DDoS against any blacklist infrastructure can cause it to add false listings.

How does a DDoS insert bad records into your blacklist? That is the real issue. If that would stop happening, then a DDoS would have no impact on removal of the bad records.insightful comment on Fridays post

And yet, GFI/SORBS has blamed their operational problems such as false positives, publishing of stale data, refusal to delist addresses and so on on “DDoS attacks” sufficiently often that it’s a running joke in the email industry.

Once again SORBS has reactivated old DUL-Listings, e.g. for 85.25.230.x. This happened back in october as well, and that time you also claimed DDoS.From a comment on Friday’s post

So lets look at the evolution of the latest SORBS incident. I don’t usually pay that much attention, as SORBS listings don’t affect me or my customers at all, but this time around I was researching this series of posts so I watched what was going on reasonably carefully.
On Thursday the www.sorbs.net webserver was reasonably responsive, static pages and files were returned fairly quickly, but pages that needed to access the backend database (for account creation, authentication etc.) had some problems. They were either throwing database errors (perl or PHP, apparently, and the errors looked like race conditions or invariate violations caused by page reloads) or the scgi web application process was taking too long, causing the webserver to return a “500 Internal Error” page. Reloading would (eventually) get the page. All of this behaviour will be very familiar to any web developer who’s messed up their database design to the extent that the queries needed to render a web page take too long. There was no sign of any network or webserver level problems, certainly no obvious symptoms that looked like any sort of DoS, just a database that wasn’t returning data quickly enough. If GFI were doing batch database operations against the production database, as part of trying to fix whatever was going on, that could cause the database to be sluggish, but it could also be caused by a huge number of people trying to log in (to try and get their false listings removed) or just poor database design.

I’ve been trying to create an account on Sorbs to request a delisting (from DUHL), but I keep getting an error, with perl error code echoed to screen, after waiting for minutes for the register an account to process.another satisfied commenter

On Saturday the situation changed entirely. I was unable to access the www.sorbs.net website at all (though the corporate gfi.com website was fine). That change in “DDoS” behaviour seemed very strange, and the timing (relative to GFI employees noticing that I was using the sorbs website to investigate the false listings) seemed rather convenient, so I looked in a bit more detail.
The sorbs webserver is hosted on five separate IP addresses (which one you end up at will be picked semi-randomly). That’s quite a lot – most websites, including the GFI corporate website, are hosted on a single address and even facebook.com only needs three.
None of those five addresses are in the address space allocated to GFI, rather three of them are in 111.125.160.128/26 – address space allocated to Matthew Sullivan personally while the other two are in 208.43.0.0/16 – address space assigned to softlayer technologies, most likely colocated servers or virtual servers being rented by GFI. (Note: This Matthew Sullivan, and the Michelle Sullivan who commented on Friday’s post are the same person, the GFI employee who founded SORBS originally and, to the best of my knowledge, still operates it.)
If the web server cluster were under a sustained DDoS I’d expect the five addresses to be attacked in the same way. Yet the symptoms were quite different. The three servers directly controlled by Matthew Sullivan (111.*) were completely unresponsive – no packets were being returned, 100% packet loss. The two softlayer hosted servers (208.*) were responsive at a packet level, accepting connections immediately on port 80 but not returning any results at the http level.
The behaviour of the softlayer servers is hard to explain in a DDoS related way, particularly as it was repeatable from several different networks. If all the apache sessions were occupied but there were still space available in the kernel level accept queue, that would explain the symptoms – but that would require an implausibly careful DDoS.
If the sorbs web application were broken in a way that it hung or crashed even on trivial static page requests then I’d expect the webserver to time out the app and return a 500 “I’m Broken” error, which it didn’t seem to do. That also wouldn’t explain the different behavior of the 111.* servers and the 208.* servers.
If the 208.* servers were pure proxy servers that just tunneled all web requests through to the 111.* servers that would explain what’s seen – a request to the 208.* server is accepted, then forwarded to the 111.* servers, which just hang. This would be an implausibly badly designed network architecture (it adds two servers which do nothing but add bandwidth costs, increase latency and reduce system reliability) but it’s just barely plausible. Given there’s no obvious packet level issues connecting to the 208.* servers, though it would require that the anonymous ddosers don’t bother attacking the 208.* servers as they know they’re “fake” servers and need not be attacked to take www.sorbs.net off the air.

Shouldn’t any competently-run DNSBL have plans in place for handling DDOS attacks? Why is Spamhaus stable but SORBS down relatively often?@delivery_kitty

So what else could explain the differences in behavior between the two sets of servers? The 111.* servers are in network space assigned directly to Matthew Sullivan, and he presumably has full, network level control over them, while the 208.* servers are hosted, so GFI may have a different level of access, perhaps mostly at an application or control panel level.
There is one explanation for the symptoms that explains the odd behaviour seen, and also explains the “elephant in the room” – that the degraded website behavior tends to appear soon after there’s been a rash of false data added to the database.

Or – excuse my wild speculation here – but maybe it’s not actually a DDOS attack against SORBS, but mistakes on the part of the operators?@delivery_kitty

If I were to fake up the symptoms of a DDoS where I had complete control over the network I’d pretend that I was being “packeted to death” by gigabits of traffic, and configure my router to drop all inbound packets. That would simulate reasonably accurately the effects of a massive DDoS, and also be the defensive approach you’d put in to place to defend against a real DDos. it’s exactly the behavior I see from the natively hosted sorbs web servers in 111.*.
If I only had access to the web application level (either because I was running on a hosted server, or if I didn’t have sufficient private control over the server such that I could create packet filtering without notice) the best I could do would be to make the web application hang, and possibly configure the webserver to have an extremely long timeout. That wouldn’t simulate a DDoS particularly well, but it would be good enough to convince anyone who were just using a web browser rather than looking at the lower level traffic. It’s exactly the behavior I see from the softlayer hosted sorbs web servers in 208.*.

Every time when SORBS makes just another mistake, you claim DDoS for either the problem or your inability to fix the mistake. And because you claim this every time, nobody believes you any longer.Hans

I’d be loath to suggest such a theory, even though it’s the most plausible explanation of the symptoms I’ve seen if it weren’t for the reputation sorbs has of having “convenient” DDoS attacks to explain false positive listings (which, as we explained earlier, cannot possibly be caused by any sort of DoS attack). Additionally, even though GFI were claiming to have been under a DDoS since early last week when they loaded millions of false positives into their database, the SORBS webserver had been up and basically functional, if slow. Shortly after a GFI employee – the same employee who has direct control over the 111.* webservers – commented on my blog post on Friday that explained I was looking into data inaccuracy, the “DDoS” symptoms changed to something entirely different, something that prevented me from looking at further SORBS data. Given SORBS history with respect to “DDoS attacks” I’m suspicious of both the timing and the details of the symptoms.
Fortunately, I’d actually gathered most of the data I needed for tomorrows post by Friday, so the “DDoS attack” didn’t really inconvenience me anywhere near as much as it did all the postmasters trying to investigate and resolve SORBS false listings.

the last response I got back … was that the entire /24 block was ‘inelligible’ for de-listing. The parent company sites a DDoS attack as well and says their management team is aware of the issue and working to resolve it ASAP. We’ll see what happens…anonymous commenter

GFI would benefit from some transparency about their processes, SORBS day-to-day operations and details about the mistakes they’re making, how they’re fixing them and how they’re ensuring they’re not repeated again and again. And some explicit details about exactly what sort of “DDoS” they’re seeing might help them gain some credibility. This level of communication isn’t helping with that.
More tomorrow.

Related Posts

The view from a blacklist operator

We run top-level DNS servers for several blacklists including the CBL, the blacklist of infected machines that the SpamHaus XBL is based on. We don’t run the CBL blacklist itself (so we aren’t the right people to contact about a CBL listing) we just run some of the DNS servers – but that means that we do get to see how many different ways people mess up their spam filter configurations.
This is what a valid CBL query looks like:

Read More

Content based filtering

A spam filter looks at many things when it’s deciding whether or not to deliver a message to the recipients inbox, usually divided into two broad categories – the behaviour of the sender and the content of the message.
When we talk about sender behaviour we’ll often dive headfirst into the technical details of how that’s monitored and tracked – history of mail from the same IP address, SPF records, good reverse DNS, send rates and ramping, polite SMTP level behaviour, DKIM and domain-based reputation and so on. If all of those are OK and the mail still doesn’t get delivered then you might throw up your hands, fall back on “it’s content-based filtering” and not leave it at that.
There’s just as much detail and scope for diagnosis in content-based filtering, though, it’s just a bit more complex, so some delivery folks tend to gloss over it. If you’re sending mail that people want to receive, you’re sure you’re sending the mail technically correctly and you have a decent reputation as a sender then it’s time to look at the content.
You want your mail to look just like wanted mail from reputable, competent senders and to look different to unwanted mail, viruses, phishing emails, botnet spoor and so on. And not just to mechanical spam filters – if a postmaster looks at your email, you want it to look clean, honest and competently put together to them too.
Some of the distinctive content differences between wanted and unwanted email are due to the content as written by the sender, some of them are due to senders of unwanted email trying to hide their identity or their content, but many of them are due to the different quality software used to send each sort of mail. Mail clients used by individuals, and content composition software used by high quality ESPs tends to be well written and complies with both the email and MIME RFCs, and the unwritten best common practices for email composition. The software used by spammers, botnets, viruses and low quality ESPs tends not to do so well.
Here’s a (partial) list of some of the things to consider:

Read More

Legitimate mail in spamfilters

It can be difficult and frustrating for a sender to understand they whys and wherefores of spam filtering. Clearly the sender is not spamming, so why is their mail getting caught in spam filters?
I have a client that goes through this frustration on rare occasions. They send well crafted, fun, engaging content that their users really want. They have a solid reputation at the ISPs and their inbox stats are always above 98%. Very, very occasionally, though, they will see some filtering difficulties at Postini. It’s sad for all of us because Postini doesn’t tell us enough about what they’re doing to understand what my client is doing to trigger the filters. They get frustrated because they don’t know what’s going wrong; I get frustrated because I can’t really help them, and I’m sure their recipients are frustrated because they don’t get their wanted mail.
Why do a lot of filter vendors not communicate back to listees? Because not all senders are like my clients. Some senders send mail that recipients can take or leave. If the newsletter shows up in their inbox they may read it. If the ad gets in front of their face, they may click through. But, if the mail doesn’t show up, they don’t care. They certainly aren’t going to look for the mail in their bulk folder. Other senders send mail that users really don’t want. It is, flat out, spam.
The thing is, all these senders describe themselves as legitimate email marketers. They harvest addresses, they purchase lists, they send mail to spamtraps, and they still don’t describe themselves as spammers. Some of them have even ended up in court for violating various anti-spam laws and they still claim they’re not spammers.
Senders are competing with spammers for bandwidth and resources at the ISPs, they’re competing for postmaster attention at the ISPs and they’re competing for eyeballs in crowded inboxes.
It’s the sheer volume of spam and the crafty evilness of spammers that drives the constant change and improvement in spamfilters. It’s tough to keep up with the spamfilters because they’re trying to keep up with the spammers. And the spammers are continually looking for new ways to exploit recipients.
It can be a challenge to send relevant, engaging email while dealing with spamfilters and ISPs. But that’s what makes this job so much fun.

Read More