GFI/SORBS considered harmful, part 3

Act 1Act 2IntermezzoAct 3Act 4Act 5
Management Summary, Redistributable Documents and Links
In the last few days we’ve talked about GFI’s lack of responsiveness, the poor quality of their reputation and blacklist data, and the interesting details of their DDoS claims. Today we’re going to look at (some of) the fundamental problems with GFI’s procedures and infrastructure that cause those issues. Some of the subset of issues I’ve chosen highlight are minor, some are major, but they show a pattern of poor decisions.
SSL Certificates
When you use SSL on a web connection it brings you two benefits. The first is that it encrypts the connection between your browser and the webserver, so that it’s very difficult for anyone to watch or tamper with your interaction with that webserver. The second, more important, reason is to make sure that you’re talking to the webserver you think you’re talking to, to avoid man-in-the-middle attacks.
This security relies on you trusting the certification authority that issues the SSL certificate that the website uses. A website providing services to the public should always use an SSL certificate created by one of a small number of reputable certification authorities that are pre-loaded into all webservers as “trusted”. These SSL certificates are something that need to be be purchased, but they’re very inexpensive – less than ten dollars a year.

One more amazing thing is about the self-signed ssl certificate used by SORBS ! It look’s like a poor homeless website ! A verisign certificate can cost 500 bucks for one year, but I’ve got one inexpensive from Trustico for 16 / year and there is no trouble with all the current browsers and operating systems. It looks like SORBS was running an old 386 server in a garage, not like a worlwide operating service. This is not serious at all !Laurent Marandet

For completely private webservers used by a well-defined group of people – an enterprise web service used solely by company employees, for instance – you might instead have your IT group set up a private (“self-signed”) certification authority and have your employees configure their web browsers to use that. Doing that is a very bad idea for public webservers for three major reasons:

  1. It requires the user to perform a complex setup to load the information into their browser, rather than just visiting the website and being secure.
  2. It provides no real authentication that you’re visiting the site you think you are, rather than some man-in-the-middle imposter. Why? Because you’re downloading the certificate that provides that security from the what you think is the same website it’s protecting. You just need to be tricked to visit the imposter website once, or have your web session hijacked in some way, and the imposter can have you download *their* private certification authority certificate instead – and then you’ll think you’re accessing the real website, protected by SSL, when you’re really accessing the imposter.
  3. Worst of all, if a user can be persuaded to load a private certification authority certificate into their browser then the people who run that private certification authority can create “fake” SSL identities for any website they like.

That last point is very scary when you think about it. If you as an end-user can be tricked into loading a certification authority into your browser then the owner of that private authority can do almost undetectable phishing or man-in-the-middle attacks against you. They can create a fake citibank.com or microsoft.com website, and fake SSL certificates to match, meaning your browser will tell you that you’re securely talking to your bank.
SORBS tries to get people to install their private certification authority certificate into their web browsers. There are all sorts of reasons a malicious website might do that, while the only reason a legitimate website would do that would be to save the $8.95 a year it would cost to use a real SSL certificate. Given that GFI uses very high-quality, expensive ($488/year), “green bar” SSL certificates on their other web properties that doesn’t seem likely.
So why is GFI/SORBS trying to get listees to install an untrusted certification authority into their browsers? The only explanation that doesn’t imply malicious intent is a deep ignorance of basic security engineering – which doesn’t inspire confidence in the safety and security of their data. (Maybe there really are hackers breaking into GFI machines and adding random false positive SORBS listings?)
Remote Hands / Remote Access
GFI regularly add huge swathes of address space to their SORBS blacklists wrongly (many millions of addresses, and a noticable fraction of the entire internet). Even when they acknowledge that it’s a bad listing they’re often not able to fix it for weeks or months due to “database problems” or “DDoS attacks”.

some people are morons “anyone can blame a mistake on a DDoS” … the problem is the mistake was corrected but the DDoS is preventing the correction from getting to the real world.GFI Statement on failure to resolve false listings

During those periods that the SORBS website is unavailable for some reason, the DNS servers that actually publish the GFI/SORBS blacklists seem to keep running with no problem.

SORBs DNS servers were responding to RBL requests during the whole time, albeit somewhat slowly. IT WOULD HAVE BEEN BETTER IF THEY HAD JUST TIMED OUT!!! Then we wouldn’t be here writing about it. But they did not time out, and worse they answered RBL requests with bad replies.Skyhawk

I’ve seen other blacklists have data problems that have caused serious false positives, and the first thing they’ve done is to remove those false positives from the data they’re publishing, even if they had to do so in a crude, broad-brush manner – in extreme cases I’ve seen them completely empty the published DNS zones until the problem was fixed. GFI isn’t doing this – and they say the reason they’re not fixing the data is that they can’t due to the DDoS or database problem.
The only way that can be true is if they have no out-of-band or remote hands access to their name servers – which is not a safe way of running mission critical internet services. Word to the Wise runs a number of servers at several locations and, while none of them are “mission critical” either to us or to our customers, they are “business critical” such that them being unavailable for more than a few hours would cause a business problem. Off the top of my head, we can access most of them in any of the following ways, even if our main office network is off the air due to a T1 cut, a power outage or even a DDoS:

  1. Connect to the publicly visible IP address from anywhere there’s network
  2. Ssh into bastion host, then ssh to production server over private maintenance network
  3. Ssh into terminal concentrator, connect to server via serial port
  4. VPN into production location from laptop at a coffee shop, use virtual machine management software to log in to production virtual servers (or reboot them, or even rebuild them from scratch)
  5. VPN into production location from my iPhone, do any of the above
  6. Ssh in to remote power strip and kill the power to one or more of the production servers, either to reboot them or take them off air
  7. Dial in to terminal concentrator, do any of the above
  8. Dial in to remote power strip, reboot or power off machines
  9. ‘Phone staff at the colocation facility and have them power-off or power-cycle machines, or type commands into their keyboards
  10. ‘Phone staff at the NOC, have them drop network connectivity to one of my IP addresses, to disable that server
  11. Modify my DNS configuration, to point services to another machine running elsewhere
  12. Modify my DNS configuration, to take a service off the air
  13. Drive to the colocation facility and do whatever is needed in person (there’s someone trustworthy  within less than an hours drive of each of our facilities)

If GFI had even one piece of that infrastructure in place, they’d be able to fix problems in the data they were publishing via DNS even if their main location were DDoSed into oblivion. If we trust their assertion that they cannot do that, then they do not have enough basic infrastructure in place to run any sort of public facing internet service, let alone one that’s providing a mission-critical service to external users.

If you can’t fix your data, you need to turn off your dnsbl name servers. You are currently harming mail administrators around the world in a big way.Hans

Lack of Safeguards
The mistakes GFI make with the SORBS blacklist vary from incident to incident somewhat, but there’s a lot of repetition. Loading ancient, years-old, database dumps into the system, then publishing them without any checks seems to be one of the common failure modes.

My company is directly affected by this, for the second time in two months.anonymous commentor

People will often forgive a single mistake, but if you make the same one over and over again they lose all patience. It would be easy enough to put some automated checks into place, in the step between a new set of data being commited to the database and that data being published to the production DNS servers.
Even some of the most trivial checks, that could be implemented in a couple of hours in perl, would have prevented the (repeated) high-profile dul.dnsbl.sorbs.net errors. GFI clearly haven’t added even those most basic checks, let alone anything more sophisticated such as separate staging areas, backup databases, sanity checking against internal or external whitelists.
That either means that GFI have chosen not to invest any development resource into data and system integrity – or it means they have, but the system engineer responsible for it is not competent to do so. Either way, it casts serious doubt on their ability to maintain any level of data integrity either on their external DNSBl blacklists or their internal reputation system (which they bought out the SORBS assets to implement).
Naivete about how Internet email works
Policy problems are just as much of a problem as implementation mistakes.
I’ve concentrated mostly on the “dul” (dynamically assigned) blacklist, as it’s easier to demonstrate it’s failures. But there are similar problems in other zones.
If you’re running a mailing list, one recommended practice for signing up new users is confirmed opt-in. This is about as solid a best practice for mailing list signups as you can get, and is evangelized by legitimate anti-spam organizations such as SpamHaus or MAAWG as one of the best ways to run a mailing list.

This is the standard Best Practice for all responsible Internet mailing firms. COI ensures users are properly subscribed, from a working address, and with the address owner’s consent. […] This simple protection means that the Bulk Email sender can not be legitimately listed on any ‘spam’ blocklist
SpamHaus on COI

Some years back Laura was working with a mailing list operator who’d been listed on the SORBS spam blacklist. That seemed strange, because they were a 100% Closed-Loop Opt-In mailing list, and so there was no way they should have been listed on any ‘spam’ blocklist. On investigating it turned out that a user, something@cox.net had tried to sign up for the mailing list, but had typoed their email address to something@cix.net – just one letter off on the keyboard. The mailing list manager sent the opt-in confirmation request to something@cix.net, nobody responded to it, so something@cix.net wasn’t subscribed to the list or sent any mail from the list. A couple of hours later the user noticed, and signed up again with their correct something@cox.net mailing address and were subscribed successfully. This is all exactly how closed-loop opt-in is supposed to work.
However, the owner of cix.net had given control over the domain to SORBS for use as a “spam-trap” domain. And SORBS used a single confirmed opt-in request to an address in that domain to blacklist the mailing list operator. That’s a very poorly thought-out way of using a spam-trap domain, and leads to serious data integrity and false-positive problems with any blacklist based on such a naive spamtrap-driven approach.
That was a few years ago, but there’s more recent behaviour that’s pretty much the same.

I’ve got some really nice data showing how SORBS lists you for hitting a seed address of theirs ONCE 24 hours after they purchased the domain which expired. The user who previously owned the address in question COI (confirmed opt-in) into a list 8 months prior and had shown opens and click throughs within a couple months of the domain transfer. This is just an example of how one of their other zones uses terrible data as the basis for a listing.mailing list operator

I’ve seen a similar thing. In November 2009 someone signed up for a mailing list – full closed-loop opt-in, everything. Mail was sent out to the mailing list regularly, and we know the recipient was reading the mail – they were clicking on links in the messages, interacting with the mail, all the sorts of things a typical happy recipient will do. Judging from the website recorded at archive.org it was just a personal domain owned by the recipient, though, and at some point they let the domain registration lapse.
Two days after the domain registration lapsed, another email was sent out to the mailing list. By then GFI had acquired access to the domain, and were using it as a spamtrap domain. The mailing list was blacklisted on the GFI/SORBS spam list due to that single, non-spam email. Even the most aggressive blacklist experts agree that you need to leave any domain that’s been previously used for legitimate email in a state where it bounces email for at least six months to a year before you can really pull much useful data out of deliveries to it.
Industry best practice is to not remove an address from a mailing list until it’s bounced at least three times, over at least a 15 day period. There’s no way at all any mailing list (or ISP smarthost or any other source of email) could avoid being listed by the SORBS spam list in this way.
Given the way the data is acquired, the quality of data in the GFI/SORBS blacklist is likely to remain poor, and to be full of false-positive listings of this type.
Database Design, Audit Trails and Broken Data
I pulled a copy of the full record for the false dul.dnsbl.sorbs.net listing of 67.194.0.0/15 on Thursday. I also have the equivalent data for a (valid) listing in the SpamHaus PBL of a Comcast cable modem pool to compare.
Spamhaus PBL listing of 98.224.0.0/12 – this is a good example of the data that needs to be tracked. It shows that the /12 is listed because Comcast, the owners of that address range, don’t want their users to send email from there directly.
Compare that with GFI/SORBS listing of 67.194.0.0/15 (this is a twenty page PDF file saved from the GFI/SORBS database lookup page)
The GFI/SORBS data is… I don’t really have any words for it. I hate to think how badly the database is designed that could even have data of this structure in it.
For those who’ve not looked at the PDF, it contains entries for six address ranges – 67.192.0.0/10, 67.192.0.0/11, 67.192.0.0/12, 67.192.0.0/13, 67.192.0.0/14 and 67.194.0.0/15. The first five of those are listed as “Unknown Status”.
The last, 67.194.0.0/15 is described as “101803411-16-IP/Netblock delisted” – suggesting that the address range has been delisted. But it’s also described as “Currently active and flagged to be published in DNS”, meaning that it’s being published in the blacklist.
There’s also an audit trail, of sorts, showing what’s been done to these records. There are something over 220 entries, all identical apart from the timestamp:
“Delisted by 10 – Comment made by [10] Michelle Sullivan at <timestamp>”
It appears Michelle has removed this data from the dul.dnsbl.sorbs.net database over 200 times, and yet it’s still “active and flagged to be published in DNS”.
I’d like to grow eloquent on all the horrible database design and data integrity implications this data dump has, but I really don’t know where to start. So I’ll stick with saying that if you have any intent of using any reputation data from GFI/SORBS for any reason, you should take a look at the PDF, show it to a friendly DBA and between you try and work out what sort of system and database setup would lead to that sort of data.
More tomorrow.

Related Posts

Guide to resolving ISP issues

I often get a chuckle out of watching some people, who are normally on the blocking end of the delivery equation, struggle through their own blocking issues. A recent situation came up on a mailing list where someone who has very vehement opinions about how to approach her particular blocklist for delisting and that the lists policies are immutable. The company she works for is having some delivery issues and she’s looking for a contact to resolve the issues.
While digging through my blog posts to see if there was any help I could provide, I realized I don’t have a guide to resolving blocking issues at ISPs. Much of the troubleshooting can be done without ever contacting the ISPs or the blocklists.
Identify the issue.
There are a number of techniques that ISPs use to protect their users from malicious or problematic mail, from rate-liming incoming mail, putting mail in the bulk folder, or blocking specific IP addresses. Step one to resolving any delivery problem is to identify what is happening to the mail. In order to resolve the issue, you have to know what the issue is.
All too often, the description of a delivery problem is: My mail isn’t getting delivered. But that isn’t very clear as to what the actual problem is. Are you being temp failed? Is mail being blocked? Is mail going to the bulk folder? Is this something affecting just you or is it a widespread problem?
Troubleshoot your side.
Collect as much data about the problem as you can. Dig through logs and get copies of any rejection messages. Follow any URLs that are present in the bounce messages. Try sending a bare bones email to yourself at that ISP with just URLs, is it still blocked? What if you send from a different IP, does the same thing happen?
There is a lot of troubleshooting a sender can do without having to contact an ISP, and the information can lead to resolution that doesn’t involve having to contact the ISP. Also, many current ISP blocks are dynamic, they come up and go down without any human intervention. Those blocks that require contact to get them resolved have clear instructions in the bounce message.
Fix your stuff.
Whether it’s a reputation issue or a minor technical issue, fix the problem on your end. Just moving IP addresses or changing a URL isn’t a sustainable fix. There is a reason mail is being blocked or filtered and if you don’t fix that issue, the blocks are just going to come back. After you do fix your stuff, expect to see changes in a few days or a week. The ISP filters are generally quite responsive to sender improvements so if you’ve fixed the stuff you should see changes pretty quickly. Expect unblocking or filtering to take a little longer than the block was in place.
If you can’t figure out what the problem is, hire a consultant. Here at Word to the Wise we can often quickly identify a problem and provide a path to resolution. Sometimes the problem isn’t even the ISPs, we’ve had multiple cases where our clients were using custom software and their software wasn’t SMTP compliant and we were able to identify the problem and get their mail working again. There are a host of other independent consultants out there that can also help you identify and resolve blocking problems.
Contact the ISPs.
If there is a hard block or after fixing what you think the underlying problem is, you’ll have to contact the ISP. Many ISPs provide self service websites and contact forms to facilitate this process. Generally, though, most issues aren’t going to require contact.

Read More

Content based filtering

A spam filter looks at many things when it’s deciding whether or not to deliver a message to the recipients inbox, usually divided into two broad categories – the behaviour of the sender and the content of the message.
When we talk about sender behaviour we’ll often dive headfirst into the technical details of how that’s monitored and tracked – history of mail from the same IP address, SPF records, good reverse DNS, send rates and ramping, polite SMTP level behaviour, DKIM and domain-based reputation and so on. If all of those are OK and the mail still doesn’t get delivered then you might throw up your hands, fall back on “it’s content-based filtering” and not leave it at that.
There’s just as much detail and scope for diagnosis in content-based filtering, though, it’s just a bit more complex, so some delivery folks tend to gloss over it. If you’re sending mail that people want to receive, you’re sure you’re sending the mail technically correctly and you have a decent reputation as a sender then it’s time to look at the content.
You want your mail to look just like wanted mail from reputable, competent senders and to look different to unwanted mail, viruses, phishing emails, botnet spoor and so on. And not just to mechanical spam filters – if a postmaster looks at your email, you want it to look clean, honest and competently put together to them too.
Some of the distinctive content differences between wanted and unwanted email are due to the content as written by the sender, some of them are due to senders of unwanted email trying to hide their identity or their content, but many of them are due to the different quality software used to send each sort of mail. Mail clients used by individuals, and content composition software used by high quality ESPs tends to be well written and complies with both the email and MIME RFCs, and the unwritten best common practices for email composition. The software used by spammers, botnets, viruses and low quality ESPs tends not to do so well.
Here’s a (partial) list of some of the things to consider:

Read More

GFI/SORBS – a DDoS Intermezzo

Act 1Act 2IntermezzoAct 3Act 4Act 5
Management Summary, Redistributable Documents and Links
I’ve been stage-managing for a production of The Nutcracker this week, so musical terminology is on my mind. In opera, the intermezzo is a comedic interlude between acts of an opera series.
This comedic interlude is about the “DDoS” – a distributed denial of service attack. What is a denial of service attack?

Read More