GFI/SORBS considered harmful, part 3

steve
December 8, 2010
Industry

Act 1 • Act 2 • Intermezzo • Act 3 • Act 4 • Act 5
Management Summary, Redistributable Documents and Links
In the last few days we’ve talked about GFI’s lack of responsiveness, the poor quality of their reputation and blacklist data, and the interesting details of their DDoS claims. Today we’re going to look at (some of) the fundamental problems with GFI’s procedures and infrastructure that cause those issues. Some of the subset of issues I’ve chosen highlight are minor, some are major, but they show a pattern of poor decisions.
SSL Certificates
When you use SSL on a web connection it brings you two benefits. The first is that it encrypts the connection between your browser and the webserver, so that it’s very difficult for anyone to watch or tamper with your interaction with that webserver. The second, more important, reason is to make sure that you’re talking to the webserver you think you’re talking to, to avoid man-in-the-middle attacks.
This security relies on you trusting the certification authority that issues the SSL certificate that the website uses. A website providing services to the public should always use an SSL certificate created by one of a small number of reputable certification authorities that are pre-loaded into all webservers as “trusted”. These SSL certificates are something that need to be be purchased, but they’re very inexpensive – less than ten dollars a year.

One more amazing thing is about the self-signed ssl certificate used by SORBS ! It look’s like a poor homeless website ! A verisign certificate can cost 500 bucks for one year, but I’ve got one inexpensive from Trustico for 16 / year and there is no trouble with all the current browsers and operating systems. It looks like SORBS was running an old 386 server in a garage, not like a worlwide operating service. This is not serious at all !Laurent Marandet

For completely private webservers used by a well-defined group of people – an enterprise web service used solely by company employees, for instance – you might instead have your IT group set up a private (“self-signed”) certification authority and have your employees configure their web browsers to use that. Doing that is a very bad idea for public webservers for three major reasons:

It requires the user to perform a complex setup to load the information into their browser, rather than just visiting the website and being secure.
It provides no real authentication that you’re visiting the site you think you are, rather than some man-in-the-middle imposter. Why? Because you’re downloading the certificate that provides that security from the what you think is the same website it’s protecting. You just need to be tricked to visit the imposter website once, or have your web session hijacked in some way, and the imposter can have you download *their* private certification authority certificate instead – and then you’ll think you’re accessing the real website, protected by SSL, when you’re really accessing the imposter.
Worst of all, if a user can be persuaded to load a private certification authority certificate into their browser then the people who run that private certification authority can create “fake” SSL identities for any website they like.

That last point is very scary when you think about it. If you as an end-user can be tricked into loading a certification authority into your browser then the owner of that private authority can do almost undetectable phishing or man-in-the-middle attacks against you. They can create a fake citibank.com or microsoft.com website, and fake SSL certificates to match, meaning your browser will tell you that you’re securely talking to your bank.
SORBS tries to get people to install their private certification authority certificate into their web browsers. There are all sorts of reasons a malicious website might do that, while the only reason a legitimate website would do that would be to save the $8.95 a year it would cost to use a real SSL certificate. Given that GFI uses very high-quality, expensive ($488/year), “green bar” SSL certificates on their other web properties that doesn’t seem likely.
So why is GFI/SORBS trying to get listees to install an untrusted certification authority into their browsers? The only explanation that doesn’t imply malicious intent is a deep ignorance of basic security engineering – which doesn’t inspire confidence in the safety and security of their data. (Maybe there really are hackers breaking into GFI machines and adding random false positive SORBS listings?)
Remote Hands / Remote Access
GFI regularly add huge swathes of address space to their SORBS blacklists wrongly (many millions of addresses, and a noticable fraction of the entire internet). Even when they acknowledge that it’s a bad listing they’re often not able to fix it for weeks or months due to “database problems” or “DDoS attacks”.

some people are morons “anyone can blame a mistake on a DDoS” … the problem is the mistake was corrected but the DDoS is preventing the correction from getting to the real world.GFI Statement on failure to resolve false listings

During those periods that the SORBS website is unavailable for some reason, the DNS servers that actually publish the GFI/SORBS blacklists seem to keep running with no problem.

SORBs DNS servers were responding to RBL requests during the whole time, albeit somewhat slowly. IT WOULD HAVE BEEN BETTER IF THEY HAD JUST TIMED OUT!!! Then we wouldn’t be here writing about it. But they did not time out, and worse they answered RBL requests with bad replies.Skyhawk

I’ve seen other blacklists have data problems that have caused serious false positives, and the first thing they’ve done is to remove those false positives from the data they’re publishing, even if they had to do so in a crude, broad-brush manner – in extreme cases I’ve seen them completely empty the published DNS zones until the problem was fixed. GFI isn’t doing this – and they say the reason they’re not fixing the data is that they can’t due to the DDoS or database problem.
The only way that can be true is if they have no out-of-band or remote hands access to their name servers – which is not a safe way of running mission critical internet services. Word to the Wise runs a number of servers at several locations and, while none of them are “mission critical” either to us or to our customers, they are “business critical” such that them being unavailable for more than a few hours would cause a business problem. Off the top of my head, we can access most of them in any of the following ways, even if our main office network is off the air due to a T1 cut, a power outage or even a DDoS:

Connect to the publicly visible IP address from anywhere there’s network
Ssh into bastion host, then ssh to production server over private maintenance network
Ssh into terminal concentrator, connect to server via serial port
VPN into production location from laptop at a coffee shop, use virtual machine management software to log in to production virtual servers (or reboot them, or even rebuild them from scratch)
VPN into production location from my iPhone, do any of the above
Ssh in to remote power strip and kill the power to one or more of the production servers, either to reboot them or take them off air
Dial in to terminal concentrator, do any of the above
Dial in to remote power strip, reboot or power off machines
‘Phone staff at the colocation facility and have them power-off or power-cycle machines, or type commands into their keyboards
‘Phone staff at the NOC, have them drop network connectivity to one of my IP addresses, to disable that server
Modify my DNS configuration, to point services to another machine running elsewhere
Modify my DNS configuration, to take a service off the air
Drive to the colocation facility and do whatever is needed in person (there’s someone trustworthy within less than an hours drive of each of our facilities)

If GFI had even one piece of that infrastructure in place, they’d be able to fix problems in the data they were publishing via DNS even if their main location were DDoSed into oblivion. If we trust their assertion that they cannot do that, then they do not have enough basic infrastructure in place to run any sort of public facing internet service, let alone one that’s providing a mission-critical service to external users.

If you can’t fix your data, you need to turn off your dnsbl name servers. You are currently harming mail administrators around the world in a big way.Hans

Lack of Safeguards
The mistakes GFI make with the SORBS blacklist vary from incident to incident somewhat, but there’s a lot of repetition. Loading ancient, years-old, database dumps into the system, then publishing them without any checks seems to be one of the common failure modes.

My company is directly affected by this, for the second time in two months.anonymous commentor

People will often forgive a single mistake, but if you make the same one over and over again they lose all patience. It would be easy enough to put some automated checks into place, in the step between a new set of data being commited to the database and that data being published to the production DNS servers.
Even some of the most trivial checks, that could be implemented in a couple of hours in perl, would have prevented the (repeated) high-profile dul.dnsbl.sorbs.net errors. GFI clearly haven’t added even those most basic checks, let alone anything more sophisticated such as separate staging areas, backup databases, sanity checking against internal or external whitelists.
That either means that GFI have chosen not to invest any development resource into data and system integrity – or it means they have, but the system engineer responsible for it is not competent to do so. Either way, it casts serious doubt on their ability to maintain any level of data integrity either on their external DNSBl blacklists or their internal reputation system (which they bought out the SORBS assets to implement).
Naivete about how Internet email works
Policy problems are just as much of a problem as implementation mistakes.
I’ve concentrated mostly on the “dul” (dynamically assigned) blacklist, as it’s easier to demonstrate it’s failures. But there are similar problems in other zones.
If you’re running a mailing list, one recommended practice for signing up new users is confirmed opt-in. This is about as solid a best practice for mailing list signups as you can get, and is evangelized by legitimate anti-spam organizations such as SpamHaus or MAAWG as one of the best ways to run a mailing list.

This is the standard Best Practice for all responsible Internet mailing firms. COI ensures users are properly subscribed, from a working address, and with the address owner’s consent. […] This simple protection means that the Bulk Email sender can not be legitimately listed on any ‘spam’ blocklist
SpamHaus on COI

Some years back Laura was working with a mailing list operator who’d been listed on the SORBS spam blacklist. That seemed strange, because they were a 100% Closed-Loop Opt-In mailing list, and so there was no way they should have been listed on any ‘spam’ blocklist. On investigating it turned out that a user, something@cox.net had tried to sign up for the mailing list, but had typoed their email address to something@cix.net – just one letter off on the keyboard. The mailing list manager sent the opt-in confirmation request to something@cix.net, nobody responded to it, so something@cix.net wasn’t subscribed to the list or sent any mail from the list. A couple of hours later the user noticed, and signed up again with their correct something@cox.net mailing address and were subscribed successfully. This is all exactly how closed-loop opt-in is supposed to work.
However, the owner of cix.net had given control over the domain to SORBS for use as a “spam-trap” domain. And SORBS used a single confirmed opt-in request to an address in that domain to blacklist the mailing list operator. That’s a very poorly thought-out way of using a spam-trap domain, and leads to serious data integrity and false-positive problems with any blacklist based on such a naive spamtrap-driven approach.
That was a few years ago, but there’s more recent behaviour that’s pretty much the same.

I’ve got some really nice data showing how SORBS lists you for hitting a seed address of theirs ONCE 24 hours after they purchased the domain which expired. The user who previously owned the address in question COI (confirmed opt-in) into a list 8 months prior and had shown opens and click throughs within a couple months of the domain transfer. This is just an example of how one of their other zones uses terrible data as the basis for a listing.mailing list operator

I’ve seen a similar thing. In November 2009 someone signed up for a mailing list – full closed-loop opt-in, everything. Mail was sent out to the mailing list regularly, and we know the recipient was reading the mail – they were clicking on links in the messages, interacting with the mail, all the sorts of things a typical happy recipient will do. Judging from the website recorded at archive.org it was just a personal domain owned by the recipient, though, and at some point they let the domain registration lapse.
Two days after the domain registration lapsed, another email was sent out to the mailing list. By then GFI had acquired access to the domain, and were using it as a spamtrap domain. The mailing list was blacklisted on the GFI/SORBS spam list due to that single, non-spam email. Even the most aggressive blacklist experts agree that you need to leave any domain that’s been previously used for legitimate email in a state where it bounces email for at least six months to a year before you can really pull much useful data out of deliveries to it.
Industry best practice is to not remove an address from a mailing list until it’s bounced at least three times, over at least a 15 day period. There’s no way at all any mailing list (or ISP smarthost or any other source of email) could avoid being listed by the SORBS spam list in this way.
Given the way the data is acquired, the quality of data in the GFI/SORBS blacklist is likely to remain poor, and to be full of false-positive listings of this type.
Database Design, Audit Trails and Broken Data
I pulled a copy of the full record for the false dul.dnsbl.sorbs.net listing of 67.194.0.0/15 on Thursday. I also have the equivalent data for a (valid) listing in the SpamHaus PBL of a Comcast cable modem pool to compare.
Spamhaus PBL listing of 98.224.0.0/12 – this is a good example of the data that needs to be tracked. It shows that the /12 is listed because Comcast, the owners of that address range, don’t want their users to send email from there directly.
Compare that with GFI/SORBS listing of 67.194.0.0/15 (this is a twenty page PDF file saved from the GFI/SORBS database lookup page)
The GFI/SORBS data is… I don’t really have any words for it. I hate to think how badly the database is designed that could even have data of this structure in it.
For those who’ve not looked at the PDF, it contains entries for six address ranges – 67.192.0.0/10, 67.192.0.0/11, 67.192.0.0/12, 67.192.0.0/13, 67.192.0.0/14 and 67.194.0.0/15. The first five of those are listed as “Unknown Status”.
The last, 67.194.0.0/15 is described as “101803411-16-IP/Netblock delisted” – suggesting that the address range has been delisted. But it’s also described as “Currently active and flagged to be published in DNS”, meaning that it’s being published in the blacklist.
There’s also an audit trail, of sorts, showing what’s been done to these records. There are something over 220 entries, all identical apart from the timestamp:
“Delisted by 10 – Comment made by [10] Michelle Sullivan at <timestamp>”
It appears Michelle has removed this data from the dul.dnsbl.sorbs.net database over 200 times, and yet it’s still “active and flagged to be published in DNS”.
I’d like to grow eloquent on all the horrible database design and data integrity implications this data dump has, but I really don’t know where to start. So I’ll stick with saying that if you have any intent of using any reputation data from GFI/SORBS for any reason, you should take a look at the PDF, show it to a friendly DBA and between you try and work out what sort of system and database setup would lead to that sort of data.
More tomorrow.

We're gonna party like it's 1996!

laura
Sep 6, 2010

Legal

Over on deliverability.com Dela Quist has a long blog post up talking about how changes to Hotmail and Gmail’s priority inbox are a class action suit waiting to happen.
All I can say is that it’s all been tried before. Cyberpromotions v. AOL started the ball rolling when they tried to use the First Amendment to force AOL to accept their unsolicited email. The courts said No.
Time goes on and things change. No one argues Sanford wasn’t spamming, he even admitted as much in his court documents. He was attempting to force AOL to accept his unsolicited commercial email for their users. Dela’s arguments center around solicited mail, though.
Do I really think that minor difference in terminology going to change things?
No.
First off “solicited” has a very squishy meaning when looking at any company, particularly large national brands. “We bought a list” and “This person made a purchase from us” are more common than any email marketer wants to admit to. Buying, selling and assuming permission are par for the course in the “legitimate” email marketing world. Just because the marketer tells me that I solicited their email does not actually mean I solicited their email.
Secondly, email marketers don’t get to dictate what recipients do and do not want. Do ISPs occasionally make boneheaded filtering decisions? I’d be a fool to say no. But more often than not when an ISP blocks your mail or filters it into the bulk folder they are doing it because the recipients don’t want that mail and don’t care that it’s in the bulk folder. Sorry, much of the incredibly important marketing mail isn’t actually that important to the recipient.
Dela mentions things like bank statements and bills. Does he really think that recipients are too stupid to add the from address to their address books? Or create specific filters so they can get the mail they want? People do this regularly and if they really want mail they have the tools, provided by the ISP, to make the mail they want get to where they want it.
Finally, there is this little law that protects ISPs. 47 USC 230 states:

laura
Jul 13, 2010

Best Practices

Recently, an abuse desk rep asked what to do when customers were complaining about being assigned an IP address located on a blocklist. Because not every blocklist actually affects mail delivery it’s helpful to identify if the listing is causing a problem before diving in and trying to resolve the issue.

steve
Sep 7, 2010

Best Practices , Delivery Improvement

A spam filter looks at many things when it’s deciding whether or not to deliver a message to the recipients inbox, usually divided into two broad categories – the behaviour of the sender and the content of the message.
When we talk about sender behaviour we’ll often dive headfirst into the technical details of how that’s monitored and tracked – history of mail from the same IP address, SPF records, good reverse DNS, send rates and ramping, polite SMTP level behaviour, DKIM and domain-based reputation and so on. If all of those are OK and the mail still doesn’t get delivered then you might throw up your hands, fall back on “it’s content-based filtering” and not leave it at that.
There’s just as much detail and scope for diagnosis in content-based filtering, though, it’s just a bit more complex, so some delivery folks tend to gloss over it. If you’re sending mail that people want to receive, you’re sure you’re sending the mail technically correctly and you have a decent reputation as a sender then it’s time to look at the content.
You want your mail to look just like wanted mail from reputable, competent senders and to look different to unwanted mail, viruses, phishing emails, botnet spoor and so on. And not just to mechanical spam filters – if a postmaster looks at your email, you want it to look clean, honest and competently put together to them too.
Some of the distinctive content differences between wanted and unwanted email are due to the content as written by the sender, some of them are due to senders of unwanted email trying to hide their identity or their content, but many of them are due to the different quality software used to send each sort of mail. Mail clients used by individuals, and content composition software used by high quality ESPs tends to be well written and complies with both the email and MIME RFCs, and the unwritten best common practices for email composition. The software used by spammers, botnets, viruses and low quality ESPs tends not to do so well.
Here’s a (partial) list of some of the things to consider:

GFI/SORBS considered harmful, part 3

Related Posts

We're gonna party like it's 1996!

I'm on a blocklist! HELP!

Content based filtering

GFI/SORBS considered harmful, part 3

Share :

Related Posts

We're gonna party like it's 1996!

I'm on a blocklist! HELP!

Content based filtering