Network

Network glitches and corrupted VMs

I had a bit of a interesting Friday. I was so glad it was finally the weekend. Saturday we did a bunch of errands, including go visit our servers. See, we’ve been upgrading infrastructure to implement a second type of backup system. Saturday we were doing the last set of upgrades so we could install over the weekend.
Yes, we do all our own networking and racking.
12974536_10206263292444901_7498678361263518784_n
Saturday evening Steve is installing the new backup software. This is awesome backup software. It backs up the entire virtual machine. If we lose a virtual machine, we can just reload the entire thing and it will be back again.
Except while installing the software, there is a weird network glitch. Said network glitch caused the system to crash. The system crashes hard. The system crash corrupts some of the data on disk. The data on disk is our virtual machine files. Files are in read only mode and won’t fsck automatically.
We lose most of our production virtual machines.  We’re off the air.
IronyBlog
Possibly this was tragic, not ironic. I dunno, it’s been a long weekend.
We lost a bunch of production virtual machines to the disc corruption. We haven’t lost any data, but it’s taking some time to rebuild the machines and pull data from the other backup system and get it installed.
That means some of our websites and services, like tools.wordtothewise.com are down. It may mean you saw some bounces if you sent us mail over the weekend. Mail is back and we are communicating with the outside world again.
Steve’s working through our other services as fast as possible to get them back up and running.
(If massive server issues weren’t enough, one of the cats got a UTI so we’re having to pill her twice a day. Then last night managed to puke so hard she passed out briefly. Poor thing. She’s doing better this morning.)

Read More

Your system; your rules

In the late 90s I was reasonably active in the anti-spam community and in trying to protect mailboxes. There were a couple catchphrases that developed as a bit of shorthand for discussions. One of them was “my server, my rules.” The underlying idea was that someone owned the different systems on the internet, and as owners of those systems they had the right to make usage rules for them. These rules can be about what system users can do (AUPs and terms of service) or what about what other people can do (web surfers or email senders).
I think this is still a decent guiding principle in “my network, my rules”. I do believe that network owners can choose what traffic and behavior they will allow on their network. But these days it’s a little different than it was when my dialup was actually a PPP shell account and seeing a URL on a television ad was a major surprise.
But ISPs are not what they once were. They are publicly owned, global companies with billion dollar market caps. The internet isn’t just the playground of college students and researchers, just about anyone in the US can get online – even if they don’t own a computer there is public internet access in many areas. Some of us have access to the internet in our pockets.
They still own the systems. They still make the rules. But the rules have to balance different constituencies including users and stockholders. Budgets are bigger, but there’s still a limited amount of money to go around. Decisions have to be made. These decisions translate into what traffic the ISP allows on the network. Those decisions are implemented by the employees. Sometimes they screw up. Sometimes they overstep. Sometimes they do the wrong thing. Implementation is hard and one of the things I really push with my clients. Make sure processes do what you think they do.
A long way of dancing around the idea that individual people can make policy decisions we disagree with on their networks, and third parties have no say in them. But those policy decisions need to be made in accordance with internal policies and processes. People can’t just randomly block things without consequences if they violate policies or block things that shouldn’t be blocked.
Ironically, today one of the major telcos managed to accidentally splash their 8xx number database. 8xx numbers are out all over the country while they search for backups to restore the database. This is business critical for thousands of companies, and is probably costing companies money right and left. Accidents can result in bigger problems than malice.
 

Read More