Network glitches and corrupted VMs
I had a bit of a interesting Friday. I was so glad it was finally the weekend. Saturday we did a bunch of errands, including go visit our servers. See, we’ve been upgrading infrastructure to implement a second type of backup system. Saturday we were doing the last set of upgrades so we could install over the weekend.
Yes, we do all our own networking and racking.
Saturday evening Steve is installing the new backup software. This is awesome backup software. It backs up the entire virtual machine. If we lose a virtual machine, we can just reload the entire thing and it will be back again.
Except while installing the software, there is a weird network glitch. Said network glitch caused the system to crash. The system crashes hard. The system crash corrupts some of the data on disk. The data on disk is our virtual machine files. Files are in read only mode and won’t fsck automatically.
We lose most of our production virtual machines. We’re off the air.
Possibly this was tragic, not ironic. I dunno, it’s been a long weekend.
We lost a bunch of production virtual machines to the disc corruption. We haven’t lost any data, but it’s taking some time to rebuild the machines and pull data from the other backup system and get it installed.
That means some of our websites and services, like tools.wordtothewise.com are down. It may mean you saw some bounces if you sent us mail over the weekend. Mail is back and we are communicating with the outside world again.
Steve’s working through our other services as fast as possible to get them back up and running.
(If massive server issues weren’t enough, one of the cats got a UTI so we’re having to pill her twice a day. Then last night managed to puke so hard she passed out briefly. Poor thing. She’s doing better this morning.)