Home > Articles > Software Development & Management

  • Print
  • + Share This
Like this article? We recommend Not Quite Prepared Enough?

Not Quite Prepared Enough?

You've probably heard that old expression about what happens when you assume—it makes an ass out of u and me. We had made some basic assumptions that didn't pan out well for us:

  • Since all of our data was backed up on a RAID server, we assumed that it was safe and inviolable. [1] If something happened to a disk drive, the other redundant drive should take over.
  • We assumed that if something happened to a power supply or logic board, we could simply pop out the redundant disk and drop it into a spare computer to resurrect the data.
  • We assumed that surge protectors do exactly what their name implies.
  • We underestimated the cost of resurrecting even relatively small quantities of data in a worst-case scenario.
  • We underestimated how long data-resurrection takes, and the impact it has on even a relatively small office.

For us, the disaster began unfolding at about 8 p.m. on Saturday—when the power came back on. Until that time, we were warm and happy on backup power. When we went back to commercial power after about 28 hours on the generator, the power company (as is often the case) was still attempting to restore steady power. The power flicked on and off two or three more times on Saturday night.

During one of those times, there was a surge or spike in the commercial power.

Here's what happened in each vital service area of the company:

  • Our phones stayed up. In the event that phone service would fail, we had subscribed to the Telecom Recovery voice-recovery system—a versatile and affordable service. If the phone company or our telephone system had failed, our calls would have been redirected instantly to our cell phones. So we were well prepared on voice communications.
  • Our Internet access stayed up. We use two separate wireless Internet service providers (WISPs) and a dual wide-area network (WAN) router. In fact, one of the two Internet feeds actually did fail, since the tower that provided it was affected by the same power outage. However, the other diverse link hummed right along, without our even noticing.
  • Most of the workstations in the office did fine, since they were on UPS for the brief period of time after the power failed and before we started the generator. (This is a manual process, but you can't have everything in a small office!)

So what's the problem? We didn't realize that, when the power came back on, the spike killed the RAID server—zapping four years of our data!

Go ahead and lecture us now on the importance of backup copies. But golly, isn't that why you buy a RAID server in the first place? The RAID server by its nature is a backup copy. Or so we assumed. (Remember what I said about what happens when you assume?)

  • + Share This
  • 🔖 Save To Your Account