Why Oh Why?
(Header image by Rita Kravchuk, Creative Commons)I just attempted to log into the digital photography review website forums at dpreview.com.
I was greeted with the following error:
Why does this happen? Out of 5 RAID drive failures I've had or been witness to in my career, 3 of them had multiple drive failures, making the entire RAID process useless. You spend $800 - $1500 on a RAID controller card, and yet still, more than one dies. [READ MORE]In the case of multiple simultaneous hard drive failures I believe it boils down to one of two things:
1) The RAID controller itself sucks, and there really isn't two bad hard drives. I had this happen on a Promise card once. After losing all my data I was able to determine that one of the 4 drives really was bad, but the other drive being reported as bad by the controller tested just fine. The very next thing I did was throw that RAID controller in the garbage.
2) The power supply in the computer is flaky, and spiked the drives to death. This is a likely culprit. An old power supply or a cheap power supply can't be trusted with your data, and few people really think about how much damage a power supply can cause. If you think about it, a bad power supply could potentially damage any or all hardware in the computer. I've taken to spending NO LESS than $85 on a power supply, even for a desktop system, if there's going to be any critical information on that machine. The only time I'm lax on that is when I build web thinclients.
I believe that many people are using RAID-5 instead of regular backups, and that is a huge mistake. I'm currently running RAID 5 on one server, but I've not only got an up-to-the minute snapshot of the data on that server in a pre-processed form elsewhere, I have a nightly full backup of the processed data from that server. If it goes down, at worst I lose a day's worth of work, plus another day to rebuild and reinstall, during which time I won't lose additional work because other staff could be put on other tasks.
It's important to choose the right solution for your data storage needs, and not just say 'Oh, RAID 5 for safety.' On a web server, perhaps a striping approach with no redundancy and a 4 hour database backup cycle is what's needed to improve speed. For a file server, perhaps RAID-5 plus a daily backup is what's needed.
Either way, don't make the mistake of thinking that just having an 'online' backup is good enough. Back in the 'lovebug' days, I knew someone who backed up all of their scanned images to another machine on the network. When the lovebug virus hit their network, it ate the live data, and happily moved across the network and ate the backup as well.
I have three primary server's at the office, and each one uses a different backup methodology because each has different data storage needs. You should consider each situation separately and think about downtime and lost data in the event of a catastrophic failure and how it would impact your business.