I was just setting up a new install of two Ubuntu servers with a TB mirrored between them in realtime using DRBD. It occurred to me while I was configuring DRBD that the default settings are way too slow for current hardware.
For instance, if you're going to set up a high-availability cluster no doubt you're going to have a minimum of a Gigabit network connection between the servers and at least use SATA 300 hard drives - probably in a RAID array to get even more throughput.
The default sync speed in DRBD is only 10 Megabytes / second. It's in your best interests, especially on the initial sync, to increase this considerably. At initial setup time you can safely configure this to be as high as your hardware will allow. Check out this article that describes how to go about calculating it.
For instance, initially my setup used 22 MB as the sync speed, but for the initial sync of 1 TB across a Gigabit crossover using SATA 300 drives was going to take almost 10 hours to complete. My hardware config actually lets me push this as high as 68 MB / second, reducing my initial sync time to about 3 3/4 hours, and that's on two systems with no RAID - simply a 1 TB hard drive synced over a crossover cable.
I stopped relying long ago on RAID-5 after I had two separate installs that the controller card corrupted more than one disk in the array at the same time, causing total data loss.
When you're dealing with more than a terabyte of data, restoring from any sort of backup medium becomes a painful process. All of my data is backed up on DVD (yeah yeah, I've heard the complaints before, but you don't know what I know about DVD backups) but restoring a terabyte of data from DVD's can take a week or so.
Enter DRBD. DRBD stands for Distributed Block Device. Essentially it's RAID-1 that works over Ethernet. DRBD rides on top of whatever physical storage medium and network you have, but below the file system level. You run it on multiple machines, and set up an identical hard drive configuration on each machine. The DRBD partition is automatically replicated from the primary server to the secondary. Using tools like "heartbeat" you can even monitor this system automatically and promote the secondary server to the primary in the event of a failure.