RAID is an acronym for redundant array of independent (or inexpensive) disks and was designed to improve the fault tolerance and performance of computer storage systems. RAID was developed at the University of California at Berkeley in 1987 and was designed so that a group of smaller, less expensive drives could be interconnected with special hardware and software to make them appear as a single larger drive to the system. By using multiple drives to act as one drive, increases in fault tolerance and performance could be realized.
Initially, RAID was conceived to simply enable all the individual drives in the array to work together as a single, larger drive with the combined storage space of all the individual drives, which is called a JBOD (Just a Bunch of Disks) configuration. Unfortunately, if you had four drives connected in a JBOD array acting as one drive, you would be four times more likely to experience a drive failure than if you used just a single larger drive. And because JBOD does not use striping, performance would be no better than a single drive either. To improve both reliability and performance, the Berkeley scientists proposed six levels (corresponding to different methods) of RAID. These levels provide varying emphasis on fault tolerance (reliability), storage capacity, performance, or a combination of the three.
Although it no longer exists, an organization called the RAID Advisory Board (RAB) was formed in July 1992 to standardize, classify, and educate on the subject of RAID. The RAB developed specifications for RAID, a conformance program for the various RAID levels, and a classification program for RAID hardware.
The RAID Advisory Board defined seven standard RAID levels, called RAID 0–6. Most RAID controllers also implement a RAID 0+1 combination, which is usually called RAID 10. The levels are as follows:
- RAID Level 0—Striping-File data is written simultaneously to multiple drives in the array, which act as a single larger drive. This offers high read/write performance but low reliability. Requires a minimum of two drives to implement.
- RAID Level 1—Mirroring-Data written to one drive is duplicated on another, providing excellent fault tolerance (if one drive fails, the other is used and no is data lost) but no real increase in performance as compared to a single drive. Requires a minimum of two drives to implement (same capacity as one drive).
- RAID Level 2—Bit-level ECC-Data is split one bit at a time across multiple drives, and error correction codes (ECCs) are written to other drives. This is intended for storage devices that do not incorporate ECC internally. (All SCSI and ATA drives have internal ECC.) It’s a standard that theoretically provides high data rates with good fault tolerance, but seven or more drives are required for greater than 50% efficiency, and no commercial RAID 2 controllers or drives without ECC are available.
- RAID Level 3—Striped with parity-Combines RAID Level 0 striping with an additional drive used for parity information. This RAID level is really an adaptation of RAID Level 0 that sacrifices some capacity, for the same number of drives. However, it also achieves a high level of data integrity or fault tolerance because data usually can be rebuilt if one drive fails. Requires a minimum of three drives to implement (two or more for data and one for parity).
- RAID Level 4—Blocked data with parity—Similar to RAID 3 except data is written in larger blocks to the independent drives, offering faster read performance with larger files. Requires a minimum of three drives to implement (two or more for data and one for parity).
- RAID Level 5—Blocked data with distributed parity—Similar to RAID 4 but offers improved performance by distributing the parity stripes over a series of hard drives. Requires a minimum of three drives to implement (two or more for data and one for parity).
- RAID Level 6—Blocked data with double distributed parity—Similar to RAID 5 except parity information is written twice using two parity schemes to provide even better fault tolerance in case of multiple drive failures. Requires a minimum of four drives to implement (two or more for data and two for parity).
There are also nested RAID levels created by combining several forms of RAID. The most common are as follows:
- RAID Level 01: Mirrored stripes—Drives are first combined in striped RAID 0 sets; then the RAID 0 sets are mirrored in a RAID 1 configuration. A minimum of four drives is required, and the total number of drives must be an even number. Most PC implementations allow four drives only. The total usable storage capacity is equal to half of the number of drives in the array times the size of the lowest capacity drive. RAID 01 arrays can tolerate a single drive failure and some (but not all) combinations of multiple drive failures. This is not generally recommended because RAID 10 offers more redundancy and performance.
- RAID Level 10: Striped mirrors—Drives are first combined in mirrored RAID 1 sets; then the RAID 1 sets are striped in a RAID 0 configuration. A minimum of four drives is required, and the total number of drives must be an even number. Most PC implementations allow four drives only. The total usable storage capacity is equal to half of the number of drives in the array times the size of the lowest capacity drive. RAID 10 arrays can tolerate a single drive failure and many (but not all) combinations of multiple drive failures. This is similar to RAID 01, except with somewhat increased reliability because more combinations of multiple drive failures can be tolerated, and rebuilding an array after a failed drive is replaced is much faster and more efficient.
Additional custom or proprietary RAID levels exist that were not originally supported by the RAID Advisory Board. For example, from 1993 through 2004, “RAID 7” was a trademarked marketing term used to describe a proprietary RAID implementation released by the (now defunct) Storage Computer Corp.
When set up for maximum performance, arrays typically run RAID Level 0, which incorporates data striping. Unfortunately, RAID 0 also sacrifices reliability such that if any one drive fails, all data in the array is lost. The advantage is in extreme performance. With RAID 0, performance generally scales up with the number of drives you add in the array. For example, with four drives you won’t necessarily have four times the performance of a single drive, but many controllers can come close to that for sustained transfers. Some overhead is still involved in the controller performing the striping, and issues still exist with latency—that is, how long it takes to find the data—but performance will be higher than any single drive can normally achieve.
When set up for reliability, arrays generally run RAID Level 1, which is simple drive mirroring. All data written to one drive is written to the other. If one drive fails, the system can continue to work on the other drive. Unfortunately, this does not increase performance, and it also means you get to use only half of the available drive capacity. In other words, you must install two drives, but you get to use only one. (The other is the mirror.) However, in an era of high capacities and low drive prices, this is not a significant issue.
Combining performance with fault tolerance requires using one of the other RAID levels, such as RAID 5 or 10. For example, virtually all professional RAID controllers used in network file servers are designed to use RAID Level 5. Controllers that implement RAID Level 5 used to be very expensive, and RAID 5 requires at least three drives to be connected, whereas RAID 10 requires four drives.
With four 500GB drives in a RAID 5 configuration, you would have 1.5TB of total storage, and you could withstand the failure of any single drive. After a drive failure, data could still be read from and written to the array. However, read/write performance would be exceptionally slow, and it would remain so until the drive was replaced and the array was rebuilt. The rebuild process could take a relatively long time, so if another drive failed before the rebuild completed, all data would be lost.
With four drives in a RAID 10 configuration, you would have only 1TB of total storage. However, you could withstand many cases of multiple drive failures. In addition, after a drive failure, data could still be read from and written to the array at full speed, with no noticeable loss in performance. In addition, once the failed drive is replaced, the rebuild process would go relatively quickly as compared to rebuilding a RAID 5 array. Because of the advantages of RAID 10, many are recommending it as an alternative to RAID 5 where maximum redundancy and performance are required.
Many motherboards include SATA RAID capability as a built-in feature. For those that don’t, or where a higher performance or more capable SATA RAID solution is desired, you can install a SATA RAID host adapter in a PCIe slot in the system. A typical PCIe SATA RAID controller enables up to four, six, or eight drives to be attached, and you can run them in RAID Level 0, 1, 5, or 10 mode. Most PCIe SATA RAID cards use a separate SATA data channel (cable) for each drive, allowing maximum performance. Motherboard-based RAID controllers almost exclusively use SATA drives.
If you are considering a SATA RAID controller (or a motherboard with an integrated SATA RAID controller), here are some things to look for:
- RAID levels supported. (Most support 0, 1, 5, and 10. A lack of RAID 5/6 or RAID 10 support indicates a very low-end product.)
- Support for four, six, or eight drives.
- Support for 6Gbps SATA transfer rates.
- PCIe card with onboard controller (provides best performance and future compatibility; note that low-cost PCIe cards are host-based and rely on the CPU).
Some operating systems include software-based RAID capability; in fact, limited RAID 0, 1, and even RAID 5 functionality has been built in to some versions of Windows since Windows 2000. When Microsoft released Windows Home Server in 2007 it greatly enhanced this capability with a feature called Drive Extender, which allowed for the creation and arbitrary expansion of an array using virtually any type of drive (SATA, PATA, USB, FireWire, etc.) in any capacity. Drive Extender creates a virtual drive that is a combination of the assigned physical drives. There is limited redundancy in that by default each file saved on a Drive Extender volume is automatically stored on two different drives such that if one drive fails it can theoretically be replaced without losing any data. If more than one drive fails, then data will be lost. Unfortunately, problems with Drive Extender caused Microsoft to remove the feature from Windows Home Server 2011.
Microsoft has included a newer and better replacement for Drive Extender in Windows 8, which is now called Storage Spaces. Just like Drive Extender, it allows you to build a virtual drive using an array of drives of just about any type or capacity. One area where Storage Spaces differs from Drive Extender is in the redundancy options. In addition to two-way redundancy where data is saved on two drives, Storage Spaces allows for three-way redundancy, meaning that data will be saved on three drives. This also means that up to two drives can fail in the array without losing data. While the redundancy and reliability has been improved, just as with most software-based RAID, performance falls dramatically as compared to either a physical drive or hardware-based RAID, especially in write performance.
While the Storage Spaces feature in Windows 8 looks like an excellent option for a home server with multiple data drives, just like any other RAID array, it doesn’t replace the need for backup, meaning you would need somewhere else to back up all of the data on the Storage Spaces virtual drive.
Normally, if you want both performance and reliability, you should look for hardware-based SATA RAID controllers that support RAID Level 5 or 10, or an external storage device with built-in RAID capability. You can install a PCIe-based RAID controller; however, many motherboards have RAID capability built in via the motherboard chipset. Another option is external storage devices like the Drobo (www.drobo.com), which can create and manage virtual drives using the various physical drives mounted in the enclosure. Because they rely on dedicated management hardware, they can offer better performance and reliability than even some hardware-based RAID setups.