RAID to the Rescue!
RAID = A Redundant Array of Independent Disks.
Hard drive failure will happen. Lost data isn’t just frustrating; it can be extremely expensive. We’ve had customers who have lost several days of shooting because of hard drive failure with no backup.
There are ways of mitigating the risk of hard drive failure. The simplest is to make a copy of your data to another hard drive. You can do this manually or with an automated program like Apple’s Time Machine software, part of the OS X operating system. But, the problem with this approach is that your backed up data is only as fresh as the last time it was copied. It’s periodic -- there’s no continuous backup of your information. Haven’t plugged in your Time Machine drive in the last 10 days? You’re setting yourself up for 10 days of lost work.
RAID systems offer a solution to this problem with the added benefit of increasing data throughput depending on the type of RAID you select. By combining multiple drives into a single system, RAID allows for vastly improved write speeds, complete mirroring, aggregation of multiple drives, and rebuild capability. There are various levels of RAID systems, each offering a different function.
RAID 0 provides pure speed and performance by harnessing a computer’s ability to deliver data to a drive faster than the drive can write it. The drive writes using a technique known as “striping” where data is written across two drives, or more depending on the size of the array, in a linear fashion. The drive will write data across all disks, creating a stripe that spans the number of disks in the array. The transfer speed for the entire array will be the combined transfer speeds of the drives composing the array. Despite some gains in performance, RAID 0 has no provision for redundancy or backup.
The speed with which data can be accessed depends on the stripe size. A large stripe will prohibit independent reading and provide no speed gains. But, as is the case with database access, a small stripe can be read independently by each of the two drives within the array, allowing for each drive to read data simultaneously.
While RAID 0 does improve speed, there is a higher rate of failure than with just a single drive. Because multiple drives are used to store the data, the failure rates for each drive must be compounded. If one drive in a RAID 0 array fails, the other drives will fail as well. Coherency between the drives is lost if failure
occurs and the array cannot cope with the lack of consistent data.
RAID 1 is used to create a complete mirrored set of disks. Usually employing two disks, RAID 1 writes data as a normal hard drive would but makes an exact copy of the primary drive with the second drive in the array. This mirrored copy is constantly updated as data on the primary drive changes. By keeping this mirrored backup, the array decreases the chance of failure from 5% over three years to 0.25%. Should a drive fail, the failed drive should be replaced as soon as possible to keep the mirror updated.
RAID 5 is the most stable of the more advanced RAID levels and offers redundancy, speed, and the ability to rebuild a failed drive. RAID 5 uses the same block level striping as other RAID levels but adds another level of data protection by creating what is known as a “parity block”. These blocks are stored alongside the other blocks in the array in a staggered pattern and are used to check that the data has been written correctly in the drive.
Should an error or failure occur, the parity block will be used to locate the information stored on the other member disks and
parity blocks in the array to
rebuild the data correctly. This can be done on-the-fly without interruptions to applications or other programs running on the computer. The computer will notify the user of the failed drive, but will continue to operate normally. This state of operation is known as Interim Data Recovery Mode. Performance may suffer while the disk is being rebuilt, but operation should
continue. The failed drive should be replaced as soon as possible.
There are other RAID options that are less common or still under development. Nesting is a term used to describe how multiple RAID levels and functionalities can be combined in a single array. RAID 10, for example, employs the speed of RAID 0 and redundancy of RAID 1. It’s often referred to as mirrored striping. The data is striped as it would be in a RAID 0 array but is also mirrored for redundancy as if it were operating in a RAID 1 array. Like RAID 10, RAID 50 employs the striping of RAID 0 with the parity blocks used in RAID 5.
Should the selection of a RAID format be overwhelming, you might opt for a Drobo. Simply insert unused SATA drives of any size in the Drobo array and it will choose the RAID format most appropriate for the given situation. Drobo can mirror and rebuild data without interruption should one of the SATA drives fail and the user can replace the drive without fear of further failure or computing interruptions. TapeOnline has the components you need to Build Your Own Raid.
All RAID levels are limited in space by the smallest disk in the drive. If a 250 GB drive and a 200 GB drive are working together in a RAID array, the total capacity of the system will be 400 GB.
Perhaps there are a lot of unused drives laying around or you have purchased a RAID drive that supports JBOD. For the former, a process called concatenation can be used to create a SPAN or BIG disk. Concatenation can be thought of as the inverse of partitioning, as the various disks are made to appear as one large disk of the sum storage capacity of each smaller drive. A 250 GB drive and a 500 GB drive in a SPAN array will have a total capacity of 750 GB. Just a Bunch of Disks,
or JBOD, can also be used to house two or more drives in a single array
but still allow for each drive to retain independence. For example, say your RAID drive contains two hard drives and you have no need for
RAID fucntionality at the moment. JBOD allows the two drives within
the array to be addressed individually rather than as a large unit, so you can use the two drives as separate storage vessels.
The goal of any RAID system is to increase reliability and input/output performance. Each level has a different functional balance to achieve this goal.
It is important to note that RAID is not perfect and is still subject to failure. But, the odds of a catastrophic failure are vastly lower than traditional single drive systems. The mirroring and parity systems used for redundancy offer no protection against viruses or other malevolent programs that could embed themselves into a drive. Should a virus find its way in, it's quite likely that it will be written across the all disks in the array. To ensure near complete protection from failures and viruses, it is always best to use RAID in conjunction with standard backup methods and virus protection software.
- RAID - Redundant Array of Independent Disks
- Striping - The process of writing data linearly across two or more hard drives; a stripe is divided into logical units of data
- JBOD - Just a Bunch Of Drives
- Parity Block - A block of data used to check for errors in written data on a drive
- Concatenation - The process of aggregating multiple individual drives into a larger single drive
- Nesting - Combining multiple RAID levels and their functionalities in a single array
Comments? Questions? Call us at 877-893-8273, or email us at [email protected]. We love feedback.