General IT‎ > ‎

RAID - Redundant Array of Inexpensive Disks

posted 12 Jun 2011, 07:27 by Tristan Self   [ updated 12 Jun 2011, 07:39 ]
RAID (Redundant Array of Inexpensive Disks) is a way of organising and using disks (Hard Disks) to increase performance or provide fault tolerance. RAID can also be set up to provide both functions at the same time. 

RAID is a set of two or more ordinary hard disks and a specialized disk controller that contains the RAID functionality. Developed initially for servers and stand-alone disk storage systems, RAID is increasingly popular in desktop PCs.

RAID can be implemented by using software or by using hardware, both do essentially the same thing, except the hardware is faster, especially when rebuilding after a failure.

RAID Levels

Raid can be setup in plethora of different ways, each one is called a level, these range from Level 0 to Level 53. The characteristics, applications, advantages and disadvantages are discussed below.

Additional RAID Information

Additional information regarding RAID and its implementations are discussed below, these include implementations by hardware or software and how it can be done in Windows 2000 Server software RAID.


RAID Level 0 - Striping

RAID 0 is a disk array where the data is broken down into blocks and each block is written to a different drive. Normally a striped disk array has 2 disks with a data is written partly to drive 1 and the next part of the data to disk 2. So a file which is split into A, B, C, D. It would be written to the drives like this: Drive 1 = A and C, Drive 2 = B and D.



Advantages

  • The output and input of the drive(s) is greatly improved as the computer can read from the two drives simultaneously and thus can spread the load across both for better performance.
  • No parity calculation overhead is involved.
  • Very Simple Design
  • Easy to Implement.

Disadvantages

  • No fault tolerance, i.e. if one of the disks breaks all the data is lost, even though the other disk might be okay the both disks are needed to get any data.
  • Should never be used in mission critical applications.

Applications

Striping is useful for applications where you need maximum input and output, this is normally video and audio applications.

  • Video Production and Editing
  • Image Editing
  • Pre-Press Applications
  • Any application requiring high bandwidth

RAID Level 1 - Mirroring

Mirroring is where the data is duplicated in its entirety and copied to one or more disks, so the data is "mirrored" across 2 or more disks. This gives 100% redundancy of data it also requires a large overhead. This is one of the best forms of RAID as far as redundancy and speed of disaster recovery. In the diagram below the data "A" is duplicated on both disk 1 and 2.

Advantages

  • Increased read throughput as there can be one write and two reads possible per mirrored pair.
  • 100% data redundancy means not rebuild is necessary in case of disk failure the contents can just be copied off to the new disk.
  • In certain circumstances RAID 1 can sustain multiple drive failures. (Very Cool!)
  • Simplest RAID storage subsystem design.

Disadvantages

  • Highest disk overhead of all RAID types, 100% redundancy so higher cost for more disk space and inefficiency.
  • Software RAID 1 can be done via software however this can cause slow down on high load servers, hardware recommended, additional cost from this.
  • "Hot Swap" of failed disk may not implemented in software RAID.

Applications

Anything were you have to have the data with very high availability, so the basis of a person's or a companies operations. (This is what I use on my fileserver.)

  • Accounting
  • Payroll
  • Financial
  • All other critical data storage applications.

RAID Level 2 - Hamming Code ECC

The Hamming Code ECC RAID system is quite complicated, it requires many disks, some used as storage and the rest used to store ECC (Error Correction Codes) for the data stored on the other disks. Using the example below, the idea is that each bit of data is written to on of the disk drives (4 of them labeled 0 to 3). Each data word written to the disks also has its Hamming Code ECC recorded on the ECC Disks. When the data is read back ECC code verifies data or corrects single disk errors.

Advantages

  • "On the fly" data error correction.

  • Very high data transfer rates possible.

  • Relatively simple controller design compared to RAID levels 3.4 & 5.

Disadvantages

  • Very high ratio of ECC disks to data disks when smaller word sizes are used.

  • Very Expensive, when you've got to have the highest possible transfer rates you can probably justify it.

  • No commercial implementations exist and its not very commercially viable due to the cost!

Applications

As there are no commercial implementations there are no real applications however things like video editing or digital movie production or the "Toy Story" type CGI (Computer Generated Images) movie might find this useful.


RAID Level 3 - Bit Interleaved Parity Organisation

RAID 3 is a commonly used RAID level, it is common as it is of moderate cost to implement, but has very high read and write access and transfer rates it also handles any drive failure well and this failure has no real impact on the read and write speeds (throughput) of the drives. The example below uses 3 disks to implement RAID 3, this is the minimum number of drives required to implement RAID 3, other flavors of RAID 3 can have 4 or 5 drives. The basic idea is that the data is written across the multiple disks, the data is striped across the disks. At the same time the parity information is sent to a dedicated parity disk. Any (one) of the disks can fail without loss of data. The dedicated parity disk can be a performance bottleneck as it must be accessed every time something accesses the array.

The stripe parity is generated on writes and recorded on the parity disk, and then checked on the reads.

Advantages

  • Very high read data transfer rate.
  • Very high write data transfer rate.
  • Disk failure has an insignificant impact on throughput.
  • Low ratio of ECC (Parity) disks to data disks means high efficiency.

Disadvantages

  • Transaction rate equal to that of a single drive at best (if spindles are synchronised.)
  • Controller design is fairly complex.
  • Very difficult and resource intensive to do as a "software" RAID.

Applications

  • Video Production and live streaming
  • Image Editing
  • Video Editing
  • Any application requiring high throughput.

RAID Level 4 - Block Interleaved Parity Organisation

RAID 4 is like RAID 3, except the entire block of data is written onto a data disk (striped by blocks) rather than striping bytes like used in RAID 3. By doing this random access performance is increased compared to RAID 3 but the dedicated parity disk still remains a bottleneck. 

The system works in the same way by generating parity on writes recording it to the parity disk and this is checked when data is read from the array. 

Advantages

  • Very high read data transfer rate.
  • High aggregate read transfer rate.
  • Low ratio of ECC (Parity) disks to data disks means high efficiency.
  • Better random access performance than RAID 3 because of storing data as blocks rather than as bits.

Disadvantages

  • Worst Write transaction rate and Write aggregate transfer rate.
  • Controller design is fairly complex.
  • Difficult and inefficient data rebuild in the event of disk failure.

Applications

  • Video Production and live streaming
  • Image Editing
  • Video Editing
  • Any application requiring high throughput, but with faster random access times.

RAID Level 5 - Block Interleaved Distributed Parity

The basic idea is the same as RAID 3 and 4, when data is written to disk parity is generated and then stored on another disk (i.e. in a distributed location.) On reads the data is read back and its parity is checked. As you can see from the diagram the parity from drive 1 is stored on drive 2 and parity for drive 2 is stored on drive 3 and so on. In reality the blocks that are written are written in a more complex way so that parity for all the drives is stored on all the drives too.

Advantages

  • Highest read data transaction rate.
  • Medium write data transaction rate.
  • Low ratio of ECC (Parity) disks to data disks means high efficiency.
  • Good aggregate transfer rate.

Disadvantages

  • A failure of a disk has an average impact on throughput.
  • Most complex controller design, means highest cost.
  • Difficult to rebuild the data in the event of a disk failure, compared to RAID 1.
  • Individual block data transfer rate the same as a single disk.

Applications

  • Servers of any description really: File and Application Servers.
  • Database Servers
  • Web Servers, E-Mail and News Servers
  • Intranet Servers

RAID Level 6 - P+Q Redundancy Scheme

RAID 6 is like RAID 5 except it has two distributed parity schemes so by doing this it increases the fault tolerance of the array. Data is striped at block level like in RAID 5 and when data is written a second set of parity is generated and written across all the drives, like with RAID 5 but twice. This makes very high fault tolerance and the array can sustain multiple simultaneous drives failures, pretty neat!

This is the ideal solution for a mission critical application, such as storage of a company's files, without the data there would be no company.

Advantages

  • Some of the advantages of RAID 5
  • Very high fault tolerance that allows for multiple simultaneous drive failures.

Disadvantages

  • More complex controller design.
  • Controller overhead to compute parity address is very high because of the 2-Dimensional parity scheme.
  • Write performance not as good as RAID 5.
  • Requires N+2 drives to implement because of the 2-Dimensional array.

Applications

  • Servers of any description really: File and Application Servers.
  • Database Servers
  • Web Servers, E-Mail and News Servers
  • Intranet Servers
  • Mission Critical Storage Applications

RAID Level 7 - Optimised Asynchrony

This is an uncommon custom built RAID array, but has very high I/O rates as well as very high data transfer rates. All the data on the disks is cached and parity calculated, the cache help ensure that the array works with very high I/O and data rates. A real time operating system controls the array and its processors.

Advantages

  • Write performance is between 25% and 90% better than single spindle performance and is between 1.5 and 6 times better than other array levels.
  • The host interfaces are scalable for more systems to be connected or for increased bandwidth to each host.
  • Small reads in this multi user environment can have very high cache hit rate giving very short access times.
  • The more drives in the array the better the write performance.
  • Access times decrease with increase with more actuators that are in the array.
  • No extra data transfers required when manipulating the parity data.

Disadvantages

  • Only one vendor supplies this RAID solution.
  • Extremely high cost per MB.
  • Very short warranty.
  • Not user serviceable.
  • Must have a UPS to protect cache data.

Applications

  • Its really fast and when a very reliable and very high speed, multi-user solution is needed this is the one to choose.

RAID 7 is a registered trademark of Storage Computer Corporation.


RAID Level 10 - Mirroring and Striping

Mirroring has good redundancy but poor performance, Striping has good performance but no redundancy, wouldn't it be great to have the best of both. Well you can, using RAID level 10 you combine Mirroring with Striping so you get the redundancy of Mirroring but with the performance of Striping.

RAID 10 works by having a striped array whose segments are RAID 1 arrays. So if data was written to the array, it would be mirrored across the mirrored pair of disks as well as being striped across the striped pair.


Advantages

  • RAID 10 has the same fault tolerance of RAID 1.
  • RAID 10 has the same overhead as RAID 1.
  • Good solution for a need that requires redundancy but with better performance.
  • Higher read/write is achieved by striping RAID 1 segments.

Disadvantages

  • Because a minimum of 4 disks are required to implement RAID 10 its expensive and has high overhead.
  • All drives must be made to move in parallel to ensure performance.

Applications

  • Any application RAID 1 might be used for, but an application that needs better performance than what RAID 1 can provide.

RAID Level 53 - RAID 3 with Striping

RAID 53 is a combination of RAID 3 and RAID 0. It basically works by having a normal RAID 3 array (data disks and a parity disk), then also striping this data on a striped array too. This gives it the higher IO rates. Each part of the striped array are made part of a RAID 3 array with parity. For example the data part "A" exists in the striped layer, this data is also spread across the 2 RAID 3 disks and its parity stored on the parity disk of the RAID 3 array.


Advantages

  • RAID 53 has the same fault tolerance as RAID 3.

  • High data transfer rates because the striping of data from the RAID 3 array.

  • Small requests achieve High IO Rates because of the RAID 0 striping.

  • Like a faster version of RAID 3.

Disadvantages

  • Very Expensive to implement.

  • All the disks must have their spindles synchronized this is difficult and requires specific hardware.

  • If striping at byte level is used, the drive's formatted capacity is not used.

Applications

  • Anything that would use RAID 3 but needs a bit more oomph!
  • Video Production and live streaming
  • Image Editing
  • Video Editing
  • Any application requiring high throughput.

RAID Level 0 + 1 - Mirroring and Striping

RAID Level 0 + 1 looks like RAID 10, but in fact it is different, rather than having a RAID 1 array and then a separate RAID 0 array, where the information is read to and from the RAID 0 array and sync'ed with the RAID 1 array. RAID 0 + 1 uses two RAID 0 arrays that are mirrored to form a RAID 1 array. Basically a mirrored array whose segments are RAID 0 arrays.


Advantages

  • RAID 0 + 1 has the same fault tolerance as RAID 5.

  • RAID 0 + 1 has the same overhead for fault tolerance as mirroring alone.

  • High I/O rates are achieved as the disks are striped in RAID 0.

  • Good solution for high performance but not maximum reliability.

Disadvantages

  • RAID 0 + 1 is not like RAID 10, as a single drive failure will make the whole array become a RAID 0 array.

  • Expensive because of the high overheads.

  • All drives must move is parallel to ensure performance.

  • Limited scalability at a very high cost.

Applications

  • Audio/Video requiring high I/O rates but with redundancy.
  • High Performance Fileserver.

Hardware RAID

Hardware RAID uses dedicated hardware to implement the array as opposed to using software to control the array. The RAID controller cards have powerful microprocessors on board to reduce the amount of processing required by the CPU.

Now for the rundown of the advantages and disadvantages of hardware RAID:

Advantages

  • Performance - Its quicker and more powerful than software RAID.
  • Boot Volume included in the RAID - As hardware is below all OS software levels you can include in the array the boot volume.
  • Level Support - Hardware supports all the RAID levels.
  • Hot Swappable - Most hardware implementations allow for hot swapping of disks.
  • No/Very Few Operating System Compatibility Problems.
  • No/Very Few Software Compatibility Issues.
  • Reliability - Its more reliable as its implemented in hardware, the is far less chance of bugs.

Disadvantages

  • Cost - It costs more to implement in hardware so the system will cost more.
  • Not Simple - Depending on the complexity of the controller the RAID configuration and installation can be more complicated than that of software RAID.

Software RAID

Like with hardware RAID but implemented without using dedicated hardware instead the OS (operating system or drivers) implement the RAID array. Most flavors of UNIX and Windows NT and 2000 have some sort of software implementations of RAID.

Now for the rundown of the advantages and disadvantages of software RAID.

Advantages

  • Cost - If your OS already supports RAID, no additional costs for hardware needed.
  • Simplicity - You don't need to install configure a RAID hardware controller.

Disadvantages (There's alot!)

  • Performance - Software RAID is not as quick and gives worse system performance than hardware implementations.
  • Boot Volume Limitations - Software RAID can only be implemented on Data partitions not the boot partition, meaning separate partitions are needed for OS booting.
  • Level Support - Normally only 0, 1 and 5 are implementable in software, sometimes you want something more than this!
  • No Hot-Swap or Drive Spares - Software doesn't normally allow these.
  • Not compatible between other RAID arrays that are software implemented in multiple OS environments.
  • Compatibility - Some software, low-level utilities may conflict with software RAID implementations.
  • Reliability - Software does have bugs, hardware might to but this is much more un-likely, the equation explains this problem: Software RAID + Bugs = No Data!

Although software RAID seems to have nothing going for it the advantages it does have are possibly the most important for a business especially a small business that is money and time (simplicity). I run software RAID with no real problems so it is okay and cheap, but if your serious go for hardware.


How to Setup RAID 0 (Striping) in Microsoft Windows 2000 Server

To setup a software RAID 0 stripe set in Microsoft Windows 2000 Server, follow the steps below, note that striping provides no redundancy.

Note! You can't use RAID on the system or boot volume, you need two (or more) separate data disks!

Step 1 - Ensure that you have 2 hard disks ready and installed in your computer ready to be used for RAID. Make sure they are the same model, make and size. Although this is not the most important thing RAID works best when both are the same size.

Step 2 - Click "Start" -> "Settings" -> "Control Panel". Then in "Control Panel" select "Administrative Tools", once you have done this select "Computer Management".

Step 3 - Now select "Disk Management" from the left hand side. The Disk Manager will load this may take some time. You will see the list of volumes on the top right pane, and then a more detail view below.

Step 4 - Now "right click" on the unallocated volume space on the first of the two disk you want to use, and select "Create Volume"

Step 5 - Click "Next" in the Create Volume Wizard.

Step 5 - Click "Striped Volume" under "Volume Type", and then click "Next."

Step 6 - In the left pane under "Select Two or More Disks", a list of all disks that have enough free unallocated space to be used are show.

Step 7 - In the right pane under "Selected Dynamic Disks", the disk that you right-clicked in step one is displayed.

Step 8 - In the left pane under "All Available Dynamic Disks", click the disk, and then click "Add."

Step 9 - Click "Next" and assign a drive letter to the drives, select "Assign Drive Letter" give the drives a drive letter.

Step 10 - Now to format the disks, click "Format this partition with the following settings", make the disks "NTFS" leave the default "Allocation Unit Size", then give the drive a "Volume Label", click "Next" and then click "Finish" once complete reboot the computer and the striped RAID array is ready to go.


How to Setup RAID 1 (Mirroring) in Microsoft Windows 2000 Server

To setup a software RAID 1 mirrored set in Microsoft Windows 2000 Server, follow the steps below, note that striping provides no redundancy.

Note! You can't use RAID on the system or boot volume, you need two (or more) separate data disks!

Step 1 - Ensure that you have 2 hard disks ready and installed in your computer ready to be used for RAID. Make sure they are the same model, make and size. Although this is not the most important thing RAID works best when both are the same size.

Step 2 - Click "Start" -> "Settings" -> "Control Panel". Then in "Control Panel" select "Administrative Tools", once you have done this select "Computer Management".

Step 3 - Now select "Disk Management" from the left hand side. The Disk Manager will load this may take some time. You will see the list of volumes on the top right pane, and then a more detail view below.

Step 4 - Now "right click" on the unallocated volume space on the first of the two disk you want to use, and select "Create Volume"

Step 5 - Click "Next" in the Create Volume Wizard.

Step 5 - Click "Mirrored Volume" under "Volume Type", and then click "Next."

Step 6 - In the left pane under "Select Two or More Disks", a list of all disks that have enough free unallocated space to be used are show.

Step 7 - In the right pane under "Selected Dynamic Disks", the disk that you right-clicked in step one is displayed.

Step 8 - In the left pane under "All Available Dynamic Disks", click the disk, and then click "Add."

Step 9 - Click "Next" and assign a drive letter to the drives, select "Assign Drive Letter" give the drives a drive letter.

Step 10 - Now to format the disks, click "Format this partition with the following settings", make the disks "NTFS" leave the default "Allocation Unit Size", then give the drive a "Volume Label", click "Next" and then click "Finish" once complete reboot the computer and the mirrroed RAID array is ready to go.


How to Setup RAID 5 (Striping with Parity) in Microsoft Windows 2000 Server

If you are going to use RAID 5 you are better off using hardware, as RAID 5 can get very slow on a software implementation. I suggest a RAID controller card if you are thinking about RAID 5.


Why Should I Use RAID?

Well if you have the above the answer could be yes, it depends what your needs are, how much money you have and how much data you are willing to lose. The general guidelines are if your business relies on your data then using RAID is a yes. If you are a home user the answer could be yes, depending on how valuable your data is and if you have a way of backing data up.

Many home users don't back up anything at all, mostly because they don't realise that hardware can break, hard disks do go wrong. Or because they do not have the technical know-how or money to implement some sort of back up.

Many home PC's have 80GB or more hard disks how are you supposed to back this up? Without using expensive tape drives, the only way is to use RAID and or hot swap backup drives.

But overall the question "Why should I use RAID?" can be answered with another question, "How much data can you afford to lose?"


Mirroring Vs. Parity

The technique (or techniques) used to provide redundancy in a RAID array is a primary differentiator between levels. Redundancy is provided in most RAID levels through the use of mirroring or parity (which is implemented with striping):

  • Mirroring: Single RAID level 1, and multiple RAID levels 0+1 and 1+0 ("RAID 10"), employ mirroring for redundancy. One variant of RAID 1 includes mirroring of the hard disk controller as well as the disk, called duplexing.
  • Striping with Parity: Single RAID levels 2 through 7, and multiple RAID levels 0+3 (aka "53"), 3+0, 0+5 and 5+0, use parity with striping for data redundancy.
  • Neither Mirroring nor Parity: RAID level 0 is striping without parity; it provides no redundancy
  • Both Mirroring and Striping with Parity: Multiple RAID levels 1+5 and 5+1 have the "best of both worlds", both forms of redundancy protection.

Related Links

http://www.acnc.com/04_01_00.html - Excellent simple RAID explanations.

http://www.pcguide.com/ref/hdd/perf/raid/levels/index.htm - A more detailed explanation of RAID.

http://www.prepressure.com/techno/raid.htm - Explanation with nice diagrams.


Resources Used

"Operating System Concepts (Sixth Edition)" - Silberschatz/Galvin/Gagne, 2002, John Willey & Sons Inc. ISBN: 0-471-41743-2

Comments