RAID (
Redundant
Array of
Inexpensive
Disks) is a way
of organising and using disks (Hard Disks) to increase performance or provide
fault tolerance. RAID can also be set up to provide both functions at the same
time.
RAID is a set of two or more ordinary hard disks and a specialized disk
controller that contains the RAID functionality. Developed initially for servers
and stand-alone disk storage systems, RAID is increasingly popular in desktop
PCs.
RAID can be implemented by using software or by using hardware, both do
essentially the same thing, except the hardware is faster, especially when
rebuilding after a failure.
RAID Levels
Raid can be setup in plethora of different ways, each one is called a level,
these range from Level 0 to Level 53. The characteristics, applications,
advantages and disadvantages are discussed below.
Additional RAID Information
Additional information regarding RAID and its implementations are discussed
below, these include implementations by hardware or software and how it can be
done in Windows 2000 Server software RAID.
RAID Level 0 - Striping
RAID 0 is a disk array where the data is broken down into blocks and each
block is written to a different drive. Normally a striped disk array has 2 disks
with a data is written partly to drive 1 and the next part of the data to disk
2. So a file which is split into A, B, C, D. It would be written to the drives
like this: Drive 1 = A and C, Drive 2 = B and D.
Advantages
- The output and input of the drive(s) is greatly improved as the computer
can read from the two drives simultaneously and thus can spread the load
across both for better performance.
- No parity calculation overhead is involved.
- Very Simple Design
- Easy to Implement.
Disadvantages
- No fault tolerance, i.e. if one of the disks breaks all the data is lost,
even though the other disk might be okay the both disks are needed to get
any data.
- Should never be used in mission critical applications.
Applications
Striping is useful for applications where you need maximum input and output,
this is normally video and audio applications.
- Video Production and Editing
- Image Editing
- Pre-Press Applications
- Any application requiring high bandwidth
RAID Level 1 - Mirroring
Mirroring is where the data is duplicated in its entirety and copied to one
or more disks, so the data is "mirrored" across 2 or more disks. This
gives 100% redundancy of data it also requires a large overhead. This is one of
the best forms of RAID as far as redundancy and speed of disaster recovery. In
the diagram below the data "A" is duplicated on both disk 1 and 2.
Advantages
- Increased read throughput as there can be one write and two reads possible
per mirrored pair.
- 100% data redundancy means not rebuild is necessary in case of disk
failure the contents can just be copied off to the new disk.
- In certain circumstances RAID 1 can sustain multiple drive failures. (Very
Cool!)
- Simplest RAID storage subsystem design.
Disadvantages
- Highest disk overhead of all RAID types, 100% redundancy so higher cost
for more disk space and inefficiency.
- Software RAID 1 can be done via software however this can cause slow down
on high load servers, hardware recommended, additional cost from this.
- "Hot Swap" of failed disk may not implemented in software RAID.
Applications
Anything were you have to have the data with very high availability, so the
basis of a person's or a companies operations. (This is what I use on my
fileserver.)
- Accounting
- Payroll
- Financial
- All other critical data storage applications.
RAID Level 2 - Hamming
Code ECC
The Hamming Code ECC RAID system is quite complicated, it requires many
disks, some used as storage and the rest used to store ECC (Error Correction
Codes) for the data stored on the other disks. Using the example below, the idea
is that each bit of data is written to on of the disk drives (4 of them labeled
0 to 3). Each data word written to the disks also has its Hamming Code ECC
recorded on the ECC Disks. When the data is read back ECC code verifies data or
corrects single disk errors.
Advantages
-
"On the fly" data error correction.
-
Very high data transfer rates possible.
-
Relatively simple controller design compared to RAID levels
3.4 & 5.
Disadvantages
-
Very high ratio of ECC disks to data disks when smaller word
sizes are used.
-
Very Expensive, when you've got to have the highest possible
transfer rates you can probably justify it.
-
No commercial implementations exist and its not very
commercially viable due to the cost!
Applications
As there are no commercial implementations there are no real
applications however things like video editing or digital movie production or
the "Toy Story" type CGI (Computer Generated Images) movie might find
this useful.
RAID Level 3 - Bit Interleaved
Parity Organisation
RAID 3 is a commonly used RAID level, it is common as it is of moderate cost
to implement, but has very high read and write access and transfer rates it also
handles any drive failure well and this failure has no real impact on the read
and write speeds (throughput) of the drives. The example below uses 3 disks to implement
RAID 3, this is the minimum number of drives required to implement RAID 3, other
flavors of RAID 3 can have 4 or 5 drives. The basic idea is that the data is written
across the multiple disks, the data is striped across the disks. At the same
time the parity information is sent to a dedicated parity disk. Any (one) of the
disks can fail without loss of data. The dedicated parity disk can be a
performance bottleneck as it must be accessed every time something accesses the
array.
The stripe parity is generated on writes and recorded on the parity disk, and
then checked on the reads.
Advantages
- Very high read data transfer rate.
- Very high write data transfer rate.
- Disk failure has an insignificant impact on throughput.
- Low ratio of ECC (Parity) disks to data disks means high efficiency.
Disadvantages
- Transaction rate equal to that of a single drive at best (if spindles are
synchronised.)
- Controller design is fairly complex.
- Very difficult and resource intensive to do as a "software"
RAID.
Applications
- Video Production and live streaming
- Image Editing
- Video Editing
- Any application requiring high throughput.
RAID Level 4 - Block Interleaved
Parity Organisation
RAID 4 is like RAID 3, except the entire block of data is written onto a data
disk (striped by blocks) rather than striping bytes like used in RAID 3. By
doing this random access performance is increased compared to RAID 3 but the
dedicated parity disk still remains a bottleneck.
The system works in the same way by generating parity on writes recording it
to the parity disk and this is checked when data is read from the array.
Advantages
- Very high read data transfer rate.
- High aggregate read transfer rate.
- Low ratio of ECC (Parity) disks to data disks means high efficiency.
- Better random access performance than RAID 3 because of storing data as
blocks rather than as bits.
Disadvantages
- Worst Write transaction rate and Write aggregate transfer rate.
- Controller design is fairly complex.
- Difficult and inefficient data rebuild in the event of disk failure.
Applications
- Video Production and live streaming
- Image Editing
- Video Editing
- Any application requiring high throughput, but with faster random access
times.
RAID Level 5 - Block Interleaved
Distributed Parity
The basic idea is the same as RAID 3 and 4, when data is written to disk
parity is generated and then stored on another disk (i.e. in a distributed
location.) On reads the data is read back and its parity is checked. As you can
see from the diagram the parity from drive 1 is stored on drive 2 and parity for
drive 2 is stored on drive 3 and so on. In reality the blocks that are written
are written in a more complex way so that parity for all the drives is stored on
all the drives too.
Advantages
- Highest read data transaction rate.
- Medium write data transaction rate.
- Low ratio of ECC (Parity) disks to data disks means high efficiency.
- Good aggregate transfer rate.
Disadvantages
- A failure of a disk has an average impact on throughput.
- Most complex controller design, means highest cost.
- Difficult to rebuild the data in the event of a disk failure, compared to
RAID 1.
- Individual block data transfer rate the same as a single disk.
Applications
- Servers of any description really: File and Application Servers.
- Database Servers
- Web Servers, E-Mail and News Servers
- Intranet Servers
RAID Level 6 - P+Q Redundancy
Scheme
RAID 6 is like RAID 5 except it has two distributed parity schemes so by
doing this it increases the fault tolerance of the array. Data is striped at
block level like in RAID 5 and when data is written a second set of parity is
generated and written across all the drives, like with RAID 5 but twice. This
makes very high fault tolerance and the array can sustain multiple simultaneous
drives failures, pretty neat!
This is the ideal solution for a mission critical application, such as
storage of a company's files, without the data there would be no company.
Advantages
- Some of the advantages of RAID 5
- Very high fault tolerance that allows for multiple simultaneous drive
failures.
Disadvantages
- More complex controller design.
- Controller overhead to compute parity address is very high because of the
2-Dimensional parity scheme.
- Write performance not as good as RAID 5.
- Requires N+2 drives to implement because of the 2-Dimensional array.
Applications
- Servers of any description really: File and Application Servers.
- Database Servers
- Web Servers, E-Mail and News Servers
- Intranet Servers
- Mission Critical Storage Applications
RAID Level 7 - Optimised
Asynchrony
This is an uncommon custom built RAID array, but has very high I/O rates as
well as very high data transfer rates. All the data on the disks is cached and
parity calculated, the cache help ensure that the array works with very high I/O
and data rates. A real time operating system controls the array and its
processors.
Advantages
- Write performance is between 25% and 90% better than single spindle
performance and is between 1.5 and 6 times better than other array levels.
- The host interfaces are scalable for more systems to be connected or for
increased bandwidth to each host.
- Small reads in this multi user environment can have very high cache hit
rate giving very short access times.
- The more drives in the array the better the write performance.
- Access times decrease with increase with more actuators that are in the
array.
- No extra data transfers required when manipulating the parity data.
Disadvantages
- Only one vendor supplies this RAID solution.
- Extremely high cost per MB.
- Very short warranty.
- Not user serviceable.
- Must have a UPS to protect cache data.
Applications
- Its really fast and when a very reliable and very high speed, multi-user
solution is needed this is the one to choose.
RAID 7 is a registered trademark of Storage Computer
Corporation.
RAID Level 10 - Mirroring
and Striping
Mirroring has good redundancy but poor performance, Striping has good
performance but no redundancy, wouldn't it be great to have the best of both.
Well you can, using RAID level 10 you combine Mirroring with Striping so you get
the redundancy of Mirroring but with the performance of Striping.
RAID 10 works by having a striped array whose segments are RAID 1 arrays. So
if data was written to the array, it would be mirrored across the mirrored pair
of disks as well as being striped across the striped pair.
Advantages
- RAID 10 has the same fault tolerance of RAID 1.
- RAID 10 has the same overhead as RAID 1.
- Good solution for a need that requires redundancy but with better
performance.
- Higher read/write is achieved by striping RAID 1 segments.
Disadvantages
- Because a minimum of 4 disks are required to implement RAID 10 its
expensive and has high overhead.
- All drives must be made to move in parallel to ensure performance.
Applications
- Any application RAID 1 might be used for, but an application that needs
better performance than what RAID 1 can provide.
RAID Level 53 - RAID 3
with Striping
RAID 53 is a combination of RAID 3 and RAID 0. It basically works by having a
normal RAID 3 array (data disks and a parity disk), then also striping this data
on a striped array too. This gives it the higher IO rates. Each part of the
striped array are made part of a RAID 3 array with parity. For example the data
part "A" exists in the striped layer, this data is also spread across
the 2 RAID 3 disks and its parity stored on the parity disk of the RAID 3 array.
Advantages
-
RAID 53 has the same fault tolerance as RAID 3.
-
High data transfer rates because the striping of data from
the RAID 3 array.
-
Small requests achieve High IO Rates because of the RAID 0
striping.
-
Like a faster version of RAID 3.
Disadvantages
-
Very Expensive to implement.
-
All the disks must have their spindles synchronized this is
difficult and requires specific hardware.
-
If striping at byte level is used, the drive's formatted
capacity is not used.
Applications
- Anything that would use RAID 3 but needs a bit more oomph!
- Video Production and live streaming
- Image Editing
- Video Editing
- Any application requiring high throughput.
RAID Level 0 + 1 - Mirroring
and Striping
RAID Level 0 + 1 looks like RAID 10, but in fact it is different, rather than
having a RAID 1 array and then a separate RAID 0 array, where the information is
read to and from the RAID 0 array and sync'ed with the RAID 1 array. RAID 0 + 1
uses two RAID 0 arrays that are mirrored to form a RAID 1 array. Basically a
mirrored array whose segments are RAID 0 arrays.
Advantages
-
RAID 0 + 1 has the same fault tolerance as RAID 5.
-
RAID 0 + 1 has the same overhead for fault tolerance as
mirroring alone.
-
High I/O rates are achieved as the disks are striped in RAID
0.
-
Good solution for high performance but not maximum reliability.
Disadvantages
-
RAID 0 + 1 is not like RAID 10, as a single drive failure
will make the whole array become a RAID 0 array.
-
Expensive because of the high overheads.
-
All drives must move is parallel to ensure performance.
-
Limited scalability at a very high cost.
Applications
- Audio/Video requiring high I/O rates but with redundancy.
- High Performance Fileserver.
Hardware RAID
Hardware RAID uses dedicated hardware to implement the array as opposed to
using software to control the array. The RAID controller cards have powerful
microprocessors on board to reduce the amount of processing required by the CPU.
Now for the rundown of the advantages and disadvantages of hardware RAID:
Advantages
- Performance - Its quicker and more powerful than software RAID.
- Boot Volume included in the RAID - As hardware is below all OS software
levels you can include in the array the boot volume.
- Level Support - Hardware supports all the RAID levels.
- Hot Swappable - Most hardware implementations allow for hot swapping of
disks.
- No/Very Few Operating System Compatibility Problems.
- No/Very Few Software Compatibility Issues.
- Reliability - Its more reliable as its implemented in hardware, the is far
less chance of bugs.
Disadvantages
- Cost - It costs more to implement in hardware so the system will cost
more.
- Not Simple - Depending on the complexity of the controller the RAID
configuration and installation can be more complicated than that of software
RAID.
Software RAID
Like with hardware RAID but implemented without using dedicated hardware
instead the OS (operating system or drivers) implement the RAID array. Most flavors
of UNIX and Windows NT and 2000 have some sort of software implementations of
RAID.
Now for the rundown of the advantages and disadvantages of software RAID.
Advantages
- Cost - If your OS already supports RAID, no additional costs for hardware
needed.
- Simplicity - You don't need to install configure a RAID hardware
controller.
Disadvantages (There's alot!)
- Performance - Software RAID is not as quick and gives worse system
performance than hardware implementations.
- Boot Volume Limitations - Software RAID can only be implemented on Data
partitions not the boot partition, meaning separate partitions are needed
for OS booting.
- Level Support - Normally only 0, 1 and 5 are implementable in software,
sometimes you want something more than this!
- No Hot-Swap or Drive Spares - Software doesn't normally allow these.
- Not compatible between other RAID arrays that are software implemented in
multiple OS environments.
- Compatibility - Some software, low-level utilities may conflict with
software RAID implementations.
- Reliability - Software does have bugs, hardware might to but this is much
more un-likely, the equation explains this problem: Software RAID + Bugs =
No Data!
Although software RAID seems to have nothing going for it the advantages it
does have are possibly the most important for a business especially a small
business that is money and time (simplicity). I run software RAID with no real problems
so it is okay and cheap, but if your serious go for hardware.
How to Setup RAID 0 (Striping)
in Microsoft Windows 2000 Server
To setup a software RAID 0 stripe set in Microsoft Windows 2000 Server, follow the steps below, note that striping provides no redundancy.
Note! You can't use RAID on the system or
boot volume, you need two (or more) separate data disks!
Step 1 - Ensure that you have 2 hard disks ready and installed in your
computer ready to be used for RAID. Make sure they are the same model, make and
size. Although this is not the most important thing RAID works best when both
are the same size.
Step 2 - Click "Start" -> "Settings"
-> "Control Panel". Then in "Control Panel"
select "Administrative Tools", once you have done this select "Computer
Management".
Step 3 - Now select "Disk Management" from the left hand
side. The Disk Manager will load this may take some time. You will see the list
of volumes on the top right pane, and then a more detail view below.
Step 4 - Now "right click" on the unallocated
volume space on the first of the two disk you want to use, and select "Create
Volume"
Step 5 - Click "Next" in the Create Volume
Wizard.
Step 5 - Click "Striped Volume" under
"Volume Type", and then click "Next."
Step 6 - In the left pane under "
Select Two or More
Disks", a list of all disks that have enough free unallocated
space to be used are show.
Step 7 - In the right pane under "
Selected Dynamic Disks",
the disk that you right-clicked in step one is displayed.
Step 8 - In the left pane under "All Available Dynamic
Disks", click the disk, and then click "Add."
Step 9 - Click "Next" and assign a drive letter to the drives,
select "Assign Drive Letter" give the drives a drive letter.
Step 10 - Now to format the disks, click "Format this partition with
the following settings", make the disks "NTFS" leave
the default "Allocation Unit Size", then give the drive a "Volume
Label", click "Next" and then click "Finish"
once complete reboot the computer and the striped RAID array is ready to go.
How to Setup RAID 1 (Mirroring)
in Microsoft Windows 2000 Server
To setup a software RAID 1 mirrored set in Microsoft Windows 2000 Server,
follow the steps below, note that striping provides no redundancy.
Note! You can't use RAID on the system or
boot volume, you need two (or more) separate data disks!
Step 1 - Ensure that you have 2 hard disks ready and installed in your
computer ready to be used for RAID. Make sure they are the same model, make and
size. Although this is not the most important thing RAID works best when both
are the same size.
Step 2 - Click "Start" -> "Settings"
-> "Control Panel". Then in "Control Panel"
select "Administrative Tools", once you have done this select "Computer
Management".
Step 3 - Now select "Disk Management" from the left hand
side. The Disk Manager will load this may take some time. You will see the list
of volumes on the top right pane, and then a more detail view below.
Step 4 - Now "right click" on the unallocated
volume space on the first of the two disk you want to use, and select "Create
Volume"
Step 5 - Click "Next" in the Create Volume
Wizard.
Step 5 - Click "Mirrored Volume"
under "Volume Type", and then click "Next."
Step 6 - In the left pane under "
Select Two or More
Disks", a list of all disks that have enough free unallocated
space to be used are show.
Step 7 - In the right pane under "
Selected Dynamic Disks",
the disk that you right-clicked in step one is displayed.
Step 8 - In the left pane under "All Available Dynamic
Disks", click the disk, and then click "Add."
Step 9 - Click "Next" and assign a drive letter to the drives,
select "Assign Drive Letter" give the drives a drive letter.
Step 10 - Now to format the disks, click "Format this partition with
the following settings", make the disks "NTFS" leave
the default "Allocation Unit Size", then give the drive a "Volume
Label", click "Next" and then click "Finish"
once complete reboot the computer and the mirrroed RAID array is ready to go.
How to Setup RAID 5 (Striping
with Parity) in Microsoft Windows 2000 Server
If you are going to use RAID 5 you are better off using hardware, as RAID 5
can get very slow on a software implementation. I suggest a RAID controller card
if you are thinking about RAID 5.
Why Should I Use RAID?
Well if you have the above the answer could be yes, it depends what your
needs are, how much money you have and how much data you are willing to lose.
The general guidelines are if your business relies on your data then using RAID
is a yes. If you are a home user the answer could be yes, depending on how
valuable your data is and if you have a way of backing data up.
Many home users don't back up anything at all, mostly because they don't
realise that hardware can break, hard disks do go wrong. Or because they do not
have the technical know-how or money to implement some sort of back up.
Many home PC's have 80GB or more hard disks how are you supposed to back this
up? Without using expensive tape drives, the only way is to use RAID and or hot
swap backup drives.
But overall the question "Why should I use RAID?" can be answered
with another question, "How much data can you afford to lose?"
Mirroring Vs.
Parity
The technique (or techniques) used to provide redundancy in a RAID array is a
primary differentiator between levels. Redundancy is provided in most RAID
levels through the use of mirroring
or parity
(which is implemented with striping):
- Mirroring: Single RAID level 1, and multiple RAID levels
0+1 and 1+0 ("RAID 10"), employ mirroring for redundancy. One
variant of RAID 1 includes mirroring of the hard disk controller as well as
the disk, called duplexing.
- Striping with Parity: Single RAID levels 2 through 7, and
multiple RAID levels 0+3 (aka "53"), 3+0, 0+5 and 5+0, use parity
with striping for data redundancy.
- Neither Mirroring nor Parity: RAID level 0 is striping
without parity; it provides no redundancy
- Both Mirroring and Striping with Parity:
Multiple RAID levels 1+5 and 5+1 have the "best of both worlds",
both forms of redundancy protection.
Related Links
http://www.acnc.com/04_01_00.html
- Excellent simple RAID explanations.
http://www.pcguide.com/ref/hdd/perf/raid/levels/index.htm
- A more detailed explanation of RAID.
http://www.prepressure.com/techno/raid.htm
- Explanation with nice diagrams.
Resources Used
"Operating System Concepts (Sixth Edition)" - Silberschatz/Galvin/Gagne,
2002, John Willey & Sons Inc. ISBN: 0-471-41743-2