Nested-RAID: RAID-5 and RAID-6 Based Configurations

A Nested RAID configuration is built on top of a standard single-level RAID configuration in order to address performance and data redundancy limitations. Digging deeper into Nested RAID, we check out RAID-5 and RAID-6, which have some truly amazing features (if you have enough drives).

In some cases there are several ways to take a pool of drives and create a RAID-05 array. For example, if you had 12 drives, you could create three RAID-5 groups that each had four drives in RAID-0. Or you could create four RAID-5 groups that each had three drives each in RAID-0 (as in Figure 1). The one rule you have to stick to is that you can create a RAID-05 using at least six drives and you cannot use a prime number of drives. This means that you can create a RAID-05 array from nine drives but not eleven or thirteen.

The fault tolerance of RAID-05 is limited to really one drive. If you lose one or more drives in a RAID-0 group, then you lose that RAID-0 group. This means that the upper level RAID-5 thinks it has lost one “drive” which is the maximum number possible for RAID-5. You could lose another drive in the same RAID-0 group that has already lost a drive, but that requires that you control which drives fail in what order.

If you lose a drive in a RAID-05 configuration, then every drive in the configuration is involved in the rebuilding process (either written to, or read from). For example, in Figure 1, if we lose disk 4 (fourth drive from the left) that entire RAID-0 group (disks 4-6) has failed and a new drive is inserted into the configuration, then disks 1-3 and disks 7-12 must be read to reconstruct the data for the failed RAID-0 group. Then disks 4-6 must be written to as part of the rebuilding process. As with RAID-01 putting the most data redundant RAID level at the highest point in the Nested RAID array causes all of the drives to be involved in rebuilding the array in the event of a drive failure.

Table 1 below is a quick summary of RAID-05 with a few highlights.

Table 1 – RAID-05 Highlights

Raid Level Pros Cons Storage Efficiency Minimum Number of disks
RAID-05

  • Excellent read performance because of both the striping (RAID-0) and RAID-5.
  • Very good write performance because of the striping (RAID-0). But RAID-5 reduces the performance relative to a RAID-01 or RAID-10 configuration.
  • Reasonable data redundancy (can tolerate the loss of any one disk)
  • Reasonable storage efficiency and capacity

  • You have to use at least 6 drives (large number of drives)
  • All disks have to be used in rebuild
Storage Efficiency = (Number of RAID-0 groups – 1) / (Number of RAID-0 groups) 6(Note: you have to use a non-prime number of disks)

RAID 5+0 (RAID-50)

The “companion” to RAID-05 is RAID-50. In this case, we use RAID-5 groups at the lowest level and use RAID-0 to combine them at the highest level. Figure 2 below is an example showing the data layout for 12 identical drives.

RAID-50_cropped.png
Figure 2: RAID-50 layout with Twelve Drives

In this example, there are three drives in each of the four RAID-5 groups that are combined using RAID-0 at the highest level. The parity block for each stripe is noted as [x-y] where x-y are the chunks of data on that particular RAID-5 group. So Ap[3-4] refers to the parity for data chunks A3 and A4

The capacity of the RAID-50 configuration is the same as RAID-05 but the formula is slightly different

Capacity = min(disk sizes) * (Number of drives in each RAID-5 group - 1) * (Number of RAID-5 groups)

For Figure 2, this means,

Capacity = min(disk size) * (3 -1) * 4
Capacity = min(disk size) * 8

This is slightly different than Figure 1 but the two configurations are not really “mirror” images of one another.

The storage efficiency for RAID-50 is also fairly easy to compute:

Storage Efficiency = (Number of drives in each RAID-5 group - 1) / (Number of drives in each RAID-5 group)

For Figure 2, the resulting storage efficiency is,

Storage Efficiency = (3 - 1) / (3)
Storage Efficiency = 0.67 (67%)

Please note that Figure 2 is not the “mirror” of Figure 1 so the storage efficiency of Figure 2 is not the same as Figure 1. However, we could have easily changed Figure 2 to use four drives in each RAID-5 group and then combine three of them with RAID-0 and achieved the same storage efficiency as RAID-05 in Figure 1. Moreover, the performance of RAID-50 is about the same as RAID-05 with any differences usually being attributed to implementation details.

As with RAID-05, RAID-50 has to use a minimum of six disks and the total number of drives has to be non-prime. However, also notice that Figure 1 and Figure 2, while using the same number of drives, have different capabilities. Figure 1 has better storage efficiency than Figure 2 while Figure 2 should have better performance than Figure 1 (more “drives” in the RAID-0 striping). Again, this illustrates the fact that Nested RAID configurations have a great deal of flexibility in design so that you can meet your specific needs.

The fault tolerance of RAID-50 is the same as RAID-05 as well: one disk. If you lose one drive then that specific RAID-5 group runs in degraded mode (i.e. it can’t tolerate the loss of another drive). If we lose another drive in that same RAID-5 group it goes down and we then lose the entire RAID-0 group as well. However, you can lose more than one disk without losing access to the data but as with RAID-05, they must be a specific sequence which is unlikely.

The big difference between RAID-05 and RAID-50 is the number of drives involved in a rebuild. In the case of RAID-50 only the surviving drives in the specific RAID-5 group that has a failed drive have to be read and only the single replacement drive in the RAID-5 group is written to. For example, in Figure 2, if we lose disk 4 (fourth disk from the left), then only disks 5 and 6 are read from and the replacement disk for disk 4 is the only one that has to be written to. The number of drives used in the RAID-50 rebuild is much smaller than RAID-05 and is a definite advantage because of a shorter rebuild time.

Table 2 below is a quick summary of RAID-50 with a few highlights.

Table 2 – RAID-50 Highlights

Raid Level Pros Cons Storage Efficiency Minimum Number of disks
RAID-50

  • Excellent read performance because of both the striping (RAID-0) and RAID-5.
  • Very good write performance because of the striping (RAID-0). But RAID-5 reduces the performance relative to a RAID-01 or RAID-10 configuration.
  • Reasonable data redundancy (can tolerate the loss of any one disk)
  • Reasonable storage efficiency and capacity (same as RAID-05)
  • Only have to use the drives in the RAID-5 group of the failed drive during rebuild (shorter rebuild times).

  • You have to use at least 6 drives (large number of drives)
Storage Efficiency = (Number of drives in each RAID-5 group – 1) / (Number of drives in each RAID-5 group) 6 (Note: you have to use a non-prime number of disks)

RAID 1+5 (RAID-15)

Remember that Nested RAID configurations can be combinations of standard RAID levels. So if you are interested in improving the performance of a RAID-1 configuration, you can always use RAID-5 which adds some performance but also adds some data redundancy (“a belt and a pair of suspenders”). RAID-15 combines two types of data protection: mirroring (RAID-1) and parity (RAID-5) to create a very well protected RAID array but also one with better performance than RAID-1. But an interesting question is how the interaction of the two data protection RAID schemes interact in the final Nested RAID configuration.

Figure 3 below is an example showing the data layout for eight identical drives in a RAID-15 configuration.

RAID-15_cropped.png
Figure 3: RAID-15 layout with Eight Drives

In this Nested RAID configuration there are pairs of drives in RAID-1 at the lowest level and they are combined using RAID-5 at the upper level (so that means you need at least three RAID-1 pairs at the lowest level). In this particular case, there are four pairs of RAID-1 drives that are combined in a RAID-5 configuration using a total of eight drives.

As you can see in the illustration, the data chunks are first passed to RAID-5 which then computes parity and pass it and the data chunks to the RAID-1 pairs. Then each RAID-1 pair mirrors the specific chunks across the two drives. In essence, RAID-51 uses both data protection schemes – parity (i.e. RAID-5 or RAID-6) and mirroring (RAID-1) but it does gain back some storage efficiency and performance from RAID-5.

The capacity of the RAID-15 configuration is fairly easy to compute:

Capacity = min(disk sizes) * (Number of RAID-1 groups - 1)

For Figure 3, this means,

Capacity = min(disk size) * (4 - 1)
Capacity = min(disk size) * 3

The resulting storage efficiency is also fairly easy to compute:

Storage Efficiency = (Number of RAID-1 groups - 1) / (Number of RAID-1 groups * Number of drives in RAID-1 group)

For Figure 3, the resulting storage efficiency is,

Storage Efficiency = (4 - 1) / (4*2)
Storage Efficiency = 3/8
Storage Efficiency = 0.375 (37.5%)

The added data paranoia has forced the storage efficiency down by quite a bit so that it is lower than even RAID-1. However, as you increase the number of drives to a very large number, the storage efficiency will approach, but never reach, 50% (RAID-1).

Notice that the minimum number of drives in a RAID-15 set is six: you need at least three groups for RAID-5 and you need at least two drives per RAID-1. However, in some cases you can construct different arrangements if you can use more than two drives in RAID-1 (some RAID implementations allow this). But note that if you can only have two drives in the RAID-1 configurations, you will need an even number of drives.

The fault tolerance of RAID-15 is excellent allowing you to lose three drives. In the worst case scenario the first drive you lose is in one RAID-1 pair, and the second drive in that same pair. This means the RAID-1 pair has failed and to the RAID-5 configuration at the highest level, this means that one “drive” has failed. At this point, the RAID-15 configuration is running in “degraded mode” where RAID-5 cannot tolerate the loss of any more “disks”. You can lose one more arbitrary drive in a surviving RAID-1 pair, which is the overall third lost drive, without losing any data. But in the worst case, you could lose the surviving RAID-1 drive in that pair, causing that RAID-1 pair to fail. At that point, RAID-5 thinks it just lost another “drive” and so it fails as well.

If you lose a drive in a RAID-15 configuration, then only the mirrored drive is read from and the replacement drive is the only one written to. If you lose a mirrored pair (two drives), all of the remaining drives in the array are read from during the rebuild. For example, if you lose disks 3 and 4 you have lost a RAID-1 pair and the RAID-5 level runs in degraded mode. If you replace disks 3 and 4, then disks 1-2 and 5-8 are read from, and disks 3-4 are written to. So when you lose two or more drives, all of the disks in the array are used for the rebuilding process.

Table 3 below is a quick summary of RAID-15 with a few highlights.

Table 3 – RAID-15 Highlights

Raid Level Pros Cons Storage Efficiency Minimum Number of disks
RAID-15

  • Excellent read performance because of both the mirroring (RAID-1) and RAID-5.
  • Good write performance because of RAID-5.
  • Excellent data redundancy with the ability to lose three drives without losing data.
  • In the event of a single drive loss, only the mirrored pair drive is used in the rebuild.

  • You have to use at least 6 drives (large number of drives).
  • If you lose two or three drives, then all of the drives in the array are used for rebuilding.
  • Poor storage efficiency – worse than RAID-1.
Storage Efficiency = (Number of RAID-1 groups – 1) / (Number of RAID-1 groups * Number of drives in RAID-1 group) 6 (Note: you have to use an even number of disks)

RAID 5+1 (RAID-51)

As mentioned previously, putting the more performance oriented RAID level on top of the more redundant levels is desired because during a rebuild, fewer drives are touched reducing the possibility of further data loss. But in the case of RAID-15 and RAID-51, we’re really combining two redundant levels together. We have already examined RAID-15 and found that for the loss of a single drive, only one drive is used. But in the event of losing two drives, then all drives are used in the rebuilding. However, RAID-15 can lose up to three drives without losing access to the data.

In addition, RAID-15 can tolerate the loss of 3 drives without losing access to the data. This is better than even RAID-6. I’m not sure what RAID-51 can accomplish so let’s examine RAID-51. Figure 4 below is an example showing the RAID-51 data layout for eight identical drives (same number of drives as RAID-15).

RAID-51_cropped.png
Figure 4: RAID-51 layout with Eight Drives

In this Nested RAID configuration, RAID-5 is a the lowest level and RAID-1 is at the highest level. So this eight-drive layout consists of two groups of four disks for a total of eight disks. Each four-disk group is built using RAID-5 and then they are combined using RAID-1.

In Figure 4 you an also see the data layout where each data “stripe” has two copies of the RAID-5 parity block. However, in the RAID-15 layout, the parity blocks are closer together using RAID-51 (compare Figure 4 to Figure 3).

The capacity of the RAID-51 configuration is the following.

Capacity = min(disk sizes) * (Number of disks in RAID-5 group - 1)

For Figure 4, this means,

Capacity = min(disk size) * (4-1)
Capacity = min(disk size) * 3

The resulting storage efficiency is also fairly easy to compute:

Storage Efficiency = (Number of drives in RAID-5 group - 1 ) / (Number of RAID-1 groups * Number of drives in RAID-5 group)

For Figure 4, the resulting storage efficiency is,

Storage Efficiency = (4 -1) / (2*4)
Storage Efficiency = 3/8
Storage Efficiency = 0.375 (37.5%)

The paranoia about data protection has cost us a great deal in terms of capacity and storage efficiency just as it did for RAID-15.

Comments on "Nested-RAID: RAID-5 and RAID-6 Based Configurations"

pjwelsh

Nice information! Casual RAID users should keep in mind that the performance of any nested RAID choice will need to be evaluated based on the physical hardware RAID -vs- software RAID choice combinations a system has. Not all hardware RAID controllers or software RAID options are created equal! YMMV.

Reply
tarax

Hi,

And thank you very much for this excellent series. Printed and filed in my tech reference !
For the icing on the cake, it would be just great to have another article about ZFS and its RAIDZ technologies…

Thanks again and best wishes

Reply

This piece of writing offers clear idea in support of the new visitors of
blogging, that truly how to do blogging.

my web-site: plumbing Companies; hotwatertankreplacement.jigsy.com,

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>