Thus far we have talked about single-level RAID configurations and Nested RAID configurations. But we've artificially restricted ourselves to only two levels in Nested RAID. Couldn't we have three RAID levels or more? The answer is yes, and in this article we'll talk about three levels (the proverbial "Triple Lindy") and have some fun with a couple of examples.
This is just a sample layout illustrating how a RAID-160 configuration is laid out. Remember that the layout goes from the lowest level (furthest left number in the RAID numbering), to the highest level (furthest right in the RAID numbering). So RAID-160 starts with RAID-1 at the lowest level (closest to the drives) that has pairs of drives in RAID-1 (I’m assuming that RAID-1 happens with two drives). Then the RAID-1 pairs are combined using RAID-6 in the intermediate layer to create RAID-6 groups (at least two are needed). Since RAID-6 requires at least four “drives” you need at least four RAID-1 pairs to create an intermediate RAID-6 group. Finally the RAID-6 groups are combined at the highest level using RAID-0 (a single RAID-0 group).
As with RAID-100 this configuration can make sense when you use multiple RAID cards that are capable of RAID-16. In the case of Figure 2, you use two RAID cards capable of RAID-16 and then combine them at the top level with software RAID-0 (i.e. RAID that runs in the Linux kernel). This makes sense for RAID-160 because RAID-6 requires a great deal of computational power and splitting drives into multiple RAID-6 groups each with their own RAID processor helps improve overall RAID performance.
Figure 2 is the fewest number of drives one can use in a RAID-160 configuration and this points out one of potential problems with three levels of Nested RAID – the large number of drives that have to be used. Using sixteen drives to create the bare minimum RAID-160 configuration isn’t exactly inexpensive and also forces you to have some sort of “case” that can accommodate that many drives (not easy in a home-user case, but definitely possible).
The capacity of a RAID-160 configuration is fairly easy to compute assuming that all the drives have the same capacity.
Capacity = min(disk sizes) * (Number of RAID-1 groups in each RAID-6 group at the intermediate level - 2) * (Number of RAID-0 groups at the top level)
This is the same storage efficiency as a sample RAID-16 configuration with eight drives that used four two-drive RAID-1 pairs and used a single RAID-6 at the highest level. So we didn’t seem to gain back any storage efficiency but this was expected.
Notice that the minimum number of drives in a RAID-160 configuration is sixteen if you want to have more than one RAID-6 group at the intermediate level (doesn’t make much sense to use RAID-0 across one intermediate RAID-6 group). This means that you have to have eight RAID-1 pairs that are combined to create two RAID-6 groups in the intermediate layer (four RAID-1 groups per intermediate RAID-0 group). Then the two intermediate RAID-6 groups are combined with RAID-0 at the highest level. The result is that you need sixteen drives at a minimum for RAID-160.
To make a “balanced” intermediate RAID-6 layer (i.e. the same number of RAID-1 pairs per RAID-6), then you need to increment the total number of drives by the number of drives in each RAID-6 group in the intermediate layer. In the case of Figure 2, the number if eight.
The fault tolerance of RAID-160 is based on that of RAID-16 and is five drives. You can lose two RAID-1 pairs within one RAID-6 group and still retain access to the data. You can then lose a fifth drive that is part of a third RAID-1 pair in the same RAID-6 group. Then if you lose it’s mirror (the sixth drive), you lose the RAID-6 group and RAID-0 at the highest level goes down.
Table 2 below is a quick summary of RAID-160 with a few highlights.
Table 2 – RAID-160 Highlights
Minimum Number of disks
Excellent read performance because of both the mirroring (RAID-1) and RAID-6 (no parity is used during reading).
Outstanding data redundancy (can tolerate the loss of any five disks).
In the event of a single drive failure, only the mirrored drive is involved in the rebuild.
You have to use at least 16 drives (very large number of drives).
Storage efficiency can be very low (lower than RAID-1).
Good write performance because of RAID-0.
Storage Efficiency = (Number of groups in each RAID-6 group at the intermediate level – 2) / ( (Number of drives in RAID-1) * (Number of groups in each RAID-6 group at the intermediate level) )
Other Interesting Triple Nested RAID Configurations
The two examples discussed here are somewhat at opposite ends of the Nested RAID spectrum. The first one, RAID-100, has a little bit of data redundancy (1 drive) but tons of write performance and very good storage efficiency. The second one has tons of data redundancy (5 drives) but the write performance is just adequate and the storage efficiency is not so good. These two simple examples illustrate that you can mix the standard RAID levels, (RAID-0, RAID-1, RAID-5, and RAID-6) in different ways to create different configurations but you always have to ask yourself if some of them make sense.
For example, does RAID-000 make any sense? Isn’t that just really RAID-0? (the extra RAID controllers don’t give you any performance advantage).
As mentioned earlier, three levels in a Nested RAID can result in a very large number of drives for the minimum configuration. For example, RAID-666 (truly the “evilest” of all Nested RAID configurations), requires four drives per RAID-6 at the lowest level, followed by four RAID-6 groups (that each use RAID-6) in the intermediate layer, that are combined at the highest level by RAID-6. So the result is that at a minimum 64 drives are required for a RAID-666 configuration (4*4*4).
Other triple Nested RAID configurations can lead to terrible storage efficiencies but amazing data redundancy. For example, RAID-111, uses three levels of drive mirroring. The minimum configuration requires eight drives (2*2*2), only one of which is used for storing real data (the other 7 drives are used for mirroring). That’s a storage efficiency of only 12.5%!! However, you can lose up to seven drives without losing access to your data.
Triple Nested RAID configurations have to be carefully designed and understood for them to be effective. Otherwise you could just be keeping data redundancy (or performance) about the same as a two-level Nested RAID configuration while increasing the minimum number of disks for the configuration. But, some configurations can lead to desirable behavior.
Just as Rodney Dangerfield said, “I don’t joke about dives. Especially that one. It almost killed me. …” triple Nested RAID configurations are nothing to joke about because it can kill your performance or your storage capacity (efficiency) or your redundancy. They need to be carefully designed to make sure they make sense relative to a two-level Nested RAID or a single level RAID configuration. Moreover, some of the configurations can require a large number of drives and multiple RAID cards, pushing up the cost of the configuration.
Two examples presented in this article, RAID-100 and RAID-160, show some extreme examples of what you can achieve with triple Nested RAID configurations. RAID-100 builds on RAID-10 and results in a configuration with a great deal of performance. RAID-160 begins with a very data redundant configurations, RAID-16, and adds RAID-0 in an effort to improve performance and storage efficiency. Both configurations require a fair number of drives for the minimum possible configuration reaching the extreme of RAID-160 where sixteen drives are required.
Typically triple Nested RAID configurations use software RAID at the highest level. Both examples can be implemented by using several RAID cards at the lowest two levels, and software RAID at the highest level (Figure 1 required three RAID cards, and Figure 2 required two RAID cards). This approach can be very efficient because you are using RAID cards in parallel which can improve the efficiency and the data throughput of the overall RAID configuration in addition to reducing the rebuild time of a lost drive. Given that we have desktops with four to six cores, using software RAID at the highest level isn’t a bad approach since one core can easily be focused on RAID functions while the other cores are doing something else.
We’re not quite done with RAID yet (even though you may be screaming at this point). In the next article, we’ll examine the software RAID tool in Linux, mdadm at a higher level and discuss some of its unique features.