Nested-RAID: The Triple Lindy

Thus far we have talked about single-level RAID configurations and Nested RAID configurations. But we've artificially restricted ourselves to only two levels in Nested RAID. Couldn't we have three RAID levels or more? The answer is yes, and in this article we'll talk about three levels (the proverbial "Triple Lindy") and have some fun with a couple of examples.

This is just a sample layout illustrating how a RAID-160 configuration is laid out. Remember that the layout goes from the lowest level (furthest left number in the RAID numbering), to the highest level (furthest right in the RAID numbering). So RAID-160 starts with RAID-1 at the lowest level (closest to the drives) that has pairs of drives in RAID-1 (I’m assuming that RAID-1 happens with two drives). Then the RAID-1 pairs are combined using RAID-6 in the intermediate layer to create RAID-6 groups (at least two are needed). Since RAID-6 requires at least four “drives” you need at least four RAID-1 pairs to create an intermediate RAID-6 group. Finally the RAID-6 groups are combined at the highest level using RAID-0 (a single RAID-0 group).

As with RAID-100 this configuration can make sense when you use multiple RAID cards that are capable of RAID-16. In the case of Figure 2, you use two RAID cards capable of RAID-16 and then combine them at the top level with software RAID-0 (i.e. RAID that runs in the Linux kernel). This makes sense for RAID-160 because RAID-6 requires a great deal of computational power and splitting drives into multiple RAID-6 groups each with their own RAID processor helps improve overall RAID performance.

Figure 2 is the fewest number of drives one can use in a RAID-160 configuration and this points out one of potential problems with three levels of Nested RAID – the large number of drives that have to be used. Using sixteen drives to create the bare minimum RAID-160 configuration isn’t exactly inexpensive and also forces you to have some sort of “case” that can accommodate that many drives (not easy in a home-user case, but definitely possible).

The capacity of a RAID-160 configuration is fairly easy to compute assuming that all the drives have the same capacity.

Capacity = min(disk sizes) * (Number of RAID-1 groups in each RAID-6 group at the intermediate level - 2) * (Number of RAID-0 groups at the top level)

For Figure 2, this means,

Capacity = min(disk size) * (4 - 2) * (2)
Capacity = min(disk size) * 4

The resulting storage efficiency is also fairly easy to compute:

Storage Efficiency = (Number of groups in each RAID-6 group at the intermediate level - 2) / ( (Number of drives in RAID-1) * (Number of groups in each RAID-6 group at the intermediate level) )

As a reminder, for RAID-16, the storage efficiency is,

Storage Efficiency = (Number of RAID-1 groups - 2) / ( (Number of RAID-1 groups) * (Number of drives in RAID-1) )

which is fairly close to the storage efficiency for RAID-160.

For Figure 2, the resulting storage efficiency is,

Storage Efficiency = (4 - 2) / ( (2) * (4) )
Storage Efficiency = 2 / 8
Storage Efficiency = 0.25 (25%)

This is the same storage efficiency as a sample RAID-16 configuration with eight drives that used four two-drive RAID-1 pairs and used a single RAID-6 at the highest level. So we didn’t seem to gain back any storage efficiency but this was expected.

Notice that the minimum number of drives in a RAID-160 configuration is sixteen if you want to have more than one RAID-6 group at the intermediate level (doesn’t make much sense to use RAID-0 across one intermediate RAID-6 group). This means that you have to have eight RAID-1 pairs that are combined to create two RAID-6 groups in the intermediate layer (four RAID-1 groups per intermediate RAID-0 group). Then the two intermediate RAID-6 groups are combined with RAID-0 at the highest level. The result is that you need sixteen drives at a minimum for RAID-160.

To make a “balanced” intermediate RAID-6 layer (i.e. the same number of RAID-1 pairs per RAID-6), then you need to increment the total number of drives by the number of drives in each RAID-6 group in the intermediate layer. In the case of Figure 2, the number if eight.

The fault tolerance of RAID-160 is based on that of RAID-16 and is five drives. You can lose two RAID-1 pairs within one RAID-6 group and still retain access to the data. You can then lose a fifth drive that is part of a third RAID-1 pair in the same RAID-6 group. Then if you lose it’s mirror (the sixth drive), you lose the RAID-6 group and RAID-0 at the highest level goes down.

Table 2 below is a quick summary of RAID-160 with a few highlights.

Table 2 – RAID-160 Highlights

Raid Level Pros Cons Storage Efficiency Minimum Number of disks
RAID-160


  • Excellent read performance because of both the mirroring (RAID-1) and RAID-6 (no parity is used during reading).
  • Outstanding data redundancy (can tolerate the loss of any five disks).
  • In the event of a single drive failure, only the mirrored drive is involved in the rebuild.


  • You have to use at least 16 drives (very large number of drives).
  • Storage efficiency can be very low (lower than RAID-1).
  • Good write performance because of RAID-0.

Storage Efficiency = (Number of groups in each RAID-6 group at the intermediate level – 2) / ( (Number of drives in RAID-1) * (Number of groups in each RAID-6 group at the intermediate level) ) 8

Other Interesting Triple Nested RAID Configurations

The two examples discussed here are somewhat at opposite ends of the Nested RAID spectrum. The first one, RAID-100, has a little bit of data redundancy (1 drive) but tons of write performance and very good storage efficiency. The second one has tons of data redundancy (5 drives) but the write performance is just adequate and the storage efficiency is not so good. These two simple examples illustrate that you can mix the standard RAID levels, (RAID-0, RAID-1, RAID-5, and RAID-6) in different ways to create different configurations but you always have to ask yourself if some of them make sense.

For example, does RAID-000 make any sense? Isn’t that just really RAID-0? (the extra RAID controllers don’t give you any performance advantage).

As mentioned earlier, three levels in a Nested RAID can result in a very large number of drives for the minimum configuration. For example, RAID-666 (truly the “evilest” of all Nested RAID configurations), requires four drives per RAID-6 at the lowest level, followed by four RAID-6 groups (that each use RAID-6) in the intermediate layer, that are combined at the highest level by RAID-6. So the result is that at a minimum 64 drives are required for a RAID-666 configuration (4*4*4).

Other triple Nested RAID configurations can lead to terrible storage efficiencies but amazing data redundancy. For example, RAID-111, uses three levels of drive mirroring. The minimum configuration requires eight drives (2*2*2), only one of which is used for storing real data (the other 7 drives are used for mirroring). That’s a storage efficiency of only 12.5%!! However, you can lose up to seven drives without losing access to your data.

Triple Nested RAID configurations have to be carefully designed and understood for them to be effective. Otherwise you could just be keeping data redundancy (or performance) about the same as a two-level Nested RAID configuration while increasing the minimum number of disks for the configuration. But, some configurations can lead to desirable behavior.

Summary

Just as Rodney Dangerfield said, “I don’t joke about dives. Especially that one. It almost killed me. …” triple Nested RAID configurations are nothing to joke about because it can kill your performance or your storage capacity (efficiency) or your redundancy. They need to be carefully designed to make sure they make sense relative to a two-level Nested RAID or a single level RAID configuration. Moreover, some of the configurations can require a large number of drives and multiple RAID cards, pushing up the cost of the configuration.

Two examples presented in this article, RAID-100 and RAID-160, show some extreme examples of what you can achieve with triple Nested RAID configurations. RAID-100 builds on RAID-10 and results in a configuration with a great deal of performance. RAID-160 begins with a very data redundant configurations, RAID-16, and adds RAID-0 in an effort to improve performance and storage efficiency. Both configurations require a fair number of drives for the minimum possible configuration reaching the extreme of RAID-160 where sixteen drives are required.

Typically triple Nested RAID configurations use software RAID at the highest level. Both examples can be implemented by using several RAID cards at the lowest two levels, and software RAID at the highest level (Figure 1 required three RAID cards, and Figure 2 required two RAID cards). This approach can be very efficient because you are using RAID cards in parallel which can improve the efficiency and the data throughput of the overall RAID configuration in addition to reducing the rebuild time of a lost drive. Given that we have desktops with four to six cores, using software RAID at the highest level isn’t a bad approach since one core can easily be focused on RAID functions while the other cores are doing something else.

We’re not quite done with RAID yet (even though you may be screaming at this point). In the next article, we’ll examine the software RAID tool in Linux, mdadm at a higher level and discuss some of its unique features.

Jeff Layton is an Enterprise Technologist for HPC at Dell. He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales (but never during working hours).

Comments on "Nested-RAID: The Triple Lindy"

khess

I’ll bet that hardly anyone gets that Triple Lindy thing. A reference from way back and way geeky. Good job.

Reply
dragonwisard

How can I maximize storage efficiency and redundancy across asymmetrical disks? I have a heterogeneous bunch of old drives (many pulled from dead systems) that I would like to attach to my NAS. I’ve seen proprietary solutions like Drobo, but is there anything free or open source?

Reply
    rikjwells

    I would think you might be able to arrange the drives in a RAID 01, building similarly-sized RAID 0 groups to mirror. Re-use being the higer priority for this application than performance or maintainability ;-)

    Reply
davidbrown

Raid 100 does have some practical use – it allows larger scale deployments with more disks than you can achieve using Raid 10 on a controller. But it is not a question of performance – Raid 0 takes almost no processing power for either a host processor or a hardware raid card. Your twelve drive RAID100 layout using 3 cards with 4 disks will give worse performance than RAID10 on a single card with 12 disks. (For a twelve drive RAID10, the fastest solution is to run all the disks as individual disks and using Linux software Raid 10 with far layout. However, running RAID10 on hardware cards might be slightly faster when degraded or rebuilding.) But if you want a 48 drive RAID10 setup, you don’t get big enough raid cards – therefore you use RAID100.

Raid 160 is an interesting arrangement – but again, it is mainly about scalability. It has no real-world advantages over Raid 16 except when you want to have a very large number of drives. I haven’t heard of Raid 16 being used in practice – if Raid 15 doesn’t give you enough protection, you probably want redundant clustered file systems anyway. While the calculation of the Raid 5 parity is easy for modern processors, Raid 6 has not insignificant costs in processor time and memory bandwidth – it is worth the cost when comparing Raid 6 to Raid 5, but it’s a different balance for Raid 16 vs. Raid 15.

The idea of using cards that support Raid 16 directly is nice in theory – but do you actually know of any cards that support Raid 16 – or even Raid 15? I have never heard of any.

Generally speaking, triple-layer raid is about scalability, not extra redundancy or performance (as compared to a two-layer solution). The same applies to a lot of two-layer raids – Raid 11, for example, is meaningless – you are better off using a single 4x mirror Raid 1.

Reply
    rikjwells

    When considering the addition of a controller-oriented layer could there not be an additional redundancy introduced by mirroring the controllers? RAID 101 perhaps?

    Reply
amadensor

I am doing something similar to nested RAID, but not quite. I use RAID-1 arrays for redundancy, but then I use LVM to do the striping. I can stripe across more than the RAID controller will handle, negating the need for RAID 100 (or 100000000) while still retaining all of the benefits, and gaining the ability to throw more storage at it in the future if needed.

Reply
arenasa

About the cons for RAID 100…Actually I can loose 6 disks without loosing data access.. if I am lucky enough to loose just one disk of every RAID 1 array… is that correct?.

Reply
rrohbeck

Some of these make a lot of sense when you consider that bus or controller throughput is often the bottleneck when you run large arrays. In many of our systems we run 16 drives per controller in two RAID5 or RAID6 groups of 8 each, with up to 4 controllers, and everything striped together in software. That would make the systems RAID500 or RAID600.
I also run a file server with RAID55. Yeah that’s overkill but I’m not only protected from dual drive failure but also from a failed array [connection] or power loss on one array. That server is mostly read from so performance isn’t an issue.

Reply

Why not RAID 6 within RAID 6? or RAID 66EE inside RAID66EE? Nested in a 3-d manner and controlled by a low-cost 3d processor such as those found on graphics cards?

Reply

Leave a Reply to arenasa Cancel reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>