Linux Software RAID – A Belt and a Pair of Suspenders

Linux comes with software-based RAID that can be used to provide either a performance boost or add a degree of data protection. This article gives a quick introduction to Linux software RAID and walks through how to create a simple RAID-1 array.

At this point a file system can be created on the md device.

# mkfs.ext3 /dev/md0
mke2fs 1.41.7 (29-June-2009)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
30531584 inodes, 122095984 blocks
6104799 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
3727 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
        102400000

Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.

You can then mount the file system fairly easily. For example, the following line was added to /etc/fstab:

/dev/md0                /mnt/raid1_test        ext3    defaults,data=ordered   0 0

Then the file system is mounted using “mount -a” (assuming the mount point exists).

To make sure the file system is mounted, you can check the file system

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/hda3              16G  5.2G   11G  34% /
/dev/sda1             147G  5.6G  134G   5% /home
/dev/hda1             969M   17M  902M   2% /boot
tmpfs                 3.8G     0  3.8G   0% /dev/shm
/dev/md0              459G  199M  435G   1% /mnt/raid1_test

Step 3 – Monitoring the RAID group
Once the RAID group is up and a file system is created, the next step is to monitor the array. There are several pieces to monitoring the array, including understanding the details of the array.

# mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.03
  Creation Time : Sun Aug 16 13:26:49 2009
     Raid Level : raid1
     Array Size : 488383936 (465.76 GiB 500.11 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Sun Aug 16 15:35:13 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           UUID : f10d60d8:d6f47f28:f4379aeb:ea2e53d0
         Events : 0.34

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1

There is a great deal of information in this output. How many devices are in the RAID array, how many devices are failed, how many devices are labeled as hot spares, etc.


  • Raid Level – In this case it is “raid1″

  • Array Size – the size of the RAID array, which in this case is 465.76 GiB

  • Raid Devices – the number of devices in the raid group, which in this case is 2

  • Total Devices – the total number of devices in the raid array, which in this case is 2

  • Active Devices – the number of active devices which is 2 for this case

  • Working Devices – the number of working devices in the array, which in this case is 2

  • Failed Devices – the number of devices that have failed, which in this case is 0

  • Spare Devices – the number of spare devices to be used by the array, which in this case is 0

This command can run periodically and the output can be parsed as part of a cron job or emailed to the system administrator.

There is an option for the mdadm command, “–mail [address]“, that allows you to add an email address so that when there are events an email will be sent to the designated address. You can also add the following options:


  • –program : Allows you to add a program that is to be run whenever an alert is detected.

  • –syslog : This causes all events to be reported to the syslog of the system (always a good idea)

You can also get some of this information from the output of the file /proc/mdstat.

Let’s disect the output of cat /proc/mdstat.

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
      488383936 blocks [2/2] [UU]

unused devices:

The output lists the active devices. In the case of RAID-1, it also tells you which device is considered the primary device. You look for the lowest number in the line listing the devices in the raid array. In this case, it’s sdb1 because it is listed as “[0]“. The next line down, which contains the number of blocks, lists the status of each device. In this case the section of data is listed as “[UU]“. The first “U” on the left corresponds to the first device listed (in this case sdc1) and the second “U” on the right corresponds to the second device listed (/dev/sdb1). The “U” means that the device is up and running. So by looking at /proc/mdstat you can immediately tell if there is a device down (you will see a “_” in place of “U” if the device is down or failed).

There is a whole host of command and techniques for managing the RAID array. You can add and subtract hot spare devices to the RAID group (always good to have a hot spare). You can also fail certain devices if you want to pull them out and replace them. However, it is beyond the scope of this particular article to talk about managing a RAID array. There are tutorials on the web that present some practical techniques for accomplishing this.


These are only a few articles floating around the web.

Summary

Linux software RAID is a great free tool that can be used to help improve performance (RAID-0) or help data protection (RAID-1, RAID-5, RAID-6). The developers of mdadm, led by Neil Brown, have developed a wonderful tool for administering and monitoring RAID arrays. This article very quickly shows how to create a RAID-1 array to help with data protection. It couples the “suspenders” of RAID-1 with the “belt” of backups and can save your bacon in many cases.

Jeff Layton is an Enterprise Technologist for HPC at Dell. He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales (but never during working hours).

Comments on "Linux Software RAID – A Belt and a Pair of Suspenders"

vu2lid

Other interesting/useful GNU/Linux replication ideas:

DRBD http://www.drbd.org
(over tcp/ip, easy to setup High Availability if required)

Unison http://www.cis.upenn.edu/~bcpierce/unison/
(user-space, cross platform)

Reply
j3

Typo: \”These are /dev/sdb and /dev/sdb\” s/b \”These are /dev/sdb and /dev/sdc\”

Reply
laytonjb

@j3:

Yep – good catch. I\’ll fix that typo later.

Jeff

Reply
wodenickel

I have an existing fakeraid 5 created from within Windows XP Home. Is it possible to access this from Ubuntu 9.04 using dm? My goal would be to dual boot & access it from both Win & Linux.

thanks!
GREAT ARTICLE – I read the docs but a worked example made it sink in.

Reply
dbbd

The two articles, about LVM and about software raid, require a 3rd
talking about the two in conjunction.
To raid LVs or to PV raid groups? What is better? from performance, functionality and stability point of views?

The article are good, introductions. I\’d like to see you going deeper.

Reply
laytonjb

@dbbd:

I\’m working on that article :) Determining which one is \”best\” is becoming somewhat subjective. In general, the best approach is to use RAID (md) on the lowest level and then use LVM on top of that. The simple reason is that you can expand the file system much easier using LVM than md.

The questions I\’ve been examining become things such as,

  • Do you split the drives into partitions and if so, how?
  • How do combine the drives into RAID groups depending upon the RAID level?
  • Do you use LVM for striping as well as md or is it one or other?

So there are a bunch of considerations which makes the article much more difficult to write – I need to examine lots of options.

Ultimately what I would like to produce is something of a \”contrast\” list. It will list the various approaches or ideas and then list the pros and cons because I think choosing the \”best\” is subjective (I haven\’t seen an article like this – have you?).

Thanks for the feedback!

Jeff

Reply
laytonjb

@wodenickel

I don\’t know if you can do that. I\’m guessing that it ill be very difficult. md would have to understand how Windows builds the RAID. Then Linux will have to understand the file system (if it\’s NTFS then read-only is fairly straight forward and you can use NTFS-3G for read/write).

Did a google search turn up anything?

Jeff

Reply
lesatairvana

Thanks for this informative article. However, you say that \”you can now put a filesystem onto /dev/md0\”. Actually, it has been my experience that you MUST put the filesystem onto /dev/md0 and NOT any drive used in the RAID. If you have two drives that you want in your RAID and you mkfs.ext3 each then when they are included in the RAID, the size of the filesystem will be larger than what the RAID can handle and you\’ll get \”attempt to write beyond end of device\”. Doing mkfs-ext3 on /dev/md0 effectively puts a filesystem onto the whole RAID group but the number of available blocks is slightly smaller than the individual members could accomodate.

Reply
laytonjb

@lesatairvana:

You are correct, sort of. If you want to stay with a simple RAID-1 with two disks, then yes you have to put the file system on /dev/md0. But you can also uses /dev/md0 as a building block for something else. For example, you can create /dev/md0 and /dev/md1 each from two pairs of disks, and then create a RAID-0 on top of that.

Disclaimer – I\’ve never done this but I\’ve been told you can do it. (if I can get a couple more disks into my case, I will try it).

Jeff

Reply
almac

More title than topic – in the UK, we\’d say \”a belt and braces approach\” – \”suspenders\” being the things which ladies used to hold up stockings, before the invention of tights. So what do you call those? I\’d like to know, my wife would like to know, but we don\’t want to get into porno hell trying to find out. She tells me.

Reply
levonshe

Hi, thanks for the article. One thing I do not understand – why to synchronize disks before any data was put on them (even mkfs was done after sync)?

Reply

Leave a Reply to vu2lid Cancel reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>