Now that we've completed our initial examination of the basics of RAID levels (including Nested RAID) it's time to turn our attention to RAID functionality on Linux using software. In this article we will be discussing mdadm -- the software RAID administration tool for Linux. It comes with virtually every Linux distribution and has some unique features that many hardware RAID cards don't.
Notice that the mdadm command begins with the “–create” option that tells mdadm that it will operate in “create” mode. I like to then define the specific md-device as well but be sure you aren’t using an existing md-device name. Specifying the “chunk” option is up to you. Then you define the RAID level you want by the “–level” option with the options listed above. You can also tell mdadm how many block devices you are using with the “–raid-devices=Z” option (Z is the number of devices). Then finally you give mdadm the list of block devices you are using.
An example of using the create option with mdadm is,
% mdadm --create --verbose /dev/md0 --level=0 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1
which creates a RAID-0 configuration that is labeled as /dev/md0 and uses three block devices that are /dev/sda1, /dev/sdb1, and /dev/sdc1.
In the example, I have used the first partition of each of the three drives that are valid block devices as the block devices for mdadm. They could have easily been the entire disk such as /dev/sda or /dev/sdb. The point is that they need to be valid Linux block devices (they could even be network based devices but that’s a different discussion).
Mdadm is smart enough to build the RAID-0 configuration using the smallest common size of each of the three devices. So it is recommended that you check on the size of each block device using fdisk as shown below.
root@laytonjb-laptop:~/# /sbin/fdisk /dev/sdb
The number of cylinders for this disk is set to 19457.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): p
Disk /dev/sdb: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000bca3e
Device Boot Start End Blocks Id System
/dev/sdb1 * 1 18704 150239848+ 83 Linux
/dev/sdb2 18705 19457 6048472+ 5 Extended
/dev/sdb5 18705 19457 6048441 82 Linux swap / Solaris
Be sure to look at the column labeled “Blocks” for the particular partition you are going to use. Do this for all the devices and make sure the number of blocks is the same or very close (otherwise you are wasting space).
You might also notice that I used the “–verbose” option in the mdadm create command. I like to use this option to get more information about what mdadm is doing (“better informed than sorry” is a good motto). This is always a good habit to develop.
At this point, your RAID array should be created and running. An easy way to check this is to look at /proc/mdstat.
% cat /proc/mdstat
Fortunately, the output should be fairly easy to read at first glance. For much more detailed information you can read this article.
Immediately after the RAID array is created it may have to go through a synchronization process. This process performs the necessary RAID functions for configuration you created. For example, for RAID-1, the blocks on the first drive are copied to the second drive even if there isn’t any information on the blocks.
Once the array has finished synchronizing and is ready, then you can move to the next step which can be using the resulting RAID array device in LVM or creating a file system using the block device.
There are many options that you can use in “create” mode. You can read the man pages to get a list of them but below are some of the more important ones that haven’t been presented yet in this article.
-x, –spare-devices= This option allows you to specify spare devices in the initial array. These are devices (disks) that are used in the event that a disk in the RAID configuration fails. Mdadm then uses the spare drive immediately and starts restoring the array to the desired configuration. Mdadm also allows spare drives to be added and removed later (they don’t have to be added when the array is created). If you use spare devices be sure that the “–raid-devices” option lists the number of devices to be the actual RAID drives plus the spares.
-p, –layout=, –parity= Mdadm gives you remarkable control over your RAID configuration. This option lets you control the fine details of the data layout for RAID-5 and RAID-10 arrays and also controls the failure mode for a faulty or failed disk. Please read the manpages for more detail.
-z, –size= This option is the amount of space to be used from each drive in a RAID-1, RAID-4, RAID-5, or RAID-6 configuration. The size is given in kibibytes and must be a multiple of the chunk size. In addition, you must leave about 128KB (128 kibibytes) of space at the end of the drive for the RAID superblock. After the array is created you can use the “grow” mode of mdadm (–grow) to increase the size of the RAID configuration.
Assembling an mdadm RAID array
One of the other “modes” in mdadm is “assemble”. After your RAID array has been created using mdadm, you can stop the array using the following command:
% mdadm --stop /dev/md0
which stops the RAID array /dev/md0 (be sure to unmount the file system that uses the RAID array first). However, there are problems in restarting the RAID array. When you restart the array you have to use mdadm to reassemble the array. For example,
% mdadm --assemble /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1
This command assembles the parts of a previously created array into an active array and “restarts” the array (i.e. make it function). To automate this, you could put this command as part of the system startup (for example, /etc/rc.d/rc.local) and you could create a simple script for stopping and starting the array. But mdadm can do some of the leg work for you with the following command:
% mdadm --assemble --scan
These options allow mdadm to scan the drives and reassemble the RAID array (it looks for the RAID superblocks on the drives). Typically this is done during the init phase of the system starting. For example, on my CentOS 5.5 system, there is a line in /etc/rc.d/rc.sysinit that looks like the following.
/sbin/mdadm -A -s
which will scan (“-s”) the drives and assemble (“-A”) the mdadm arrays.
However, you can get into trouble with scanning and assembling mdadm RAID arrays when you have more than one array. What is recommended is that if you want to restart an array by hand you specify the uuid for the array (a unique “name” of the RAID array). For example,
% mdadm --scan --assemble --uuid=7121b438:7d36f9f6:8aa9c8b3:b5b0d211
Since the uuid’s are unique to each array, this will ensure that mdadm can reassemble the array properly. However, mdadm‘s scanning and assembling capabilities are quite good and I routinely run two mdadm RAID configurations on my desktop and I’ve never had any confusion between which disks belong to which array (thanks to the superblocks on the devices).
Monitoring/Following an mdadm RAID array
The third mode of operation of mdadm is monitoring or following an mdadm RAID array. This mode monitors one or more arrays and allows action to be taken if the state of the array changes. According to the manpage for mdadm, this mode of operation is really only useful for RAID-1, 4, 5, 6, and 10, or multipath arrays since they have interesting states. On the other hand, RAID-0, and linear RAID are not useful because missing, spare, or failed drives cause these RAID modes to fail (i.e. non-operational).
The basic option for following or monitoring mdadm controlled arrays is the following:
mdadm --monitor options... devices...
(Note: you can use “-F” or “–follow” in place of “–monitor”). There are several options that can be used for monitoring and following mdadm arrays as listed below:
-m, –mail This options allows you to define an email address where mdadm alerts are sent.
-p, –program, –alert This option allows mdadm to run a “program” whenever an event is detected (it is recommended to use the full path to the program).
-y, –syslog This option causes all events to be reported through “syslog”.
-d, –delay This options creates a delay (in seconds) from when mdadm polls the arrays to when it next polls the arrays (i.e. the interval between polling).
-f –daemonize This options tells mdadm to run as a background daemon if it is monitoring arrays. This causes mdadm to fork and run in the child process and disconnect from the terminal.
-i, –pid-file This option tells mdadm to write the pid of the daemon process when mdadm is run as a daemon (see previous option). The pid is written to a specified file.
-1, –oneshot This option checks the arrays only once and generates “NewArray” events as well as “DegradeArray” and “SparesMissing” events (this will show up in the logs). According to the manpages, if you run the command “mdadm –monitor –scan -1″ from a cron job, it will ensure regular notification of any degraded arrays (always a good thing).
-t, –test This option generates a “TestMessage” alert for every array found at startup. This alert gets mailed and passed to the alert program (if you have defined one). This is very useful for testing that alert messages get through successfully (i.e. they work).