Linux Software RAID – A Belt and a Pair of Suspenders
Linux comes with software-based RAID that can be used to provide either a performance boost or add a degree of data protection. This article gives a quick introduction to Linux software RAID and walks through how to create a simple RAID-1 array.
At this point a file system can be created on the md device.
# mkfs.ext3 /dev/md0
mke2fs 1.41.7 (29-June-2009)
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
30531584 inodes, 122095984 blocks
6104799 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
3727 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
You can then mount the file system fairly easily. For example, the following line was added to /etc/fstab:
Step 3 – Monitoring the RAID group Once the RAID group is up and a file system is created, the next step is to monitor the array. There are several pieces to monitoring the array, including understanding the details of the array.
# mdadm --detail /dev/md0
Version : 00.90.03
Creation Time : Sun Aug 16 13:26:49 2009
Raid Level : raid1
Array Size : 488383936 (465.76 GiB 500.11 GB)
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Aug 16 15:35:13 2009
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : f10d60d8:d6f47f28:f4379aeb:ea2e53d0
Events : 0.34
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
There is a great deal of information in this output. How many devices are in the RAID array, how many devices are failed, how many devices are labeled as hot spares, etc.
Raid Level – In this case it is “raid1″
Array Size – the size of the RAID array, which in this case is 465.76 GiB
Raid Devices – the number of devices in the raid group, which in this case is 2
Total Devices – the total number of devices in the raid array, which in this case is 2
Active Devices – the number of active devices which is 2 for this case
Working Devices – the number of working devices in the array, which in this case is 2
Failed Devices – the number of devices that have failed, which in this case is 0
Spare Devices – the number of spare devices to be used by the array, which in this case is 0
This command can run periodically and the output can be parsed as part of a cron job or emailed to the system administrator.
There is an option for the mdadm command, “–mail [address]“, that allows you to add an email address so that when there are events an email will be sent to the designated address. You can also add the following options:
–program : Allows you to add a program that is to be run whenever an alert is detected.
–syslog : This causes all events to be reported to the syslog of the system (always a good idea)
You can also get some of this information from the output of the file /proc/mdstat.
The output lists the active devices. In the case of RAID-1, it also tells you which device is considered the primary device. You look for the lowest number in the line listing the devices in the raid array. In this case, it’s sdb1 because it is listed as ““. The next line down, which contains the number of blocks, lists the status of each device. In this case the section of data is listed as “[UU]“. The first “U” on the left corresponds to the first device listed (in this case sdc1) and the second “U” on the right corresponds to the second device listed (/dev/sdb1). The “U” means that the device is up and running. So by looking at /proc/mdstat you can immediately tell if there is a device down (you will see a “_” in place of “U” if the device is down or failed).
There is a whole host of command and techniques for managing the RAID array. You can add and subtract hot spare devices to the RAID group (always good to have a hot spare). You can also fail certain devices if you want to pull them out and replace them. However, it is beyond the scope of this particular article to talk about managing a RAID array. There are tutorials on the web that present some practical techniques for accomplishing this.
Article on using Nagios to monitor remote systems with Linux Software RAID – very cool.
An article on how to replace a failed hard drive in a RAID-1 group.
These are only a few articles floating around the web.
Linux software RAID is a great free tool that can be used to help improve performance (RAID-0) or help data protection (RAID-1, RAID-5, RAID-6). The developers of mdadm, led by Neil Brown, have developed a wonderful tool for administering and monitoring RAID arrays. This article very quickly shows how to create a RAID-1 array to help with data protection. It couples the “suspenders” of RAID-1 with the “belt” of backups and can save your bacon in many cases.