Linux comes with software-based RAID that can be used to provide either a performance boost or add a degree of data protection. This article gives a quick introduction to Linux software RAID and walks through how to create a simple RAID-1 array.
At this point a file system can be created on the md device.
# mkfs.ext3 /dev/md0
mke2fs 1.41.7 (29-June-2009)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
30531584 inodes, 122095984 blocks
6104799 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
3727 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
This filesystem will be automatically checked every 26 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
You can then mount the file system fairly easily. For example, the following line was added to /etc/fstab:
/dev/md0 /mnt/raid1_test ext3 defaults,data=ordered 0 0
Then the file system is mounted using “mount -a” (assuming the mount point exists).
To make sure the file system is mounted, you can check the file system
# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/hda3 16G 5.2G 11G 34% /
/dev/sda1 147G 5.6G 134G 5% /home
/dev/hda1 969M 17M 902M 2% /boot
tmpfs 3.8G 0 3.8G 0% /dev/shm
/dev/md0 459G 199M 435G 1% /mnt/raid1_test
Step 3 – Monitoring the RAID group
Once the RAID group is up and a file system is created, the next step is to monitor the array. There are several pieces to monitoring the array, including understanding the details of the array.
# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.03
Creation Time : Sun Aug 16 13:26:49 2009
Raid Level : raid1
Array Size : 488383936 (465.76 GiB 500.11 GB)
Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Aug 16 15:35:13 2009
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
UUID : f10d60d8:d6f47f28:f4379aeb:ea2e53d0
Events : 0.34
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
There is a great deal of information in this output. How many devices are in the RAID array, how many devices are failed, how many devices are labeled as hot spares, etc.
- Raid Level – In this case it is “raid1″
- Array Size – the size of the RAID array, which in this case is 465.76 GiB
- Raid Devices – the number of devices in the raid group, which in this case is 2
- Total Devices – the total number of devices in the raid array, which in this case is 2
- Active Devices – the number of active devices which is 2 for this case
- Working Devices – the number of working devices in the array, which in this case is 2
- Failed Devices – the number of devices that have failed, which in this case is 0
- Spare Devices – the number of spare devices to be used by the array, which in this case is 0
This command can run periodically and the output can be parsed as part of a cron job or emailed to the system administrator.
There is an option for the mdadm command, “–mail [address]“, that allows you to add an email address so that when there are events an email will be sent to the designated address. You can also add the following options:
- –program : Allows you to add a program that is to be run whenever an alert is detected.
- –syslog : This causes all events to be reported to the syslog of the system (always a good idea)
You can also get some of this information from the output of the file /proc/mdstat.
Let’s disect the output of cat /proc/mdstat.
# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
488383936 blocks [2/2] [UU]
unused devices:
The output lists the active devices. In the case of RAID-1, it also tells you which device is considered the primary device. You look for the lowest number in the line listing the devices in the raid array. In this case, it’s sdb1 because it is listed as “[0]“. The next line down, which contains the number of blocks, lists the status of each device. In this case the section of data is listed as “[UU]“. The first “U” on the left corresponds to the first device listed (in this case sdc1) and the second “U” on the right corresponds to the second device listed (/dev/sdb1). The “U” means that the device is up and running. So by looking at /proc/mdstat you can immediately tell if there is a device down (you will see a “_” in place of “U” if the device is down or failed).
There is a whole host of command and techniques for managing the RAID array. You can add and subtract hot spare devices to the RAID group (always good to have a hot spare). You can also fail certain devices if you want to pull them out and replace them. However, it is beyond the scope of this particular article to talk about managing a RAID array. There are tutorials on the web that present some practical techniques for accomplishing this.
These are only a few articles floating around the web.
Summary
Linux software RAID is a great free tool that can be used to help improve performance (RAID-0) or help data protection (RAID-1, RAID-5, RAID-6). The developers of mdadm, led by Neil Brown, have developed a wonderful tool for administering and monitoring RAID arrays. This article very quickly shows how to create a RAID-1 array to help with data protection. It couples the “suspenders” of RAID-1 with the “belt” of backups and can save your bacon in many cases.
Jeff Layton is an Enterprise Technologist for HPC at Dell. He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales (but never during working hours).
Comments on "Linux Software RAID – A Belt and a Pair of Suspenders"
Other interesting/useful GNU/Linux replication ideas:
DRBD http://www.drbd.org
(over tcp/ip, easy to setup High Availability if required)
Unison http://www.cis.upenn.edu/~bcpierce/unison/
(user-space, cross platform)
Typo: \”These are /dev/sdb and /dev/sdb\” s/b \”These are /dev/sdb and /dev/sdc\”
@j3:
Yep – good catch. I\’ll fix that typo later.
Jeff
I have an existing fakeraid 5 created from within Windows XP Home. Is it possible to access this from Ubuntu 9.04 using dm? My goal would be to dual boot & access it from both Win & Linux.
thanks!
GREAT ARTICLE – I read the docs but a worked example made it sink in.
The two articles, about LVM and about software raid, require a 3rd
talking about the two in conjunction.
To raid LVs or to PV raid groups? What is better? from performance, functionality and stability point of views?
The article are good, introductions. I\’d like to see you going deeper.
@dbbd:
I\’m working on that article :) Determining which one is \”best\” is becoming somewhat subjective. In general, the best approach is to use RAID (md) on the lowest level and then use LVM on top of that. The simple reason is that you can expand the file system much easier using LVM than md.
The questions I\’ve been examining become things such as,
So there are a bunch of considerations which makes the article much more difficult to write – I need to examine lots of options.
Ultimately what I would like to produce is something of a \”contrast\” list. It will list the various approaches or ideas and then list the pros and cons because I think choosing the \”best\” is subjective (I haven\’t seen an article like this – have you?).
Thanks for the feedback!
Jeff
@wodenickel
I don\’t know if you can do that. I\’m guessing that it ill be very difficult. md would have to understand how Windows builds the RAID. Then Linux will have to understand the file system (if it\’s NTFS then read-only is fairly straight forward and you can use NTFS-3G for read/write).
Did a google search turn up anything?
Jeff
Thanks for this informative article. However, you say that \”you can now put a filesystem onto /dev/md0\”. Actually, it has been my experience that you MUST put the filesystem onto /dev/md0 and NOT any drive used in the RAID. If you have two drives that you want in your RAID and you mkfs.ext3 each then when they are included in the RAID, the size of the filesystem will be larger than what the RAID can handle and you\’ll get \”attempt to write beyond end of device\”. Doing mkfs-ext3 on /dev/md0 effectively puts a filesystem onto the whole RAID group but the number of available blocks is slightly smaller than the individual members could accomodate.
@lesatairvana:
You are correct, sort of. If you want to stay with a simple RAID-1 with two disks, then yes you have to put the file system on /dev/md0. But you can also uses /dev/md0 as a building block for something else. For example, you can create /dev/md0 and /dev/md1 each from two pairs of disks, and then create a RAID-0 on top of that.
Disclaimer – I\’ve never done this but I\’ve been told you can do it. (if I can get a couple more disks into my case, I will try it).
Jeff
More title than topic – in the UK, we\’d say \”a belt and braces approach\” – \”suspenders\” being the things which ladies used to hold up stockings, before the invention of tights. So what do you call those? I\’d like to know, my wife would like to know, but we don\’t want to get into porno hell trying to find out. She tells me.
Hi, thanks for the article. One thing I do not understand – why to synchronize disks before any data was put on them (even mkfs was done after sync)?