Linux Software RAID – A Belt and a Pair of Suspenders

Linux comes with software-based RAID that can be used to provide either a performance boost or add a degree of data protection. This article gives a quick introduction to Linux software RAID and walks through how to create a simple RAID-1 array.

There is an old phrase about wearing a belt and a pair of suspenders if you want to make sure your pants stay up (why haven’t plumbers figured that out?). The point of the phrase is that if you want to be sure that your plans will happen you should have a backup plan as well. In the case of file systems this is literally the truth. If you want to make sure you don’t lose any data, do backups as well as provide some other form of data protection. That something else for data protection is RAID (Redundant Array of Inexpensive Disks).

This article is a brief introduction to software RAID, which is really md (Multiple Device Driver) for Linux. As with the article on LVM, this article is just a quick introduction and not a deep tutorial. The intent is to quickly demonstrate Linux software RAID using md and mdadm. Perhaps this article will show you how easy it is to add software RAID to your repertoire to either help improve performance or provide extra protection. In essence, this article will introduce you to Linux software RAID becoming the “suspenders” to the “belt” of backups.

Quick Introduction

The original intent of RAID was to improve IO performance as well as using smaller disks to create larger virtual disks (although the phrase “virtual” disk was not originally used, in this age of “virtual-everything” it seems appropriate). The basic concept was then embraced and developed from the 1987 inception to today.

RAID has evolved into a technology that is ubiquitous as storage drives themselves. It allows system designers to add in performance while also providing some additional data protection (don’t forget to wear your “belt”). There are many choices with RAID such as various RAID levels and software and/or hardware RAID. Software RAID means the RAID functionality is provided in software by the OS. Hardware RAID means the RAID functionality is provided by a card, usually in a PCI or PCIe slot. There a couple of articles that can present the pros and cons of the various RAID options, here and here. But this article will focus on software RAID with Linux using the md capability of Linux.

It’s beyond the scope of this article to discuss the various RAID level options. There are better articles for this (it may be wikipedia, but it’s a good introduction to the various RAID levels). Instead this article will go through the creation of a simple RAID-1 setup. RAID-1 mirrors disks (actually disk partitions) so if you write to one, the data is copied to the other disk(s). This is a simple way to provide some data protection because you can lose a single disk without losing any data (but it is not a substitute for real backups). So what is a good way to create and manage RAID arrays on Linux?

Madam – I’m mdadm

Handling md groups can be very complex and difficult. It can require hand editing files where a mistake can cause the lose of RAID groups. If you are careful, it works very well. But to help you maintain your RAID groups, Neil Brown started a project for an administrative tool for md called mdadm.

The mdadm tool is very comprehensive and has a variety of functions:


  • Assemble: Assemble the components of a previously created array into an active array

  • Build: Build an array that doesn’t have a superblock on each device

  • Create: Create a new RAID array with a superblock on each device

  • Monitor: Monitor one or more md devices and act on any changes

  • Grow: Change the size (grow or shrink) or reshape an md device. This also allows you to add devices as needed

  • Incremental Assembly: Add a single device to an array

  • Manage: This function allows you to manage specific components of the RAID array such as adding new spare devices or removing faulty devices

  • Misc: This is the function that contains all other functions that might be needed

  • Auto-detect: This function, while not explicit, has the kernel activate any auto-detected arrays

The man pages are quite good and you can find it on-line here.

This article will present a simple example with two drives. For this article, a CentOS 5.3 distribution was used on the following system:


  • GigaByte MAA78GM-US2H motherboard
  • An AMD Phenom II X4 920 CPU
  • 8GB of memory
  • Linux 2.6.30 kernel
  • The OS and boot drive are on an IBM DTLA-307020 (20GB drive at Ulta ATA/100)
  • /home is on a Seagate ST1360827AS
  • There are two drives for testing. They are Seagate ST3500641AS-RK with 16 MB cache each. These are /dev/sdb and /dev/sdb.

Using this configuration a simple RAID-1 configuration is created between /dev/sdb and /dev/sdc.

Step 1 – Set the ID of the drives
The first step in the creation of a RAID-1 group is to set the ID of the drives that are to be part of the RAID group. The type is “fd” (Linux raid autodetect) and needs to be set for all partitions and/or drives used in the RAID group. You can check the partition types fairly easy:

# fdisk -l /dev/sdb

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       60801   488384001   fd  Linux raid autodetect
# fdisk -l /dev/sdc

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       60801   488384001   fd  Linux raid autodetect

Step 2 – Create the RAID set using mdadm The tool mdadm allows the easy creation of a RAID group. In this article, a simple RAID-1 group, a two disk group is created.

[root@test64 ~]# mdadm --create --verbose /dev/md0 --level raid1 --raid-devices=2 /dev/sdb1 /dev/sdc1
mdadm: /dev/sdb1 appears to contain an ext2fs file system
    size=244187136K  mtime=Sun Aug 16 13:06:51 2009
mdadm: /dev/sdc1 appears to contain an ext2fs file system
    size=244187136K  mtime=Sun Aug 16 13:06:51 2009
mdadm: size set to 488383936K
Continue creating array? y
mdadm: array /dev/md0 started.

The options are fairly easy to understand. The first option “–create” creates a RAID group (naturally). After the “–verbose” option is the md device, in this case it’s /dev/md0. After that is the RAID level (“–level”) – in this case it’s raid1. Finally the RAID devices are specified using the “–raid-devices” option. Also notice that it prompts the user if there is a file system on the drives (partitions).

RAID works on a block level. That is, the RAID controller, be it software RAID or hardware RAID, works on the blocks of the devices in the RAID group. This means it’s independent of the file system. Consequently, immediately after the RAID1 group is created, the drives are “synchronized”. That is, the contents of the blocks from the first partition (drive) are copied to the second partition (drive). Below is the output of that synchronization process at very stages of completion (just to give you an idea of speed and time).

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
      488383936 blocks [2/2] [UU]
      [>....................]  resync =  0.2% (1444224/488383936) finish=112.3min speed=72211K/sec

unused devices:

Notice that the status of the synchronization process is found by “cat-ing” the contents of the file /proc/mdstat.

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
      488383936 blocks [2/2] [UU]
      [==========>..........]  resync = 50.1% (245077952/488383936) finish=57.4min speed=70554K/sec

unused devices:
[root@test64 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
      488383936 blocks [2/2] [UU]
      [================>....]  resync = 80.1% (391254144/488383936) finish=25.9min speed=62269K/sec

unused devices:
[root@test64 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
      488383936 blocks [2/2] [UU]
      [===================>.]  resync = 99.6% (486830720/488383936) finish=0.5min speed=47731K/sec

unused devices:

After the synchronization process is finished, the output should look like the following.

# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
      488383936 blocks [2/2] [UU]

unused devices:

Comments on "Linux Software RAID – A Belt and a Pair of Suspenders"

vu2lid

Other interesting/useful GNU/Linux replication ideas:

DRBD http://www.drbd.org
(over tcp/ip, easy to setup High Availability if required)

Unison http://www.cis.upenn.edu/~bcpierce/unison/
(user-space, cross platform)

Reply
j3

Typo: \”These are /dev/sdb and /dev/sdb\” s/b \”These are /dev/sdb and /dev/sdc\”

Reply
laytonjb

@j3:

Yep – good catch. I\’ll fix that typo later.

Jeff

Reply
wodenickel

I have an existing fakeraid 5 created from within Windows XP Home. Is it possible to access this from Ubuntu 9.04 using dm? My goal would be to dual boot & access it from both Win & Linux.

thanks!
GREAT ARTICLE – I read the docs but a worked example made it sink in.

Reply
dbbd

The two articles, about LVM and about software raid, require a 3rd
talking about the two in conjunction.
To raid LVs or to PV raid groups? What is better? from performance, functionality and stability point of views?

The article are good, introductions. I\’d like to see you going deeper.

Reply
laytonjb

@dbbd:

I\’m working on that article :) Determining which one is \”best\” is becoming somewhat subjective. In general, the best approach is to use RAID (md) on the lowest level and then use LVM on top of that. The simple reason is that you can expand the file system much easier using LVM than md.

The questions I\’ve been examining become things such as,

  • Do you split the drives into partitions and if so, how?
  • How do combine the drives into RAID groups depending upon the RAID level?
  • Do you use LVM for striping as well as md or is it one or other?

So there are a bunch of considerations which makes the article much more difficult to write – I need to examine lots of options.

Ultimately what I would like to produce is something of a \”contrast\” list. It will list the various approaches or ideas and then list the pros and cons because I think choosing the \”best\” is subjective (I haven\’t seen an article like this – have you?).

Thanks for the feedback!

Jeff

Reply
laytonjb

@wodenickel

I don\’t know if you can do that. I\’m guessing that it ill be very difficult. md would have to understand how Windows builds the RAID. Then Linux will have to understand the file system (if it\’s NTFS then read-only is fairly straight forward and you can use NTFS-3G for read/write).

Did a google search turn up anything?

Jeff

Reply
lesatairvana

Thanks for this informative article. However, you say that \”you can now put a filesystem onto /dev/md0\”. Actually, it has been my experience that you MUST put the filesystem onto /dev/md0 and NOT any drive used in the RAID. If you have two drives that you want in your RAID and you mkfs.ext3 each then when they are included in the RAID, the size of the filesystem will be larger than what the RAID can handle and you\’ll get \”attempt to write beyond end of device\”. Doing mkfs-ext3 on /dev/md0 effectively puts a filesystem onto the whole RAID group but the number of available blocks is slightly smaller than the individual members could accomodate.

Reply
laytonjb

@lesatairvana:

You are correct, sort of. If you want to stay with a simple RAID-1 with two disks, then yes you have to put the file system on /dev/md0. But you can also uses /dev/md0 as a building block for something else. For example, you can create /dev/md0 and /dev/md1 each from two pairs of disks, and then create a RAID-0 on top of that.

Disclaimer – I\’ve never done this but I\’ve been told you can do it. (if I can get a couple more disks into my case, I will try it).

Jeff

Reply
almac

More title than topic – in the UK, we\’d say \”a belt and braces approach\” – \”suspenders\” being the things which ladies used to hold up stockings, before the invention of tights. So what do you call those? I\’d like to know, my wife would like to know, but we don\’t want to get into porno hell trying to find out. She tells me.

Reply
levonshe

Hi, thanks for the article. One thing I do not understand – why to synchronize disks before any data was put on them (even mkfs was done after sync)?

Reply

Possibly You Also Make Most of these Blunders With bag !

Reply

Identify who’s raving about bag and the actual reason why you should be worried.

Reply

Listing of helpful options to discover more about women before you’re abandoned.

Reply

thank you for share!

Reply

I used to be recommended this blog by means of my cousin.
I’m no longer positive whether or not this post is written by him as nobody else know such designated approximately my difficulty.
You are amazing! Thank you!

Reply

Generating Traffic Procedure That Is Actually Helping bag-experts To Advance

Reply

Listed Here Is A Technique That Is Also Assisting bag-industry professionals To Rise

Reply

Some really great info , Sword lily I observed this. “Ours is a world where people don’t know what they want and are willing to go through hell to get it.” by Donald Robert Perry Marquis.

Reply

The Primary Methods Of Comprehend watch Plus The Way One Can Join The watch Top dogs

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>