dcsimg

Data Replication Using rsync

Having just discussed replication in Linux -- what it is, how it can be used and how it's not the same as a backup -- it's time to tackle a simple example of one of the replication tools: rsync. You will be surprised how easy it is to use rsync to replicate data to a second storage pool.

Replication Review

Recently we discussed the importance of data replication for situations ranging from mission critical environments to home users. Replication ensures that you have a copy of your current data on a separate storage environment (secondary system) so that if you lose the first system (primary system), you still have access to the data.

In general there are two types of replication: synchronous and asynchronous. Synchronous replication, as the name implies, means that the primary storage and secondary storage are kept exactly the same. Any data writes (or deletes) have to complete on both the primary and secondary storage pools before returning to the application allowing it to continue. This means that the data in the data pools is an exact match.

Asynchronous replication allows any writes (or deletes) from an application to finish on the primary storage pool. Then the data is copied from the primary storage pool to the secondary storage pool, typically outside of the application I/O path. This means that there can be some data difference between the two pools at any instance in time. The amount of data difference you can tolerate is up to you (your requirements) but you can shrink that data difference to something fairly small and tolerable.

But one of the most important aspects of data replication that you shouldn’t forget is that data replication and data backup are not the same thing. A backup can keep prior versions of data so that you can effectively go back in time over the life of the data to get prior versions. On the other hand, replication keeps a replica (copy) of the current data. You can get the most current state of the data from a backup but it will only be as accurate as when the backup was made. Replicas are much more recent so they will, in general, capture the data changes since the last backup. The question you need to answer is when do I need replication?

There is no universal answer to that question. You need to examine your data requirements and determine how important having the latest copy of the data is to your mission and the importance of the accessibility of that data. During your examinations you should also weave in discussions about off-site disaster recovery for your data center (or your home system).

In the previous article about replication, two replication techniques for Linux were discussed – DRBD and rsync. DRBD is a kernel based replication method that automatically replicates all data from the primary storage pool to a secondary storage pool without the user or administrator having to intervene. On the the hand, rsync is a file based approach allowing you to selectively replicate directory trees so that you don’t have to replicate an entire file system. However, rsync mush be invoked manually so it’s not as automatic as DRBD increasing the possible data differences between the primary and secondary storage pools. But many times the flexibility of replicating only portions of the data to one or more secondary storage pools make it a popular choice despite the possibility of increased data differences.

Since rsync is so flexible and is file based, in this article I want to show a simple rsync example to illustrate what it can do for data replication.

Simple example using rsync

I want to present a simple rsync example to illustrate the basic steps involved in getting to replicate a directory tree. One of the advantages of rsync is that you don’t have to make a copy of the entire file system – you can just make a copy of a specific directory tree and even specific file types or file names. This makes it incredibly flexible since you can now replicate directory trees to various secondary storage servers as needed and/or focus on certain types of files or directory trees.

For this article I will be using rsync to replicate data from my laptop to my main storage box at home. The overall concept is that I come home from being on the road, I fire up the laptop on the home network and my data gets replicated to my home server. In addition, I will only be replicating a specific directory from the laptop to the home server since that is where I keep all the data I modify while I travel.

Let’s start by defining terms within the rsync framework. The first two systems or terms that we need to define are the rsync server and the rsync client. One would think that the traditional client/server terms would apply, but a common source of confusion in rsync is that the rsync server does not necessarily have to be the system that has the original copy of the data and the rsync client does not have to be the recipient of the data. To better understand rsync, remember that there is a distinction between roles and processes in rsync. So to make sure we understand all the terms used in this article, there are four terms we’ll be using (taken from this link).


  • Client: This is a role within rsync where the client initiates the synchronization.
  • Server: This is a role within rsync and refers to the remote process (system) that clients connect to either within a local transfer, a remote shell, or with a network socket.
  • Sender: This is a role and a process within rsync and applies to the particular rsync process that has access to the original files being synchronized. So the “sender” process reads the data and sends it to the “receiver” process.
  • Receiver: This is a role and a process within rsync and describes the receiving process that receives the updates to the data and writes it to the storage device (i.e. the secondary storage pool).

To be honest, things can get a little confusing between roles and processes but I like to keep things simple. So for this example where I’m replicating data from my laptop to my home server I will define the home server as the rsync server, and the laptop as the rsync client and the laptop is the rsync sender since it has the original data and the home server is the rsync receiver. The idea is that when I plug in the laptop to my home network, my data is automatically replicated to my home server so I have a copy in case the laptop dies.

A reasonably good tutorial to use to start learning rsync is here. It is a bit old (1999) but it has a very good overview of rsync and explains things fairly well. There are other tutorials that cover useful topics such as how to use ssh with rsync or using stunnel with rsync.

For the rsync command used for my simple scenario let’s start with the simple example in the “everythinglinux” tutorial. Here is the script I used to perform the rsync that is taken from the article and adapted to my situation (notice that there are few changes).

rsync --verbose --progress --stats --compress --rsh=/usr/bin/ssh \
  --recursive --times --perms --links --delete \
  --exclude "*bak" \
  /home/laytonjb/Documents/* 192.168.1.8:/data/laytonj/rsync_test

Comments on "Data Replication Using rsync"

We came across a cool web page which you may love. Take a search when you want.

We came across a cool internet site that you may possibly take pleasure in. Take a appear in case you want.

Although internet websites we backlink to below are considerably not related to ours, we really feel they are actually really worth a go by means of, so have a look.

Always a big fan of linking to bloggers that I adore but really don’t get lots of link enjoy from.

Here are some hyperlinks to web pages that we link to for the reason that we assume they are worth visiting.

Here are some links to web-sites that we link to simply because we think they may be really worth visiting.

Although internet sites we backlink to below are considerably not associated to ours, we really feel they are basically really worth a go by, so have a look.

Just beneath, are many absolutely not associated internet sites to ours, however, they’re certainly really worth going over.

Every after inside a even though we choose blogs that we study. Listed below are the most up-to-date websites that we pick out.

Check below, are some completely unrelated web sites to ours, nevertheless, they’re most trustworthy sources that we use.

Below you?ll locate the link to some web sites that we feel you ought to visit.

Check below, are some entirely unrelated websites to ours, having said that, they may be most trustworthy sources that we use.

Here are some links to web pages that we link to due to the fact we feel they may be really worth visiting.

Just beneath, are a lot of absolutely not related web pages to ours, however, they’re surely worth going over.

Just beneath, are many totally not associated web-sites to ours, having said that, they’re surely worth going over.

Wonderful story, reckoned we could combine several unrelated information, nonetheless definitely really worth taking a appear, whoa did one learn about Mid East has got far more problerms also.

Just beneath, are quite a few entirely not connected websites to ours, however, they are certainly really worth going over.

We came across a cool site which you may get pleasure from. Take a look in the event you want.

Every the moment in a whilst we pick blogs that we read. Listed beneath would be the most up-to-date sites that we pick out.

Here is a superb Weblog You might Uncover Exciting that we encourage you to visit.

Always a significant fan of linking to bloggers that I like but do not get quite a bit of link appreciate from.

Usually posts some really fascinating stuff like this. If you are new to this site.

Please pay a visit to the web sites we adhere to, which includes this a single, as it represents our picks in the web.

We came across a cool web site that you just may possibly love. Take a search if you want.

That would be the end of this article. Right here you will locate some internet sites that we think you will value, just click the hyperlinks.

Here is a superb Weblog You may Locate Intriguing that we encourage you to visit.

We came across a cool web page which you might get pleasure from. Take a search if you want.

The data talked about in the article are a number of the best obtainable.

Here are some of the sites we advise for our visitors.

Please pay a visit to the web-sites we adhere to, including this one, as it represents our picks in the web.

Here are some hyperlinks to web pages that we link to simply because we believe they’re worth visiting.

Every when inside a while we pick out blogs that we read. Listed below are the latest web-sites that we opt for.

We like to honor several other world-wide-web web pages on the net, even though they aren?t linked to us, by linking to them. Under are some webpages worth checking out.

Here are some of the web-sites we recommend for our visitors.

Although sites we backlink to beneath are considerably not connected to ours, we really feel they may be basically really worth a go via, so possess a look.

That may be the end of this report. Here you will obtain some internet sites that we believe you?ll enjoy, just click the hyperlinks.

Although websites we backlink to beneath are considerably not connected to ours, we feel they are essentially really worth a go by, so have a look.

One of our guests lately proposed the following website.

Please visit the sites we follow, such as this one, because it represents our picks from the web.

One of our visitors a short while ago advised the following website.

I really like what you guys are up too. Such clever work and coverage!
Keep up the excellent works guys I’ve added you guys to blogroll.

Check below, are some entirely unrelated sites to ours, nonetheless, they may be most trustworthy sources that we use.

Please take a look at the internet sites we stick to, like this a single, as it represents our picks in the web.

We came across a cool web-site that you could delight in. Take a appear when you want.

Just beneath, are numerous totally not associated internet sites to ours, having said that, they may be surely worth going over.

Please stop by the web pages we follow, including this one particular, because it represents our picks in the web.

Check beneath, are some completely unrelated websites to ours, on the other hand, they’re most trustworthy sources that we use.

Please stop by the web pages we follow, like this one particular, as it represents our picks in the web.

Wonderful story, reckoned we could combine several unrelated information, nevertheless really worth taking a appear, whoa did 1 understand about Mid East has got much more problerms also.

The info mentioned inside the write-up are a number of the best readily available.

Leave a Reply