Let's face it, even rsync and tar is better than nothing.
Everyone in IT knows that a good backup strategy is critical. The unfortunate reality, however, is that all too many users — from the home to the enterprise — don’t yet have an adequate backup system in place. If you can count yourself among the latter group, consider this New Year’s resolution: “I will backup my data.”
When formulating your backup plan, ask yourself these questions: What do I need to backup? How often does it need to be backed up? How long will I need the data for? What medium would I prefer to backup to? Armed with answers, you can begin to search for a solution that fits your needs.
Your search may lead you to applications such as Arkeia, Amanda, and dump, and a variety of storage mediums, such as DLT, DAT, and CDR. Enterprise users will likely look into solutions such as Legato, ARCserve, and VERITAS. Your backup needs should dictate the solution. It’s also extremely important that you fully test your backup (and restore!) procedure after it’s put into place. The worst time to discover that your system was flawed is after data loss has already occurred.
Here, we’ll focus on using two standard Linux utilities, rsync and tar, that can quickly and easily backup data over a network (LAN or WAN) to a hard drive on a remote machine. While rsync and tar lack some of the more advanced features of other (even commercial) backup applications, the two tools are simple to configure, free to use, and readily available. Chances are your Linux distrbution includes both.
The first utility, rsync, synchronizes source and destination directories, either locally or remotely. One of rsyncÃ”s greatest strengths is that it only transfers the differences between two sets of files, which saves bandwidth and transfer time. However, a major drawback to using rsync as a backup method is that if a file becomes corrupted or is accidentally deleted, rsync replicates the corruption or deletion. You can somewhat mitigate this problem by syncing to rotating directories, such as one directory for each day of the week. (We’ll cover this option, but if you prefer discrete snapshots, tar may be a better option for you.)
The syntax for rsync is similar to that of cp. The basic command to replicate from a local machine to a remote one is:
$ rsync -e shh -a --delete /usr/local/backup email@example.com:/home/backups
This command recursively replicates the entire contents of /usr/local/backup/ on the local machine to /home/backups/ on the remote host, while preserving symbolic links, permissions, file ownership, timestamps and devices. The
-e tells rsync use to use a secure ssh connection instead of rsh (the default). The
--delete removes any file from the remote side that no longer exists on the local side.
To use this as a backup method, simply schedule the above command with cron, setting your desired frequency. Before you do that, though,, make sure the password-less logins we setup using ssh keys in the July 2004 “Tech Support” (http://www.linux-mag.com/2004-07/tech_support_01.html) are working. Without ssh keys, cron will just hang, waiting for your password.
You can use the following script to rsync to a different destination directory based on the day of the week:
rsync -e ssh -a --delete --backup --backup-dir=/home/backups/$BACKUPDIR /usr/local/backup firstname.lastname@example.org:/home/backups/today
rsync is extremely flexible and has tons of options, so read its man page to tweak the examples to better suit your needs.
tar is a backup program designed to store and extract files from an archive file, better known as a tarfile. Using tar for backup is easy: just place everything you want to archive into the tarfile, and copy the tarfile to another machine for safekeeping. This technique stores every file every time, and lets you recover a file from an arbitrary point in time. You can also use gzip to make tarfiles smaller.
The following script creates a compressed tarfile of the local /usr/local/backup/ directory and places the archive in /tmp/ with a filename that contains the year, month and day and copy it to /home/backups on backup.host:
tar zcpf /tmp/backupfile-$date.gz /usr/local/backup
scp /tmp/backupfile-$date.gz email@example.com:/home/backups
The /home/backups/ folder on backup.host now accumulates your timestamped backup files. (If you want to run this script more then once a day, edit the date line to ensure unique filenames.) To periodically remove old files, write a script that removes them based on your backup policy.
While neither rsync or tar alone constitute a comprehensive backup strategy, both allow you to quickly and reliably backup content to a remote machine using standard tools. They also make restores trivial.