AMANDA Network Backups

One problem faced by administrators of large and small networks alike is performing backups. While backups are simple to perform and a variety of software and hardware is available to help, the logistics of reliably archiving even a handful of machines are quite onerous. Having each user backup his or her own computer seems an obvious solution, but is really quite impractical: equipping each workstation and server with its own backup hardware is prohibitively expensive, and asking users to schedule and run their own backups is tantamount to ignoring the problem altogether.

One problem faced by administrators of large and small networks alike is performing backups. While backups are simple to perform and a variety of software and hardware is available to help, the logistics of reliably archiving even a handful of machines are quite onerous. Having each user backup his or her own computer seems an obvious solution, but is really quite impractical: equipping each workstation and server with its own backup hardware is prohibitively expensive, and asking users to schedule and run their own backups is tantamount to ignoring the problem altogether.

So, many administrators favor a centralized approach, where a single computer, the backup server, controls a network-wide backup process. Using this centralized approach, once a system is configured as a backup client, little or no manual intervention is required to perform frequent backups of that system.

Unfortunately, this centralized approach to backups also creates a scheduling nightmare, and can pose authorization problems. You can use NFS or even tar‘s own networking features to get the data from each personal computer to the server, but those solutions fail to scale well.

Another approach is to use more specialized network backup software. One such package is the Advanced Maryland Automatic Network Disk Archiver (or AMANDA), a powerful open source program for performing tape backups of an entire network. You can download AMANDA at http://www.amanda.org.

Principles of AMANDA Operation

For Linux or Unix boxes, AMANDA uses server software running on the backup clients. This software can access the computer’s entire disk, but requires password authentication, unlike NFS. For Windows boxes, AMANDA relies upon the standard Windows file sharing tools, which the AMANDA backup server or another Unix or Linux system can access via smbclient.

In either case, the network client running on the AMANDA backup server contacts the AMANDA servers running on the backup clients to initiate data transfers on a regular basis. To restore data, AMANDA runs a pair of servers on the backup server computer and a client program on the backup client.

In addition to the AMANDA network client and server, the system relies on archiving software on the backup clients themselves. Specifically, AMANDA uses tar or dump to archive data. On Linux, dump is of increasingly limited utility — it works with ext2fs and ext3fs filesystems, but not with other journaling filesystems, such as ReiserFS, XFS, and JFS. (XFS has its own dump variant, xfsdump.) When backing up Windows systems, AMANDA uses tar on the Unix or Linux interface system in conjunction with the Windows file sharing server software on the backup client system.

AMANDA ships with many distributions, so you may be able to install it very easily. Unfortunately, the program depends on certain defaults that are set at compile time, and these defaults vary from one distribution’s build to another. For this reason, if you’re trying to back up a network that uses a variety of Linux distributions, you may find it easiest to obtain the source code from the AMANDA web site and compile it yourself. Alternatively, you may be able to install an AMANDA package from one distribution on many distributions, but this approach may not always work due to library differences and other dependency issues. This column was based on binaries built from the AMANDA 2.4.4 source code.

AMANDA uses a special backup account, typically called amanda, and part of the amanda group. You must create an account for amanda on both the backup server and each backup client. On the backup clients, this account or group should have full read access to all files on the system. In some cases, this means you must use the root account, but careful planning may provide a way for you to use another account, or perhaps use a dedicated account in the root group. On the backup server computer, the backup account needs full read/write access to your tape device file. You may need to adjust ownership or permissions on this file.

One of AMANDA’s greatest strengths is that it functions as a scheduling tool. Rather than try to back up every file on every run, AMANDA spreads the backup load across the network in an even manner, managing backups and restores intelligently. AMANDA also keeps track of the tapes where files were stored, so that when you want to restore a file, the system can tell you which tape to insert in the tape drive.

Typically, you create a backup schedule and then call AMANDA from a cron job on the backup server computer. You must ensure that the correct tape is in the backup server’s drive, or AMANDA won’t be able to store data on the tape.

Configuring the Backup Clients

The AMANDA backup clients run server software called amandad. This software is normally run from a “super server,” such as inetd or xinetd. A typical /etc/inetd.conf entry for amandad looks like this:

amanda dgram udp wait amanda amandad amandad

If you’re running Mandrake or Red Hat, which use xinetd, you need to create a file in /etc/xinetd.d to launch the AMANDA server. This file must contain an entry like the following, which is equivalent to the preceding /etc/inetd.conf entry:

service amanda {
socket_type = dgram
protocol = udp
wait = yes
user = amanda
server = /usr/local/sbin/amandad

Of course, you may need to change the path to the server, depending on where amandad is installed on your system.

Whether you use inetd or xinetd to launch the server, you must also add an entry to /etc/services (if it’s not already present) to tell the super server what port to use:

amanda 10080/udp

And if it’s not already present on the system, you must also create an account for AMANDA. Examples here use the username amanda, but you can call the account something else, if you prefer.

In any case, the name you use must match the one specified with the –with-user option when you compiled the program. The user that you choose must have full read access to all of the files that are to be backed up.

Finally, the backup user’s home directory needs to contain an authentication file. This file is called .amandahosts, and contains the name of the backup server computer and the user on that server that may access the local backup server software.

For instance, the following entry gives the amanda user on buserver.pangaea.edu the right to grab files for backup:

buserver.pangaea.edu amanda

Once you’ve finished configuring the backup client, you must restart the super server. Typing killall -HUP inetd or killall -HUP xinetd usually does the job.

Configuring the Backup Server

The backup server doesn’t need to run server software to handle backup operations, but it does need to run a pair of servers to handle client-initiated restores. These servers are handled by /etc/inetd.conf entries like the following:

amandaidx stream tcp nowait amanda
amindexd amindexd
amidxtape stream tcp nowait amanda
amidxtaped amidxtaped

The matching /etc/services entries are:

amandaidx 10082/tcp
amidxtape 10083/tcp

More important than the servers that handle restoration is the configuration to handle backups. This is handled through a file called amanda.conf, which is installed in a subdirectory named after a backup set.

For example, if you have a daily dump strategy, you might place its configuration in /usr/local/etc/amanda/daily/amanda. conf. Ordinarily, the user who runs AMANDA (such as amanda) must have write access to the AMANDA configuration directory because the program writes summary information to files located there.

An example configuration file is shipped in the example directory of the source distribution, but this file does not work as shipped. Some of the options you might need to change include:

* The dumpuser option should specify the username you provided with the –with-user parameter when you compiled the package.

* dumpcycle defines the length of time AMANDA takes to perform one complete backup of your network.

* runspercycle is the number of times the AMANDA program (amdump) is run per dump cycle.

* tapecycle is the number of tapes that may be used in a single dump cycle. This number is normally slightly larger than runspercycle to allow for damaged tapes.

* runtapes is the number of tapes to be used per run of amdump. It’s normally set to 1 for non-changer tape drives, but may be higher for changers.

* tapedev is the non-rewinding tape device, such as /dev/nst0 or /dev/nht0.

* tapetype is the type of the tape device. Several tape types are defined later in the file. You can also create a new definition, which you insert into amanda.conf, by using the amtapetype command, as in amtapetype -f /dev/nst0. This command can take several hours to complete, though.

* labelstr sets the label for AMANDA tapes. This option takes a regular expression as an argument.

* holdingdisk spans several lines and sets information on a holding disk — a part of your hard disk that AMANDA uses to temporarily store data retrieved from the backup clients. Ideally, this area should be at least as large as a full day’s backups, but it can be smaller.

The trickiest part of the configuration may be setting the dump cycle. If you try to back up too much data in too short a time, either your backups will take long enough that they’ll interfere with normal operations, or you won’t be able to fit all the data from one day’s backups on a single tape. Setting the tape type can also be tricky, or at least tedious. If your tape drive supports compression, the amtapetype program may yield incorrect results for the tape type. You may be able to increase the capacity of the tape by a factor of 1.5X to 2X, but if you do so and then back up data that’s not easily compressed, your tape may run out of space, causing problems. Alternatively, you can disable compression on your tape hardware (using a jumper or mt commands) and use compression in AMANDA instead.

The example file that ships with AMANDA contains a few lines that you may need to comment out. Most notably, the tpchanger and changerfile lines are likely to cause problems if your tape drive isn’t a changer model. If you do have a changer, you should set the changerfile line to point to a file describing your tape changer model.

You may also need to adjust paths to some data storage files and directories, such as infofile, logdir, and indexdir. These files and directories must exist, and most must be writeable by the AMANDA user.

Once the overall backup strategy is set in amanda.conf, you must define a backup set. This is done in a file called disklist, which can be found in the same directory as amanda.conf. As with amanda.conf, an example file ships with the AMANDA source in the example directory, but this file needs substantial changes to work correctly. Most of the entries in this file resemble the following:

buclient.pangaea.edu /home user-tar

This entry tells AMANDA to back up the /home directory on buclient.pangaea.edu using the user-tar dump type. This dump type sets options such as the backup program to use, whether or not to compress the data, the backup’s priority, and so on. It must be defined in amanda.conf, and the example file includes quite a few predefined dump types.

You can specify directories to be backed up either by directory name or by partition device file. To back up a Windows system, specify the backup server computer or some other Linux system on which Samba is installed as the backup client computer, and then provide the NetBIOS name and share name to be backed up as the directory, as in //BILLY/DRIVEC to back up the DRIVEC share on BILLY. By default AMANDA uses the SAMBA username when connecting to the Windows system. To change this behavior, you must recompile the program with the –with-samba-user option. In any event, the system looks for passwords in the /etc/amandapass file, which contains Windows share names followed by passwords.

Preparing Tapes

Prior to backing up a network, you must prepare tapes. You do that with the amlabel program, which takes two options: the backup set name (which is the same as the directory in which you store the AMANDA configuration files), and a label name. The label name must match the pattern provided on the labelstr option in amanda.conf.

For instance, with a labelstr of ^DailySet1[0-9] [0-9]*$ and a backup set name of daily, you might type the following two commands to create two backup tapes:

% amlabel daily DailySet101
% amlable daily DailySet102

You type these commands as the AMANDA user. Each command responds with a brief summary of what it did. When you perform a backup, AMANDA expects tapes in the order in which you prepared them. You should probably develop a naming system to help you locate and manage your backup tapes.

Performing a Backup

Before you perform a backup, you should verify your configuration by using the amcheck utility. Type amcheck setname on the backup server computer, where setname is the backup set name, to perform this test. The result should be a report of problems with your configuration.

When you first configure the software, you may need to create directories and files to satisfy the software. In most cases, these directories and files must be readable, and often writeable, to the AMANDA user.

Once you’ve performed a test and corrected any problems, you’re ready to perform a backup. Insert one of your prepared tapes in the tape drive and type a command like the following on the backup server computer, which backs up using the daily set:

% amdump daily

If all goes well, your system will begin backing up data to the tape drive from one or more of your backup clients. When it’s done, it will create a log file in the directory you specified on the logdir line in amanda.conf. AMANDA will also email a report to the AMANDA user account.

Ordinarily, you don’t run AMANDA manually, although such runs are useful for testing purposes. Instead, you create a cron job to run AMANDA on a regular basis. As an example, the following line can be added to the AMANDA user’s crontab file to run AMANDA on a daily basis at 2:00 AM:

00 02 * * * /usr/local/sbin/amdump daily

Restoring Data

Of course, backup is pointless if you can’t restore the data that’s archived. Ordinarily, AMANDA handles this task through software installed on the backup client computers. This software relies upon servers running on the AMANDA backup server computer, so be sure he backup server’s running well before you proceed. These servers require an authentication file, .amandahosts, in the backup server’s AMANDA user’s home directory, so be sure it’s present and grants access to user root on each backup client.

The amrecover program provides a text-mode interactive utility for recovering data from an AMANDA backup server. You normally run this program as root, and provide several critical parameters to this program on its command line, including the backup set name (-C), the index server name (-s), the tape server name (-t), and the tape device on the tape server (-d). Ordinarily, the index server and the tape server are one and the same, and refer to your AMANDA backup server.

Once you’ve launched the program, it presents a prompt, at which you type commands, such as setdate to set the date of the backup you want to restore, add to add files or directories to be restored, and extract to perform the restore.

Once you type extract, amrecover prompts you to insert the appropriate tape or tapes in the backup server’s tape drive. If the backup server computer is in another room, you’ll have to go there or have somebody else help with the process.

AMANDA is a very powerful backup tool, and this article can only scratch the surface of using it. If you need more information, consult the software’s man pages or online documentation.

Roderick W. Smith is the author or co-author of eleven books, including Advanced Linux Networking and Linux Power Tools.
He can be reached at rodsmith@rodsbooks.com.

Comments are closed.