Backups with Bacula: A cross-platform system to archive your data

Whether you have tens of gigabytes or hundreds of terabytes, the Bacula system makes backups easy. Here's a hands-on primer.

Crashes happen. People make mistakes. Hardware fails. It’s all just a matter of time before you lose your data. Indeed, certain failure is the impetus for backups. But creating backups is often painful, so much so that many users choose to ignore the procedure, albeit at great peril. Hence, the ideal backup system is thorough, powerful, automatic, and easy to maintain.

Bacula, an open source package written by Kern Sibbald, neatly ticks all of the above boxes. Bacula is network-based, too, and can run across multiple machines (but also works on a single machine). Better yet, it’s cross-platform: you can backup your Mac and Windows box on your Linux machine (or vice versa).

Of course, different sites have different needs. At home, you may only have tens of gigabytes across one or two machines, while your office may manage terabytes of data across dozens of desktops and servers. Bacula is sufficiently flexible to handle all of it.

How Bacula works

Bacula runs on Mac OS X, Linux, and Windows, and interoperates happily across a collection of heterogeneous systems. Choose any operating system as the server to backup all of your machines.

Bacula has three major components: the Director, the Storage Daemon, and the File Daemon. The Director and Storage Daemon run on the central server. A copy of the File Daemon runs on each client you want to backup. (Of course, the central server also runs a File Daemon to make copies of its own filesystem and to backup the Bacula catalog.

The Director manages the backup and restore jobs, and acts as an administrative center. The Storage Daemon runs whatever storage method you’re using, such as a tape drive/multi-loader, file, DVD, and so on. The File Daemon connects with the Director and the Storage Daemon, and transfers data to be backed up.

This all requires a database backend, and you are free to choose between MySQL, SQLite (compiled into Bacula, but only recommended for small, non-production setups), and PostgreSQL.

Backup Strategy

When constructing a backup strategy, consider:

1.How much data do you want to back up?

2.How long do you want to keep it for?

3.How often would you like to perform a backup? In other words, how many days’ worth of data are you prepared to lose?

4.How much space you have available on disk/tape/DVD?

Combining a full backup with incremental backups is an effective strategy. Run the full backup once a week — which tends to be time-consuming — followed by nightly incremental backups to record anything that’s changed since the previous backup. Incremental backups are relatively fast. Be aware, however, that combining the techniques slows the restore process, since multiple tapes (real or logical), and multiple places on those tapes, need to be accessed. Bacula’s restore is pretty fast, so this may not matter, but if restore speed is important to you, you may want to consider your use of incremental backups.

Unless you’re a seriously obsessive archivist, use reusable media, such as tapes, rewriteable CD/DVD media, or an external hard disk.. I’ll assume from here on that you’re using tapes, but the same theory applies to CDs/DVDs, or to external hard disks. The capacity and cost of the storage media can also influence your backup strategy.

As an example. consider a site with three sets of machines: “basic” servers (providing services such as home directories, an LDAP server, a Kerberos server, and Web serving), “data” servers (a RAID array and similar), and desktops. Both sets of servers are backed up completely once a week, followed by nightly incremental backups. The desktops only get one weekly full dump, and no incremental backups, meaning that a file created on day 3 and lost on day 6 can’t be retrieved.

Backup Tips

1.Think about what you really need to back up. It may not be worth backing up all of your software, for example, if you only use packaged software. However, you might want to back up the list of installed packages.

2.There’s a lot of useful configuration info in /etc. You may wish to back that up regularly.

3.Backup your local applications, especially those built from source. Consider storing the corresponding source code, too, especially if it contains local modifications or enhancements.

4.Be sure to backup those random files you never seem to need until it’s too late.

5.Measure whether off-site backups are worthwhile. Determine if you can afford the off-site storage fees.

5.Decide if you need an emergency backup. For example, you may want to have a spare disk with the backup copy of any main NFS directories, or of your home directory.

This helps you to get going quickly in the event of disk or machine failure, as restoring large quantities of data can take several hours.

The schedule makes a trade-off: there isn’t enough time in the night to fit in a server-type schedule for everything, and daytime backups can affect performance. The full backups are spread over the week — the first night for the basic servers, the second and third night for the data servers, and nights 4-7 for the desktops — with incremental backups occurring nightly.

Installation and Configuration of the Director

You can, of course, compile Bacula from scratch, if you wish — the instructions are very good. Alternatively, use the packages available for your distribution. Choose a database to use, install it, and then install Bacula. The examples in this article are based on Bacula 1.38.11, the version available from the (current as of this writing) Debian stable repository. The latest version of Bacula is 2.0.3.

There are four configuration files, all found in /etc/bacula. The Director configuration can be found in bacula-dir.conf; the Storage Daemon is configured via bacula-sd.conf; the File Daemon is tweaked using bacula-fd.conf; and the Console configuration, described shortly, is controlled by bacula-console.conf.

bacula-dir.conf is the most complicated file. It specifies which files to backup, names clients, controls backup times, and more. Typically, this is the file that is edited most often. An example bacula-dir.conf is shown in Listing One.

LISTING ONE: A sample Bacula Director configuration file, bacula-dir.conf


Director {
  Name = server-dir
  QueryFile = "/etc/bacula/scripts/query.sql"
  WorkingDirectory = "/var/lib/bacula"
  PidDirectory = "/var/run/bacula"
  Password = "password"         # Console password
  Messages = Daemon

Storage {
  Name = "Overland Neo2000"
  Autochanger = "yes"
  Address = 192.0.0.70   # N.B. Use a fully qualified name here (not localhost)
  SDPort = 9103
  Password = "sdpassword"          # password for Storage daemon
  Device = "Overland Neo2000"   # must be same as Device in Storage daemon
  Media Type = Ultrium-3        # must be same as MediaType in Storage daemon
}

Catalog {
  Name = MyCatalog
  dbname = bacula; DB Address = dbserver.example.com; user = bacula; password
  = "dbpassword"
}

Messages {
  Name = Standard
  # Leave most of this as default (not shown her to save space)
  mail = myself@example.com = all, !skipped  # edit these lines
  operator = myself@example.com = mount      # to give your email address
  append = "/var/lib/bacula/log" = all, !skipped
}
# Similar message config for daemon messages (again not shown here)

Console {
  Name = server-mon
  Password = "monpasswd"
  CommandACL = status, .status
}

Schedule {
  Name = "ServerBasicCycle"
  Run = Full mon at 21:00
  Run = Incremental tue-sun at 21:30
}

FileSet {
  Name = "client"
  Include {
    Options {
      signature = MD5
    }
    File = /root
    File = /etc
  }
}

Pool {
  Name = ServerBasic
  Pool Type = Backup
  Recycle = yes                       # Bacula can automatically recycle Volumes
  Recycle Oldest Volume = yes
  AutoPrune = yes                     # Prune expired volumes
  Volume Retention = 20 days          # This for a 4-week tape reuse cycle
  Accept Any Volume = yes             # Write on any volume in the pool
}

JobDefs {
  Name = "ServerBasicJob"
  Type = Backup
  Level = Incremental
  Schedule = "ServerBasicCycle"
  Storage = "Overland Neo2000"
  Messages = Standard
  Pool = ServerBasic
  Incremental Backup Pool = ServerBasic  # You can have different pools for
                                         # full and incremental backups
  Priority = 10
}

Job {
  Name = "client"
  JobDefs = "ServerBasicJob"
  Client = "client-fd"
  FileSet = "client"
  Write Bootstrap = "/var/lib/bacula/client.bsr"
}

Client {
  Name = client-fd
  Address = client.example.com
  FDPort = 9102
  Catalog = MyCatalog
  Password = "fdpassword"
  File Retention = 20 days
  Job Retention = 6 months
  AutoPrune = yes                     # Prune expired Jobs/Files
}

Let’s look at each section, or what Bacula calls a resource. Each resource has a unique name, and the configuration file may contain many resources of the same kind.

1.The Director resource defines the Director’s parameters. Name, Password, WorkingDirectory, and PidDirectory must be set. QueryFile specifies where the Director can find the SQL queries.

2.Job defines a backup or restore to perform. You will need at least one job per client. To simplify configuration of similar clients, create a common JobDefs resource and refer to it from within a Job. For example, if you have one set of defaults for desktops and another set for servers, you can create a Desktop and Server (these names are arbitrary and set with the Name attribute) JobDefs and refer to those two collections of settings from a Job.

3.A Schedule resource is referred to within a Job to allow it to occur automatically.

4.A FileSet resource defines which files are to be backed up. You can both Include and Exclude files.

5.Each Client resource details the clients that this Director can back up.

6.The Storage resource specifies the storage daemon available to the Director.

7.Pool identifies a set of storage volumes (tapes/files) that Bacula can write data to. Each Pool can be configured to use different sets of tapes for different jobs.

8.The Catalog resource defines the Bacula catalog (database) to be used.

9.Finally, the Messages resource captures where to send messages and which messages to send.

For convenience and readability, you can divide the file into subsections, say by resource type, place each subsection in a separate file, and reference each file from with lines of the form @/etc/bacula/ file-to-include.

Setting Up the Storage Daemon

Again, the Storage Daemon’s configuration file is named bacula-sd.conf. It defines four resources, or five if you have an auto-changer. A sample is shown in Listing Two.

LISTING TWO: A sample Storage Daemon configuration file


Storage {
  Name = server-sd
  SDPort = 9103
  WorkingDirectory = "/var/lib/bacula"
  Pid Directory = "/var/run/bacula"
  Maximum Concurrent Jobs = 20
}

Director {
  Name = server-dir
  Password = "password"
}

Autochanger {
  Name = "Overland Neo2000"
  Device = "Overland-1"
  Changer Command = "/etc/bacula/scripts/mtx-changer %c %o %S %a %d"
  Changer Device = /dev/sg1  # You may need to alter this - check the
                             # manual and your setup
}

Device {
  Name = "Overland-1"                        #
  Drive Index = 0
  Media Type = Ultrium-3
  Archive Device = /dev/st0
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
  Maximum Spool Size = 2000000000
  Spool Directory = /var/spool/bacula
}

Messages {
  Name = Standard
  director = loki-dir = all
}

1.Storage defines the name of the Storage Daemon. The Name, WorkingDirectory, and PidDirectory attributes are all required; others can be left untouched.

2.The Director resource specifies the name and password of the Director.

3.A Device resource characterizes the details of a device (say, a tape drive) to be used by the Storage Daemon. Listing Two has one Device, an Overland Neo2000 tape library with an auto-changer. If you are using an autochanger, check the Bacula manual for details on how to set this up and test it in stages.

4.The Messages resource directs where messages are sent (usually back to the Director).

Configuring the File Daemon and the Console

The File Daemon configuration file, bacula-fd.conf, is very straightforward. It contains the name of the Director you wish to connect to, and the client password (this should be different for each client, and needs to match the password in the relevant client section of the Director configuration). It also specifies the monitor to connect to (usually the same machine as the Director). Listing Three shows an example.

LISTING THREE: A File Daemon configuration file, bacula-fd.conf


Director {
  Name = server-dir
  Password = "password"
}

FileDaemon {
  Name = client-fd
  FDport = 9102
  WorkingDirectory = /var/lib/bacula
  Pid Directory = /var/run/bacula
  Maximum Concurrent Jobs = 20
}

The default Debian configuration also includes the line:

FDAddress = 127.0.0.1

Remove this line if you’re using a network installation to allow the File Daemon to bind to any available network address.

Restart the File Daemon anytime you alter bacula-fd.conf. (If you get a password error during backup, check that the File Daemon was restarted.)

The remaining configuration file, bacula-console.conf, is also simple. It contains only the name of the Director you wish to connect to, here server-dir, and the password for that Director.

Tapes and Other Medi

Bacula can read tape barcode information from your auto-changer. Insert the tapes, check which slots they’re in (run the command mtx-f /dev/sg1 status; the mtx script is part of the Bacula distribution), and then type label in the Bacula console. When prompted, provide the tape name (use the barcode), tape pool, and slot for each tape you wish to add. Labelling files or CDs/DVDs works similarly.

Running Backups

To interact with Bacula, you can use either a command-line interface, bacula-console, or a graphical interface, such as bacula-console-gnome. The graphical interface, even if you never use the graphical buttons, has the major advantage of being scrollable.

The same commands work in both interfaces, and running a backup or restore is very easy.

To see the current list of jobs, run status dir. To check the Storage Daemon, type status sd. The command list volumes is self-explanatory. And to cancel a command, type a . (period).

In general, Bacula runs your backups automatically, according to its configuration. However, once in a while (and for testing) you may wish to run a manual backup.

Type run and when prompted choose a client to backup. You will then be given the default settings for the backup. In most cases, the defaults are fine — the settings are the ones listed in the resource, and the time is immediately). To change anything, type mod. The most likely things to change are timing and the file set. After you finish your changes, type yes to schedule the backup job. That’s it!

Performing a Restore

Restores are equally easy. Type restore in the console and set the options for the restoration, such as all files from a specific client (this gives the most recent backup), all files from a client before a certain time, and so on. 7 is the option if you know the exact filename (8 if you wish a copy older than the most recent one). 5 gives you a directory tree to select from (or 6 for an older file copy). 5 and 6 are the most useful options.

If you choose 5, Bacula prompts you to choose the client and generates the directory tree so you can mark files for restoring. mark dir/ marks a directory with all its files. help provides assistance.

When you’ve marked all the files to be restored, type done. The default settings are shown, and are editable. For instance, you may want to change the default restore directory, or the client, say, if a machine has failed and you are restoring its data to another machine. When you’re satisfied, type yes and you’re done.

Maintaining Bacula

Once you have everything running, there is very little maintenance required. If you have a system that involves swapping tapes (or other media), the rotation must be done at appropriate intervals. Use a cron job to email you a reminder.

Unfortunately,while Bacula can read the barcodes to identify what tapes are in the auto-changer, it can’t remove the old tape information. You must create an SQL query to blank all the slot information, by adding the following to the /etc/bacula/scripts/query.sql file:

# 17 - change this number if necessary 
# according to the number of queries
# already in the file
:Remove all slot/InChanger assignments
UPDATE Media set InChanger=0, Slot=0;

Then, once you’ve changed the tapes, type query at the Bacula console to update the slots. Next. type 17 (or the appropriate number if different), and then update slots to get the new information.

If you add another machine, you need to add that to the configuration. Type reload at the Bacula console to reload the config on the fly. Bacula will send you email when backups do or don’t happen, and you can review these as well.

The underlying Bacula database does need a fair amount of disk space. You want to keep as much as possible, so give yourself lots of room. Similarly, the backup SQL dump of the database, which by default happens weekly, requires plenty of space.

Troubleshooting

A tape error can occur for several reasons. You may well be able to fix it by purging the volume (purge from the Bacula console. This deletes all jobs and files on the tape, so be sure that you don’t need the data! Obviously, in some cases, the tape may really be dead.

If you must restore from a tape that has been marked for recycling but not yet recycled, this can be done. Find the job number (by querying the SQL database directly), type ^run^, and choose the “Restore Files” job. You can then specify the job number to restore, and the bootstrap file that corresponds to the required machine.

If you’re running out of tape space, create another tape (by adding it to the changer and labelling it as discussed above), or purge an existing tape if you are certain you don’t need the data. One feature that can be slightly irritating is that Bacula won’t skip the problem job and move onto another; the whole lot is stopped until you fix the tape problem.

Bacula is an excellent network backup system- easy to set up, easy to use, scalable, and well-supported. About the only thing it doesn’t do for you is physically switch the tapes in and out of the autochanger!

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62