Backup using Amazon S3

Learn how to use Amazon’s S3 to quickly, easily, and cheaply perform off-site backups.

Everyone knows it’s important to backup vital data, but
for a business, it’s an absolute necessity. The reality,
however, is that despite the imperative, far too few systems are
actually archived. Worse, many sites that do create archives store
the backups physically adjacent to the live data, either on a
separate drive or within the same facility. If disaster strikes,
both backups and the original data are both lost. Safety is just
one of the many good reasons to keep archives offsite — even
if the offsite is your home office or attic.

But don’t start cleaning out your attic yet.
Amazon’s Simple Storage Service (S3)
offers a quick, easy, and frugal way to store data remotely. S3
provides a simple web services interface to store and retrieve any
amount of data, at any time, from anywhere on the Internet, and
gives you access to the same highly-scalable, reliable, fast, and
inexpensive data storage infrastructure that Amazon uses to run its
own global network of Web sites. Part of Amazon’s Web
Services (AWS), the S3 API is simple and flexible.

The first thing to do is head over to "http://aws.amazon.com/s3" class=
and sign up for an
account. The fees are very reasonable — there’s no sign
up fee, storage is $0.15 per gigabyte per month, and bandwidth is
just $0.20 per gigabyte of data transferred. Moreover, the
piece-of-mind is invaluable. Once you have an S3 account, you can
use the service in any way you’d like: you can write your own
backup application or use one of the myriad solutions already

One backup package is s3sync. It
transfers directories between your local system and an S3
“bucket,” using a syntax that’s very similar to
rsync. (However, due to the way that S3
currently works, it doesn’t provide the full features of
rsync, especially the speed and bandwidth
savings.) s3sync is available from "http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz" class=
and is licensed under a minimal license. It’s only
prerequisites are Ruby and "i">OpenSSL. After unpacking the s3sync "i">tar file, the best place to start is "i">README.txt, which describes configuration and operation
of the tool.

class="story_image"> "http://www.linux-mag.com/images/2007-02/tech/linux-browse1.jpg"

If you’re not the command-line type, you may prefer a tool
such as Cockpit, a graphical "i">Java application that allows you to fully manage an S3
account. Licensed under the Apache 2 license
and available for download at
https://jets3t.dev.java.net/cockpit.html, Cockpit can be run as a
standalone application or as a browser applet. If you choose the
former, the package contains a script named "c">cockpit.sh. Run this script to start the application
— no installation is needed.

There are other Amazon S3 solutions, too:

( "story_link">http://www.backup-manager.org/) is a command-line
backup tool for Linux, designed to make
daily archives of your file system. The development version fully
supports S3, and the software is licensed under the "i">GPL.

( "story_link">http://www.jungledisk.com/, pictured) is a
proprietary application that uses a local "i">WebDAV server.[ Jungle Disk was lauded recently in
Linux Magazine. See "http://www.linux-mag.com/content/view/2632/" class=

( "story_link">http://brad.livejournal.com/2205967.html), written
by Brad Fitzpatrick, who also wrote the memcached and perlbal, is a
versioning backup tool written in Perl.

Keep in mind that some tools have unique storage formats, so an
archive created by one tool may not be easily available to another
tool. And while many existing solution are available, most of them
are in their infancy, so it’s important you test thoroughly
and pick the software that best suits your needs.

The Web page located at "http://developer.amazonwebservices.com/connect/kbcategory.jspa?categoryID"
66 contains a more comprehensive list of S3 clients, maintained by
Amazon. Used properly, S3 is an inexpensive solution for off-site

Use it wisely.

Comments are closed.