Backups are a technology or process that everyone -- everyone! -- needs to consider. This article looks at some on-line backup options for Linux that can apply to the spectrum of home to enterprise-class users.
One of the more popular of this class of on-line backup service is called Dropbox. Dropbox advertises itself as a synchronization service rather than an on-line backup service but it does provide both file sharing and on-line backup services. One feature that it offers that is somewhat unique is that it can track changes in a file and keep separate versions. It has a 30-day history allowing undo’s to be performed within that 30-day time limit.
Dropbox uses a single directory on your desktop. If you want a file copied to Dropbox you need to copy it or move it to a single directory ($HOME/dropbox by default). However, you can create subdirectories and you can control access to them from the Dropbox GUI including public folders. But public files are only viewable by people who have a link to the file(s). Public folders are also not browsable or searchable. But because of the limitation of a single directory, Dropbox is perhaps not as well suited for on-line backups as other approaches.
Dropbox has Windows and Mac versions as well as Linux. For Linux it has versions for Ubuntu 7.10+ and Fedora Core 9+. But they have had reports of it successfully running Dropbox on Debian, OpenSUSE, Arch Linux, Gentoo as well as several other distributions of Linux. If you are an iPhone or Blackberry user (even an iPod touch), you can also get a version of Dropbox that allows you to access your files (great way to upload photos from you mobile device).
The system requirements are fairly modest. You need 512MB of memory along with free space on your system that is equal to your space on Dropbox. The software requirements are:
- GTK 2.12 or higher
- GLib 2.14 or higher
- Nautilus 2.16 or higher
- Libnotify 0.4.4 or higher
Dropbox also has Ubuntu repositories for updating their tool.
Dropbox has two components: a closed-source daemon called
dropboxd. It makes sure that your $HOME/dropbox directory is synchronized with your other systems that are accessing the account as well as the Dropbox servers. The second component is an open-source application that uses nautilus. It is named nautilus-dropbox and connects to
dropboxd via two Unix domain sockets and is the GUI into Dropbox.
All transmission of file data and metadata to/from Dropbox occurs over an encrypted channel (SSL). In addition, all files stored on Dropbox are encrypted (AES-256) and can only be accessed with your account password. In addition, Dropbox employees are not able to view any user’s files. Moreover, Dropbox only transmits the parts of the files that have changed saving bandwidth. You can also control the bandwidth to/from the Dropbox GUI.
Dropbox allows you to have up to 2GB for free with an unlimited time limit. For greater than 2GB of data and less than 50GB, they charge $9.95/month. For 50GB to 100GB they charge $19.95/month.
Installing Dropbox is very easy. You just download the package and install it. This was tested on a simple laptop with Ubuntu 8.0.4. You just use the following command to install the .deb file:
# dpkg -i nautilus-dropbox_0.6.1_i386_ubuntu_8.04.deb
After this you start up the application (it was inserted into the menu options). You create an account (very easy to do), and then start up the nautilus application as shown below in Figure 1.
Figure 1 – Screenshot of Dropbox Application at Startup
At this point you just copy files in the $HOME/Dropbox folder and they automatically get copied to your folder on the Dropbox servers. You can either manually copy the files into the folder using “cp” or you can use the graphical browser as shown in Figure 2 below.
Figure 2 – Screenshot of Dropbox Application with Files
Using Dropbox is very easy to use either from a command line or via the GUI. You could even use symlinks to link files into $HOME/dropbox. A quick Google search will return a large number of positive reports of using Dropbox with Linux.
SpiderOak is another very popular on-line backup service for Linux. It too has a custom client for performing backups. But unlike Dropbox, you are not limited to a single directory for backups. Rather, the client allows you to specify which files and/or directories are to be backed up. Consequently it is more like a backup tool than Dropbox. But you can use it for collaboration and/or synchronization. This includes the ability to share documents with friends of family or with co-workers. Overall Dropbox likes to say that they have 5 major capabilities: (1) Backup, (2) Sync, (3) Sharing, (4) Access, and (5) Storage.
SpiderOak saves files using versioning with a date-stamp, so you can can always roll back the clock to previous versions (this is true for the “life” of the file not just a limited time period as in the case of Dropbox). It saves only the differences in the files using de-duplication and compression, greatly reducing the storage space (and saving you money). Even better, there is a “Recycle” bin so that if you erase files you have a window in which to retrieve them.
SpiderOak has gone to great lengths to ensure security. The password that you create is used to generate encryption keys for the data. But the password is never stored on SpiderOak’s servers so even SpiderOak cannot look at user’s data. The data is then encrypted during transmission and during storage. SpiderOak uses a layered approach to encryption using a combination of 2048 bytes RSA and 256 bit AES. The details of the encryption scheme is described in this link.
SpiderOak also allows you to define “Sharerooms.” These are the locations that allow you to place data to be shared with friends or co-workers. They are easy to create and manage via the SpiderOak GUI.
The requirements for the SpiderOak tool are fairly small:
- 300 MHz Processor
- 92 MB of RAM
- 75 MB of free disk space
They have versions available for: Debian Etch, Debian Lenny, Fedora 10, Mac OS X, Slackware 12.1, Ubuntu Gutsy Gibbon, Ubuntu Hardy Heron, Ubuntu Jaunty Jackalope, and Windows.
The pricing for SpiderOak is also very good. The first 2GB is free for an unlimited period of time. After 2GB it costs $10/month for up to 100GB. It then costs $10/month per 100GB. They also have discounts for larger amounts as well as pricing for geographically different sites (if you want that kind of dispersal for redundancy).
Installing SpiderOak is very simple. Much like DropBox, you download the .deb file and install it.
# dpkg -i spideroak_9530_i386.deb
Selecting previously deselected package spideroak.
(Reading database ... 256598 files and directories currently installed.)
Unpacking spideroak (from spideroak_9530_i386.deb) ...
Setting up spideroak (9530) ...
* Reloading system message bus config... [ OK ]
Processing triggers for man-db ...
Processing triggers for menu ...
Processing triggers for libc6 ...
ldconfig deferred processing now taking place
At this point you can start the application. In the case of the Ubuntu laptop, a menu item is inserted in the “Internet” menu. This allows you to start the SpiderOak application.
When you run the application for the first time, you will be prompted to create an account as shown in Figure 3 below.
Figure 3 – Screenshot of SpiderOak Application when Creating an Account
Once you create the account you get a quick message about how SpiderOak is creating the encryption keys and how to contact SpiderOak support in case of help as shown in Figure 4 below.
Figure 4 – Screenshot of SpiderOak Application After Creating an Account
Once the application starts you are presented with 5 tabs:
This is show below in Figure 5.
Figure 5 – Screenshot of Main Screen of SpiderOak Application
These tabs control how you will create backups and on-line sharing.
For backups, you use the “Backup” tab to define what files and/or directories you want included in your backup as shown in Figure 6.
Figure 6 – Screenshot of SpiderOak Defining a Directory for Backups
In this case a single directory has been chosen for backup. At this point, the backup begins since the files on the SpiderOak are out of sync. At the bottom of Figure 6 you can see that the sync operation has begun and 364MB of data has already been uploaded.
During the syncing process, you can click on the “Status” tab to find out the status of backup process and get details as shown in Figure 7.
Figure 7 – Screenshot of SpiderOak Status
SpiderOaks is probably the most complete backup service available for Linux as you can tell from these simple screenshots.
The technology of on-line backups has gotten very mature even for Linux on the desktop which has a lower market share than Windows and OS X. It is now very easy to define on-line backups using standard Linux backup tools such as
rsync or to use tools from the backup provider. In addition, there are a number of on-line backup providers to chose from, that offer various features and capabilities.
The advent of large on-line storage capabilities such as Amazon’s S3 has created a very large pool of storage that is relatively inexpensive. Even if you have a full 250GB hard drive in your desktop or laptop, you can backup the entire drive for as little as $30/month. The price structures vary by provider ranging from paying for access and bandwidth to/from the servers, to paying for how much data you actually store.
One of the obvious concerns with on-line backups is the security of the data. Many if not all of the on-line providers use encryption during transmission and storage. In fact, most of them have no idea how your data is encrypted so they cannot even examine it! If a provider does not have encryption capability, you can easily encrypt the data before performing a backup.
In addition to security, another concern is how safe is your backup. Many providers work very hard to make sure that their servers do not fail. For example, SpiderOak claims 0.0000% failures (4 nines). Many providers also have the option of geographical disbursed sites so that you actually have multiple copies of your data (this is usually an extra fee).
This article is just a quick overview of four providers. It is not intended to be complete and comprehensive, but rather it is intended to wet your appetite to investigate this capability on your own. If you have a favorite provider that is not listed here, please tell us about it in the comments section. Tell us how you use them and why you like them. In the meantime, if you aren’t using an on-line backup provider, they are definitely worth considering. If you aren’t doing any backups at all, then on-line providers are an easy way to perform backups without having to buy extra hardware.
Jeff Layton is an Enterprise Technologist for HPC at Dell. He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales (but never during working hours).