Secure encrypted backup using duplicity for Linux and Mac

Secure Encrypted Backup

By David Mytton,
CEO & Founder of Server Density.

Published on the 4th July, 2013.

I have been looking for a replacement alternative to Dropbox which I use on my Mac mainly for backups (I rarely use the sharing). The requirements were secure encrypted backup (where I control the keys) and “intelligence” so incremental backups could be performed i.e. not copying everything every time.

Searching for a Dropbox alternative

rsync logo

The first thing I considered was rsync. This is a well known tool which uses a clever algorithm to ensure that data is copied incrementally, after the initial backup. It can be channeled over SSH for security but doesn’t include any encryption and is really a synchronisation tool rather than backup. This is great if you want a direct Dropbox alternative – you could set up multiple machines to sync through rsync – however, I wanted to be able to restore files from a specific date, keep deleted files for a period and only sync one way.

The solution to this is rdiff-backup. It uses the rsync algorithm to ensure efficient backups but is designed as a backup tool with point in time snapshots. You can restore to a specific time and even get back deleted files (they can be purged after a period).

Secure encrypted backup

However, rdiff-backup is not encrypted. It’s not designed to be and so it falls to the user to secure their backups. I’d need to set up my own destination (e.g. an external disk or my own server) and configure encryption myself. With the availability of volume encryption like TrueCrypt or full disk encryption built into Ubuntu, it’s not a difficult task but there is some work involved with provisioning my own server, setting things up and making sure I don’t make a mistake. rdiff-backup is really designed for internal backups e.g. on your own, secured network to existing storage.

My preference would be to encrypt locally and upload to a public file hosting service. Luckily, the same author has created exactly this, called duplicity. It works exactly the same way as rdiff-backup using the rsync algorithm but encrypts everything locally before uploading to whatever destination you wish.

duplicity storage can be local, via SCP, rsync, WebDav, Google Docs, Amazon S3, Rackspace Cloudfiles, Ubuntu One and even IMAP. A full backup is initially completed then further differential backups ensure only the changed portion of files are copied, again using the rsync algorithm. The main advantage being that you can use “public” cloud type services such as Amazon S3 or Rackspace Cloud files without worrying about the files being compromised because they are encrypted using GnuPG.

Being open source and written in Python, I can also inspect the source code if I should wish.

Setting up duplicity on Mac

As a command line tool, duplicity works the same on every supported platform. On Mac, it’s available via Homebrew and MacPorts which makes it easier to install and solves all the dependancies.

Having installed MacPorts, install duplicity:


sudo port install duplicity

Generating your encryption keys

MacPorts will install GnuPG so you then need to generate some keys:


gpg --gen-key

Follow the instructions and your key is ready. Since all your backups are encrypted using that key, if you lose your passphrase or key file then you lose access to your backups. Remember your passphrase and export/backup your key:


gpg --export-secret-key -a > mykey.key

You can import your key again using:


gpg --import < mykey.key

gpg-list-keys

Backing up to Rackspace Cloud Files using duplicity

Having signed up for a Rackspace Cloud files account, you can simply run duplicity from the command line and it’ll go about encrypting and uploading your files. You need to give it your Cloud Files credentials which can be set as environment variables. From the command line:


export CLOUDFILES_USERNAME="usernamehere"
export CLOUDFILES_APIKEY="apikeyhere"
export PASSPHRASE="passphrasehere"

duplicity --exclude "**.DS_Store" --exclude "**.dropbox*" --verbosity i /Users/david/Dropbox/ cf+http://backup

This will backup the contents of my /Users/david/Dropbox directory to a new Cloud Files container called “backup”, which will be created if it doesn’t exist. It excludes Mac OS X and Dropbox meta data folders and outputs lots of info to the terminal so you can see what it’s doing.

On first run it’ll back up everything, then subsequent runs will check for changes and only upload the differences. It supports resuming even on the initial run so if you need to stop it halfway through, it’ll continue when it left off the next time.

Using Rackspace Cloud Files UK (or other regions)

Being in the UK, it makes sense to upload to the closest version of Cloud Files. Rackspace have several locations, one of which is London so I can specify the location by defining an additional environment variable before running duplicity:


export CLOUDFILES_AUTHURL=https://lon.auth.api.rackspacecloud.com/v1.0

The different location options are provided in the Rackspace documentation. Setting CLOUDFILES_AUTHURL also allows you to upload to other Cloud Files compatible locations e.g. your own OpenStack deployment.

Rackspace Cloud Files control panel

Backing up to Amazon S3 using duplicity

Sign up to Amazon S3 and then run duplicity from the command line, and it’ll go about encrypting and uploading your files. You need to give it your security credentials which can be set as environment variables. From the command line:


export AWS_ACCESS_KEY_ID="accesskeyhere"
export AWS_SECRET_ACCESS_KEY=secretaccesskeyhere""

duplicity --exclude "**.DS_Store" --exclude "**.dropbox*" --verbosity i /Users/david/Dropbox/ s3+http://dmbackup1111

This will backup the contents of my /Users/david/Dropbox directory to a new Amazon S3 backup called “dmbackup1111” (this name has to be globally unique), which will be created if it doesn’t exist. It excludes Mac OS X and Dropbox meta data folders and outputs lots of info to the terminal so you can see what it’s doing.

On first run it’ll back up everything, then subsequent runs will check for changes and only upload the differences. It supports resuming even on the initial run so if you need to stop it halfway through, it’ll continue when it left off the next time.

Backing up to Amazon S3 Ireland (EU-West) using duplicity

If you create the bucket yourself then you can specify the region (you still need the –s3-use-new-style flag) but if you let duplicity create it then you need to specify some additional options:


duplicity --s3-use-new-style --s3-european-buckets --exclude "**.DS_Store" --exclude "**.dropbox*" --verbosity i /Users/david/Dropbox/ s3+http://dmbackup1111

This will create the bucket in the EU region. See also the man page note on European buckets because there can be some weird behaviour.

Restoring files using duplicity

At some point you’ll want to restore files, which is an easy command:


duplicity --file-to-restore "David Mytton - Dublin Blue.png" cf+http://backup /Users/david/Desktop/image.png --verbosity i

Here I am restoring a single file called David Mytton – Dublin Blue.png from my Cloud Files backup to /Users/david/Desktop/image.png. It will restore the latest version of that file but I can specify a snapshot time using the -t flag with a formatted timestamp. The file name is relative to the root of the backup so if you have a directory structure, you need to specify the full path.

Other duplicity options

duplicity has various options to allow you to clean up your backups, list all the available files, etc. You can find these in the documentation.

Why not tarsnap?

Tarsnap is a very similar tool, but provided as a service which offers encrypted backups through Amazon S3. It’s well respected and used by a lot of people but is not open source (although you do download and compile the source yourself, it is not released under an open source license). All uploads go through the Tarsnap central server running on Amazon EC2, which is not freely released. This presents a potential single point of failure, especially since it’s run by an individual (who could disappear). Duplicity also provides time based restores whereas Tarsnap always restores the latest version and duplicity allows me to upload to a specific region, so I can get faster transfer times.

With duplicity, I now have a secure backup which I run as a regular cron job to keep my backup up to date. With complete control over the destination, storage location and local encryption using my own key, I can be sure my data is secure.

Free eBook: 4 Steps to Successful DevOps

This eBook will show you how we i) hacked our on-call rotation to increase code resilience, ii) broke our infrastructure, on purpose, to debug quicker and increase uptime, and iii) borrowed practices from the healthcare and aviation industry, to reduce complexity, stress and fatigue. And speaking of stress and fatigue, we’ve devoted an entire chapter on how we placed humans at the centre of Ops, in order to increase their productivity and boost the uptime of the systems they manage. What are you waiting for, download your free copy now.

Help us speak your language. What is your primary tech stack?

What infrastructure do you currently work with?

Articles you care about. Delivered.

Help us speak your language. What is your primary tech stack?

Maybe another time