Tuesday, 13 September 2011

A tarsnap client script wrapper

I've been using tarsnap as my personal “cloud” backup solution for a while now and can warmly recommend it.

The tarsnap client tool (called tarsnap) feels and behaves just like you would expect of any respectable UNIX command: it has a well-written manpage; it has command-line arguments that you cannot remember, but which sort of agree with the conventions set by the elder UNIX commands. In fact, the tarsnap syntax is almost a superset of the tar command.

In other words, it's practically crying out to be used as a building block in some script; and that's exactly what I've done here.

backup.py is my convenient tarsnap wrapper (github).

The idea is that since you generally backup the same set of archives regularly, you want to define the contents for each archive somewhere.

backup.py makes this easy by looking into a single (configurable) directory and making an archive from each directory entry. If the entry is a directory, it uses that. If it's a symbolic link, it uses that. If it's a directory containing symlinks, it follows them. On top of that, you can define exclusions (archive everything in firefox-profile/ except for Cache) via the config file.

Since tarsnap has no concept of different version of backup archives, backup.py will append the current date (yyyy-mm-dd) to each archive. (This works well with tarsnap since it does deduplication, so you automatically only pay for the diffs between archives.)