Tuesday Aug 05, 2008

'Timemachine'ish backup with ZFS and rsync

Apple MacOS X Timemachine is a nice piece of software. However it does not compress the data, and it only works on MacOS X. I was looking to a similar solution that works also on other Unix Systems (as well as MacOS X) and does transparent compression of the data. The idea is to have multiple backups on one disk, each backup showing the state of the source hard disk or directory at the time of the backup, browsable by the normal file system tools, without storing any data duplicated. I found a solution using the ZFS file system and 'rsync' (rsync is pre-installed on most Unixish operating systems).


My tutorial is for MacOS X, but it can be adapted to any of the Systems that support the ZFS file system

Step 1: preparing a ZFS file system

I used the steps from the ZFS on MacOS-Forge site to create a ZFS pool on an external USB drive: Finding the disk:
# diskutil list . . . /dev/disk2 #: type name size identifier 0: Apple_partition_scheme *9.4 GB disk2 1: Apple_partition_map 31.5 KB disk2s1 2: Apple_HFS FW 9.2 GB disk2s3

writing a GPT label on the external disk (be sure to not format your 'main' disk here!):

# diskutil partitiondisk /dev/disk2 GPTFormat ZFS %noformat% 100% Started partitioning on disk disk2 Creating partition map [ + 0%..10%..20%..30%..40%..50%..60%..70%..80%..90%..100% ] Finished partitioning on disk disk2 /dev/disk2 #: type name size identifier 0: GUID_partition_scheme *9.4 GB disk2 1: EFI 200.0 MB disk2s1 2: ZFS 9.0 GB disk2s2

createing a ZFS pool on the disk called 'macbook-backup':

# zpool create macbook-backup /dev/disk2s2

enable compression on the new pool and disable ATIME:

# zfs set compression=on macbook-backup # zfs set atime=off macbook-backup

the hard drive is now prepared.

Step 2: creating the first (the 'base') backup

now I create the first full backup, which I call the 'base' backup. For this I create a new file system called 'base' in the ZFS pool 'macbook-backup':
# zfs create macbook-backup/base

next I copy all files from my backup-source directory (or the whole source disk) to the backup:

# rsync -avh --progress --delete /Users/myuser /Volumes/macbook-backup/base/

depending on the size of the data to backup, this will take a while.

Once the backup is finished, we can access all files under '/Volumes/macbook-backup/base'. With the ZFS command I can check the compression-ratio of our backup:

# zfs get all macbook-backup/base NAME PROPERTY VALUE SOURCE macbook-backup/base type filesystem - macbook-backup/base creation Wed Jan 31 9:08 2007 - macbook-backup/base used 2.21G - macbook-backup/base available 1.76G - macbook-backup/base referenced 2.21G - macbook-backup/base compressratio 1.38x - macbook-backup/base mounted yes - macbook-backup/base quota none default macbook-backup/base reservation none default macbook-backup/base recordsize 128K default macbook-backup/base mountpoint /Volumes/macbook-backup/base default macbook-backup/base sharenfs off default macbook-backup/base shareiscsi off default macbook-backup/base checksum on default macbook-backup/base compression on local macbook-backup/base atime off local macbook-backup/base devices on default macbook-backup/base exec on default macbook-backup/base setuid on default macbook-backup/base readonly off default macbook-backup/base zoned off default macbook-backup/base snapdir hidden default macbook-backup/base aclmode groupmask default macbook-backup/base aclinherit secure default macbook-backup/base canmount on default macbook-backup/base xattr on default

Step 3: creating an incremental backup

Now, a few weeks later, I want to make a new, incremental backup. So I create a new snapshot of the base file system and then clone that snapshot into a new file system:
zfs snapshot macbook-backup/base@20080804 zfs clone macbook-backup/base@20080804 macbook-backup/20080804

The directory for my new, incremental backup will be '/Volumes/macbook-backup/20080804'. So far the new file system does not use any space on the hard drive. Now I do the new backup with 'rsync':

# rsync -avh --progress --delete /Users/myuser /Volumes/macbook-backup/20080804/

The new backup will only take as much new space on the backup hard drive as there were changes compared to the base backup. But still I am able to browse through the file system at '/Volumes/macbook-backup/20080804' and see all files that were available at the date of the 2nd backup.

Any subsequent snapshots for more backups will be done from the 20080804 file system.


Post a Comment:
  • HTML Syntax: Allowed