posted on 03:57 AM on Friday 15 August 2008

This page describes my backup strategy. I have a linux machine (fillmore) which I use to store all my data. It has more than 2TB of storage so backing up everything is clearly not going to work and I really hate to use DVDs for backups. I do not really think they are very reliable. To better manage the backup, I categorize the data into the following types:

  • Important data - data which are generated by me or which are important for my work. These can be further split into two types, one for dynamic data and one for static data. Dynamic data are important data which are still being worked on while static data are ones which are not going to be changed. This type of data would include all files generated during my work and important personal data like photos.

  • Normal data - these are mostly stuff which are obtained from external sources and which can be obtained again or are not essential. These are not backed up.

  • System data - these are data which are used by and came with the operating system. They can always be obtained via a new installation so there is really no point in backing them up other than for the reason of a fast system restore.


All of the dynamic important data is found on my macbook pro (ernest) and that is usually the most recent copy. Unison is used to sync the non-code data to fillmore. This results in 2 copies of the data in two different places. There is a delay between ernest and fillmore but the difference is typically not much. I sync quite often so this is not too bad a problem.


All my codes are kept in a versioning system called Monotone. Fillmore is typically the server while ernest checks in the codes. So again there are 2 copies of the codes and they are version controlled. Monotone allows checking in via network.


Rsync is then used to sync important dynamic data on fillmore to another partition. A cron job is used to sync the data on a regular basis. Static important data is also housed on fillmore and rsync is used to maintain 2 copies of the data on 2 different harddrives.


Rsnapshot is used to maintain weekly snapshots of the dynamic important non-code data. All these measures does not place the data in 2 different physical location other than data which I carry around on ernest. So this is probably the biggest drawback to this.