« Lunch with Jason Gillmore at LinuxWorld Boston | Main | Bringing a friend to freedom, then shot by whom? »

March 2, 2005

What do you backup?

I'm facing this question on multiple fronts right now. For the book, have been hacking away at Chapter 17 on data backup and restore (after this one only three more to go!).

At work, we're in the process of bringing a bunch of new machines into service.

The real question is what filesystem backups are necessary to backup? This leads to a question about what cases would be find having a backup necessary. Any machine that has important data (test data does not qualify) not available and being backed up on another machine.

webservers - no
Data on our webservers is pulled from a central machine, both MySQL via the network and files via an rsync. In the instance of a compromize or failure, we'd most likely have the machine jumpstarted and resync any necessary data.

primary data and database server - yes
This machine contains our end user shell accounts (very limited), master instance of database, and a central copy of all non-database files. In the instance of a failure, we'd have to decide whether to restore everything from the backup. Obviously the user accounts would need to be restored, but the data will probably be more fresh pulling from a replicated server (whichever one stepped in as the primary server at the time of failure).

replicated database servers - no
The data for this machine comes from the primary database server, if one of these machines has a failure we'll always want to rebuild from the data on the primary database machine.

media servers - no
The data for the media servers is synced from the primary data machine. Again, if a failure occurs we'd repair or rebuild and then resync the data files.

logging server - yes
The system logs from all our machines go onto this server. A backup of that data is pretty important. On a machine failure we'd definitely restore from the backup to get the log history.

test servers - no
On test machines, the code is pulled from CVS and the data is pulled (periodically) from the primary database server backup. No need to use backup in failure. Rebuild or repair and then resync data from appropriate places.

development server - yes
CVS and all the developer accounts reside on this machine, definitely want that backed up. If our dev server goes down we want /home and /data restored from the backup.

We back up our MySQL database with mysqlhotcopy (actually, we have a replicated database that is backed up to not disturb the production database) every night, so if just the database goes down or there is a problem with the data we have a history of data files on hand.

Posted by mike at March 2, 2005 3:50 PM