Know Thy Backups – Part I

Ben KeenerMore often than not, server backups are misunderstood. With dozens of hardware options and hundreds of software options, finding the right backup can be intimidating. To assuage some of those fears and clear up a bit of that confusion, let’s go over a few of the most common backup schemes. This list isn’t all-inclusive, and the options presented shouldn’t be mistaken for backup plans. A backup scheme is simply a method of creating backups. A backup plan (or disaster recovery plan) is a scheduled implementation of a backup scheme. As we evaluate each scheme, we’ll look at the requirements, costs and benefits, and by the end of our tour, you can decide which best fits your business.

Before we get too far into the specifics of the different schemes, we should define some fundamental terms that we’ll use throughout the comparison:

  • An archive is a set of data that is being preserved
  • A reference point is a single archive against which comparisons are made
  • A restore point is the most recent working backup

The key question a backup scheme answers is this: “If a server suffers a catastrophic failure, what is needed to resume operations with minimal downtime and data loss?” Again, the backup scheme is not a complete disaster recovery plan — its focus is the restoration of data.

The four basic backup schemes we’ll compare are full-server backups, simple incremental backups, multi-level incremental backups and differential incremental backups. The primary considerations about the method that should be used are the server load generated by the backup process, the backup file size, and the speed with which a backup can be restored.

Full Server Backups

A full server backup is one of the simplest methods for a backup scheme. It takes only a single backup archive to create a restore point, which makes data restoration simple and fast. The drawbacks are the amount of time it takes to make the backup, the load it generates, and the total size of the backup. Each backup scheme we’re comparing uses a full backup of the server.

As we evaluate the other schemes, you’ll note they all start with a full backup as a reference point, and create their own restore points as they move forward.

Simple Incremental Backups

A simple incremental backup attempts to resolve some of the issues with full backups, and it does a good job. With an incremental backup, a single full backup is made that serves as both a restore point and the initial reference point. On subsequent backups, it becomes a little more complex. Instead of making a new full backup when it is updated, this scheme compares the current state of the server against the state of the server as it was in the reference point (the first full backup). If it locates any changes, it backs up those changes and generates a new snapshot of the drive as another reference point. This new reference point is then used for the next incremental backup.

This backup structure means the restore point on a server with this backup will consist of the initial reference point and all subsequent incremental backups that use this reference point. This dependency is the primary weakness in simple incremental backups: All of the backups — from the original reference point to the incremental additions recording changes from the reference point — must be uncorrupted and complete for the backup to fully restore the data. If any backup is missing, corrupt or incomplete, the restoration can’t be completed.

The server load created and storage space required for this type of backup is generally less than what you’ll see in a full backup scheme, especially when there aren’t many differences between the backup point and the reference point. On the other side of the spectrum, if the entire data set changes between backups, the storage requirements and server load will be the same as they were when full backups were being performed.

Example: Simple Incremental Backups

I am implementing incremental backups for a database that houses all of my users’ data. I decide I am going to start with a full backup each Sunday — the slowest day of the week for the database — and do an incremental backup on each subsequent day. This process starts over again every Sunday. On Friday, my server suffers a catastrophic hard drive failure. I am told by the technician who replaced the drive that the controller failed, and the heads were idly tapping the side of the drive cage. Everything on the drive is lost.

I gather my backups and begin to restore them on the new replacement drive. The backups from Sunday, Monday and Tuesday restore without a hitch, but Wednesday’s backup is corrupted and will not complete. This means I have lost all of the data from Wednesday and Thursday. Without Wednesday’s backup, the rest of my incremental backups are useless.

There are two incremental backup schemes that attempt to address this issue: the differential and the multi-level incremental backup schemes. In Part II of “Know Thy Backups,” we’ll explain the pros and cons of these methods, and you’ll be ready to plan your backup strategy.

-Ben

StumbleUpon
Twitter
DZone
Digg
del.icio.us
Technorati

Related Posts

Comments are closed.