Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

Backup and Recovery

Last updated Mar 28, 2003.

If you’re looking for more up-to-date information on this topic, please visit our SQL Server article, podcast, and store pages.

It usually only takes one bad experience with a database to make you realize the importance of backups and recovery. In my case I was the Database Administrator for a small development firm, and I also wrote the system installation and upgrade code. One of the clients of the software experienced a physical problem with an upgrade, and contrary to instructions stopped the process mid-stride. The administrator tried (once again, against the instructions in the application documentation) to re-start the upgrade after the failed attempt, not once, but three times. This of course left the system in a wholly unusable state, and they contacted us.

As part of the instructions I had written that the administrator should take a complete system backup before attempting the upgrade, which is a practice you should follow even without direct instruction. But the administrator didn't do that.

The unfortunate part of the whole situation is that after a lot of finger pointing they finally agreed to rebuild their system and restore the database from the last backup. But when they tried that, the DBA found a problem with their tape drive, and even after replacing it they found the last six months of backups were not valid. The company suffered huge financial problems, the DBA and the Administrator lost their jobs, and I changed my installation and upgrade code to automatically take a backup whether the administrator checked off that it had been done or not.

In this tutorial I'll explain how backups and restores work, and in others I'll explain the strategies you can use to implement these two commands. A good backup and recovery strategy is one of the first things you should work on in any DBA role.

Database Physical Architecture

Before I cover the details of backup and recovery, you need a little background information on how SQL Server databases work. I’ve covered this material in other articles, but it’s worth noting here again.

Logs and Files

SQL Server is made up of two basic components. The first component is the file set that SQL Server uses. The second is the software engine and tools used to control those files.

The files are broken down further into two camps. The first camp is the set of database files. There can be more than one database file, used for performance, size or safety reasons. They normally end up with an extension of .mdf.

The second camp is the set of transaction log files. And just what is a transaction log? Whenever a user writes or edits data, SQL Server first records the change in a sequential fashion in a special file, called the transaction log. This data is then written to the database file or files by another process.

Why all this writing and re-writing? It has to do with safety. The most important thing a database server does is guarantee the integrity of the database. If data were written directly to the database, an error could cause the entire file to become corrupt. The transaction log provides an "airlock" to make sure the data is kept clean in the database files.

Another benefit in using a transaction logs involves performance. If you physically separate the drive where the log files are stored from the database file drive, then the data being written doesn’t slow down the data being read.

Yet another benefit in using a transaction log is that you have a record of each transaction. As the data is written to the database files, a "recovery model" setting determines when the data is erased from the transaction log file. (I’ll explain those models a little later.) If the log files are kept intact, just having the last backup and the transaction log can help you recover the database to current data. Think about it: the backup has the data from the database files, and the transaction log has the changes since that time. Keep those things safe, and you can recover the data even if you lose the disk on which the database files are stored.

Now that you have a feel for how the server works (at a high-level), we can continue our discussion of backups and recovery.

SQL Server Backup

The SQL Server backup program is built right into SQL Server — you don’t need to buy anything else. To be sure, you can get other backup software that can back up a SQL Server database, but you don’t have to.

The backup that SQL contains uses the Microsoft Tape Format (MTF) so it’s compatible with the backup built into the Windows operating system.

You can also back up a database to a file on the hard drive or to a share location on the network. I have a pointer to more information about that at the end of this article.

SQL Server also has support for the Vendor Device Interface (VDI) that allows "instant" or snapshot backups. I’ll explain that concept a little further on in this article. Another important fact is that SQL Server backups are incredibly fast. The restore process is a bit slower, but backups are really quick.

SQL Server 2000 allows real-time backups. That means that you don’t have to keep users out of the database while you’re backing it up. Your backup is current as of the time the backup ends. If the backup starts at midnight but takes two hours to complete, you’ll have the data current as of 2:00am.

I’m often asked if the backup file is compressed — the answer is no, at least with the native backup commands when you're using a version lower than SQL Server 2008 (where it is an option). If you are going to backup the database to the hard drive, it’s important that you have at least as much room on the hard drive as the database files consume.

All SQL Server editions support the major features of the various backup and recovery scenarios I’ll describe in this article. The primary backup difference between SQL Server editions is the Log Shipping option, available only in the Enterprise edition. This feature allows you to backup the transaction log and apply it to another server, providing a "warm" standby.

You can password-protect a backup. This makes sure that no one can restore your backup to another server. This also means if you forget the password, you can’t restore it, so make sure you keep this password safe.

You don’t have to do anything beforehand to restore a backup. You don’t have to create any files, services or anything like that. You can restore to a database that is already there (as long as no one is using the database), or you can use the restore process to create an entirely new database.

Finally, you can use SQL’s Enterprise Manager or SQL Server Management Studio to point-and-click your way to backup and restore commands, or you can type them in directly using Transact-SQL (T-SQL). I’ll use T-SQL commands throughout this article, since the graphical tools carry the same concepts.

To help you develop your comprehensive backup and recovery plan, I’ll explain two key concepts: recovery models and backup types.

Recovery Models

Every SQL Server database has a recovery model. The recovery model setting has to do with how the transactions are handled in the transaction log, and when the log entries are erased. There are three models: Full, Simple, and Bulk-logged.

Each of these models has advantages and disadvantages, and as you make your choices it’s important to keep the tradeoffs in mind. With the recovery model, it’s a tradeoff between backup speed, size, safety, and ease. Each model has a mix slanted in favor of one or more of these features. You should understand what each model means so that you can make the right choices for your situation.

Full Recovery Model

This is the most comprehensive model there is. As the user enters or changes data, each operation passes through the transaction log, just as always. The difference is that when the data is written to the database file, the entry in the transaction log is not erased.

You might be thinking that the data has to get erased at some point, and you’re right. Whenever you back up the database or the database log (more on that in a minute), the entries that made it into the database since the last backup or log backup are "truncated" from the log file.

As you can imagine, you need to back up the logs or database fairly often, especially if you have a lot of data entry going on.

As you can probably guess, this model scores high on safety, but lower on speed, ease and size. This model is more difficult to implement during restores, since you need the last full database backup, and then you need to sequentially restore each transaction log backup in order to restore the database to a full state.

There’s an obvious data safety advantage to the full model. But there’s also the fact that, since you have the base data in the database and all the changes in the log files, you can restore the database to a point-in-time. A special qualifier at the end of the restore command allows you to set a time to which you want the database to be restored and voila! Your data stops at that time mark.

Another recover-to-mark is possible with this model using a "Named Transaction." When you want to set a mark like this in the transaction log, issue this command:

BEGIN TRANSACTION transactionlabel WITH MARK 

And your data will have a section with a mark. End the transaction block with this command:

COMMIT TRANSACTION transactionlabel

Now you can use the command RESTORE LOG using the WITH STOPATMARK=’transactionlabel’ clause to restore to just after the mark. To restore the data to just before the data mark, use RESTORE LOG and the WITH STOPBEFOREMARK=’transactionlabel’ clause.

Simple Recovery

The Simple recovery model is just what its name implies — easy, with low log space and maintenance requirements. In this model, the transactions are erased from the log immediately after they are committed to the database. As you can imagine, this method isn’t super-safe. The advantages with the simple model are the ease, speed and space overhead issues.

With the simple recovery model, you still have a transaction log. It just gets truncated so often that you can’t recover the database to a point in time with it.

Bulk-Logged

The next type of recovery model is Bulk-Logged. This model has high performance and low log space requirements. It is similar to a mixture of the simple and full models, as some operations are still logged, such as the CREATE INDEX, Bulk Load, Select Into, WRITETEXT and UPDATETEXT statements.

This model is useful when you have large import jobs that you run. If you can reproduce that data, you can lessen the load on the log files with this model. Many people take a full backup after each large load process, and pick up the OLTP data with log backups.

Backup Types

The next important concept to understand is the type of backups that SQL Server supports. We’ll take a look at each, and then I’ll show you how to combine the recovery model with the backup type to create your recovery plan.

Full Backup

The full backup operation backs up all the data in the database. Every table, stored procedure and all other objects in the database are placed into a single backup file on a hard drive or tape. This operation also truncates the transaction log, and a separate log backup isn’t necessary.

Log Backup

The log backup operation makes a backup file of all items in the transaction log, and then truncates the log. This operation is normally done throughout the day using what is called your "tolerance level." That means the total delta of time that your company finds an acceptable loss. Normal intervals are anywhere from six hours down to an hour. Unless your system is really high-volume, much more than that makes the restore process pretty ridiculous. If your system is that heavily used, you should consider a cluster.

Differential Backup

The Differential Backup is very useful. Even if you’re in the simple model, this will gather all changes since the last full backup into a backup file. It doesn’t use the transaction log to do this; it uses a mark in the database that shows what’s been backed up and what hasn’t. This means that the differential database backups will grow over time until the next full backup.

The differential backup is kind of like a huge log backup. If you’re using the simple model, I advise that you use this kind of backup at least once a day, in addition to the regular complete backup you make at night.

Filegroup Backup

This type of backup is used for really large systems. When you have some big databases (like into the multi-terabyte range), you might not be able to complete the backup in a realistic time period. If this situation occurs, you can segment the database objects into separate filegroups. You then use this type of operation to back up the filegroups, one at a time, to bring down the time requirement.

Snapshot Backup

The snapshot backup is used in conjunction with hardware and software vendors to do an "instant backup and restore." If you’re not sure if you have that type of hardware or software, you probably don’t. If you do, see your hardware vendor’s documentation to find out how they’ve implemented this operation.

With those two concepts in hand, it can still be a bit daunting to figure out which backup type goes with which recovery model. This chart will help you sort it out.

Recovery Model/Backup Type Matrix:

Recovery Model

Backup Type

 

 

 

 

Database

Differential

Log

File

Full

Required

Optional

Required

Optional

Bulk-Logged

Required

Optional

Required

Optional

Simple

Required

Optional

N/A

N/A

I’ve been telling you about the backup plan. Now I need to let you know how to do the restore part.

Restore

Implementing the restore process is much simpler than planning for the backup process, since the restore type is totally dependent on the type of backup you’re trying to restore.

The basic command for the restore of the main database backup looks like this:

RESTORE DATABASE nameofdatabase
FROM devicename
WITH lotsofoptions

For instance, a simple restore of a database that already exists, where the backup is a file on the hard drive, looks like this:

RESTORE DATABASE pubs 
FROM disk=’c:\temp\pubs.bak’
WITH REPLACE

Basically, those commands say to restore a database called pubs from a file on the hard drive called c:\temp\pubs.bak and to replace the current database called pubs. As I mentioned earlier, no one can access the database during the restore. This set of commands also assumes that the files in the backup are the same as what is possible on the destination system. For instance, if you backed up a database on a system where the database or log files are stored on drive J:, the example command above assumes that you have a J: drive on the restoring system as well, with the same subdirectories where the files were originally stored.

If you’re in the Full recovery model, you normally restore the full backup and then each log since that backup in turn.

As an example, let’s look at the following situation. Then I’ll use that example situation to show you the restore commands I would use to use to bring a database up to date.

Let’s assume you have backed up the pubs database as of last night at midnight. It’s a heavily used database, so you’ve backed up the transaction logs every three hours since then. It’s now 9:30 AM, and after a hardware crash you want to restore the database to be as current as possible.

The first thing to do is restore the full database backup. You want to notify SQL Server that it isn’t to close out the restore process, as you’ve got a few more transaction log backups to apply after the main backup is restored. To do this, issue the following commands:

RESTORE DATABASE pubs 
FROM disk=’c:\temp\pubs.bak’
WITH REPLACE,
NORECOVERY

That last bit (WITH ... NORECOVERY) is the magic part. It tells the database to wait for another backup.

Next you’d locate the first transaction log backup (assume that it’s called 3AMLog.bak) and type the following command:

RESTORE LOG pubs 
FROM disk=’c:\temp\3AMLog.bak’
WITH NORECOVERY

Notice that although you specify that you’re restoring a log file, the first variable is the database name to which the log belongs.

Again, you specify the WITH NORECOVERY qualifier so that you can continue to restore more logs. Now, assuming that you’ve got two more transaction log backups called 6AMLog.bak and 9AMLog.bak, you’d issue the following commands:

RESTORE LOG pubs 
FROM disk=’c:\temp\6AMLog.bak’
WITH NORECOVERY
GO
RESTORE LOG pubs 
FROM disk=’c:\temp\9AMLog.bak’

And you’re all set. Notice that the last line doesn’t have the WITH NORECOVERY qualifier, which closes out the restore process and marks the database ready for use.

There are quite a few other qualifiers and options for the RESTORE command, and you can find those in Books Online. With what you’ve learned in this article, you should be able to put that command to good use.

And use it you should. The biggest mistake most DBAs make is not to practice restores. This process is called validation, and is the mark of someone who’s been burnt by not doing it. I advise that you restore your backups to another server at least once a month or more frequently, just to make sure they work, and also to give you the confidence you need when the real thing comes along. When you have a real meltdown, you’ll be glad you practiced a restore operation.

Backups are the best insurance you have as a DBA — but ensure that the media is good.

InformIT Articles and Sample Chapters

I didn't talk a lot about the locations for backup in this tutorial. My friend Richard Waymire does that in this article.

Online Resources

SimpleTalk also has a great series on SQL Server Backups. You can check those out here.