Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Developing an Exit Strategy

Last updated Mar 28, 2003.

We're all familiar with Murphy's Law: If something can go wrong, it will. As a DBA, I've seen this proved out time and time again. But perhaps that's too negative a view on the world. I like the quote I heard long ago: "I'm an optimist who carries an umbrella." I think that should describe the professional DBA: Someone who has a positive, can-do attitude with lots of confidence – and a backup.

I was recently writing a chapter for my latest book when I got to a section on upgrading a system. I put a small section there to highlight some of the things that you should think about when you're upgrading a platform from one version to another. I call that kind of preparation an Exit Strategy. An exit strategy involves planning what to do if things don't go well. This kind of plan goes by other names as well, such as disaster recovery, emergency planning and change management, but I differentiate an exit strategy from those kinds of plans because of when they are created, and what it contains.

I create an exit strategy whenever I touch a system and that touch might result in a change. Obviously upgrading an entire platform from one version to another fits that description, but I include an exit strategy even if I'm just going to do any monitoring that requires me to make any changes to the system to do so. I create an exit strategy not only for database changes, but also for any change that might affect the system where the database is stored, such as patches, security fixes, and hardware changes.

The exit strategy is a formal process. I make sure I record any changes I intend to make, and what I plan to do if they don't work out. I communicate this plan to as many people involved with the system as will listen. I do this to protect myself ("Nobody said you were going to do this"), but I also try to involve as many people as I can in the change process because I want the change to succeed. Many times when I send out an e-mail or call a meeting to discuss the change someone speaks up and says "Are you aware that they are running program X on there?" or "Did you know they took memory out of that system last week?" I always find out more when I do the discovery. Every time that I don't get the feedback, there's a little "gotcha" that makes life hard for everyone.

So how do you create an exit strategy, and how do you format it to get feedback from everyone? I've put together a few questions you should answer to help you create a plan to do just that. These are the questions that a manager should ask, but often doesn't.

What is the Change?

First, you need to define what the change is in two ways: a high-level, business-language description, and a technical description of exactly what you're doing, step by step.

These descriptions help you think through all of the steps in the change, and they help everyone understand why they are important. They can be as simple as a paragraph or two in an e-mail set for wide distribution, or they can be military-style documents with executive summaries and paragraph numbers. Your organizational document should set the tone.

I'm a big fan of keeping everything electronic, and I'm also fond of a single copy. That means I try to post everything on the web, with an e-mail notification sent with a link to that page. Depending on the impact of the change, I'll also have an electronic "sign-off" where I ask application or hardware owners to give their blessing to the change.

What is Affected?

This might seem obvious – at least at first. "I plan to apply the latest security patches to the XYZ database server", you think to yourself, "...so that's all that is affected." In fact, the security patch might make material changes to the Microsoft Database Access Components (MDAC) layer. If it does, then the effect on the server might be a significant change in the application that talks to the server over that layer.

In the case of hardware, you have to be specific about what parts will be changed and their level of regression testing for the system. Notifying others enables you to leverage their knowledge about impacts that you aren't aware of.

This might bring up a serious deficiency in your organization. Do you know all of the applications that use your database servers, and do you know who owns them? Have any "special reports" been written against the database using Office applications (shudder) or other interfaces? Is the security and access for the database well documented? Your exit plan might cause others to do a little work first, but it's worth it when something goes wrong.

How is it Affected?

Your exit strategy now contains a description of the change and the items that are affected, but you also need to explain how it is affected. Is this a replacement, an enhancement, or a fix? Will it cause the system to operate in some new way or have new requirements? In the case of software, does this change require a higher service pack, which itself is another change? In the case of hardware, does this have any new power requirements? Will additional backup storage be needed?

What Will You Do If It Doesn't Work?

This section is the "Exit Strategy" heart. All of the other items lead up to this part of the plan.

As you describe each change in the system, you should include a chart that shows what is affected and how it is affected. You'll also need to include a risk factor. This is a number that estimates the extent of damage a corruption in that change involves. For instance, one change might be shutting the system down to install more memory. The risk there is minimal, since most of the time shutting a system down doesn't impact whether it will come back up again. But the step where you install the memory might entail removing old memory to put the new in. If you don't protect against Electro-Static Discharge (ESD), you could damage both the old and new memory. Even though the risk is remote, it's the highest in the process, so it gets a higher number.

Beside that information goes what you'll do to mitigate that risk (wear an ESD wristband, for instance) and what you'll do if it fails anyway (Take memory from a non-critical server). That's the core of your exit strategy, and here's what it looks like:

Process:

Change Memory in Server XYZ

 

 

 

Step

What is Affected

How is it Affected

Risk Factor

Exit Strategy

Notify Users

Application X

Down

0

Current Backup

Bring Down Server

Server, Application X

Down

1

Determine reason for failure, correct with binned parts

Remove Old Memory

Hardware

NA

3

Replace with memory from server X

Replace Memory

Hardware

NA

4

Replace with previous memory

Restart Server

NA

NA

1

Determine reason for failure, correct with binned parts

Test Application

Application X

Data checked

1

Current Backup

Notify Users

Application X

Data operational

1

Current Backup

How Long Will It Take To Implement That?

You'll need to add one more column to the previous chart. You need to tell everyone how long to anticipate that each step will take. For most of the steps it's an easy sell because you have a general understanding of how long it takes to shut down or bring up a server. In other cases, such as a restore operation, you might not be sure. Keep in mind that a backup is much faster than a restore, often by two times or more.

What you're doing is setting expectations. I've had managers push to implement a change such as a software upgrade in the middle of the workweek. When things go awry, they are shocked to find that even with a good plan to restore the system, it will be down for over 8 hours, simply due to restore time. Alerting them to this risk ahead of time helps them understand what they are asking you to do.

Informit Articles and Sample Chapters

Although it deals more with disaster recovery than with an exit strategy, this free sample chapter from the book Disaster Recovery Planning: Preparing For The Unthinkable by Jon William Toigo has some great material that works for both.

Online Resources

Want to see what good (and bad) planning can do? Check out this article from the Chronicle of Higher Education.