Table of Contents
- Microsoft SQL Server Defined
- Microsoft SQL Server Features
- Microsoft SQL Server Administration
- Microsoft SQL Server Programming
- Performance Tuning SQL Server: Tools and Processes
- Performance Tuning SQL Server: Tools Overview
- Creating a Performance Tuning Audit - Defining Components
- Creating a Performance Tuning Audit - Evaluation Part One
- Creating a Performance Tuning Audit - Evaluation Part Two
- Creating a Performance Tuning Audit - Interpretation
- Creating a Performance Tuning Audit - Developing an Action Plan
- Understanding SQL Server Query Plans
- Performance Tuning: Implementing Indexes
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 1
- Performance Monitoring Tools: Windows 2008 (and Higher) Server Utilities, Part 2
- Performance Monitoring Tools: Windows System Monitor
- Performance Monitoring Tools: Logging with System Monitor
- Performance Monitoring Tools: User Defined Counters
- General Transact-SQL (T-SQL) Performance Tuning, Part 1
- General Transact-SQL (T-SQL) Performance Tuning, Part 2
- General Transact-SQL (T-SQL) Performance Tuning, Part 3
- Performance Monitoring Tools: An Introduction to SQL Profiler
- Performance Tuning: Introduction to Indexes
- Performance Monitoring Tools: SQL Server 2000 Index Tuning Wizard
- Performance Monitoring Tools: SQL Server 2005 Database Tuning Advisor
- Performance Monitoring Tools: SQL Server Management Studio Reports
- Performance Monitoring Tools: SQL Server 2008 Activity Monitor
- The SQL Server 2008 Management Data Warehouse and Data Collector
- Performance Monitoring Tools: Evaluating Wait States with PowerShell and Excel
- Practical Applications
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
Creating a Performance Tuning Audit - Developing an Action Plan
Last updated Mar 28, 2003.
This is the last of the conceptual tutorials on creating a Performance Tuning Audit. I've explained the steps that you should follow to document and monitor your entire application landscape, and how to interpret the results of the monitoring in light of the documentation. Putting all that together will point you to the areas that need further attention. Once you evaluate those areas, as I described in the last tutorial, you can interpret what the possible causes for slowdown might be. Not every problem needs this level of detail. If you're dealing with a simple slowdown on one component because load has increased, you may not need a full audit process. But if you have a large, complex system, I recommend a review like this once a year.
All throughout the process I've reminded you not to interpret what you're seeing until you're in the third phase of the audit process. If your workplace is like most of the shops where I've conducted these audits, the technical staff has so much work to do they haven't had time to go back and evaluate their entire landscape. It takes most of your day just to keep the systems up and running and deal with new requests to go back and document and monitor all of the current systems.
It is almost certain that as you perform the tuning audit that you will run into obvious problems, like systems not having the latest service packs installed or not enough RAM for the applications installed on that device. The temptation is to immediately correct the issues you find along the way. This temptation is even greater when the person doing the audit is not part of the local technical team. The team might be embarrassed or concerned that their knowledge will be called into question if problems are exposed. I faced this situation many times as a consultant.
Unless the problem is extremely severe, however, you shouldn't make any corrections, not even at this last phase of the audit. Making changes while you're still in the process changes the system, and makes the previous and subsequent evaluations less effective. For instance, if midway through the audit you find a system without the proper drivers installed and you correct that issue, you may correct a bottleneck that then exposes another in a system earlier in the chain. By making that correction, you've lost the ability to use the metrics you gathered earlier, and you'll have to repeat the entire process.
There's another danger in making corrections, and staying within the framework I've been explaining keeps this problem from happening. The danger is that the change you make may not be isolated and implementing it may cause damage to the integrity of the system. For example, I once had a developer notice and correct an issue dealing with an index on the database structure. Making that correction caused another part of the application to generate repeated timeouts, and because that component was not designed to handle that, allowed incorrect data to be stored in the system. As a DBA, the worst possible condition is not to have a loss of system availability, but to allow incorrect data.
To keep the process on track, the last phase is arguably the most important. Remember that you are performing an audit — that's a passive word. In the audit you are only evaluating the system and recommending changes. There are no changes made to the system at this point.
The last phase of the process is to create your action plan. As I've mentioned before, the tool and format of the presentation is less important than following the process for the audit. I use Microsoft Office software to create my presentations, primarily because I find it the best fit for the job. Almost everyone has Office products or can at least download the readers for free, and all Office products allow text and graphical representations. Remember that your audit has two audiences: one technical, and one business. The technical audience wants the data in as raw a format as possible, so that they can use it within their favorite evaluation and presentation tools. The business audience wants you to get to the point, and often wants graphs and charts that show trends and end results. The Microsoft Office components I use to satisfy both audiences are Excel for raw numbers, PowerPoint for presentations and Word for recommendations, including the action plan.
The action plan contains several components, but what it delivers is quite simple. Here are the main elements:
- What you've found
- The interpretation of those findings
- What you think should be done about it
- Who needs to do the work
- How long the work will take
- What the impact of the change will be
- The exit strategy for each item
I normally create a Word document with each of these headings and full details and a set of PowerPoint slides with those headings with any graphics that make the point clearer. Let's take a look at each of these elements in detail.
What You've Found
In this section you have two choices. You can list out every area you evaluated, with links to the raw data in Excel. If you're in a situation where things have gone horribly wrong, you might want to go to this level of detail. More likely, however, you only need to include the "outliers", which are just the items that you've found with possible issues.
I normally allude to all of the testing components so that the technical and business communities know that we've looked at everything, and then I drill down a little on the items that show up as issues.
The Interpretation of Those Findings
This is the most important section to the technical audience. They will carefully evaluate the statements you make here, so be sure that you include all of the supporting information you collected to validate your conclusions.
If you found that the main cause for the slowdowns or issues were the result of an incorrectly configured middle tier and some poorly designed T-SQL statements, make sure you support your arguments as scientifically as you can. Do not assign blame, use the "they" word, or disparage anyone else's work. You're trying to solve a problem, not case blame. Make sure that comes through in the documentation.
What You Think Should be Done About It
With the problems identified, list the possible corrections for each of the items you've found that need attention. You may recommend more than one course of action for a particular area. For instance, "We can add more RAM, or redesign the query." If you do that, make sure you list the pros and cons of each action: "If we add more RAM, we need to take the system down for a short while, but the system will work faster. If we redesign the query, it will take longer to implement but will not entail any downtime."
If there is a course of action that gives the same benefit, allow the business to decide. Give them enough technical information to make the decision without overwhelming them with jargon. You may think that the technical team knows best, but the business has information you do not. For instance, they may be shutting down that line or plant soon so they just want a band-aid to limp along until that happens.
Don't worry that a bad decision made by the business will come back to haunt you. Remember, you're documenting everything and presenting it, so you have a permanent record of your suggestions and which route the business chose.
Who Needs to Do the Work
In this section you should indicate which department and resources need to work the issues. Don't assign names here. Just indicate that the "Senior DBA should handle X, and a System Administrator should handle Y". That way the project managers can decide who is available and what the work impact will be.
How Long the Work Will Take
For the business, this is one of the primary questions. Detail each correction, and indicate how long it will take to do that task.
It is also important to explain which items can be done in parallel and which ones are dependent on each other. If X needs to be done before Y, then the resources and downtimes for X have to be scheduled first.
What the Impact of the Change Will Be
Not all of your suggestions should be implemented — at least not right away. You may come away with the suggestion that the team review all indexes or stored procedures in the system. While that's a valid proposal, you may not have enough time and resources to do it all.
The way out of this dilemma is to rank your suggestions by the amount of work and the expected gain. For instance, changing three indexes might provide an estimated 30% performance increase, while changing all of the rest might only bring another 5%. Ranking each of your suggestions this way, along with the time it takes to implement the change allows everyone to choose the right course of action.
The Exit Strategy for Each Item
This is one of the most overlooked parts of any evaluation. For each item, on each line, for any course of action, make sure that you indicate what you can do to "put the system back." I've seen numerous occasions where an obvious course of action is indicated, the team makes and implements a suggestion, and something entirely unexpected happens. The team sits in stunned silence as the business asks "what can we do to at least get back where we were?" Answer that question here and now and you won't have to go through this pain.
The exit strategy might be as simple as "restore from backup." Make sure you really can do that, and that you indicate how long that will take. This is referred to in some documentation as the "risk mitigation factor," and it allows the business to determine how dangerous a particular item is. Remember that they don't have your level of technical expertise, so you need to indicate in terms of downtime and repair time how much a bad decision might cost.
Informit Articles and Sample Chapters
You may decide that you are going to run a continuous performance audit process. In addition to using Microsoft Office products to do collection and presentation, you can use SQL Server to present the data to both technical and business staff, using Reporting Services. Check out this resource to help you understand how to do that.
Not happy with your PowerPoint skills? Check out this site for a quick tutorial to help you present your data effectively.