Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

Creating a Performance Tuning Audit - Evaluation Part One

Last updated Mar 28, 2003.

In the Performance Tuning section of the guide I've been describing how you can tune a complete application. While you probably won't be able to perform all of the tuning efforts yourself on each and every component within the landscape, it's not unusual for the Database Administrator to lead the charge. The reason is that since the database is at the end of the chain, so to speak, you're usually familiar with the path the data takes to get there. That makes it easier for you to make the initial efforts in gathering everyone together, and defining the parts of the landscape for investigation. I call this process a "Performance Tuning Audit." It's really a set of documentation that shows the parts of the landscape, the measurements you've taken, what they mean and what you plan to do about them.

After you define the parts of the landscape, which I covered in the last tutorial in this series, you now need to zero in a little more on those components. That's the focus of this tutorial and the next. In fact, let's back up a moment and review the four phases of the Performance Tuning Audit:

  1. Defining Components
  2. Evaluation
  3. Interpretation
  4. Action Plan

In the last tutorial I explained the way that you can explore the landscape for the first phase, and the method you can use to limit the systems you'll evaluate. In a large system, there's simply no practical way to look at every component. What you're looking for is a way to "zoom in" and select the components that are indicating they might be an issue.

So that becomes the next part of what you do. You'll need to break out the Evaluation Phase into two parts. The first part deals with the measurements you can take for an individual component, called metrics. Each component will have different metrics based on what they are and what they do. In this tutorial I'll focus on that part of the process.

At this point in the process you've gathered all of the information you can around the design and function of your system. As I mentioned in the last tutorial, one of the biggest temptations, and the biggest dangers, you face is that during the component discovery process you'll uncover some obvious things that need correcting. I'll re-iterate here that you should avoid making those corrections at (almost) all costs. The reason I qualify the "all costs" part is that if there's something that might cause data corruption, hurt someone or stop your business, by all means, document and implement the correction.

If, however, you can wait — do, because if you make the obvious correction, there are several dangers. First and foremost, you're altering the original conditions of the test. In any kind of scientific testing, control is essential. Control simply involves making sure you know what the state of things are for every component of the experiment. In effect, you're performing a kind of experiment when you set up a baseline set of measurements for a system and then measure it again when it performs poorly. By comparing those two measurements, you can see the parts that move. And that's the evaluation part that we're looking at now. So it's best that you document your findings and then wait until you've collected all of your information before you make any changes.

Once you've selected your test systems, your next decisions involve what to monitor. Your first impression might be that there are only a few things you can watch on a particular component, but you'll find that at least on most components you're faced with a litany of things that you can collect. There are two ways you can approach your decision.

The first approach is to collect everything the component and its subsystems can report. This is usually the right thing to do when you are creating a baseline, since you do not really have to dive in very far and you aren't looking for anything in particular. It does, however create a mountain of data, which has to be categorized and saved out somewhere. After all, no matter what data you collect you'll need to be able to compare it to other metrics to see correlations later, so the more you collect the more work you'll have to do.

The other approach is to move from broad to specific, and from outside to inside. What that means is that you will detail what you can find out from the broadest objects and counters the component reports, involving the four parts of any computer system: Processor, Memory, I/O and Network Interfaces. If you use this data against the baseline measurements, you can drill in on the differences. You can then compare the outer objects to see what's moved. Once you locate an area (say in the Processor counter) you can drill in to just that object to see if there's a true correlation. Remember, an integrated system has many dependencies, so keep this in mind when you're selecting metrics.

If you're forced to do your measurements without a baseline, you're going to have to use your judgment about the metrics you're seeing, watch for dramatic movements, or both. This brings us to the most difficult part of this tutorial — choosing the individual objects and counters within each object to measure. For SQL Server, I've given you many things you can watch and even the process and tools to use in another tutorial. While that tutorial gives you a starting point, it is by no means exhaustive. Even if it were, you're evaluating the whole landscape, not just SQL Server. To choose the parts of the component you want to look at, you need to be very knowledgeable about the component, specifically regarding how it interacts with other parts of the landscape.

But odds are you're not an expert in every component from networking to a given application. If that's the case, it's up to you to engage others, explain what you've learned here, and ask them to provide the input you need. Even so, I still recommend that you study up on what you're dealing with, since it may help you to ask questions about what the experts are measuring and why. Don't worry if the other experts get irritated with you for asking what they consider basic questions; just explain that it's part of doing an effective job.

Now let's return to the outline and discuss the second part of the Evaluation Phase. In this part of the process you will apply measurement tools and processes against the components you selected using the metrics you decide on. You'll need to make more decisions here on things as varied as timing and interval to the instance of the metric to evaluate. That's the focus of the next tutorial.

Although I'll mention this again in the next tutorial, it's important not to get ahead of the process. Just as in the discovery part of the process, you may notice a counter moving in a certain way and be tempted to make a broad statement about what you're seeing. It's fine to watch the monitor moving, but it's important to allow the tests to complete before you make your interpretations of the data, and even more important that you wait until the interpretation is evaluated before you create a corrective action plan. I'll detail all these steps throughout this series.

Informit Articles and Sample Chapters

If you can't measure a particular application directly because the developers didn't add metrics to their software, you may need to combine system metrics where the application is running and a set of counters that watch the results on the database side. I explain how to create your own counters in this tutorial.

Online Resources

One of the more difficult measurements to take is latency. There's a great article on that very topic, but from a rather interesting angle. This is a snippet from a product called VR Juggler — which has nothing to do with SQL Server, but has some fascinating information regarding latency measurements within their application. Even if you don't care about the application itself, take a look at the process shown here.