Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

Forming Queries Part 3: Query Optimization

Last updated Mar 28, 2003.

We're in our final article in the "Forming Queries" series.

Now that you have the syntax in hand, it's time to create the query. It's at this point when you should think about optimization. Many times it's tempting to "just get the query out there" and re-evaluate it later to make it faster. Far too often, you'll forget to come back to correct the query. Worse, if you create the query incorrectly, you might cause related performance issues that are difficult to detect.

So how do you identify an inefficient query? First, you need to understand the route the query takes to satisfy the results. This route, called a query execution plan, includes the various methods the SQL Server engine uses to locate the rows of data, whether that means reading a table from beginning to end or the indexes the query uses.

Microsoft SQL Server provides three tools to evaluate a query and the query plan. Two are available inside Query Analyzer, and the other is in the SQL Server Profiler tool.

The two methods inside the Query Analyzer tool include a graphical query plan and a text display of that plan. We'll start with the graphical tool and then examine the text you can receive from Query Analyzer, and then I'll explain how to use SQL Server Profiler to see the execution plan as well.

I'll begin by opening Query Analyzer and connecting to the pubs database. Once there, I type a simple query:

SELECT * 
FROM authors
GO

Before I run the query, however, I'll select the "Show Execution Plan" from the "Query" menu:

Figure 131Figure 131

NOTE

One thing before I actually run the query: you can also choose to display an estimated execution plan. This means the system will attempt to predict what might happen as far as the path goes. I don't use this option very often, since there's rarely a reason to do so.

Some might argue that a particular query might take too long to run. My pushback is that a developer should be working on a development system – preferably a virtual system anyway. So who cares if it takes a while? The other reason I don't use this option is that if your query creates temporary tables, it won't display the plan; since the query doesn't really run, it doesn't create the temp tables, and there you are.

Getting back to our example, I press F5 to run the query, and I get the results. I also get an additional tab in the results pane, where I can see the graphical representation of the query plan:

Figure 132Figure 132

There's quite a bit of information, even in this small display. You read the plan from right to left. If the query is complex, you'll see a lot of information. I recommend that you split the larger query into smaller ones until you know what they are doing, and then put them back together a little at a time.

You can see two icons which display the operations the engine used to get the data. I've clicked on one of them; that graphic then displays more detail about that operation.

Notice the small arrow pointing to the left. It's a small arrow because the operation wasn't that time consuming. On larger queries, this arrow will be much thicker.

Such small clues make this a powerful tool. For instance, if the icon turns red, it indicates that the operation could benefit from statistics. Right-click on the red icon, and you can create the statistics on the fly! Also, try moving your cursor over the direction arrows. You'll get the number of records that were transferred in that step.

You'll also see within the information box the CPU and I/O costs. Don't put too much stock in the numbers themselves – they just help the query optimizer create a total cost for the query, and don't reflect real world CPU ticks.

Look for the operations with the highest cost. Those are the steps to attack first.

This particular query used a clustered index scan, which means that the query processor satisfied the query by reading all the rows in order from the index, which is really the table. In the case of a clustered index, the data is physically stored in the order of the index, so reading that index turned out to be 100% of the cost of the query. The icon showing a tree of computers with a blue arrow indicates the clustered index scan.

An index scan isn't a great thing. We can do better.

Figure 133Figure 133

That is better - look closely at the graphics. This time, we got an Index Seek instead of an Index Scan. That's an important distinction. A scan means that the system had to read through all the rows to get the data it was looking for. A seek means that the query processor found what it needed directly from the pointers in the index.

It's similar to having to look through the whole house to find your keys, versus knowing to look on the dresser. The reason we're doing better now is that I've added a WHERE clause. Getting just the data we need makes use of the index properly. As a matter of fact, without an index, you're more often than not going to receive a table scan, which is usually quite bad.

I say usually, because in some cases it's faster for SQL Server to read an entire table than it is to use an index. This is normally the case for any small table, say under a few hundred rows.

We've still got an issue, though, because now we have a Bookmark Lookup icon as well. As a matter of fact, it's half the cost of the entire query. A Boomark Lookup means the system found the rows quickly, but then had to find which columns to bring back. The reason this happened with my query is that I used a SELECT * statement (which you should never do in production). This means all columns, and some aren't covered by the index, so it had to get the columns from the table rather than the index. While this brings back all the columns without my having to bother with figuring out which ones I want, it's very inefficient.

I'll go back to the program to see what's really needed, and I find that I only really want the last name of the author.

Figure 134Figure 134

That's better. In fact, it doesn't get any better than this. A 100% index seek is exactly what you're after - if you can get it.

You can use Books Online to find other query plan symbols and what they mean; look up the topic "Graphically Displaying the Execution Plan Using SQL Query Analyzer." Here are some issues to watch out for:

Index or Table Scans

If you're getting a scan, the system has to read the entire table to find the data. You should look for an index to make the query more useful, or consider creating one.

Sort

A sort happens when you use an ORDER BY on the query. If you need the data in that order, fine. If it's not necessary, however, consider leaving it out.

Bookmark Lookup

As I mentioned earlier, these are often caused by using a SELECT * statement. There is almost never a reason to do this in production systems.

Filter

This one is a bit trickier. You'll often see these when you use a function, which are sometimes the best way to get the data. Again, see if you can reconstruct the query to use an index, or create one if possible.

A textual representation of this kind of data is available as well. Type:

SET SHOWPLAN_TEXT ON

The information is largely the same, but the graphical method is better. I'm normally biased towards command-line operations, but in this case, the graphical plan really does show you more information quickly.

Finally, there's another method to see the query plans. You can use the SQL Server Profiler tool – just capture these events:

  • Performance: Execution Plan

  • Performance: Show Plan All

  • Performance: Show Plan Statistics

  • Performance: Show Plan Text

  • And then pick these data columns:

  • Start Time

  • Duration

  • Text data

You might want to limit the duration to the larger queries so that you don't get inundated with data.

Online Resources

The most awesome site for database and query optimization is http://www.sql-server-performance.com.

InformIT Tutorials and Sample Chapters

Using views with your tables? Check out the article by Andy Baron and Mary Chipman called Creating and Optimizing Views in SQL Server.