Home > Articles > Databases > SQL Server

Toggle Open Article Table of ContentsArticle Contents

Close Table of ContentsArticle Contents

  1. Synonyms
  2. Including Non-Key Columns in Non-Clustered Indexes
  3. Summary
Close Table of Contents
  • PrintPrint
  • Share ThisShare This
  • DiscussDiscuss
First Look at SQL Server 2005 for Developers, A

Like this article? We recommend
First Look at SQL Server 2005 for Developers, A

Including Non-Key Columns in Non-Clustered Indexes

This is a great feature that is somehow not advertised well by Microsoft (surprise! surprise!!). In a non-clustered index, you can include other columns in addition to the index keys. Okay—that is a vague statement, so let's see what it means in practice.

NOTE

You will need the pubs database to try out these queries.

In the SQL Server Management Studio, open a new query window. Place the cursor in the window; using the Query menu, select the Include Actual Execution Plan option, as shown in the following figure.

Figure 01 Figure 1

Now run this query:

USE PUBS 
SELECT au_lname, au_fname FROM authors WHERE au_lname = 'White' 

Select the execution plan tab and note that the engine performs an Index Seek operation (see the following figure) because the authors table has an index on the au_lname, au_fname columns called aunmind. So it can query and return all the data using the index without ever having to go to the table. Such an index is called a covered index (all the data that needs to be returned is covered in the index columns).

Figure 02 Figure 2

Now let's try returning an additional column, for example, the phone column. Let's run this query:

USE PUBS 
SELECT au_lname, au_fname, phone FROM authors WHERE au_lname = 'White' 

Select the execution plan tab and note that the engine performs a clustered index scan (see the next figure). This is equivalent to a table scan.

Figure 03 Figure 3

By asking for the phone column in the output, we made the engine do extra work. It decided that a table scan would be more efficient in this case rather than using the non-clustered index aunmind. Note that if there are a large number of rows in the table, the engine may first locate row that match the au_lname in the non-clustered index and then use the row identifiers to get the data from the table. In this case the number of rows was small, so it decided to bypass the non-clustered index and directly did a clustered index scan to get the data. (The second example shows that the SQL engine took a different approach.) But in any case, the engine would have to do extra work to fetch the phone column.

If you frequently need the phone column in the output, you want to make the query more efficient. In previous versions of SQL Server, you would use an indexed view (a.k.a. materialized view or virtual table). You would create a shadow table with three columns: au_lname, au_fname, and phone. SQL Server would use the shadow table for the above query and would never have had to go to the authors table to get any data (because all required columns were available in the shadow table).

But indexed views are usually harder to plan, create, and maintain (or, in other words, they're easy to mess up). There are also a number of restrictions on creating an indexed view—the most restrictive is that the index must be unique. So in the previous scenario, if the three columns au_lname, au_fname, and phone are not unique, you're out of luck.

SQL Server 2005 offers you an amazing alternative: You can simply include the phone column as a non-key column in the non-clustered index on the au_lname, au_fname columns. As a result, the index now becomes a covered index, and the engine does not have to go to the table to get any data. This can significantly speed up your queries, especially for large tables and for queries that return many rows.

Let's see how to do this. In the Object Explorer window, locate the aunmind index, right-click, and select Properties. Then select the Included Columns property and add the phone column (see the following figure). Click OK to exit the dialog box.

Figure 04 Figure 4

Now run the query again:

USE PUBS 
SELECT au_lname, au_fname, phone FROM authors WHERE au_lname = 'White' 

Notice that the execution plan is back to the Index Seek operation! And wasn't it easy to do? (This makes it a good candidate for abuse—see the next paragraph.)

It is very easy to abuse this feature by adding one gazillion columns as included columns, which will make your insert and update operations very inefficient. So don't tell your rookie developer about this feature. SQL Server 2005 allows up to 1023 columns to be included as non-key columns. The CREATE INDEX and ALTER INDEX columns now have an INCLUDE clause to let you add these columns.

Also note that to achieve full coverage of the index, you need to add all columns in the query's SELECT, WHERE, and ORDER BY clauses.

Here's another example of this feature: I work a lot with a patient database, in which one of the tables that holds patient data looks like this:

patient_id

Unique per patient

fname

Patient first name

lname

Patient last name

...

Numerous other columns holding data on the patient: date of birth, gender, allergies, blood type, last hospital visit, and so on


The table has about one-half million rows and will grow about 5 percent every month. Because it is frequently used, it is important that the queries be efficient.

Most of the queries on this table are focused on getting the patient's name by using the patient_id. So I simply create a non-clustered index on the patient_id column and add the fname and lname as the included columns. This ensures that all such queries are covered in the index itself. But wait, this gets interesting...I also have a large percentage of queries that are interested in finding patients on their lname and/or fname combination and returning their patient_id. So I simply create another non-clustered index on the lname, fname columns and add the patient_id as the included column!

Here are the execution plans before and after the columns are included for the first index. Query:

SELECT lname, fname 
FROM patients 
WHERE patient_id = 71119613 

The following figure shows the execution plan without the included columns in the index having the patient_id as the key column (note that the Bookmark Lookup task is taking up a big chunk of work).

Figure 05 Figure 5

The following figure shows the execution plan after addition of the lname and fname as the included columns.

Figure 06 Figure 6

If the queries return a large number of rows and also have an ORDER BY clause, an indexed view might be a better choice because the indexed view will have the data physically sorted in the required order and can return the data more quickly.

  • Share ThisShare This
  • Your Account

Discussions

Make a New Comment

You must log in in order to post a comment.

Related Resources

There are currently no related podcasts. Please check back later.

Great password information at a small price
By John Traenkenschuh on June 13, 2009 No Comments

Where can cash-strapped security pro's get great information on security basics??

Steven HainesOracle Buys Sun of $7.4B
By Steven Haines on April 20, 2009 No Comments

In a stunning turn of events, Oracle steps in and buys Sun amist the breakdown of IBM's attempt to acquire Sun.

Buck WoodyIf it's Free it's for Me
By Buck Woody on January 26, 2009 No Comments

Sign me up for anything free these days. I just ran across a book that promises to help you build a web site for free...

See All Related Blogs

Informit Network