Home > Articles > Data > SQL

  • Print
  • + Share This

Understanding the Optimizer and Associated Tools

To see how query variants affect performance, you need two things: information about indexes and a way to see the choices the optimizer made. Most systems provide tools for both these functions, but they vary widely.

Getting Information on Indexes

You'll always be able to find out what indexes are connected to a particular table, through the meta-data system catalogs if nothing else (see "Getting Meta-Data from System Catalogs" in Chapter 7). For example, in Adaptive Server Anywhere, a quick survey of the system catalogs reveals two likely suspects: sysindex and systable. A first try at a query to locate indexes for a particular table might look like this:

Adaptive Server Anywhere
select sysindex.index_name, systable.table_name
from sysindex, systable
where sysindex.table_id = systable.table_id
  and table_name = 'product'

index_name            table_name
===================== =================================
prodix                product
pricex                product

[2 rows]

(See "Writing Queries Using System Catalogs" in Chapter 7 for more examples of generating this kind of code.)

In Adaptive Server Enterprise and MS SQL Server, an easy way to investigate indexes is the sp_helpindex stored procedure (the output from the two RDBMSs is not identical). It tells you the names of the indexes and the columns in each. ("Nonclustered" is a type of Transact-SQL index, with pointers from the bottom row of the index to the data—the index and data are not in the same order. This contrasts to a clustered index, in which the index and data are in the same order.)

Adaptive Server Enterprise
exec sp_helpindex product

index_name index_description                          index_keys  ix_max_rows_per_page
---------- ------------------------------------------ ----------- --------------------
prodix     nonclustered, unique located on default     prodnum           0
pricex     nonclustered located on default             price             0

(2 rows affected)

In many systems, including the Adaptive Server Anywhere on this book's CD, the index information source of choice is a graphical user interface (GUI) tool. Since you may not be much of an ASA user but would like to follow along, here are basic instructions on using ASA Sybase Central for index information (see the help files for details). If you're using another system, the tools and commands will look different but will probably supply very similar information—the name of the index, the columns that make it up, the order in which the columns appear, and the index type.

Open Sybase Central and click Connect under the Tools button (Figure 6–2). Choose Adaptive Server Anywhere. Fill in the forms for the Login and Database tabs for msdpn (Figure 6–3), logging in as DBA with a password of SQL (both must be uppercase) and substituting your msdpn6.db address for C:/SQLBook on the Database file line.

Figure 6-2Figure 6–2. Connecting on Sybase Central

Figure 6-3Figure 6–3. Logging In

After you click OK, you'll see your server (here msdpn, with the terminal icon) listed in the Sybase Central window (Figure 6–4). Click it and then click the database-owner combination you want to use—msdp(DBA)—with the disk icon. You'll see a list of objects (Figure 6–5). Click Tables. From the detailed list of tables and views, click the table name you want (say, product) and then Indexes (Figure 6–6). Pick the index you want. Use the tabs on the window to get details.

Figure 6-4Figure 6–4. Choosing a Database

Figure 6-5Figure 6–5. Listing Objects

Figure 6-6Figure 6–6. Getting Index Details

There is information on the msdpn indexes in "Table Details" in Chapter 1.

Checking the Optimizer

To check your work, you need to know a little about your optimizer, the part of the SQL engine that decides how to process a query. A cost-based optimizer looks at processing options and chooses the one that is "cheapest" in terms of time. If your system is cost-based, you may need to run commands that make sure the statistics on data are current (see Table 6–2). In a rule-based system, the optimizer makes choices based on a set of ranked guidelines.

Most systems support a command or a tool that shows you what the optimizer is doing. In Adaptive Server Anywhere, the software included on the disk, there is one of each.

  • The command is the PLAN function, which takes a query (in quotes) as its argument. For a list of related commands, see Table 6–1.

  • The tool is the Performance Monitor on Sybase Central (an option under Statistics).

If you're following along with the Adaptive Server Anywhere software included on the CD, you may need to send PLAN results to an output file in order to read more than the first line. To do this, end the query with a semicolon and follow it with an OUTPUT TO line naming a file in which to store the query results, and a FORMAT line prescribing the format of columns in the output file. The following examples illustrate this method.

Adaptive Server Anywhere
select plan 
( 'select prodnum, type, price, description from product 
             where prodnum in (1104, 1105, 1106, 1107)' );
output to out.txt
format text

The out.txt file (it could have any name) is located in the directory that holds the ASA database file. When you open the out.txt file, you'll see information on how the query was processed.

Adaptive Server Anywhere
Estimate 1 I/O operations (best of 2 plans considered)
Scan product sequentially
 Estimate getting here 21 times
 For _value_1 in (1104,1105,1106,1107)

If you change the IN phrase to a BETWEEN, the output is different.

Adaptive Server Anywhere
select plan 
( 'select prodnum, type, price, description from product
             where prodnum between 1104 and 1107' );
output to out.txt
format text

Estimate 5 I/O operations
Scan product using unique index prodix
for rows where prodnum is between 1104 and 1107
 Estimate getting here 4 times

Without knowing much about the PLAN messages, you can see that the IN query doesn't use an index—it does a table scan. The BETWEEN query uses the prodix index. The first one goes somewhere 21 times, while the second makes just four trips.

You can get a shorter version of that information by looking at the ASA Interactive SQL Statistics window (Figures 6–7 and 6–8). Notice that the PLAN> line for the two queries is different. The first indicates a sequential scan of the table. The second shows index use. This is parallel to the PLAN results, and easier to read and generate. Figure 6–8 shows just the statistics part of the screen produced by the BETWEEN version of the query. Adaptive Server Anywhere provides more detailed performance information in the Sybase Central Performance Monitor.

Figure 6-7Figure 6–7. Interactive SQL Window

Figure 6-8Figure 6–8. Statistics Pane of the Interactive SQL Window

For a summary of commands that keep tabs on the optimizer, see Table 6–1. You'll need to do some research before you can get much information from the output of any of these commands. You may find additional tools that are used at your site—GUI-based, third party, or home grown.

Table 6-1 Monitoring Performance




MS SQL Server




PLAN ( 'query' )






SQL Conventions

Before diving into performance in SQL queries, consider how your code looks. Code is easier to read and understand if you present it consistently. In some systems, reusability of cached code may depend on the various copies being identical. Differences as small as a single space character may be relevant. In addition, training time for new employees is shorter if they can expect consistent patterns. For your sanity, develop coding guidelines. Here are some common suggestions.

  • Start each line with a SQL verb (SELECT, FROM).

       select prodnum, type, price
       from product
       where prodnum between 104 and 107
  • Indent continued lines.

       select prodnum, type, price
       from product
       where prodnum between 104 and 107
         and price >50.00
  • Be consistent in naming tables and columns—don't make some table names singular and others plural, don't use case randomly, and don't call one column pubdate and a related column in another table pub_date.

  • If you use table aliases, stick to the same ones, and don't use nonmnemonic aliases such as a, b, and c for supplier, product, customer.

  • Put in lots of comments: the date, your name, what the query or script is about—everything you'd want to know.

  • + Share This
  • 🔖 Save To Your Account

Related Resources

There are currently no related titles. Please check back later.