Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

Forming Queries Part 1: Design

Last updated Mar 28, 2003.

Programming against databases requires a firm understanding of the language used to create objects, such as tables and views, read data, update data and remove data and objects. You might see these commands referred to as CRUD (Create, Read, Update and Delete).

Many new developers rush into this phase of programming as their first step. Instead of jumping right into syntax, however, it's important to understand all of the processes involved in the development cycle. You can find out more about these processes in the series of articles that begin with Database Design: Requirements, Entities, and Attributes.

Once you fully understand those concepts, you can begin to work on the heart of the programming task: forming queries. Before we get to the syntax, however, there are a few formalities to deal with.

Dozens of Transact-SQL commands fall into two broad categories, or "languages:" Data Definition Language (DDL) and Data Manipulation Language (DML). DDL statements create, alter, and drop (remove) database objects. (Those statements are covered in other articles beginning with Database Objects: Databases.) The statements I'll focus on in the next few articles involve DML.

In this series of articles I'll cover the INSERT, UPDATE, DELETE and SELECT statements. Along the way, I'll map these to the "CRUD" matrix.

There are two methods to get run T-SQL statements against a SQL Server. The first method is to use "Dynamic SQL," which involves sending SQL statements directly to the database through a client program. If you're using Query Analyzer and typing one of these four commands, that's Dynamic SQL. Using Visual Basic or C#, you can create a connection object, build a T-SQL String, and send it on to the Database Server. This is often the most effective way to send the commands, especially if your higher-level programming code builds the query "on the fly." Since the database has no foreknowledge of what the demands are, you can't plan ahead very well to run them any other way.

The second method is to use server-side programming. This involves the use of stored procedures, user-defined functions, and even views in the statements sent to the server. The process in the higher level language remains the same, with the exception that the string you send is the name of the stored procedure with any variables it requires. The statements in the stored procedures or views run on the server, instead of coming from the client program.

Server-side programming has many advantages. For instance, the first time a stored procedure runs, SQL Server places it in a special location in memory called the procedure cache. The next time the stored procedure is called, SQL Server accesses the faster memory location rather than direct storage. Designing often-used queries into stored procedures provides a great performance boost.

Which method is best? There's an old saying: "When all you have is a hammer, everything looks like a nail." I've seen fairly heated arguments among developers on whether to use a server-side versus a dynamic-SQL approach in a particular situation. In practice, you'll often see a mixture of both, especially in larger, more complex programs.

With that in mind, you can determine the best method use by taking a holistic view of the program, using your previous experience, and testing. That's also the right order to begin the process.

Once you've created the requirements and the outline for the program, stop. Step back, and view the program from end to end. Evaluate the program flow diagrams, and see if you can determine any patterns. Follow this step before you write any code.

Patterns form the basis of reusable objects that you should create, which are prime candidates for stored procedures. If there is a common "engine" that evaluates a condition, determine the variables you'll need to pass to make the stored procedure as useful as possible in multiple situations.

Taking a holistic view also includes larger program elements, such as scope, error handling, security, optimization and even data archival.

The original requirements document should contain a good definition of the program's scope. One of the most troubling issues in the modern development world is a disconnect between what the user thinks the program will do and what the developer codes. It's here, more often than not, that the primary stress factors of the programming effort lie. Make sure you have a firm understanding of the program's bounds. If you're not sure, check.

Determine, early in the process, how you'll handle errors. Of course, each component will have unit-level error handling, where you deal with any program errors that arise, but you'll want to determine how to handle larger error elements. Questions to ask here are things such as "If the program has an error, would you rather process 'X' roll all the way back to the beginning, or are partial data inserts permissible?" Most often it's best to prevent the user from entering "bad" data (a pessimistic design), but if the data depends on unknown states of data, this might not be possible (an optimistic design). These decisions will determine the transactions you create and even the database's original design.

Security is of paramount concern, right from the beginning. Think of the design as a building, and detail all possible entries and exits. This section requires a thorough knowledge of SQL Server access. The task becomes more difficult if the system exports or imports data. If it does, be sure to extend your security boundaries to include the source and destination systems.

Most of the performance of a system lies not in the hardware or database size, but in the proper design and use of indexes and queries. The important thing to remember during the initial design is that it is exponentially more difficult to optimize the design later. I'll explain how to optimize the individual queries in Part 2 of this series, but program optimization encompasses proper design, index creation and use, and effective queries.

Data archival involves a strategy for dealing with the data as time passes. This means that you should determine the effective "life" for each business data element that your program will store. How long will the data be used? Is it governed by any legal or ethical requirements? How often should it be rolled up? Is this the "system of record," where the data is created?

Once you've completed these steps, you're ready to code. We'll continue this article next week, where I'll explain the basics of creating queries.

Online Resources

More online content for the SELECT statement is available here. By Judith S. Bowman, Marcy Darnovsky, Sandra L. Emerson do a great job in their article called "Practical SQL: Selecting Data from the Database".

InformIT Tutorials and Sample Chapters

Find a good "Beginning SQL" Book on Safari. One of the best is Sams Teach Yourself SQL in 24 Hours, Second Edition by Ronald R. Plew and Ryan K. Stephens.