Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

Database Objects: Keys

Last updated Mar 28, 2003.

We've been covering quite a bit of ground on database objects, so far, and we'll continue that discussion in this tutorial. We've learned about databases, tables, and table relationships. Along the way, I've briefly mentioned the concepts of keys, but I waited until now to discuss them in depth.

There are two types of keys: Primary and foreign. Both of them can be created when you create the table or after the fact, as long as you don't violate any of the rules I'll show you as we go along.

Let's start at the beginning and talk about primary keys.

The concept of a primary key is pretty simple. It's a value or set of values that makes a particular row (or record) in a table unique. A primary key forces something called entity integrity in a table. That means that no duplicate entries are allowed in a database, something I'll cover again later when we talk about table normalization.

The primary key of a table has to be unique, and it also can't be null. Recall from my last tirade about the value null – that many people think "null" contains a blank value, or zero. In fact, the null value is a placeholder; it means that the value isn't known yet. As such, you can see that a null value may or may not make a record unique, since the value isn't known.

A primary key can be made up of several columns. Sometimes one column alone doesn't make a row unique. Perhaps an order contains the same item as another, the same amount, and is even sold on the same date. What makes a particular record in that table unique is a sequence number, along with the other information. This type of primary key is called a composite key, because it's made up of several columns.

When you select your primary key, you look at all the information in a record and decide what makes each one unique. What we're talking about here are key candidates – columns that are potentially useful as a primary key. Among the guidelines for picking the primary key: look for columns that contain a unique value and aren't null. They are called "highly selectable" and are primary key candidates.

Another important consideration regarding primary keys are that they should never change. We'll see why in a moment.

Using this information, let's return to the authors table from the pubs database in SQL Server. Here's a record from that table:

Authors

au_id

au_lname

au_fname

phone

address

city

state

zip

contract

172-32-1176

White

Johnson

408 496-7223

10932 Bigge Rd.

Menlo Park

CA

94025

1


Let's take these columns one at a time, and see which ones are candidates for the primary key.

The first column is au_id. You'll notice it looks suspiciously similar to a US Social Security number. At first blush, that seems a good choice, since Social Security numbers are supposed to be unique – right? Wrong! Social Security numbers have been known to be re-used, and even worse, not everyone has a US Social Security number! In addition, what if you hire a writer from the UK, or from Africa? They certainly won't have a US Social Security number. This column would fall under the category of bad choice number 1.

The next columns are the first and last names; we all know those aren't always unique, regardless of the country. These columns are bad choice numbers 2 and 3.

How about Phone, Address, City, State or Zip? Again, these are bad choices, since they might be duplicated or non-existent.

There's an interesting column there at the end that looks promising – it's the contract number. There should only be one contract per book, right? Perfect! But wait, it seems that the accounting department tells us that the contract is by work, not by person. Sometimes many people are signed to one effort.

By the way, Microsoft uses the au_id column, and gets away with it by saying that it's not a Social Security number, it's a generated number (right).

Sometimes there just aren't any columns that meet the requirements for a primary key. There are also some pretty good arguments that you shouldn't use a composite key, because the query optimizer engine has to work harder to track the keys. What you do in this situation is to make a new column, and use that for the key. This column can be auto-numbered (using the identity type) and is never shown to the users. This is sometimes called an artificial key, which has no meaning outside of identifying the record. The only thing to be careful about here is the design of the number, to allow for growth and uniqueness in the future.

Now that you've discovered how to select a primary key, how do you create one? That part is pretty simple. You can create the primary key when you create the table or even after, assuming that the data in the table doesn't violate the key. For instance, if you have an existing table, and it has a column you wish to use for the primary key, that column can't have null or duplicate values.

You can create the keys by using Enterprise Manager or with T-SQL code. To use Enterprise manager, open the database in question, navigate to the table object, then right-click the table name and select "Design Table" from the menu that appears. Once the design tool is open, just select the column(s) and click the key icon. Voila!

To create the key via code, just include the following syntax on the CREATE TABLE command:

CREATE TABLE test
(
columnname smallint
   IDENTITY(1,1)
   PRIMARY KEY 
)

If the table already exists, you can use the ALTER TABLE command with the following syntax:

ALTER TABLE tablename ADD newkeycolmun IDENTITY(1,1) 
  CONSTRAINT PRIMARY KEY

As you can see, primary keys are pretty simple to implement, and at first glance don't appear to be that big a deal – right? Au-contraire! Primary keys are vital to entity integrity, by guaranteeing that no one can enter duplicate data into the table. They also serve as the reference anchor to the foreign keys, as we'll see.

Foreign keys are just as important and as simple to implement as primary keys. Foreign keys are simply a column you add in one table that points to the primary key column in another. It contains exactly the same value in that columns' field as the "parent" table. Here's a simple example:

Table Clients:

ClientID

Fname

Lname

1

Buck

Woody

2

Eric

Keenan


Table Orders:

OrderID

ClientID

Amount

100

2

25.00

101

1

10.00

102

2

15.00

103

2

10.00


In the example shown above, the Clients table has a primary key of ClientID. The Orders table has a primary key of OrderID. I want to make sure that all orders have a client associated with them.

To do that, I created a ClientID column in the Orders table and made it a foreign key to the ClientID in the Clients table. By making the foreign key on the "child" table, SQL Server will prevent anyone from entering an order unless there is a corresponding ClientID in the "parent" table.

While primary keys enforce entity integrity as I explained earlier, foreign keys are used to enforce referential integrity. Referential integrity guarantees that references from one table to another are enforced by the database, and not left up to the developer or user.

Foreign keys can point in many directions. They often point in stair-stepping fashion (one table pointing to another), circular (two tables with primary and foreign keys pointing at each other), and even self-referencing (a foreign key in a table point to its own primary key). We'll see many of these types in our database design discussions.

You'll notice that, unlike the primary key, the foreign key has repeating values – and that's OK. If you think about it, you'll see that the foreign keys are the "many" side of those one-to-many relationships I mentioned in an earlier article.

One fact that's important to mention here is that you're not allowed to "orphan" entries that are defined as foreign keys. What that means is that in the example I mentioned earlier, you can't delete Buck or Eric's entries from the Clients table as long as there are foreign keys pointing to it in the Orders table. If those foreign key entries exist, SQL requires you to delete the child entries first. The same applies if you try to modify the ClientID values in the Clients.

Of course, primary keys aren't supposed to change, but it happens. If so, you'll need to alter the values of the children first and then change the values in the parent table.

To deal with all this, Microsoft SQL Server 2000 supports cascading referential integrity – and we'll see more of that as we go along.

So how do you create foreign keys? Once again, you can use the Enterprise Manager tool or T-SQL commands.

To create the foreign keys using Enterprise Manager, open the database in question, navigate to the table object, then right-click the table name. Select Design Table from the menu that appears. Once the design tool is open, click the Relationships icon (it looks like three tables with lines between them). Once the properties appear, click the New button.

You'll get a selection for the tables you want to relate, and from there you can select the columns to relate.

If you want to make the changes using T-SQL code, here's the way I did it for the example above:

ALTER TABLE Orders ADD CONSTRAINT
 FK_Orders_Clients FOREIGN KEY
 (
 ClientID
 ) REFERENCES Clients
 (
 ClientID
 ) ON UPDATE CASCADE
  ON DELETE CASCADE

Did you catch that first line? What we've been discussing in this entire article are constraints – and when you think of the meaning of that word, that makes sense. If you want to ensure that you only enter "good" data, you need to constrain your entries into a certain domain of data.

Primary and foreign keys are just the first set of constraints you'll have at your disposal to make your users behave. Well, during the data entry, anyway!

We've scratched the surface of keys – but don't worry, we'll see them again in our database design tutorial.