Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

NULLs

Last updated Mar 28, 2003.

I've mentioned using NULL values in a SQL Server database in various places on Informit, but I've found that describing them and how to use them requires a separate tutorial. NULL values are an interesting construct that aren't really values at all.

To talk about what NULL values are and how to deal with them, I'll take you back to the first design for a relational database. In the late 1960's a gentleman named E. F. Codd invented a different way to store data. Up until that time databases were largely made up of independent files with duplicated elements of data. Mr. Codd felt that a more flexible design included only one repetition of discreet data elements, with references to the independent rows of data. These references, or keys, made up relationships that were not static, but could be joined together in multiple ways. And so the Relational Database Management System (RDBMS) was born.

As the design was refined, there came a separation of ideas about what should theoretically be allowed and what was practically necessary. On the theoretical side, each element would exist only one time, and every data point was known at the time of the element's inception. In other words, if you created a storage element for a list of names, all parts of the name (first, middle and last in English-speaking countries) would have a value included or not exist at all. In practical use, however, it is not always possible to have each and every value determined, perhaps not ever. In our example, we may want to have a place for the middle names where they exist, but also allow for values where a person has no middle name.

So a new construct was asked for, and an interesting solution was created — one that didn't have universal agreement. Database vendors created an attribute of a column which allowed for no value at all, and called this construct a NULL. The definition of a NULL value, then, is simply a placeholder, or an undetermined value. And that is what has cause a great deal of confusion ever since.

The confusion comes because a lot of developers feel that a NULL value is either a 0, a blank, or nothing. If you follow this logic, treating a NULL as a number causes issues with calculations. Treating a NULL as blank causes issues with sorting data. And treating a NULL as nothing is logically problematic, because "nothing" implies that there will not be a value. But the real problem comes with joins and comparisons. The reason is that unlike a 0, a blank or nothing, a NULL does not equal a NULL. Let me explain.

Suppose you have some coins in your pocket, but you don't know what they are. I might ask, "Do you have a dime?" You would probably answer, "I don't know. Let me check." But if I then told you, "No, don't check. Just tell me if you have a dime or not!" You honestly couldn't answer. You know that your pocket holds a "value," but you don't know what it is yet. Suppose I then continued "OK, I'll tell you what. I'll give you an identical amount of dimes as the number of dimes you have in your pocket." You still can't perform this "join," even if you think you may have some dimes. Because the amount is unknown, you can never answer the question. It's the same with a database. A strict design doesn't allow you to compare unknown values.

Let's take a look at how you create NULL values, and the functions SQL Server provides to work with them.

Creating a NULL-Enabled Column and Inserting NULL Values

To create a placeholder for unknown values, you enable NULLs on a column. That means that you can insert NULL values or enter no values in a row of a table and have a NULL take its place. You can allow NULLs on a table graphically when you're creating a table by checking the box marked "Allow NULLS" in Management Studio (SQL Server 2005) or Enterprise Manager (SQL Server 2000).

To create NULLs in code, you append either NOT NULL or NULL to the end of the column definition. The following is an example of creating a database called Test with the default locations and settings, and then a table called ClientNames.

CREATE DATABASE Test
GO
USE Test
GO
CREATE TABLE ClientNames(
	FirstName varchar(50) NOT NULL
	, MiddleName varchar(50) NULL
	, LastName varchar(50) NOT NULL

This table has three columns, where we set that we will always have to enter a first and last name. The middle name, however, is allowed to be empty, or unknown. We don't want to omit the name if we have it, and we don't want to put something like 'N/A' in the column if it doesn't exist. In this case, NULLs are the way to go.

With this table structure in place, we can enter data in two ways. The first way is to specify a NULL in the insert statement, as in this example:

INSERT INTO ClientNames
VALUES ( ’Buck’
		, NULL
		, ’Woody’)
GO

Selecting all data from that new table shows the NULL value:

SELECT * 
FROM ClientNames
GO
-----------------------------------------
FirstName          MiddleName	        LastName
-----------------	-----------------	-----------------
Buck		        NULL		        Woody
(1 row(s) affected)

You can also insert a NULL value when it is defined on the column by entering no value at all, as in this example:

INSERT INTO ClientNames 
(Firstname
,LastName)
VALUES 
(’Jon’
,’Glandon’)
GO

Selecting all data from the table shows both NULL values:

SELECT * 
FROM ClientNames
GO
-----------------------------------------
FirstName          MiddleName	        LastName
-----------------	-----------------	-----------------
Buck		        NULL		        Woody
John		        NULL		        Glandon
(2 row(s) affected)

There are a couple of restrictions with NULL values. First, they can't be a Primary Key. This makes sense, because you can't uniquely identify a row (which is the purpose of a PK) with an unknown value. You can have NULLs in a unique constraint, but the entire set of indexed columns can contain only one of them. You also cannot use NULL values in an IDENTITY column, since it automatically applies a value for you.

Searching and Comparing NULL Values

I mentioned earlier that the primary difficulty in allowing NULL values in your database comes when you search on them or compare them. Assume that we now want to search for all values in the database where there is no middle name. Let's try a standard query:

SELECT * 
FROM ClientNames
WHERE MiddleName = NULL
GO
-----------------------------------------
FirstName          MiddleName	        LastName
-----------------	-----------------	-----------------
(0 row(s) affected)

That doesn't seem right, since we know that we have two names with NULL values in the middle. This is where the NULL design presents a problem. Just like the coins in the pocket example, we aren't allowed to compare NULLs together - if the value is unknown, it's unknown.

But there is a way to test the columns for NULLs. The first keywords you have available are IS NULL and IS NOT NULL. Let's try those:

SELECT * 
FROM ClientNames
WHERE MiddleName IS NULL
GO
-----------------------------------------
FirstName 	        MiddleName	        LastName
-----------------	-----------------	-----------------
Buck		        NULL		        Woody
John		        NULL		        Glandon
(2 row(s) affected)

That's better. And this query should return no rows, since we don't have middle names in our table yet:

SELECT * 
FROM ClientNames
WHERE MiddleName IS NOT NULL
GO
-----------------------------------------
FirstName          MiddleName	        LastName
-----------------	-----------------	-----------------
(0 row(s) affected)

You can also test for NULL values, and even replace them where you find them, with the T-SQL keyword ISNULL. This is a function that has two arguments: the field being tested, and the replacement value:

SELECT FirstName
, ISNULL(MiddleName, ’None’) 
, LastName
FROM ClientNames
GO
-----------------------------------------
FirstName          MiddleName	        LastName
-----------------	-----------------	-----------------
Buck		        None		        Woody
John		        None		        Glandon
(2 row(s) affected)

There are other settings that affect the behavior of NULLs in a database. You can use the SET ANSI NULLS statement in your T-SQL batch. Setting this value ON during a query makes the comparisons of NULLs just as I've shown you. You can, however, set this value OFF, and then you can compare two NULLs in an equality statement. I recommend against this behavior, however, since it only tends to confuse the issue.

We'll see NULL values again — both in code and in database settings.

Informit Articles and Sample Chapters

If you're interested in reading a little more about those NULL restrictions in constraints, you can check out this sample chapter from Sams Teach Yourself SQL in 24 Hours in our free Reference Library.

Online Resources

Want to see how passionate people are about NULLs? Check out this blog entry.