Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

SQL Server Security: Encrypting Data

Last updated Mar 28, 2003.

High security requirements exist in more places than you might think. The headlines are rife with examples of careless government and commercial employees that have left laptops unprotected, ruining the credit and privacy of hundreds of thousands of people that trusted them with their private data. Even if you don’t think you store personal or private data, you should understand the options you have for protecting this kind of data, since at some point you’ll want to do so.

Private or sensitive data should absolutely never be removed from a protected environment. When companies (and the government) decide to do this, they put us all at risk. There are, however, certain exceptions. The police need instant access to data to find out if the bad guys in the building have a past record. The fire department needs to know where the dangerous chemicals are, and medical personnel need access to your records to make sure they don't cause a dangerous drug interaction. For data that will be "at rest", or located on a device, encryption might be the right thing to do.

Security for your data involves not just the techniques I’ll describe in this tutorial, but a comprehensive security plan. I’ve got a series on that starting here, so make sure you pay attention to physical security, settings, Service Packs and the other areas that make up a secure system. Again, I’m focusing on protecting the data itself, should someone gain access to it.

There are two basic ways to protect data within the database: hashing, and encryption. Hashing data involves taking the data and putting it through a formula, giving a result. For instance, assume you have my social security number (SSN) in a database. You could multiply each number in my SSN by, say, 2:

My SSN: 123456789

New value: 24681012141618

Now take that and divide it by, say 26: 949269697754.53846153846153846154

So to “hash” my SSN, we did this: f(x) = ((x of each N)*2)/26

This is the number you store on both “sides” of the transaction you want to make, so the other system never stores the original number, but verifies that it’s the same thing. It’s a one-way operation, meaning that there is no way to get back to my SSN, since you’re sending neither the original value or even the formula to the other system. This is how passwords work in many operating systems.

Of course, since this is a one-way transaction, it wouldn’t be very useful to use in a single database, since at some point you do actually want to get the original value back. For that, you need encryption.

Encryption (literally, “to bury”) is the process of applying a formula on data that you can reverse in some way. It takes the original value in, stores another value and “knows” the formula to unlock it again. SQL Server 2005 (and higher) has functions to help you do this, and that’s the focus of this tutorial.

In this tutorial I'll explain how encryption is used in SQL Server version 2005 and higher. If you have high demands for security, you should immediately switch from SQL Server 2000 to the later versions anyway, since there are several security improvements in the later versions — and the small fact that SQL Server 2000 is now out of primary support. If you have to use SQL Server 2000, you'll need to use Application Programming Interfaces (API's) to call an encryption function, or you can buy a package that will encrypt data for you. Keep in mind, with that version out of support, you won’t have the security updates needed to keep it safe.

The encryption I'll describe in this tutorial deals with columns of data, not the entire database. It's rare that you need to encrypt the entire database, but normally only one or more columns of particularly sensitive data. This brings up an important point. Before you start encrypting data, you should think about a proper design. If an application is designed and implemented properly, you may not need to encrypt data at all. Let's take a look at a concrete example.

Assume that your organization approaches you and tells you that the requirements for a new application they are developing involve storing a client's name along with their Social Security Number. They also want the field agents to have copy of a subset of this data on their laptops.

Your first step should be to ask if it is absolutely necessary to have this information on a remote database. You should explain the financial and personal impacts if this data is compromised, and you should point out recent high-profile cases as backup. If the name and number are required for identity verification (a common practice, although illegal), then you should ask if only a portion of the number can be used. You can then create a practice of removing all but a few of the numbers that would be stored remotely, lessening your risk. Or perhaps you could set up a hash for the data for those remote systems.

But let's assume that your firm (foolishly) requires the number to be stored in these laptop databases. The next thing you should do is work with the broader IT department and ensure that the laptops have an encrypted directory for the data, that they have strong passwords, and a "wipe when stolen" mechanism. You should also brief each laptop user about how dangerous it is to have this data, and that they will be held responsible for its loss. After all that, you'll need to develop the processes to encrypt and decrypt the data.

Encryption Background

Encryption is simply changing one form of text (plaintext) into another (ciphertext) by passing it through a formula (encryption algorithm), which includes a number or string that is known by one or more people (key). A simple form of encryption is letter substitution, where one letter stands for another. Of course this kind of encryption is quite easy to break, because you can study the distribution of letters and begin to make guesses about what letters are standing in for others, simply because most languages have certain letters that are used more often.

But a more robust form of encryption does the character substitution according to rather lengthy formulas that shift the substitution around quite frequently. While this is far more secure, you would have to have a separate formula for each use. To get around that, some of the parts of the formula involve a number or string that is changed for each person or organization. Now you can reuse the formula that a lot of people know, because each one will have their own "key" that is substituted in the formula.

There are two types of encryption keys. One is called symmetric, and the other is called asymmetric. A symmetric key is the same to encrypt the data (change it from plaintext to ciphertext) that you use to decrypt the data (change it back from ciphertext to plaintext). Creating a simple password on a file and then telling someone the password is an example of using a symmetric key. The same password encrypts the file and decrypts it.

An asymmetric key has two parts. One key is known to someone else (called the public key), and they can use it to encrypt data. The other half of the key (called the private key) is used along with the public key to decrypt the data. An asymmetric key is also called a public/private key pair.

You can use either kind of key to encrypt data by inserting the data through a function. I'll discuss the options and mechanisms in a moment.

SQL Server uses encryption by leveraging the Windows operating system. The operating system has an encryption mechanism called the Data Protection API (or DAPI) built right in. When you install SQL Server, it uses this DAPI to create the main key that SQL Server users, called the Service Master Key. This key is generated automatically, without any input from you. Your only job is to back up that key, using the command BACKUP SERVICE MASTER KEY — a VERY important step that you should take right now, if you’ve never done so.

For you to encrypt data, you'll also need for the database to have a key. I'll show you how to create that in a moment. That key is then used in all of the encryption routines.

Asymmetric Certificates and Keys

As I mentioned earlier, there are two types of keys, symmetric and asymmetric. There are two kinds of asymmetric mechanisms: certificates and keys. They are both used the same way, and are the same strength. You can create a new asymmetric key with this command:

CREATE ASYMMETRIC KEY MyNewAsymmetricKey 
  WITH ALGORITHM = RSA_512
  ENCRYPTION BY PASSWORD = ’NeedAReallyStrongPasswordHere!’; 
GO

I'll show you how you can use this key in a moment. For the full syntax of this command and all its options, check this reference.

Certificates are just files that have keys in them, and they work the same way. In fact we used them in my tutorial on encrypting connections. I talk about them more in that tutorial. To create a certificate, use the CREATE CERTIFICATE command. You can find out more about that in this reference.

You might wonder why I'm not spending a lot of time on those commands. The reason is that you may not want to use asymmetric methods for data encryption. Asymmetric encryption is inherently slower than symmetric encryption, and has some limitations on how much data can be encrypted. Not only that, asymmetric encryption is really more useful when you want someone to know how to encrypt data (using the public key) but not decrypt it (you do that with the private key). In my example, I want the remote laptop to be able to both encrypt and decrypt the data, so I'll focus on the symmetric keys in this tutorial. If you’d like to learn more about asymmetric keys in general, there’s a great “Video Mentor” download here.

Symmetric Keys

Interestingly enough, you can create a symmetric key (and this holds true for asymmetric keys as well) by using certificates, passwords, and even other keys. I'll keep it simple in this example and use a password, but you should give this some thought. You're going to have to give that key to someone (like the developer), so a certificate might be a good choice.

I'll use the CREATE SYMMETRIC KEY command, and you can find the full syntax for that here. I'll explain the parts we need as I go.

Make sure you're following along on a test system, and on test data. You can really hurt yourself if you lose your keys or forget your password. In that case you can't get your data back, or replace the key. You're just out of luck. So make sure you're using a test system for this exercise.

First, if you want to follow along, set up a database to work with — on your test system, of course. Open Management Studio, and connect to your test server. Then open a query in the master database, and type the following command:

CREATE DATABASE EncryptionTest;
GO
USE EncryptionTest;
GO

That creates a simple database, with all the defaults. We'll get rid of it at the end of this exercise. Next, you need a table to work with. For this simple exercise, we'll create only two columns: one to hold the name and the other to hold the identification number. Let me say again how bad of an idea this is, since we shouldn't store an ID remotely no matter what. But here is the syntax to create the table nonetheless:

CREATE TABLE SensitiveData
(FullName VARCHAR(255)
, IDNumber varbinary(128));
GO

You can see that I've made the ID a large variable binary number, even though it will be a human-readable string. That's because the encrypted data will be in binary format.

Now let's get started on the encryption. Before you do anything else, you need to get some basic maintenance out of the way. We need to back up the server Service Master Key — again, this is for your test server, not production. You should have an entirely different place to store that key!

Here are the commands for your test server:

BACKUP SERVICE MASTER KEY 
TO FILE = ’c:\temp\SMK.buf’ 
ENCRYPTION BY PASSWORD = ’UseAStrongPassword!1231’;
GO

This example sends the Service Master Key to a file in a “temp” directory, and secured that with a password, which is required. Make sure you pick something strong, and then store that backup file somewhere.

Now you need the database master key, since that isn't created automatically, and back it up right away as well:

USE EncryptionTest;
GO
CREATE MASTER KEY 
ENCRYPTION BY PASSWORD = ’UseAStrongPassword!1232’;
GO
BACKUP MASTER KEY 
TO FILE = ’c:\temp\DBMK.buf’ 
ENCRYPTION BY PASSWORD = ’UseAStrongPassword!1233’;
GO

Now you're ready to create a symmetric key. Here's the command script, which creates the key, sets it to use a password, using the DES level of strength (more on that here in this reference for the syntax) and a password. No, you can't back this one up:

CREATE SYMMETRIC KEY SymmetricKeyTest
WITH ALGORITHM = DES
ENCRYPTION BY PASSWORD = ’UseAStrongPassword!1234’;
GO

With the key in place, now you can use a function to insert data into your table, encrypting only the ID along the way. The process is to open the key, and then use it to encrypt the data. I'll show you how to insert the data first, and then I'll explain what you're doing here:

OPEN SYMMETRIC KEY SymmetricKeyTest 
DECRYPTION BY PASSWORD = ’UseAStrongPassword!1234’;
GO
INSERT INTO SensitiveData
VALUES 
(
’Buck Woody’
, EncryptByKey(Key_GUID(’SymmetricKeyTest’), ’1234567890’)
);
GO

Here’s the breakdown of what you just typed. The first thing you need to do is open the key so you can use it, with the same password. That's the symmetric part.

The next thing you need to do is insert the data. The first field is easy, since it's a simple insert for character data. The more interesting part is the EncryptByKey() function. It needs a key number, so you use yet another function to find that, called Key_GUID(). Then you give it the key name, and the value you want to encrypt. That's all there is to it.

Now assume the laptop with the database is stolen, and the criminal that finds it wants to take a look at the data. He (or she) opens the database, and issues the following query:

SELECT * 
FROM SensitiveData;
GO

What does the perpetrator get back? Not much:

Buck Woody	0x0036CB777A989E4FB0215F2F2828179201000000FB39971B4C90A681936573137522
              56467621B07BC5A12D716EEEE699E275E318

Which of course isn't terribly useful. But assume you haven't lost the laptop. You now need to see the data. You can use a corresponding function to read the encrypted data:

OPEN SYMMETRIC KEY SymmetricKeyTest 
DECRYPTION BY PASSWORD = ’UseAStrongPassword!1234’
SELECT FullName
, CAST(DecryptByKey(IDNumber) AS VARCHAR)
FROM SensitiveData;
GO

And there you have it. Now let's clean all this up:

USE master;
GO
DROP DATABASE EncryptionTest;
GO

Careful with that Data

All this being said, you shouldn't enter into a decision to encrypt data lightly. Unless you carefully create and implement your plan, you can lock yourself out of your data. The keys and certificates that you can use to encrypt your data require special maintenance and backup, in excess of the regular maintenance your system requires. The backup for a key is potentially more sensitive than the database backups you take. If you lose an unprotected database backup, only that data is compromised. If you lose the database key, any data encrypted with it is at risk. So be careful with this powerful tool.