Table of Contents
- Microsoft SQL Server Defined
Microsoft SQL Server Features
- SQL Server Books Online
- Clustering Services
- Data Transformation Services (DTS) Overview
- Replication Services
- Database Mirroring
- Natural Language Processing (NLP)
- Analysis Services
- Microsot SQL Server Reporting Services
- XML Overview
- Notification Services for the DBA
- Full-Text Search
- SQL Server 2005 - Service Broker
- Using SQL Server as a Web Service
- SQL Server Encryption Options Overview
- SQL Server 2008 Overview
- SQL Server 2008 R2 Overview
- SQL Azure
- The Utility Control Point and Data Application Component, Part 1
- The Utility Control Point and Data Application Component, Part 2
- Microsoft SQL Server Administration
- Microsoft SQL Server Programming
- Performance Tuning
- Practical Applications
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
SQL Server Encryption Options Overview
Last updated Mar 28, 2003.
I'm fascinated by etymology, or word origins. I think that when you learn where a word comes from, you have a much deeper understanding of its meaning. For instance, the word "encryption" essentially means to scramble the bits of some computer storage, so that others can't read it. But the word encryption comes from an ancient Greek word which means to put inside a crypt. A crypt, following the chain a little further, is a cavern or cave used to hide something. It's the same kind of word that we use when someone is not really telling us everything, or being cryptic. Interestingly, a crypt had religious connotations, because these special burial places were for some of the most important artifacts the Greeks had.
What has this got to do with SQL Server? In SQL Server, there are times when you want to be a little cryptic you want to hide the meaning of what you are doing from others. In essence, encryption is the process of "scrambling" or mixing up important data such that it can't be read by others without a corresponding code that you specify. Microsoft has included several features in SQL Server to help you do that.
In this section of the guide I normally cover the concepts around specific features of either SQL Server in general or a specific release. You might think this section belongs in the Security section of this site, which indeed it does. The reason I am covering it here is because it is what I call a "covering" topic, which means that it belongs to more than one section of the guide, and because you need to consider it in a holistic fashion. I'm also focusing on SQL Server 2005 and later in particular, since those versions include more encryption features and options than previous versions. In fact, security features like encryption are often one of the primary drivers for an upgrade decision.
I'll cover encryption in this article in more general terms, and then I'll show you how to implement it in practical examples in a few tutorials in the Administration and Programming sections of the guide. This is meant as an overview of your options, not a tutorial for implementing any of these specific features. I’ll give you those in other tutorials.
I should state from the outset that any kind of encryption on almost any system involves some overhead, so there is always a performance implication. That's normally acceptable, because if you're going to all the trouble to encrypt something it is understood that the protection is worth the slight penalty for the safety. Also, encryption isn't available only at the SQL Server level. There are other software solutions for encryption, and there are even hardware devices that can encrypt transmissions to and from the server.
To encrypt something, you need to apply some method of mixing up the data. In SQL Server 2005 and higher, you have multiple options to scramble the data including symmetric keys, asymmetric keys and certificates. In an interesting twist, you can create a key (which I'll describe in a moment) and then encrypt the key with a certificate.
It’s important to note here that I am talking about the data itself. There are other forms of encryption, such as encrypting the connection at the client, network and server (called data in flight) and in SQL Server 2008 and higher, Transparent Data Encryption, which encrypts the files that hold the data and the backups, but not the data itself. I’ll give you a quick overview of encryption in those areas, but I’ll focus mostly on the options for encrypting the data in the database, so that even with a SELECT statement you can’t read it without the keys or certificates.
Symmetric keys are simple to understand and implement. They are just like your password that you use on a computer. If you encrypt an object with a symmetric key, you can decrypt it if you know the “password”. Of course, this makes it the least secure of your encryption options.
You can use the CREATE SYMMETRIC KEY statement in Transact-SQL (T-SQL) to create and store a key in the database. From then on you can use that key to scramble data. To decrypt the data the developer must know the key password. This is the same key used over and over.
Let's look at a simplified example. Let's assume that you're going to tell me your United States Social Security Number. If your number is 111-11-1111, I would encrypt it with the number "2," making the number 222-22-2222. Anyone looking at the number wouldn't know your original Social Security code, so your data is protected. You would know that I multiplied by 2, so you would divide by two to get the real number. This is, of course, not a practical example, since a single number is far too easy to break, and you should never ever store a sensitive number like a Social Security code in a database. Ever.
The Symmetric Key function in SQL Server is quite a bit more secure than a simple multiplication, however, and is suitable for a wide range of data that you want to secure.
An asymmetric key is the next level of security. Asymmetric means that two things are uneven, or not the same. In this arrangement you actually have two keys a public key that you tell everyone about, and a private key that you keep to yourself. Anyone can encrypt data by using your "public" key, but only you would be able to decrypt the data with your "private" key. Let's look at another simple example. Again, this isn't a real-world example; it's just here to help you understand the concept.
Let's assume that we take your 111-11-1111 number again and create two keys. This time the public key is "3," but the private key is "4." The public key would use a secret algorithm to actually store the number as 1332-132-13332. This means that just having the "3" won't help, since you don't have the other "half" of the key.
Public/Private key arrangements are used quite frequently to protect very sensitive data, and depending on how they are implemented, quite secure.
The next level of security is using another mechanism (different than the system itself) to validate all the codes used by each party. In effect, this is like getting introduced to someone you don't know. They tell you that you can trust them to hold your wallet. In fact, if you don't know them, you probably won't do that. But if your banker comes along with the person and says, "You can trust John, he keeps my wallet for me as well" you might be more inclined to trust them, because someone you trust does.
That's exactly the way a certificate works. A server is set up to create special codes. This server, which might be one of your own or at a different place entirely, holds all the keys. Using a special set of numbers they send you, called a certificate, you scramble the data or connection. This third party that secures the data and creates and maintains security is called a Certificate Authority, or CA.
In SQL Server 2008 and higher, Microsoft added another way to hide data, called the HashBytes function. A “Hash” is a one-time event you can’t use it to get back to the actual data, so it isn’t really an encryption method. So why include it here?
Let’s assume that you have a password on your computer. You want to access a resource on another computer. You could send your password across the network, where the server would read the letters “1234” and see that the password for you is indeed “1234,” and it lets you in. The trouble is, if someone is watching the network traffic, they could see the number, so that isn’t very secure. Even if you encrypted the password and then sent it, the attacker could grab that and begin to hack at it. Over time, given enough talent and tools, they could get your password and odds are you’re using that in other places.
So a better solution is to Hash the password into a long number. In fact, in some hashing algorithms, no matter how long the original string is, the hash is always the same length, so the attacker can’t even tell how many letters or numbers you sent in the first place. By changing the “1234” into a long number using some codes, you get an entirely new number that won’t reveal the initial values you just get the result. On the other end, the server uses the same or perhaps even another agreed-on function on your password there and just compares the results. The actual password, in any form, isn’t sent through the network.
So in effect the hash function is really a way of doing comparisons, not necessarily for encrypting data that you want to unencrypt later. In fact you can’t so make sure you understand how this feature works, and when it should be used. I’ll put some links at the end of this article to help you understand more.
With those explanations complete, there are three general areas that SQL Server 2005 allows you to secure with encryption: connections to SQL Server, data, and file backups. In SQL Server 2008 and higher, you can also encrypt the actual files used by the system, called “data at rest”.
Encrypting Connections to SQL Server
The first line of defense using encryption is the connection to SQL Server. There are actually several ways to connect to SQL Server, not just through the client tools Microsoft provides, called the SQL Server Network Interface (SNI) which replaced the Network Libraries in version 2000. SQL Server has the concept of database endpoints, which are TCP/IP ports that you can create for specific services. These endpoints and other connection methods can be secured with a certificate, and using the Secure Sockets Layer (SSL) with TCP/IP. At the end of this article I'll point you to a reference that shows the steps to set this up, and in another tutorial I'll also provide instructions for this kind of connection.
Encrypting SQL Server Data
Using the OPEN SYMMETRIC KEY, OPEN ASYMMETRIC KEY, ENCRYPTION BY CERTIFICATE, EncryptByKey() and DecryptByKey() statements and functions, you can encrypt the contents of a column of data. In another tutorial I'll show you how to use each one of these step-by-step. In SQL Server 2008 and higher, you also get the HashBytes function, and combined with those other statements these are your primary options.
Encrypting Backup Files
You've learned the background behind securing the connections and even the data for a database. But one backup file left at the wrong location and the bad guys have instant access to your entire system. Microsoft provides a mechanism in all of its current operating systems to store files in an encrypted area. Using this feature you can encrypt the entire backup set, simply by placing your backup files there.
There are third-party methods for extending encryption for SQL Server database backups but make sure you understand what these are, so that you are certain they meet your needs.
Transparent Data Encryption
In SQL Server 2008 and later, Microsoft added a feature called “Transparent Data Encryption,” or TDE. This feature is something you “turn on” for a single database, and it automatically encrypts the following objects:
- The database .MDF files
- The database transaction log .LDF files
- The database backups
- The tempdb system database
This feature is transparent to the database and the applications that use them in other words, you don’t have to do anything else to the database or change the application in any way to use it.
It’s important to understand that this feature does not encrypt the data in a table if the application or user can read the data, this doesn’t change that one bit. It also does not change the encryption over the network if the data is sent in a clear network channel, even with TDE turned on it still is in the clear. TDE is designed only to protect the files, not the data within them. If someone were to steal the hard drives, obtain your backups or the tempdb files, they couldn’t read them.
One final, VERY important note on all of these features. You need to fully understand how these features work before you implement them, paying special attention to the keys, passwords and/or certificates you use to implement them. In some cases, no one can help you if you “lock yourself out” of your database or even server.
As I mentioned, I'll cover practical examples of all of these features throughout the site. You should consider the benefits of encrypting your connections, data and backups for your systems carefully to select the right balance of security and performance.
InformIT Articles and Sample Chapters
We have a fantastic article on Encryption from our old Security Reference Guide, A Beginner's Guide to Encryption.