Table of Contents
- Microsoft SQL Server Defined
- Microsoft SQL Server Features
- Microsoft SQL Server Administration
- Microsoft SQL Server Programming
- Performance Tuning
- Practical Applications
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
SQL Server and Microsoft Excel Integration
Last updated Mar 28, 2003.
As data professionals, most of us concentrate on establishing a secure, reliable, well-performing back-end server for various applications. We seldom worry about those front-end applications, unless they cause issues with a cluster or server.
It's a different matter, however, when it comes to integration. No matter what your level of involvement with front-end applications, it's your duty to assist with the efforts to directly access data in a SQL Server database. No one is more familiar with the effects that will occur when data is imported, exported, or linked to one of your databases.
Enter Microsoft Office. It's the industry standard office tool, known by most business users, and it works on everything from Windows 95 and Windows XP all the way to Crossover Office on Linux. In short, Office is everywhere. And, in particular, Microsoft Excel is relied on by most every company that has it installed. In fact, some have called it “the world’s most frequently used database system.” And that may not be far from the truth.
Because of its easy availability, Excel is a natural choice for manipulating data. It has an amazing array of tools for applications, from statistics to engineering, and business functions besides. Excel is taught at most business schools and colleges as part of a standard business curriculum.
Another reason to integrate SQL Server data with Microsoft Excel is that the workbooks, which are composed of one or more spreadsheets, are portable. There are viewers and converters for almost all major operating systems, and other office programs can often read their format natively.
Microsoft Excel also has great presentation and graphing tools. I know of several shops that use Excel exclusively for reporting. Microsoft has even announced that the newer versions of Excel are the tool of choice for working with, displaying and reporting on even large data sets.
Add to this the ability to program right inside Microsoft Excel using Visual Basic for Applications (VBA), and you've got a very powerful argument for using this tool to access and process SQL Server data.
Now that you're convinced ☺ that Excel is a natural fit for data integration, how do you get that data in and out of your databases? There are several methods, each with their own application to consider. Before we begin, however, there are a couple of things to keep in mind.
First, because Excel is a flat-file based medium, you may have to change your way of thinking about the data stored in your relational databases. You might have to create a few more database objects, such as views and stored procedures, to format the data in a more de-normalized, pre-joined way for the spreadsheet user.
The same concepts hold true for imports and exports. When creating your design for the transactions, this difference between flat-file and relational structure will figure heavily into the design. You may have to create a process to break the flat design into a relational structure during the import or export.
The other issue to confront with Excel-SQL Server integration is data conversion. The most classic example is the way that SQL Server interprets hexadecimal values coming from Microsoft Excel. Dates can also be problematic; it's best to check the results after you import or export data between the two programs. The issue arises because SQL Server handles lots of data types, depending on the version and edition of SQL Server you’re using. Here are just a few:
- char, nchar, varchar, nvarchar
- bigint, int, smallint and tinyint
- Datetime and smalldatetime
- money and smallmoney
- decimal and numeric
- float and real
- text and ntext
- binary, varbinary and image
- bit, uniqueidentifier and timestamp
However, Excel groups data into only a few basic types, once again depending on the version you have. Some of those include Text, Values, Dates and Formulae. While both Excel and SQL Server do a pretty good job at guessing the type of data that's coming in, you can't trust the transfer between them blindly.
SQL Server and Excel Integration Methods
Even with a few limitations, Excel remains a wonderful integration tool. There are several methods of integration; divided primarily into categories I'll call "Data Copy" and "Linked Data." There are lots of ways to perform this process; I’ll focus on the more popular ones here.
The primary difference between these categories is the way they treat the data after the interaction. In a "Data Copy" method, the data is essentially copied from one place to another, and then begins to diverge from its source immediately after the transfer. In the "Linked Data" method, the data is stored in only one source, so changes are reflected immediately in both interfaces.
So, your first decision to make is which category to use, and then choose the method in that category that best suits the need. After that, test the process with sample data, and then work on measuring the throughput and writing error handlers.
You can use this category of methods when you just need to have a handoff of data stored in SQL Server or Excel. In other words, the criterion for the following methods is that you are finished with the data in one program, and need it in another.
For example, your users may need data which is stored in a SQL Server database to present in a management report. In another example, data stored in various spreadsheets is needed in SQL Server to complete a data set. I've also seen more complex situations, where the data originates in SQL Server, is transferred to spreadsheets for processing, and is re-imported back into the SQL data-stream for use by another application.
In all of these cases, there's a discreet handoff of the data ownership. The limitation is that these methods don't allow edits to the same base data, at least at the same time.
Now that you understand when to use Data Copy, I’ll drill into the specifics on how to make that happen.
Simple Copy and Paste
The first of these methods is good-old copy and paste. You can use this method to copy a small set of records quickly. Understand that this method is quite hands-on; it's definitely a "high-touch" process! Even so, it's often adequate for ad-hoc or one-time solutions.
Copying data from SQL Server to Excel using a manual process is quite simple:
- Open Query Analyzer (QA) or a SQL Server Management Studio (SSMS) query window. Set the output to "grid" with the icon in the icon bar at the top of the tool. This is the default most of the time anyway.
- Run whatever query you're interested in.
- If you're using Query Analyzer, click the small box on the top left part of the results pane to select the results of the query, and hit CTRL-C. You can do the same thing in SQL Server Management Studio, and in fact there you can select only a few cells or rows, or even hold down CTRL and click to select cells that are not contiguous.
- Open Excel and paste the data where you want it.
The reason this works is that when you copy data from the grid view in SQL Server, it is stored in the clipboard as a tab-separated value string. When you paste into Excel (or almost any Microsoft Office grid), it places the values there as tab-separated values.
The reverse direction (Excel to SQL Server) isn’t as simple. You can use Enterprise Manager’s “data view” or SQL Server Management Studio’s “Edit Data” option for a table right-click, but it may not work. You'll need to make sure that the table constraints allow the data you're trying to paste, and that you’re really sure about those data types. In most all circumstances, I recommend against the copy-and-paste method for importing data to SQL Server.
Using the Import/Export Wizard
While the copy-and-paste method is great for one-time simple transfers, it's not very sustainable. If you're after a more permanent, controlled method of data transfer, consider using SQL Server's Data Transformation Services (DTS) in version 2000 and lower, and SQL Server Integration Services (SSIS) in SQL Server 2005 and higher. These features have several powerful functions, from transferring the data to developing data maps, handling errors and notifications, and transforms. In addition, the process can be saved and used over and over, even triggered by another SQL event.
I have articles about using DTS and SSIS here on InformIT and there are several written by other folks as well. You can certainly begin to explore these powerful tools from their respective interfaces, but there’s actually a simpler way to bring Excel (or many other types of inputs for that matter) data in and out of SQL Server.
- Open Enterprise Manager (2000 and lower) or SQL Server Management Studio (2005 and higher)
- Right-click any database name and select "All Tasks" from the menu that appears.
- Select "Import and Export Data" from that submenu.
- You'll see a source selection dialog, in which you can select SQL Server or Microsoft Excel, depending on which way you want to transfer data
- Follow the rest of the wizard, which leads you through the process of field-mapping, transforms, and saving the selections (called a package) for later use
- Run the package at the last dialog.
(A reference at the end of this article goes into more depth on this process.)
Using Excel for Data Copy
To move data using Excel, you have a few choices. Under the Excel "Data" menu, you'll find "Import External Data" as well as a "Query Wizard" submenu, both of which use ODBC drivers to access data in SQL Server. In later versions of Excel, you have a lot more options – There’s even a “Get Data from SQL Server” option right in the Data option of the Ribbon.
Each option has a “refresh” capability, so you can copy the changes in data on the SQL Server side, but keep in mind this process is one-way only. Data changed in Excel doesn't automatically flow back to SQL Server.
This method is best used for small, ad-hoc queries into the database, and should normally be used through views on database tables, not through direct access to base tables.
While we're on that topic, it's important to remember that any access to a SQL Server database has locking and blocking implications. Allowing unplanned, ad-hoc access into an OLTP production system is almost always a bad idea. It's a better process to create special read-only accounts that can access a few views on a reporting system only. Using a reporting system allows proper index design and access control for unplanned queries.
Once you've considered the performance implications, here's the rundown on one of those Excel import processes – this one should work with earlier versions:
- Open Excel
- Select "Data" from the menu bar, then "Import External Data”
- Select "Import Data...”
- Select the "New Source..." button
- Select "Microsoft SQL Server" as a source, then "Next"
- Enter the server name, and then select the method of authentication fill that out as required
- Select the database you want, then the table or view, and then select "Next"
- Name the connection and then select "Finish"
- You're now returned to the "Data Sources" area
- Click "Open", and then select the area of the spreadsheet where the data should start
- Click "OK"
As I mentioned a moment ago, a couple of other methods in Excel can bring data in from SQL Server. While in step 2, you can also select the "Microsoft Query" option, which uses a wizard similar to the steps above. Microsoft Query has a more powerful interface, where you can make selections all the way down to the column level. You can also edit the query in an interface similar to the one you'll find in the table access area in Enterprise Manager or Microsoft Access. The query can be saved for later use.
Using VBA to Get At SQL Server Data
If you're after a less manual process, the last method is a bit more complex, but even more powerful. Microsoft Excel, like all newer versions of Microsoft Office products, has a complete programming interface in the guise of Visual Basic for Applications (VBA). If you've got any programming experience at all, you can write code against a database.
Again, all the previous warnings about locking apply. Additional warnings are warranted here, since with programming you can affect data in the database as well as reading from it.
There are a few places you can use VBA in Excel, such as custom functions and macros, but we'll stick with macros for this example. The basic process is that you create a macro, edit it, and then run it.
You edit the macro inside an editor, in which you type the code to connect, access, and close the connection to a database. Here's the process to create your own macro to connect to SQL Server programmatically:
- Open Excel
- Click on “Tools,” then “Macro,” and then “Macros...”
- Name the Macro, and then click Create
- In the editor window, type the following information, substituting the proper names for the server and the tables you want in between the "Sub xxxx" and "End Sub" tags:
' Declare the QueryTable object Dim qt As QueryTable ' Set up the SQL Statement sqlstring = "select au_fname, au_lname from authors" ' Set up the connection string, reference an ODBC connection ' There are several ways to do this ' Leave the name and password blank for NT authentication connstring = _ "ODBC;DSN=pubs;UID=;PWD=;Database=pubs" ' Now implement the connection, run the query, and add ' the results to the spreadsheet starting at row A1 With ActiveSheet.QueryTables.Add(Connection:=connstring, Destination:=Range("A1"), Sql:=sqlstring) .Refresh End WithSave and close the macro, and run it from the same menu you accessed in step 2.
I’ll now move away from the "Copy Data" category to the "Linked Data" category of methods. I'll describe the two main methods to link data, both of which link Excel data into a SQL Server query.
There are ways to use Excel to manage data directly in SQL Server, but they involve a bit more programming, and I've found them to be a bit clumsy especially for daily use. Excel simply doesn't handle locks and connectivity issues as well as you need for large-scale production solutions.
You could also code a solution that accesses data stored in an Excel spreadsheet and update a SQL Server table as a result, but this is really event-driven and not a linked solution. (Again, there's a great deal of programming help in that vein here on InformIT.)
So returning to the methods I've found easy to implement, there are two options you can use to query data in an Excel spreadsheet in SQL Server.
Temporary Access to Excel Data from SQL Server
The first method is quite simple, and uses the OPENROWSET function. Here’s the syntax for a quick query from an Excel spreadsheet with one default tab in it, called c:\temp\test.xls:
SELECT * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0', 'Excel 8.0;Database=c:\temp\test.xls;HDR=YES', 'SELECT * FROM [Sheet1$]')
Permanent Access to Excel Data from SQL Server
If you think you might want to query the spreadsheet in multiple lines of code, you might want to create a re-usable connection to it in other words, treat it like a server. You can tell SQL Server that you want to treat another data source (like Excel, text files or Oracle databases) as a server using a “linked server.” It’s fairly simple to create one. Here's how to do that for a spreadsheet called “test.xls” in the c:\temp directory:
- Open Query Analyzer or a SQL Server Management Studio Query window
- Run the following code:
-- Here we set up the linked server using the JET provider EXEC sp_addlinkedserver N'ExcelLink', @srvproduct = N'', @provider = N'Microsoft.Jet.OLEDB.4.0', @datasrc = N'C:\temp\test.xls', @provstr = N'Excel 8.0;' GO
By doing this, you create a linked server, and give that connection a name. Notice also the name of the spreadsheet, which can also be on a network share. You only have to do this once for each spreadsheet; if you're not going to access that spreadsheet again, it's a good idea to drop the linked server after you've used it.
Now that you have a linked server, you can access the data. The process for this method is as follows:
In a query tool such as Query Analyzer or SSMS Query window, type the following:
-- Setup the user connection for the spreadsheet EXEC sp_addlinkedsrvlogin 'ExcelLink', 'false' GO -- Get the spreadsheet data – “Sheet1” is the tab name SELECT * FROM OPENQUERY(ExcelLink, 'select * from [Sheet1$]') GO
In this section, we've used the OPENQUERY function, which passes the query on to the provider. You'll find the query language is pretty limited at times, with Excel. If the queries aren't selective enough, set up another worksheet in the workbook with the data you want, and query that one.
As you can see, you have several options open to integrate data between Microsoft Excel and SQL Server. I have quite a few resources below that might be useful.
InformIT Articles and Sample Chapters
William E. Pearson, III has a good article on integrating Microsoft SQL Server 2000 OLAP and Microsoft Office, Integrating Microsoft SQL Server 2000 OLAP and Microsoft Office, Part 1: Creating an Excel PivotTable Report with an OLAP Cube. You can explore the analysis capabilities of the Excel PivotTable report to present data from an OLAP (OnLine Analytical Processing) cube. In this tutorial, learn how to build a PivotTable report from scratch, and explore the details of its cube-focused functionality.
Books and eBooks
Need something a little more current? Check out Roger Jenning’s book, Special Edition Using Microsoft Office Access 2007, that also covers Excel 2007 and SQL Server.
These links are from Microsoft. There are several others out there but these deal with some of the issues that I brought up in the article.
Here's an article that details the data type woes with DTS and Excel imports.
This article from Microsoft details the process of using DTS with Excel.
Microsoft has a larger discussion of ODBC to Excel datatype issues here.
This Microsoft article details programming with ADO against Excel files it shows you how to open one as a database.