- Table of Contents
- Microsoft SQL Server Defined
- Microsoft SQL Server Features
- Microsoft SQL Server Administration
- Microsoft SQL Server Programming
- Performance Tuning
- Practical Applications
- Choosing the Back End
- The DBA's Toolbox, Part 1
- The DBA's Toolbox, Part 2
- Scripting Solutions for SQL Server
- Building a SQL Server Lab
- Using Graphics Files with SQL Server
- Enterprise Resource Planning
- Customer Relationship Management (CRM)
- Building a Reporting Data Server
- Building a Database Documenter, Part 1
- Building a Database Documenter, Part 2
- Data Management Objects
- Data Management Objects: The Server Object
- Data Management Objects: Server Object Methods
- Data Management Objects: Collections and the Database Object
- Data Management Objects: Database Information
- Data Management Objects: Database Control
- Data Management Objects: Database Maintenance
- Data Management Objects: Logging the Process
- Data Management Objects: Running SQL Statements
- Data Management Objects: Multiple Row Returns
- Data Management Objects: Other Database Objects
- Data Management Objects: Security
- Data Management Objects: Scripting
- Powershell and SQL Server - Overview
- PowerShell and SQL Server - Objects and Providers
- Powershell and SQL Server - A Script Framework
- Powershell and SQL Server - Logging the Process
- Powershell and SQL Server - Reading a Control File
- Powershell and SQL Server - SQL Server Access
- Powershell and SQL Server - Web Pages from a SQL Query
- Powershell and SQL Server - Scrubbing the Event Logs
- SQL Server 2008 PowerShell Provider
- SQL Server I/O: Importing and Exporting Data
- SQL Server I/O: XML in Database Terms
- SQL Server I/O: Creating XML Output
- SQL Server I/O: Reading XML Documents
- SQL Server I/O: Using XML Control Mechanisms
- SQL Server I/O: Creating Hierarchies
- SQL Server I/O: Using HTTP with SQL Server XML
- SQL Server I/O: Using HTTP with SQL Server XML Templates
- SQL Server I/O: Remote Queries
- SQL Server I/O: Working with Text Files
- Using Microsoft SQL Server on Handheld Devices
- Front-Ends 101: Microsoft Access
- Comparing Two SQL Server Databases
- English Query - Part 1
- English Query - Part 2
- English Query - Part 3
- English Query - Part 4
- English Query - Part 5
- RSS Feeds from SQL Server
- Using SQL Server Agent to Monitor Backups
- Reporting Services - Creating a Maintenance Report
- SQL Server Chargeback Strategies, Part 1
- SQL Server Chargeback Strategies, Part 2
- SQL Server Replication Example
- Creating a Master Agent and Alert Server
- The SQL Server Central Management System: Definition
- The SQL Server Central Management System: Base Tables
- The SQL Server Central Management System: Execution of Server Information (Part 1)
- The SQL Server Central Management System: Execution of Server Information (Part 2)
- The SQL Server Central Management System: Collecting Performance Metrics
- The SQL Server Central Management System: Centralizing Agent Jobs, Events and Scripts
- The SQL Server Central Management System: Reporting the Data and Project Summary
- Time Tracking for SQL Server Operations
- Migrating Departmental Data Stores to SQL Server
- Migrating Departmental Data Stores to SQL Server: Model the System
- Migrating Departmental Data Stores to SQL Server: Model the System, Continued
- Migrating Departmental Data Stores to SQL Server: Decide on the Destination
- Migrating Departmental Data Stores to SQL Server: Design the ETL
- Migrating Departmental Data Stores to SQL Server: Design the ETL, Continued
- Migrating Departmental Data Stores to SQL Server: Attach the Front End, Test, and Monitor
- Tracking SQL Server Timed Events, Part 1
- Tracking SQL Server Timed Events, Part 2
- Patterns and Practices for the Data Professional
- Managing Vendor Databases
- Consolidation Options
- Connecting to a SQL Azure Database from Microsoft Access
- SharePoint 2007 and SQL Server, Part One
- SharePoint 2007 and SQL Server, Part Two
- SharePoint 2007 and SQL Server, Part Three
- Querying Multiple Data Sources from a Single Location (Distributed Queries)
- Importing and Exporting Data for SQL Azure
- Working on Distributed Teams
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
Migrating Departmental Data Stores to SQL Server
Last updated Feb 12, 2010.
All organizations deal with, store and process data. Whether you are for-profit, non-profit, small or large, we store and process data.
To the Data Professional or Database Administrator, whenever we hear the word “data,” we think “database.” But in fact, most of the data in an organization isn’t in a database. By importance, or in some cases even in volume, data is all over the organization, in word-processing documents, spreadsheets, text files, pictures and of course e-mails and their attachments.
For the most part, Data Professionals don’t concern themselves with this data. We not only don’t control it, we’re not always sure where it is. But in some cases, we should.
Often the data stored all over the organization has a use beyond the person that created or saved it. For instance, many organizations use a lot of spreadsheets to track events, actions or things that are important to a particular department. These spreadsheets are stored on a user’s network share location, and the user maintains the data. At some point, another user wants access to that same data. The first user grants access to the other, either by copying the data or just placing the file on a share where others can get to it.
In another case, a larger database holds data for a program the entire organization uses. If it’s a “closed” system, the users might want to extract a report that only a single department needs. The users ask IT (or in some cases, they don’t) to extract a subset of data into a text file or spreadsheet. The department then uses that data to create reports not found in the larger system.
Perhaps the users are a bit more sophisticated than just spreadsheets. They use another data program such as Microsoft Access or perhaps even something open-source like MySQL with some sort of front-end.
Over time, lots of people start using that data not just reporting on it, but adding their own data there. And that’s where the issue really starts.
Why Move the Data?
When data, like that in a spreadsheet or a smaller database product like Microsoft Access is accessed by multiple users, it’s normally not a problem for the Data Professional. The data is created, controlled and accessed by a small group of people, and the data isn’t affected or doesn’t affect anyone else.
But in some cases it does. I can’t tell you how many times I’ve “inherited” a data system or departmental application. This usually comes about from two vectors. The first event that brings the department’s application (and subsequently its data) to my attention is the loss of that application. Perhaps someone deletes that file, unintentionally or otherwise, or perhaps it gets corrupted in some way.
The second vector that brings the Data Professional into a departmental data store is when the data store in the application “hits a wall.” What I mean by that is either the design of the data or the application that stores and processes it just can’t handle the load or width of access. In some cases, they have the data, but they just can’t make it display or report like they wanted to, because the design evolved from something simple to something a bit more (or a lot more) complicated.
So you should consider moving that data store when two primary conditions are met:
- The application is “mission critical” if lost
- The data needs a formal level of security and access control
Notice that it doesn’t matter how many people need access to the data which is often a requirement cited for movement to SQL Server. If the data is that important, and if it has security ramifications, then the Data Professional should follow proper protocol to protect the organization.
Consider that you may not have to migrate the data only integrate it. In that case, you can query the data in data stores like text files, Excel and Access with the OPENQUERY statement as well as other methods. If you follow this route, you fall outside of what I am discussing in this serial of articles.
Following the Process
There are two sides of moving a departmental data store to SQL Server. The first is technical and that’s what I’ll deal with in this series of tutorials. The second, possibly more difficult side of the process is political. The reason many departments created the application (and subsequently the data store) was because they didn’t want to wait on the IT department, or felt it wasn’t important enough for them to worry about.
And of course now you’re going to take that control away from them. Almost no one I know likes that, so you’ll need to work with management to impress on the group that you are there to help, and not to hurt. In fact, if you’re careful, you can actually help them keep their application, and explain that you’ll just handle keeping the data safe, protected and performing well. Once you’ve developed that trust, you can move forward.
Before you can start the process of bringing departmental data into SQL Server, however, you have to find it. That’s the first step.
Locate the Data
As I mentioned, many times the application data owners will come to you and tell you about the data issues they face. But you may want to take the lead and locate potential data sources first.
Locating data sources also gives you intelligence around how wide-spread these silos of data are. So how do you locate the data?
There are no foolproof methods of finding the data if the user really doesn’t want it found – but there are ways, both technically and socially, of locating the major applications, and their data.
The first way to locate department data is simply to ask. I have had far more success with this method than you might think.
The process I follow is to schedule meetings for the department heads to explain my logic mentioned above. I take no more than 15-20 minutes to tell them what the criteria are for identifying departmental applications, and when it should be considered for migration.
After I brief the executives, I ask them to brief their own people. I ask them to emphasize that it isn’t about wresting control it’s all about protecting the data. In fact, explain you’ll work hard to ensure that the system will look like it does now, as much as possible. I ask for follow-on meetings with the department when they aren’t sure about whether something should be in spreadsheets, small databases or in SQL Server.
If the department trusts me, they bring me in earlier to talk with them about their applications, and in some cases I’ve even managed to hold a few “lunch and learns” where I explain the basics of data design. Over time this makes it easier for everyone. I keep a channel of communication open, so I can intercept new projects and host them properly to begin with. The users end up with more reliable and better performing data, and if I do ever have to bring the data in, it’s in a much better format for me to deal with.
If the users or managers don’t want to communicate, you may have to resort to a little detective work. The basic process is pretty simple you just interrogate file locations for certain patterns and check the software installed on the workstations for potential “targets.” You then monitor those targets to see who (or what) changes them.
There are, of course, security constraints to consider, so you will want to involve your system administrators in this endeavor. In fact, they may already have the information you need, so be sure and get with them first.
The first tool in your arsenal is the Microsoft Assessment and Planning Solution Accelerator, or much more simply, MAPS.
The MAPS tool is a free download and install from Microsoft that can work across your domain to locate all kinds of data. You can use it to find SQL Server Instances, capacity limits and even consolidation advice, but where I think it is most useful for in this context is that it can locate various versions of Microsoft Office.
You can configure the tool to use a network range, a list of machine names and more, so you have a lot of control for the discovery. I have a pointer at the end of this article that will show you how to use this tool.
Something to keep in mind is that this tool only finds Microsoft Office products on the systems you interrogate. Just finding Office doesn’t indicate that the users have a data store you want to migrate it just limits the targets.
Also keep in mind that the users might have installed something other than Microsoft Office. If they are using another program you’ll have to rely on whatever methods that vendor uses to discover their products.
If you don’t want (or can’t) use MAPS to locate the Microsoft Office installations, you can use PowerShell to ask Windows what is installed on a computer. Once again, you’ll need rights to do that and a more manual method of querying the system. This script uses the Windows Management Interface (WMI) to “ask” a system what software is installed:
gwmi win32_product | format-list -Property Name,Vendor,Version
With that list developed, you can now audit the list of three kinds of files that are often used as data stores: spreadsheets (like Excel), database files (Like Access or FileMaker Pro) and XML. The XML documents will get a lot of hits, so I tend not to focus on these unless the names stand out.
I normally only focus on shared locations. I especially suspect shares on a user’s workstation that’s often an indication that something is shared out for a department. It’s easy enough to find a share on Windows just type this at a command-prompt or in PowerShell:
I also look for shares that have full department access. Once I find the shares, I detail out the files using this command:
DIR *.XLS /S
And of course I change the extensions. I look for the “last modified date” column, and if it’s current within a day, I check that again. Then I check that again in a week, and then each week for a month. If I find that file is being accessed a lot, I try to see who is doing that. More than one or two people? Time for a few questions. For a product like Access, FileMaker Pro or MySQL, I just assume that I need to have the discussion.
From there I follow the same process as the meeting approach. I just explain what I’ve seen and ask if the data store meets the criteria for the migration, and then offer to help. Most of the time this approach works pretty well.
PowerShell has other uses as well, like interrogating services that are running (to find things like MySQL) and to locate other potential targets. The key is that the general approach is to find files, software and services that could run the engine for the data store, and then watch the files to see if they are being accessed frequently and by other folks.
In the next few articles, I’ll explain what to do now that you’ve found the potential files for movement.
InformIT Articles and Sample Chapters
I have an entry elsewhere in this Reference Guide that details the use of the MAPS tool, Microsoft Assessment and Planning Solution Accelerator. It’s the first place you should start.
Books and eBooks
As I explain how to migrate the data from one source to another, you’ll most definitely need to know about SQL Server Integration Services, or SSIS. Microsoft SQL Server 2008 Integration Services Unleashed, by Kirk Haselden, can help.
In some cases, you may decide to leave the data where it is and simply link to it. In that case, you might want to research your programming options for data here.