What Is Integration Services and Why Do I Need It?
IN THIS CHAPTER
- What Is SQL Server Integration Services?
- A Short Integration Services History
- How This Book Is Organized
- The Sample Projects
What Is SQL Server Integration Services?
Depending on whom you ask, you might get different answers to that question ranging from descriptions such as a data import/export wizard, to an ETL tool, to a control flow engine, to an application platform, or to a high-performance data transformation pipeline. All are correct because Integration Services is a set of utilities, applications, designers, components, and services all wrapped up into one powerful software application suite. SQL Server Integration Services (SSIS) is many things to many people.
Data Import/Export Wizard
One of the most popular features of Integration Services is the Import/Export Wizard, which makes it easy to move data from a source location such as a flat file or database table to a flat file, table, or other destination. The Import/Export Wizard was the first utility developed back in the SQL Server 7.0 time frame and continues today as an important utility in the database administrator’s (DBA) toolbox.
ETL is an acronym for Extract, Transform, and Load and describes the processes that take place in data warehousing environments for extracting data from source transaction systems; transforming, cleaning, deduplicating, and conforming the data; and finally loading it into cubes or other analysis destinations. Although Data Transformation Services (DTS), Integration Services’ predecessor application, was considered a valuable tool for doing ETL, Integration Services is where true Enterprise ETL became available in SQL Server.
Control Flow Engine
The processes involved in moving data from location to location and transforming it along the way are not restricted to only processing data. Integration Services provides a control flow for performing work that is tangentially related to the actual processing that happens in data flow, including downloading and renaming files, dropping and creating tables, rebuilding indexes, performing backups, and any other number of tasks. Integration Services provides a full-featured control flow to support such activities.
Developers can create applications that use Integration Services as a platform, embedding the engines within their application using the provided object models. As a developer, you can embed the Integration Services engines and components within your applications using the object models.
High Performance Data Transformation Data Pipeline
That’s a mouthful and really incorporates two ideas: high performance and data pipelining. The Data Flow Task is a high-performance tool because you can use it to perform complex data transformations on very large datasets for incredibly performant processing. The pipeline concept means that you can process data from multiple heterogeneous data sources, through multiple parallel sequential transformations, into multiple heterogeneous data destinations, making it possible to process data found in differing formats and on differing media in one common “sandbox” location.