Home > Articles > Data > SQL Server

  • Print
  • + Share This

Types of Data Storage

If you always generated, used, and stored data in a standard format, you wouldn't need to do much data transformation. But as data is used for different purposes, it is stored in different formats. A data transformation tool has to be able to take data in one format and move it to another.

This section outlines the support provided by DTS for transforming data in some of the primary types of storage. More technical information about creating DTS connections for various types of data storage systems can be found in Chapter 5, "DTS Connections."

Text Files

Text files remain one of the primary ways for storing data. They are often used to transfer data from one system to another.

DTS provides access to text files through an OLE DB provider. You can import data from or export data to text files. You can also use the Bulk Insert task with text files, but only to transfer data from a text file into SQL Server.

Listing 3.1 shows a standard text file of clickstream data from a web server.

Listing 3.1 Clickstream Data in a Text File

cacheflow19.isu.net.sa - - [01/Jan/2000:02:04:52 -0600]
  "GET /whatis.htm HTTP/1.0" 200 13589
cacheflow19.isu.net.sa - - [01/Jan/2000:02:04:53 -0600]
  "GET /image4.gif HTTP/1.0" 200 6276
cacheflow19.isu.net.sa - - [01/Jan/2000:02:04:54 -0600]
  "GET /navibar.gif HTTP/1.0" 200 1031
cacheflow19.isu.net.sa - - [01/Jan/2000:02:04:54 -0600]
  "GET /image6.gif HTTP/1.0" 200 991
cacheflow19.isu.net.sa - - [01/Jan/2000:02:04:54 -0600]
  "GET /home.gif HTTP/1.0" 200 1272
cacheflow19.isu.net.sa - - [01/Jan/2000:02:04:54 -0600]
  "GET /image5.gif HTTP/1.0" 200 1552

XML

XML is an Internet-ready form of text file data storage. It provides a powerful format for sharing data between different applications.

SQL Server 2000 provides XML support in a number of ways:

  • You can export relational data to XML by using the FOR XML clause in a SELECT statement.

  • You can import data into a relational database by using the OpenXML rowset provider.

  • You can access SQL Server through a URL using XML.

  • You can use XML-Data schemas and Xpath queries.

  • You can set XML documents as command text and return result sets as a stream.

Listing 3.2 shows the output of the following SELECT query:

select top 8
  o.OrderID, o.CustomerID,
  od.ProductID, od.Quantity
from orders o
join [order details] od
  on o.orderid = od.orderid
order by o.OrderID

Listing 3.2 Result Set from Querying the Orders and Order Details Tables

OrderID   CustomerID ProductID  Quantity
----------- ---------- ----------- --------
10248    VINET   11     12
10248    VINET   42     10
10248    VINET   72     5
10249    TOMSP   14     9
10249    TOMSP   51     40
10250    HANAR   41     10
10250    HANAR   51     35
10250    HANAR   65     15

Listing 3.3 shows the same query with the FOR XML AUTO clause:

select top 8
  o.OrderID, o.CustomerID,
  od.ProductID, od.Quantity
from orders o
join [order details] od
  on o.orderid = od.orderid
order by o.OrderID
for xml auto

Listing 3.3 The Same Result Set Returned with FOR XML AUTO

XML_F52E2B61-18A1-11d1-B105-00805F49916B
<o OrderID="10248" CustomerID="VINET">
 <od ProductID="11" Quantity="12"/>
 <od ProductID="42" Quantity="10"/>
 <od ProductID="72" Quantity="5"/>
</o>
<o OrderID="10249" CustomerID="TOMSP">
 <od ProductID="14" Quantity="9"/>
 <od ProductID="51" Quantity="40"/>
</o>
<o OrderID="10250" CustomerID="HANAR">
 <od ProductID="41" Quantity="10"/>
 <od ProductID="51" Quantity="35"/>
 <od ProductID="65" Quantity="15"/>
</o>

The Parallel Data Pump task (described in Chapter 10) is a useful tool for handling XML data because it provides support for hierarchical recordsets. The Transform Data task (see Chapter 6) has transformations designed to load text into a database, which can be used for loading XML files. There are third-party companies that are offering OLE DB providers that read XML data so that it can be used as the source for a transformation task.

Spreadsheets

Many companies store some of their data, and especially their business analysis data, in spreadsheets. DTS uses the Jet OLE DB provider to connect to Microsoft Excel spreadsheets. DTS can import and export data to Excel.

Relational Database Management Systems

A significant portion of corporate data is stored in Relational Database Management Systems (RDBMSs). These systems provide powerful tools for querying and updating the data, using SQL.

DTS has an OLE DB provider for Microsoft SQL Server, Microsoft Access, and Oracle. The OLE DB provider for ODBC can be used for other database systems that have ODBC drivers available, but not OLE DB providers.

Normalized Database Schema

A normalized database schema is the most efficient format for storing OLTP data. In a normalized database, each piece of information is only stored once, so it can be updated in a single location. It is the more efficient schema for accessing individual records.

Figure 3.8 shows a typical normalized database schema.

Figure 3.8
A diagram of a database with a normalized schema.

Multidimensional (Star) Database Schema

A multidimensional, or star, schema is the most efficient format for storing data used for Business Analysis. In a star schema, the facts being analyzed are stored in a central fact table. This table is surrounded by dimension tables, which contain all the perspectives by which the facts are being analyzed.

Figure 3.9 shows a typical star database schema.

Chapter 4, "Using DTS to Move Data into a Data Mart," discusses the use of the star schema.

Figure 3.9
A diagram of a database with a star schema.

Multidimensional Database Management Systems (OLAP)

You can create a multidimensional database schema in a relational database system. There are also database systems that are specifically designed to hold multidimensional data. These systems are typically called OLAP servers. Microsoft Analysis Server is an example of an OLAP server.

The primary unit of data storage in a relational database system is a two-dimensional table. In an OLAP system, the primary unit of storage is a multidimensional cube. Each cell of a cube holds the data for the intersection of a particular value for each of the cube's dimensions.

The actual data storage for an OLAP system can be in a relational database system. Microsoft Analysis Services gives three data storage options:

  • MOLAP—Multidimensional OLAP. Data and calculated aggregations stored in a multidimensional format.

  • ROLAP—Relational OLAP. Data and calculated aggregations stored in a relational database.

  • HOLAP—Hybrid OLAP. Data stored in a relational database and calculated aggregations stored in multidimensional format.

  • + Share This
  • 🔖 Save To Your Account

Related Resources

There are currently no related titles. Please check back later.