Home > Articles > Programming > Windows Programming

  • Print
  • + Share This
This chapter is from the book

System.Data Architecture

You can think of the System.Data namespace as consisting of two primary parts. The first part is the managed providers that allow you to connect to a data source, issue commands against the data source, and read data directly from the data store or into a DataSet. The managed providers contain classes analogous to the Connection, Command, and Parameter objects in classic ADO as well as adding support for iterating through a result set in a forward-only manner. The second part is the DataSet and its various supporting classes that allow you to manipulate data in a disconnected fashion. As mentioned previously the DataSet is most like a disconnected Recordset, although much more powerful and flexible.

Managed Providers

To manipulate data in a data store you obviously first need to open a connection to it and pass it commands to execute. In ADO.NET this is accomplished by using a managed provider. ADO.NET ships with two managed providers, the OleDb managed provider and the SqlClient managed provider. The former is contained in the System.Data.OleDb namespace, which like classic ADO, allows you to access any data source for which an OLE DB provider is available. The latter is in System.Data.Sql and provides native access to SQL Server by writing directly to the tabular data stream protocol (TDS) used by SQL Server. In other words, the OleDb managed provider simply offers a data access model that sits on top of the existing OLE DB infrastructure and requires data source specific OLE DB providers or ODBC drivers. Conversely, the SqlClient managed provider takes the place of the existing OLE DB provider for SQL Server (SQLOLEDB) because it writes directly to SQL Server without the assistance of any other software. A diagram of these components and their relation to OLE DB and ADO can be seen in Figure 7.1.

Figure 7.1 ADO.NET Data Access Architecture. This diagram depicts the data access components used in a VB.NET application. Note that OLE DB and ODBC are still a part of this data access model.

Each of the managed providers includes a set of classes that implement interfaces and derive from classes found in System.Data. The common types of classes included in a managed provider are as follows:

  • Connection—A class that implements IDbConnection and includes members to connect to a data source, handle transactional behavior and connection pooling, and receive notifications when the state of the connection changes.

  • Command—A class that implements IDbCommand and includes members used to execute queries against a Connection. It can execute queries that do not return results and queries that return an IDataReader object for iterating through a result set. Also includes events for responding to schema changes in the underlying data source.

  • DataAdapter—A class derived from DbDataAdapter that implements IDbDataAdapter. It includes a set of Command objects and a Connection used to populate a DataSet and update the underlying data source. It also maps data source tables and columns to tables and columns in a DataSet. Note that only relational providers will inherit directly from DbDataAdapter.

  • Parameter—A class that implements IDataParameter used to pass parameters to Command objects. Parameters are tracked through a ParameterCollection object that implements the IDataParameter collection interface.

  • Transaction—A class that implements IDbTransaction used to represent a transaction in the data source.

  • DataReader—A class that implements the IDataReader interface used to read a forward-only stream of records from the data source. Note that this is analogous to a firehose cursor in SQL Server.

The managed providers can also include their own classes to represent properties, events, exceptions, command builders for use with a DataAdapter, and errors. The naming convention used is to prefix the name of the base class or interface with the namespace for the managed provider. For example, for the SqlClient managed provider the classes shown in Table 7.1 are available.

Table 7.1  Classes found in the System.Data.Sql namespace for the Sql Server managed provider.

Class

Description

SqlConnection

Implements IDbConnection and represents the connection to the SQL Server database. Includes its own syntax for the connection string.

SqlCommand

Implements IDbCommand and is used to execute Transact-SQL commands such as stored procedures.

SqlDataAdapter

Derived from DbDataAdaper and used to map data from SQL Server to a DataSet for select, insert, update, and delete.

SqlDataReader

Implements IDataReader as a means of returning a forward-only, read-only cursor on a result set.

SqlParameter

Represents a parameter to a SqlCommand object.

SqlParameterCollection

Represents a collection of SqlParameter objects.

SqlTransaction

Represents a transaction in SQL Server.

SqlError

Represents an error returned by the SqlDataAdapter.

SqlException

Thrown by the SqlDataAdapter when a SQL Server error occurs.


Obviously, by abstracting the basic functionality for managed providers into base classes and interfaces it is possible for database vendors and corporate developers to write their own managed providers. Although this is possible, it is probably not recommended for the majority of developers because the managed providers shipped with .NET handle all the connectivity required for most data sources. However, it might be warranted for companies that have large-scale systems and have developed a proprietary data access interface to their backend.

In one respect the availability of multiple managed providers represents a decision point for developers using SQL Server (or a backend for which a managed provider is available) as their backend database. Because the SqlClient managed provider handles TDS natively it performs better than incurring the extra translation layer when using the OleDb managed provider against SQL Server as noted in Figure 7.1. As a result, most SQL Server developers will likely want to use the SqlClient managed provider. The downside of this choice is that if you change your backend database you'll have to change and recompile your code to use the OleDb provider as a result. To avoid this, where it is likely that the backend database may change, you may want to create your own set of generic classes that abstract the instantiation of objects from the managed providers. Obviously these classes could run the gamut from statically invoking a particular managed provider such as SqlClient to dynamically invoking the correct one based on a property setting. The cost of creating and maintaining this extra layer may be justified if your target database changes.

That being said, for most projects a better approach is to be sure the data access code is abstracted from the user and business logic in custom classes so that if necessary the code will be fairly easy to change and yet not incur the overhead of creating and maintaining a separate set of generic data access classes. This is the approach used in this chapter.

Using a Managed Provider

The basics of using a managed provider are analogous to using the connection and command objects in classic ADO. However, a managed provider also includes a DataAdapter used to populate and control DataSet objects that have no equivalent in classic ADO.

To illustrate the use of the connection and command objects, the code snippet in Listing 7.1 shows a simple example of connecting to SQL Server using the SqlClient managed provider and executing a simple stored procedure to return the contents of the Courses table using the SqlCommand object. The resulting rows are then iterated upon using the SqlDataReader class.

Listing 7.1  Using a managed provider. This listing shows the simplest example of executing a stored procedure in SQL Server using the SqlClient managed provider and streaming through the resulting rows.

Imports System.Data
Imports System.Data.SqlClient
...

Dim cnSQL As SqlConnection
Dim cmSQL As SqlCommand
Dim drCourses As SqlDataReader

cnSQL = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment")
cmSQL = New SqlCommand("usp_ListCourses", cnSQL)
cmSQL.CommandType = CommandType.StoredProcedure

Try
 cnSQL.Open()
 drCourses = cmSQL.ExecuteReader()

 Do While drCourses.Read()
  Console.WriteLine(drCourses("Description").ToString())
 Loop
Catch e As Exception
 Console.WriteLine(e.ToString())
Finally
 cmSQL.Close()
 cnSQL.Close()
End Try

There are several items to note in Listing 7.1 including the fact that constructor strings are used with both the SqlConnection and SqlCommand objects to pass the connection string and the command text with a reference to the connection respectively. Furthermore, you'll notice that the syntax of the connection string passed to SqlConnection is similar to that used with the SQLOLEDB provider. In addition, the ExecuteReader method is called, which returns a SqlDataReader object used to simply iterate through the rows using the Read method. The SqlDataReader implements a forward-only cursor that allows access to the data through a simple default Item property. For optimum performance the DataReader provides only the basic constructs for reading data.

Note

While much of the code in this chapter uses the SqlClient managed provider, the code for using the OleDb managed provider is almost exactly analogous.

Using a DataReader is the fastest way to get data from SQL Server and involves no overhead on the client because data is not persisted. However, as with a "firehose" cursor in classic ADO, the database connection stays open until the Close method of the SqlDataReader or SqlConnection is called or until the object goes out of scope.

Although it may at first feel strange to access forward-only result sets in this manner, moving the functionality provided by the SqlDataReader into the managed provider and away from the DataSet (where you might expect to find it because it is analogous to the Recordset in most respects) is a good thing. This is because the implementation must clearly be managed by the data source and it allows the DataSet to be used strictly as a client-side data cache.

Tip

For more information on creating firehose cursors in classic ADO see Chapter 14 of Pure Visual Basic.

The interesting aspect of the SqlCommand is that it also implements other methods to execute commands against SQL Server as shown in Table 7.2.

Table 7.2  Execute methods of SqlCommand. These methods are used to retrieve other types of results from SQL Server.

Class

Description

ExecuteNonQuery

Executes the command and does not return a result. Useful for executing statements that do not return result sets.

ExecuteScalar

Executes the command and returns the first column and first row of the result set only. Discards all other data.

ExecuteResultset

Reserved for future use.

ExecuteStream

Executes the command and returns the results as an XML stream.


Several of the methods shown in Table 7.2 such as ExecuteNonQuery and ExecuteScalar also exist in the OleDb managed provider and will be discussed in more detail later in this chapter. However, as an example of the customization possible when using managed providers for a particular data source consider the code in Listing 7.2.

Listing 7.2  Returning a stream. This listing shows how a managed provider can implement specific features of the data source, in this case the ability to return XML as a stream.

Dim cnSQL As SqlConnection
Dim cmSQL As SqlCommand
Dim sCourses As Stream
Dim objSr As StreamReader
Dim strXML As String

cnSQL = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment")
cmSQL = New SqlCommand("usp_ListCourses", cnSQL)
cmSQL.CommandType = CommandType.StoredProcedure

Try
 cnSQL.Open()
 sCourses = cmSQL.ExecuteStream()
 objSr = New StreamReader(sCourses)
 strXML = objSr.ReadToEnd()
Catch e As Exception
 Console.WriteLine(e.ToString())
Finally
 cmSQL.Close()
 cnSQL.Close()
End Try

In this example the same stored procedure is executed as in Listing 7.1, however, rather than execute the procedure using the ExecuteReader method, the ExecuteStream method is invoked, which is unique to the SqlClient managed provider. In this case, instead of returning a result set that you can iterate on using a DataReader, an XML stream is returned and stored in a string variable using a StreamReader (discussed in Chapter 11). This is possible because SQL Server 2000 has the ability to generate XML at the server using the FOR XML clause. To generate the XML the usp_ListCourses stored procedure was modified as shown in this Transact-SQL snippet:.

CREATE PROCEDURE usp_ListCourses
AS
SELECT * FROM Course
ORDER BY CourseNum
FOR XML AUTO

Several other techniques for using command objects of the managed providers will be shown later in the chapter.

The other key aspect of using a managed provider is to use the DataAdapter to populate a DataSet and control how changes to the DataSet are sent to the underlying data store. To do this the DataAdapter references select, insert, update, and delete command objects that are invoked when the DataSet is manipulated. The code snippet shown in Listing 7.3 creates a new SqlDataAdapter and sets the SelectCommand and UpdateCommand properties.

Listing 7.3  Using SqlDataAdapter. This simple example creates a DataAdapter and uses it to specify a stored procedure for updating a DataSet.

Dim cnSQL As SqlConnection
Dim daSQL As SqlDataAdapter
Dim parmWork As SqlParameter
Dim dsCourses As New DataSet("Offerings")
Dim intRows as Integer

cnSQL = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment")

' Create the adapter and set the SelectCommand through the constructor
daSQL = New SqlDataAdapter("usp_ListCourses", cnSQL)
daSQL.SelectCommand.CommandType = CommandType.StoredProcedure

' Create the update command
daSQL.UpdateCommand = New SqlCommand("usp_UpdateCourse", cnSQL)
daSQL.UpdateCommand.CommandType = CommandType.StoredProcedure

' Set the parameters for the update command
parmWork = daSQL.UpdateCommand.Parameters.Add(New SqlParameter("@Days", 
SqlDbType.TinyInt))
parmWork.SourceColumn = "Days"
parmWork.SourceVersion = DataRowVersion.Current

parmWork = daSQL.UpdateCommand.Parameters.Add(New SqlParameter("@CourseID", 
SqlDbType.Int))
parmWork.SourceColumn = "CourseID"
parmWork.SourceVersion = DataRowVersion.Original

parmWork = daSQL.UpdateCommand.Parameters.Add(New SqlParameter("@Cost", 
SqlDbType.Money))
parmWork.SourceColumn = "CourseWareCost"
parmWork.SourceVersion = DataRowVersion.Current

' Populate the DataSet
intRows = daSQL.Fill(dsCourses, "Courses")

' Make a change to the underlying data
dsCourses.Tables("Courses").Rows(0).Item("Days") = 0

Try
 ' Save the changes
 daSQL.Update(dsCourses,"Courses")
Catch e As Exception
 ' Report the exception
End Try

Note that like any command object in classic ADO the SqlCommand object referenced in the UpdateCommand property accepts parameters stored in a Parameters collection. The constructor of SqlParameter accepts the name of the parameter and the data type using the SqlDbType enumeration. The SourceColumn and SourceVersion properties are used to instruct the SqlDataAdapter as to which column and which version of the column (Current, Default, Original, Proposed) should be used from the DataSet to populate the parameter. In this case the original version of the CourseID is used because it is the primary key while the current versions of Days and CourseWareCost are used because they are updateable.

Tip

Unlike in classic ADO when using parameters with a SqlCommand or OleDbCommand, the parameters needn't be added to the parameters collection in the order in which they are defined in the stored procedure because the procedure will be executed using named arguments. For example in Listing 7.3 the @Days parameter is defined prior to the @CourseID although in the stored procedure the reverse is true. Of course, this means that you must use the exact name of the parameter when adding it to the collection.

The SqlDataAdapter can then populate the underlying DataSet using the Fill method, which returns the number of rows added to the underlying DataTable. Note that if the SqlConnection object is not yet open, the Fill method will first open it before executing the query used to retrieve the data before closing it. If the connection is already open, the Fill method simply uses it and does not close it. This behavior is efficient because holding on to an expensive resource such as database connections is not desirable in distributed applications.

A simple change is then made to the Days column of the first row and the Update method of the SqlDataAdapter is then called. The end result is a single call to the usp_UpdateCourse stored procedure with the appropriate parameters.

In both the Fill and Update methods, the second argument is used to specify the DataTable object within the DataSet that is to be populated (more about this follows).

As mentioned previously, instead of specifying the insert, update, and delete commands executed by the DataAdapter in code as shown in Listing 7.3, it is possible to instruct the managed provider to create them on-the-fly. This can be done with SqlCommandBuilder and OleDbCommandBuilder classes. In fact by simply instantiating a command builder and passing it the DataAdapter in the constructor, the commands will be built on-the-fly by the CommandBuilder when they are needed. The code in Listing 7.4 shows how this works using the SqlCommandBuilder class.

Listing 7.4  Using a CommandBuilder. This listing shows how to use the SqlCommandBuilder to autogenerate commands for data manipulation through the SqlDataAdapter.

Dim cnSQL As SqlConnection
Dim daSQL As SqlDataAdapter
Dim cmdSql As New SqlCommandBuilder(daSQL)
Dim cmUpd As SqlCommand
Dim dsCourses As New DataSet("Offerings")

cnSQL = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment")
' Create the adapter and set the SelectCommand through the constructor
daSQL = New SqlDataAdapter("SELECT * FROM fn_ListCourses()", cnSQL)

' Create the command builder
cmdSql = New SqlCommandBuilder(daSQL)

' Populate the DataSet
daSQL.Fill(dsCourses, "Courses")

' Make a change to the underlying data
dsCourses.Tables("Courses").Rows(0).Item("Days") = 0
daSQL.Update(dsCourses,"Courses")

' Get the command that was used to update the database
cmUpd = cmdSql.GetUpdateCommand
Console.WriteLine(cmUpd.CommandText)

Notice that rather than create UpdateCommand explicitly, the SqlCommandBuilder is instantiated and passed a reference to daSQL. At this point no further intervention is required. When the Update method of the SqlDataAdapter is invoked, the command builder sends a request to SQL Server to discover the column names and data types of the base table used in the SelectCommand. In this way the command builder only incurs the overhead of the extra roundtrip to the server when it is required. It then populates the underlying SqlDataAdapter command objects, which are then immediately used to execute the insert, delete, or update (as in this case) using the sp_executesql system stored procedure. The commands are then cached for the lifetime of the SqlDataAdapter. Note that the code in Listing 7.4 also uses the GetUpdateCommand method of the SqlCommandBuilder to get a reference to the SqlCommand used for the update and simply prints the CommandText that was generated.

Before you rely on the command builders, there are a couple of caveats to be aware of. First, command builders only work when the statement used for the SelectCommand pulls data from a single base table. In this example, the SelectCommand uses a call to an in-line table function in SQL Server 2000 that queries only the Course table and so it is successful. However, if the function were to use a SELECT statement with aggregate functions or a JOIN clause, the command builder would return an exception when trying to generate the commands because either no base table or multiple base tables would be returned. As shown however, with SQL Server you can use functions and stored procedures that return data from a single table.

Tip

A second type of function supported in SQL Server 2000 is the multistatement table function. This function returns a result set built on-the-fly by Transact-SQL inside the function. While you can use multistatement table functions in a SelectCommand by simply referencing them in the FROM clause, the SqlCommandBuilder cannot create commands from such a statement because base table information is missing. This is the case even if the function returns data from a single table.

The second point to note is that the SQL syntax that gets built when using a command builder is not necessarily optimum. For example, all the columns returned in the SelectCommand are included inside the WHERE clause for the update and delete commands even if the primary key of the underlying DataTable is set. This obviously wastes network bandwidth and causes the SQL Server to have to parse the statement and create an optimized query plan before it can be executed. An example of the update CommandText is shown here. Note that CourseID is the primary key.

UPDATE Course SET CourseNum = @p1 , Description = @p2 , ProductID = @p3 ,
LOBID = @p4 , Days = @p5 , CourseWareCost = @p6
WHERE ( CourseID = @p7 AND CourseNum = @p8 AND Description = @p9 AND
ProductID = @p10 AND LOBID = @p11 AND Days = @p12 AND CourseWareCost = @p13 )

In addition, the insert command is built based on only the columns that are set when populating the new row. In other words, you must ensure that you populate all the columns that are required and that do not have defaults if the insert is to succeed.

The end result is that for most sophisticated applications you'll want to populate the data modification commands yourself using more efficient techniques such as stored procedures.

Using Events

Just as the Connection and Command objects in classic ADO support events so do the managed providers. The connection and DataAdapter classes for both OleDb and SqlClient include events that your code can capture. For example, both the SqlConnection and the OleDbConnection classes support the InfoMessage and StateChanged events to capture informational messages from the data source and monitor when the connection changes its state. SqlDataAdapter and OleDbDataAdapter support RowUpdating and RowUpdated events that fire before and after the UpdateCommand of the adapter is executed.

To handle one of these events you can declare the object using the WithEvents keyword, use a dynamic event handler, or the Delegate class explicitly as discussed in Chapter 4. As an example, consider a client that wants to capture messages produced with the PRINT statement in Transact-SQL. To do this the InfoMessage event must be handled. Assuming that the SqlConnection object is declared as cnSQL, a dynamic event handler can be constructed using the AddHandler statement.

AddHandler cnSQL.InfoMessage, AddressOf SqlConnInfoMessage

The private SqlConnInfoMessage method must then take as its second argument the type SqlInfoMessageEventArgs, which exposes a collection of SqlError objects. The code to handle the event, construct a message string, and pass it to a method used to log the message to a file is shown here:.

Private Sub SqlConnInfoMessage(ByVal sender As Object, ByVal e As 
  SqlInfoMessageEventArgs)
Dim i As Integer
Dim strMessage As String

 For i = 0 to e.Errors.Count - 1
  strMessage = "Source: " & e.Errors(i).Source & vbCrlf & _
  "Number: " & e.Errors(i).Number.ToString() & vbCrlf & _
  "State: " & e.Errors(i).State.ToString() & vbCrlf & _
  "Class: " & e.Errors(i).Class.ToString() & vbCrlf & _
  "Server: " & e.Errors(i).Server & vbCrlf & _
  "Message: " & e.Errors(i).Message & vbCrlf & _
  "Procedure: " & e.Errors(i).Procedure & vbCrlf & _
  "LineNumber: " & e.Errors(i).LineNumber.ToString()
 Next

 LogToFile(strMessage)
End Sub

Connection Pooling

Developers interested in developing scalable distributed applications for use by hundreds or thousands of concurrent users must concern themselves with using expensive resources such as database connections. As a result you'll want to be sure your applications do not hold connections for an extended period of time and that multiple users can reuse connections no longer in use without having to incur the overhead of rebuilding the connection.

In classic ADO, pooling database connections was handled either through the session pooling mechanism of OLE DB or the connection pooling code implemented by the ODBC Driver manager when using the MSDASQL provider.

Tip

For more information on the differences and specifics of these two methods see Chapter 14 of Pure Visual Basic.

When using the OleDb managed provider both of these features are still available and are handled automatically by OLE DB and ODBC depending on which OLE DB provider you use. However, because the SqlClient managed provider communicates directly with SQL Server using TDS, it must implement its own form of connection pooling and does so by relying on Windows 2000 Component Services. As discussed in chapter 8, COM+ in Windows 2000 supports pooled components whereby a predefined number of instantiated object instances are hosted in a pool managed by COM+. If an object instance is available, one is handed out to a client application when it requests a new instance and subsequently returned to the pool when dereferenced for use by another client. This scheme allows clients to reuse objects that are expensive to create thereby reducing the resources required on the server. The class that represents a connection to SQL Server, SqlConnection, is a perfect candidate for a pooled component.

To support creating and configuring a connection pool several additional properties are included in the ConnectionString for the SqlConnection. For example, by simply instantiating and opening a SqlConnection object with the Pooled property set to True the connection will be added to a new or existing pool. As with OLE DB session pooling, connections are grouped into pools based on the distinct text of the ConnectionString property. Any differences at all in connection strings results in the creation of a new pool.

As an example consider this code snippet:

cnSQL = New 
SqlConnection("server=ssosa\sql2k;uid=dfox;pwd=dlf;database=enrollment;
pooling=true")
cnSQL.Open

cnSQL2 = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment;
pooling=true")
cnSQL2.Open

cnSQL3 = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment;
pooling=true")
cnSQL3.Open
cnSQL3.Close

cnSQL4 = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment;
pooling=true")
cnSQL4.Open

In this example, four SqlConnection objects are instantiated and all are left open with the exception of cnSQL3. In this case because all have the Pooling attribute set to True two pools are created: one containing cnSQL1; and the other containing cnSQL2 and cnSQL3 because the ConnectionString for cnSQL1 differs from that of cnSQL2 and cnSQL3. However, when the Close method of cnSQL3 is called the connection is not torn down but returned to the pool. As a result when the Open method of cnSQL4 is called the connection previously used by cnSQL3 will be pulled from the pool and assigned to cnSQL4 rather than a new connection.

Obviously, this contrived example concerns a single client application but if you assume that each connection represents a separate user that attempts to create a database connection nearly simultaneously and you begin to see the usefulness of connection pooling.

Unlike OLE DB session pooling, by default the pooled SqlConnection objects are not destroyed until the process that created them ends or the connection is somehow broken. In addition, the pools created are also subdivided based on whether the connection is enlisted in a particular transaction context. This allows the client application to use distributed transactions with SqlConnection objects and be sure that all work done by the transaction is committed together.

Finally the syntax of the ConnectionString has been augmented with additional properties that control the size and behavior of the pool as shown in Table 7.3. Note that because these are included in the ConnectionString they are set when the pool is first created because pools are created on the basis of the syntax of the ConnectionString.

Table 7.3  SqlClient Connection pooling properties. These properties are used in the ConnectionString of the SqlConnection object to configure connection pooling.

Property

Description

Connect Timeout

Length of time to wait for a connection to succeed. Default is 15 seconds.

Connection Lifetime

Specifies the maximum amount of time in seconds that a connection can live. Default is 0, meaning for the duration of the process. Checked when the connection is returned to the pool.

Connection Reset

Specifies whether the connection is reset when returned to the pool by undoing any session level (SET) statements issued during the user session. Defaults to True but incurs an extra roundtrip when the connection is returned to the pool.

Enlist

Defaults to True and automatically enlists the connection in the current transaction context.

Min Pool Size, Max Pool Size

Defaults to 0 and 100, respectively, and determines how many connections may live in the pool.


Disconnected Data

As previewed in the code in the previous section, the second major partition of the System. Data namespace is the DataSet and its related classes. As previously mentioned the DataSet is fundamental to ADO.NET because it represents data from one or more underlying data sources in a disconnected fashion. In addition it includes metadata such as relationships and constraints that provide structure for the data and allow code to navigate through the data. And of course the DataSet can be defined using an XSD schema and stored as XML so that it is a great vehicle for passing data between tiers of a distributed application.

Note

The DataSet class is derived from the System.ComponentModel. MarshalByValueComponent class, which allows it to be marshaled by value across process boundaries whereby a copy of the serialized object is passed rather than a reference to it. In this way it is analogous to the disconnected Recordset in classic ADO.

In addition to scalar properties used to configure the DataSet, the DataSet contains references to the major classes shown in Figure 7.2 and discussed in Table 7.4.

Figure 7.2 The DataSet object model. The primary classes referenced by the DataSet.

Table 7.4  DataSet classes. The primary classes referenced by the DataSet.

Class

Description

DataRelationsCollection

Exposed through the Relations property and contains a collection of DataRelation objects that map all the parent/child relationships between DataTable objects in a DataSet.

PropertyCollection

Exposed through the ExtendedProperties property and contains a collection of custom properties based on the Hashtable class.

DataTablesCollection

Exposed through the Tables property and contains a collection of DataTable objects contained in the DataSet.

DataViewManager

Exposed through the DefaultViewManager property and used to create custom sort and filter settings (DataView) for each DataTable in the DataSet. Includes a reference to a DataViewSettingsCollection where settings for each DataTable are stored.


As you'll notice from Table 7.4, the DataSet contains references to objects that expose collections containing the data stored in the DataSet. From that perspective you can also think of the DataSet as simply a container for tables, properties, and relations.

As depicted in Figure 7.2 the DataTableCollection exposes the collection of DataTable objects that are used to represent the actual data. Figure 7.3 and Table 7.5 expand the DataTable and show its constituent classes.

Figure 7.3 The DataTable model. This diagram depicts the relationships between the DataTable and other System.Data classes.

Table 7.5  DataTable classes. The primary classes referenced by the DataTable.

Class

Description

DataColumnCollection

Exposed through the Columns property and contains a collection of DataColumn objects that represent each column in the DataTable.

DataRowCollection

Exposed through the Rows property and contains a collection of DataRow objects in the DataTable.

DataView

Exposed through the DefaultView property and contains a reference to a DataView object that contains sort and filter settings for this DataTable object.

DataRelationCollection

Exposed through the ChildRealtions and ParentRelations properties and contains a collection of DataRelation objects contained in the DataSet.

PropertyCollection

Exposed through the ExtendedProperties property and contains a collection of custom properties based on the Hashtable object.

ConstraintCollection

Exposed through the Constraints property and references a collection of objects derived from Constraint such as ForeignKeyConstraint and UniqueConstraint.

DataColumn

In addition to being referenced in ColumnsCollection, also exposed in the PrimaryKey property as an array of objects to enforce the primary key.


The DataTable is the heart of the DataSet and contains the representation of the actual data. You'll notice that like the DataSet, it contains a series of collection objects that reference the columns, rows, properties, constraints, and relations. From this perspective the DataTable is analogous to the classic ADO Recordset.

It also contains a reference to a default DataView object that controls how the data is sorted and filtered. DataView objects can also be created independently to provide multiple views of the same data, for example, in the case where changed rows need to be viewed in one grid control while new rows are viewed in another.

As an example of using the DataSet and its associated collections consider the code in Listing 7.5.

Listing 7.5  Manipulating a DataSet. This code example populates a DataSet using a SqlDataAdapter and then exercises the various collections of the DataSet and DataTable.

Dim cnSQL As SqlConnection
Dim daSql As SqlDataAdapter
Dim daSql1 As SqlDataAdapter
Dim drProdCourse As DataRelation
Dim objCon As Constraint
Dim objUn As UniqueConstraint
Dim rowProd As DataRow
Dim rowCourse As DataRow
Dim CourseRows() As DataRow
Dim dtProducts As DataTable
Dim dtCourses As DataTable
Dim pk() As DataColumn
Dim dsCourses As New DataSet("Offerings")

cnSQL = New 
SqlConnection("server=ssosa\sql2k;trusted_connection=yes;database=enrollment")

' Create the adapters and set the SelectCommand through the constructor
daSql = New SqlDataAdapter("usp_ListCourses", cnSQL)
daSql.SelectCommand.CommandType = CommandType.StoredProcedure

daSql1 = New SqlDataAdapter("usp_ListProducts", cnSQL)
daSql1.SelectCommand.CommandType = CommandType.StoredProcedure

' Populate the DataSet
daSql.Fill(dsCourses, "Courses")
daSql1.Fill(dsCourses, "Products")

dtProducts = dsCourses.Tables("Products")
dtCourses = dsCourses.Tables("Courses")

' Set the primary keys
ReDim pk(1)
pk(0) = dtProducts.Columns("ProductID")
dtProducts.PrimaryKey = pk
ReDim pk(1)
pk(0) = dtCourses.Columns("CourseID")
dtCourses.PrimaryKey = pk

' Add a relationship
drProdCourse = new DataRelation("ProdCourse", _
          dtProducts.Columns("ProductID"), _
          dtCourses.Columns("ProductID"))

drProdCourse.Nested = True
dsCourses.Relations.Add(drProdCourse)

' Look at each row in Products
For Each rowProd in dtProducts.Rows
 CourseRows = rowProd.GetChildRows("ProdCourse")
 Console.WriteLine(rowProd("Name").ToString() & " has " & _
          CourseRows.Length.ToString() & " Courses")

 ' Loop through each correlated row for the Courses
 For Each rowCourse in CourseRows
  Console.WriteLine(vbTab & rowCourse("CourseNum").ToString() _
    & ":" & rowCourse("Description"))
 Next
 Console.WriteLine()
Next

' Print out the constraints
For Each objCon in dtProducts.Constraints
 If TypeOf objCon Is UniqueConstraint Then
  objUn = objCon
  Console.WriteLine("UniqueConstraint on " & objUn.Columns(0).ColumnName)
  If objUn.IsPrimaryKey Then
    Console.WriteLine("It is the primary key")
  End If End If
Next

In this example, two SqlDataAdapter objects use stored procedures to fill two DataTable objects within the DataSet dsCourses with data for products and their associated courses.

Tip

Although this example illustrates the use of several DataAdapters being used to populate a single DataSet, multiple resultsets can be returned from a single stored procedure or SQL batch to the same effect. Doing so will increase performance because only a single roundtrip to the server will be incurred. Keep in mind, however, that you can only associate a single insert, update, and delete command to a DataAdapter. This means that if you plan on updating multiple DataTables within a DataSet with a single DataAdapter you'll need to programmatically switch the insert, update, and delete commands before invoking Update or exceptions will likely result.

The remainder of this section discusses several features of Listing 7.5 that illustrate the structure and uses of the DataSet and DataTable classes.

Mapping Data to the DataSet

Note that the second argument of the overloaded Fill method specifies the name of the DataTableMapping to populate. In this case because no mapping was created before the Fill method was invoked, the SqlDataAdapter creates DataTable objects using the provided mapping names and maps each DataColumn in the DataSet to a column from the underlying database. However, you can programmatically create a mapping layer before populating the DataSet using the TableMappings collection. This is important for situations where the schema of the DataSet will not exactly match the names of the columns in the database. For example, instead of simply using the database column names in Listing 7.5 the following code could be inserted before the Fill method is invoked to map some of the columns of the Course table to new values.

daSql.TableMappings.Add("Table","Courses")

daSql.TableMappings(0).ColumnMappings.Add("CourseNum","CourseNumber")
daSql.TableMappings(0).ColumnMappings.Add("Description","CourseDesc")
daSql.TableMappings(0).ColumnMappings.Add("ProductID","CourseProdID")
daSql.TableMappings(0).ColumnMappings.Add("LOBID","CourseLineOfBusinessID")
daSql.TableMappings(0).ColumnMappings.Add("Days","CourseDays")
daSql.TableMappings(0).ColumnMappings.Add("CourseWareCost","Cost")

daSQL.Fill(dsCourses)

In this example a DataTableMapping object is added to the collection exposed by the DataAdapter and called "Table." In this case "Table" has a special connotation as it represents the default table mapping. In other words, when a DataSet is filled the SqlDataAdapter looks for a table mapping with the name of "Table" and uses it to map the column information in the DataTable. Optionally, another name for the mapping could be used and then passed as the second argument to the Fill method. Because of this default behavior the second argument to the Fill method is omitted.

Once the mapping has been created, columns can be added to it using the ColumnMappings collection by simply passing the name of the database column followed by the name it will have in the resulting DataTable. In this, all the columns except CourseID are being mapped, if the database column does not exist in the ColumnMappings collection it will by default automatically be added to the resulting DataTable (although this can be changed by setting the MissingSchemaAction property of the DataAdapter). After the Fill method is invoked a DataTable is created called "Courses" and populated with the columns from the ColumnsMapping collection. A snippet of the resulting XML from the DataSet returned using the Xml or ReadXml properties of the DataSet follows (note that the Nested property is set to False for this example as discussed later).

 <Courses>
  <CourseID>12</CourseID>
  <CourseNumber>SYB </CourseNumber>
  <CourseDesc>FastTrack to Adaptive Server</CourseDesc>
  <CourseProdID>8</CourseProdID>
  <CourseLineOfBusinessID>2</CourseLineOfBusinessID>
  <CourseDays>5</CourseDays>
  <Cost>300</Cost>
 </Courses>
 <Courses>
  <CourseID>15</CourseID>
  <CourseNumber>SYBA</CourseNumber>
  <CourseDesc>Administering Adaptive Server</CourseDesc>
  <CourseProdID>8</CourseProdID>
  <CourseLineOfBusinessID>2</CourseLineOfBusinessID>
  <CourseDays>5</CourseDays>
  <Cost>300</Cost>
 </Courses>

Tip

Optionally, you can programmatically create the DataTable and add new columns and rows to it with the Add method of the Columns collection and the NewRow method, respectively.

After the DataSet has been populated, the DataTablesCollection can then be accessed through the Tables property of the DataSet, in Listing 7.5 it is used to assign the tables to object variables for easier access. Note that because the DataAdapters are separate objects the data that they use to populate the tables of the DataSet could come from heterogeneous sources. You also can set the PrimaryKey property of the DataTable to an array of DataColumn objects. This also has the effect of automatically creating a UniqueConstraint object and placing it in the ConstraintCollection exposed through the Constraints property.

In situations such as that shown in Listing 7.5 where you want to populate the schema of the DataSet directly from a database table, you can use the FillSchema method of a DataAdapter. This method accepts the DataSet to populate in addition to the SchemaType and optionally arguments such as the table mapping to populate. In addition to the column information, FillSchema retrieves both the primary key and unique constraints. For example, the following line of code will prepopulate the schema of the Courses table mapping used in Listing 7.5.

daSql.FillSchema(dsCourses, SchemaType.Mapping, "Courses")

Note that this method returns the DataTable object that was populated.

Tip

A more convenient way to retrieve the constraint information automatically is to set the MissingSchemaAction property of the DataSet equal to MissingSchemaAction.AddWithKey before invoking the Fill method.

Relating Data

To associate the rows in the Courses table with their parent rows in the Products table you can create a DataRelation object as shown in the listing. In this case the constructor of the DataRelation accepts the name of the relation in addition to the parent column and child column within their respective tables. The DataRelation is then added to the appropriate DataRelationCollection for each table and exposed through the ChildRelations and ParentRelations properties. For example, in this case the relation is added to the ChildRelations collection of dtProducts and the ParentRelations collection of dtCourses.

You'll also notice that the Nested property of the DataRelation is set to True. Setting this property changes the XSD schema produced by the DataSet from one that produces a union like view of the data with one that is hierarchical. For example, changing the Nested property from the defaulted value of False to True changes the resulting XML from the following:

<Offerings>
 <Courses>
 ...
 </Courses>
 <Courses>
 ...
 </Courses>
 <Products>
 ...
 </Products>
 <Products>
 ...
 </Products>
</Offerings>

where all the Products elements follow all the Courses elements, to this:

<Offerings>
 <Products>
  <Courses>
   ...
  </Courses>
  <Courses>
   ...
  </Courses>
 </Products>
 <Products>
  <Courses>
   ...
  </Courses>
 </Products>
</Offerings>

where all the Courses for a particular product are nested within the appropriate Products element. Of course, the underlying XSD also changes to reflect the new hierarchical nature of the DataSet and can be queried using the XmlSchema property.

Tip

Keep in mind that even though all the database columns are mapped by default to elements in the XML representation of the DataSet, you can override this behavior by altering the schema used to produce the XML. One technique for doing so is to change the ColumnMapping property of the DataColumn object to a value from the MappingType enumeration. For example, to change the ProductID column to an attribute you would use the syntax: dtProducts.Columns("ProductID"). ColumnMapping = MappingType.Attribute in Listing 7.5.

Regardless of which way the XML is nested for display, once the relationship is in place the code can navigate through the DataSet. In this example, each DataRow in the dtProducts table is iterated on using a For Each loop. The GetChildRows method of the DataRow object can then be called to return an array of DataRow objects, in this case CourseRows, found via the relationships. In turn, the rows in the array are iterated to print out various columns.

Using Constraints

Finally, the code in Listing 7.5 iterates through each Constraint object in the ConstraintCollection exposed by the Constraints property of the dtProducts DataTable. Because Constraint is the base class, the TypeOf statement is used to determine if the constraint is of type UniqueConstraint. If so, the first column of the constraint is printed to the console. In this case the act of creating the relation automatically puts a unique constraint on the ProductID column of the dtProducts DataTable. However, unless the primary key is explicitly created as in this example, the IsPrimaryKey property of the UniqueConstraint object will be defaulted to False.

An excerpt of the sample output for Listing 7.5 is shown here:

SQL Server has 3 Courses
    1140:SQL Server 7.0 Intro
    2072:System Administration for SQL Server 2000
    2073:Implementing a Database Design on SQL Server 2000

Visual Basic has 2 Courses
    1013:Mastering Visual Basic
    1016:Enterprise Development with Visual Basic

Visual Interdev has 1 Courses
    1017:Mastering Web Site Development

Using Select

The final important concept that you should be familiar with when using the DataTable class is the capability to select rows with the Select method. This overloaded method returns an array of DataRow objects corresponding to the criteria provided. For example, in the case of the Courses DataTable shown in Listing 7.5 you can use the Select method to return only those rows for a particular product.

Dim drRows() As DataRow

drRows = dtCourses.Select("ProductID = 1","CourseNum ASC")

For Each rowProd in drRows
 StrCourseNum = rowProd("CourseNum"))
Next

The second argument (exposed in an overloaded signature) also can include a sort criteria, in this case sorting by CourseNum in ascending order. Further, a third overloaded method allows you to select rows based on the values of the DataViewRowState enumeration, which includes CurrentRows, Deleted, ModifiedCurrent, ModifiedOriginal, New, None, OriginalRows, and Unchanged.

Note

Other techniques can also be used to return a subset of data from a DataSet or DataTable. These include the GetErrors method of the DataTable, which returns an array of rows that contains errors after a call to Update, and the GetChanges method of the DataSet which returns a new DataSet that contains only data that has been modified.

Although this discussion highlights the constituent pieces and mechanics of a DataSet and using the managed providers, it does not place them in the context of an actual application. The remainder of the chapter is devoted to using ADO.NET in a distributed application.

Using Events

Just as the Recordset object in classic ADO supports events, both the DataSet and DataTable classes support a variety of events that you can use to respond to changes in the structure and the data in the application. For example, the DataSet supports the MergeFailed and PropertyChanged events that fire when errors occur as two DataSet objects are merged with the Merge method and when any property of the DataSet changes respectively. The DataTable supports seven events, six of which are found in pairs as shown in Table 7.6.

Table 7.6  DataTable events. This table lists the events for the DataTable class.

Events

Event Args

Description

ColumnChanging, ColumnChanged

DataColumnChange EventArgs

Fired immediately before and immediately after a column value is modified in the DataTable.

RowChanging, RowChanged

DataRowChange EventArgs

Fired immediately before and immediately after a row has been successfully edited in a DataTable.

RowDeleting, RowDeleted

DataRowChange EventArgs

Fired immediately before and immediately after a row has been successfully deleted in a DataTable.

PropertyChanged

PropertyChanged EventArgs

Fired whenever a property of the DataTable has changed.


These events can be used to add functionality to your applications such as creating a client-side audit trail by capturing the RowChanged event and inspecting the DataRowAction enumeration exposed through the Action property of the DataRowChangeEventArgs. For each row marked as Delete, Add, Change, and Commit, your application could log the fact to a file. For example, the RowChanged event of the dtCourses table created in Listing 7.5 can be handled by using the AddHandler statement shown here:

AddHandler dtCourses.RowChanged, AddressOf CoursesRowChanged

The CoursesRowChanged method is then implemented to handle the event, extract the Action and the primary key of the row from the event arguments, and log them to a log file using a private LogToFile method.

Private Sub CoursesRowChanged(ByVal sender As Object, ByVal e As 
DataRowChangeEventArgs)

Dim strAction As String

 Select Case e.Action
  Case DataRowAction.Add
   strAction = "Action=Add"
  Case DataRowAction.Change
   strAction = "Action=Change"
  Case DataRowAction.Commit
   strAction = "Action=Commit"
  Case DataRowAction.Delete
   strAction = "Action=Delete"
  Case Else ' Do Nothing
   Return
 End select

 LogToFile(strAction & ", CourseID = " & e.Row.Item("CourseID") & _
  " at " & DateTime.Now.ToShortTimeString()
  & " " & DateTime.Now.ToShortDateString())
End Sub
  • + Share This
  • 🔖 Save To Your Account