Home > Articles

Working with DataSets

  • Print
  • + Share This
This chapter is from the book

Populating a DataSet

As you learned yesterday, a DataSet can be populated by a data adapter by calling the data adapter's Fill method. The Fill method invokes the command object referenced in the data adapter's SelectCommand property, and the data is subsequently loaded into the DataSet using the mapping found in the TableMappings property of the data adapter. This technique is far and away the one that you'll use most frequently to load data into a DataSet.

However, because the DataSet is a standalone object, it can also be programmatically loaded using its Tables collection. For example, the code in Listing 3.1 written in VB .NET creates a DataSet and populates it with two stores.

Referencing ADO.NET

Because ADO.NET is so integral to creating applications in .NET, when using VS .NET, new projects in both VC# .NET and VB .NET automatically add a reference to the ADO.NET assembly. However, in VC# .NET, you must include a using System.Data; statement in your source code file to avoid having to fully qualify the names of ADO.NET objects as in System.Data.DataSet. In VB .NET, the story is a little different because its Project Property dialog includes the Imports page that by default includes the System.Data namespace. Therefore, the Imports System.Data statement doesn't have to appear in VB .NET source code files (unless you're going to compile them from the command line). The page can be seen and modified to include other namespaces, such as System.Data.SqlClient or System.Data.OleDb, by right-clicking on the project name in the Solution Explorer when working with a VB .NET project and selecting properties. The Imports page is under Common Properties.

Listing 3.1 Populating a DataSet programmatically. This code populates a DataSet with store information.

Dim stores As New DataSet("NewStores")
Dim storesTable As DataTable
Dim store As DataRow
Dim dcID As DataColumn

stores.CaseSensitive = False
stores.Namespace = "http://www.compubooks.com/stores"
stores.Prefix = "cbkss"

' Add the new table
storesTable = stores.Tables.Add("Stores")

' Define the columns
With storesTable
 .Columns.Add("StoreID", GetType(System.Guid))
 .Columns.Add("Address", GetType(String))
 .Columns.Add("City", GetType(String))
 .Columns.Add("StateProv", GetType(String))
 .Columns.Add("PostalCode", GetType(String))
End With

' Create a new row
store = storesTable.NewRow
With store
 .Item("Address") = "5676 College Blvd"
 .Item("City") = "Overland Park"
 .Item("StateProv") = "KS"
 .Item("PostalCode") = "66212"
 .Item("StoreID") = System.Guid.NewGuid
End With

' Add it

' Add a second row
Dim newValues() As Object = {System.Guid.NewGuid, _
 "5444 Elm", "Shawnee", "KS", "66218"}

You'll notice in Listing 3.1 that the constructor of the DataSet accepts a string that is used to specify the name of the DataSet. This can also be set using the DataSetName property if you use the alternative constructor that accepts no arguments. Properties of the DataSet including the XML Namespace and Prefix are then set to ensure that if it were serialized to XML and transported to a trading partner, for example, the data could be differentiated.


In this example, the CaseSensitive property is set to False to disable case-sensitive searching and filtering on the DataSet. As you'll learn tomorrow, there is also a CaseSensitive property on the DataTable class. As you would expect, setting the property on the table overrides the property on the DataSet, and resetting the property on the DataSet has no effect on tables that have already had their CaseSensitive property set. It should also be noted that the CaseSensitive property affects only string comparisons with data and not the names of columns. In other words, case is not taken into account when accessing column names even if the CaseSensitive property is set to True.

To track store information, a DataTable object is created and added to the Tables collection of the DataSet using the Add method of the underlying DataTableCollection object. As you'll learn tomorrow, the DataTable exposes a collection of the DataColumn objects in its Columns collection. In Listing 3.1, columns are added to the table using the Add method. Although there are several overloaded Add methods, the one used here simply accepts the name of the column and its data type. The StoreID column is the unique identifier, and so its data type is System.Guid, whereas the rest simply map to the VB .NET String data type (System.String behind the scenes).

After the table's structure is created, the first row is added by first creating the DataRow object using the NewRow method and then populating each individual column exposed through the Item collection. Although the new row was created with the NewRow method, it must be added to the table explicitly using the Add method of the DataRowCollection class exposed through the Rows property.

The second row is added using a different technique. In this case, the values for the new row are placed into an array of type Object and then simply passed to the overloaded Add method of the DataRowCollection class. Note that in this case the values must be placed into the array positionally coinciding with the order of the columns in the DataTable. In addition, notice that the expression System.Guid.NewGuid is added as the first element in the array. This is the case because the first position represents the StoreID column, which must be uniquely generated.

Of course, you can also use a combination of the Fill method and the programmatic approach to build a DataSet. One typical example is when you want the DataSet to include both data from a data source and some application-generated data. For example, the rather hard-coded example in Listing 3.2 shows how the DataSet can first be filled using the Fill method and then modified to add a second table to store the criteria used to populate the other table. The criteria can then subsequently be used to remind the users how the rows they're seeing were found.


As you'll learn on Day 7, "DataSets and XML," a DataSet can also be loaded from an XML document.

Listing 3.2 Combining techniques to populate a DataSet. This listing uses both the Fill method and the programmatic approach to populate a DataSet.

Dim con As New OleDbConnection( _
Dim da As New OleDbDataAdapter("usp_GetTitles", con)
Dim books As New DataSet("ComputeBooksTitles")
Dim criteria As DataTable
Dim strISBN As String = "06720001X"

da.SelectCommand.CommandType = CommandType.StoredProcedure
da.SelectCommand.Parameters.Add("@isbn", strISBN)

da.Fill(books, "Titles")

criteria = books.Tables.Add("Criteria")

' Define the columns
With criteria
 .Columns.Add("ISBN", GetType(String))
 .Columns.Add("Title", GetType(String))
 .Columns.Add("Author", GetType(String))
 .Columns.Add("PubDate", GetType(Date))
 .Columns.Add("CatID", GetType(System.Guid))
End With

' Add the row
Dim newValues() As Object = {strISBN, Nothing, Nothing, Nothing, Nothing}

' Make it permanent

A second interesting aspect of Listing 3.2 is that it calls the AcceptChanges method of the criteria DataTable when the criteria row has been added. This is done to make sure that the new row has been committed to the DataSet and so if the DataSet is passed to a data adapter for update, the adapter will not attempt to synchronize the changes with the data store. The code could have alternatively called the AcceptChanges method of the DataSet, as shown in Table 3.1. This would have had the effect of committing all changed or new rows in all tables within the DataSet. Obviously, in this case, it makes sense to call AcceptChanges because the criteria will never actually be stored in the database. In Listing 3.1, the AcceptChanges method wasn't called because the new stores may be later inserted into a database using a data adapter.

Traversing a DataSet

It should come as no surprise that after a DataSet has been populated, both its structure and data can be traversed programmatically. Listing 3.3 uses the collections of the DataSet and its children to write out information about each of the tables as well as the rows in a DataSet populated through the stored procedure usp_GetTitlesLookups.

Listing 3.3 Traversing a DataSet. This listing populates and traverses both the structure and data in a DataSet.

Dim con As New SqlConnection( _
Dim da As New SqlDataAdapter("usp_GetTitlesLookups", con)
Dim dsLookup As New DataSet("LookupData")
Dim dt As DataTable
Dim dc, pk As DataColumn
Dim dr As DataRow

da.SelectCommand.CommandType = CommandType.StoredProcedure

da.MissingSchemaAction = MissingSchemaAction.AddWithKey

For Each dt In dsLookup.Tables
 ' Table info
 Console.WriteLine(" has " & dt.Rows.Count & " rows")
 Console.Write("Primary Key: ")
 For Each pk In dt.PrimaryKey

 ' Column Info
 For Each dc In dt.Columns
  Console.Write(dc.ColumnName & " ")
  Console.Write(dc.DataType.ToString & " ")
  Console.Write(dc.AllowDBNull & " ")

 ' Data
 For Each dr In dt.Rows
  For Each dc In dt.Columns
   Console.Write(dc.ColumnName & ":")


In Listing 3.3, you'll notice that the DataSet is populated with the Fill method from the stored procedure. The interesting aspect of the usp_GetTitlesLookups stored procedure is that it contains SELECT statements from not just one but three tables, as shown here:

Create Procedure usp_GetTitlesLookups
Select * From Titles
Order By Title
Select * From Categories
Order by Description
Select * From Publishers
Order by Name

As a result, if the data source can stream multiple result sets to the client in a single statement, the Fill method of the data adapter can simply iterate them and create multiple tables within the DataSet. You'll also notice that the MissingSchemaAction property of the data adapter is set to the AddWithKey value from the MissingSchemaAction enumeration. Although covered in more detail tomorrow, this setting ensures that the tables within the DataSet will be loaded with the primary key and other information about the columns.

After filling the DataSet, the code consists of a series of For Each loops used to iterate the various collections. Each loop prints table and column information to the console followed by all the data in the table. A sample of the results printed to the console is shown here:

Table has 605 rows
Primary Key: ISBN
ISBN System.String False True
Title System.String False False
Description System.String True False
Author System.String False False
PubDate System.DateTime False False
Price System.Decimal False False
Discount System.Decimal True False
BulkDiscount System.Decimal True False
BulkAmount System.Int16 True False
Cover System.Byte[] True False
CatID System.Guid False False
Publisher System.String True False

Title:.NET Framework Essentials
Description:Great Books
PubDate:6/1/2001 12:00:00 AM

Note that unlike in ADO, the number of rows returned for each table can be queried before the data is traversed because the entire result set is first downloaded to the client. In addition, by setting the MissingSchemaAction property, the PrimaryKey property of the DataTable and the AllowDBNull and Unique properties of the DataColumn objects are populated with the correct values corresponding to the primary key constraint, nullability (NULL or NOT NULL) setting, and unique constraint information in SQL Server, respectively.

Finally, the data is traversed by iterating the Rows collection of DataRow objects. In this case, the ColumnName is printed along with the value.


Note that even though Option Strict is turned on, you don't have to convert the value of each column, dr.Item(dc), to a string using the CType function. This is because the WriteLine method of the Console object actually supports 18 different signatures that accept the data types returned by the Item collection. As a result, the appropriate data type will be chosen at runtime and converted to a String for display. This is an excellent example of an effective way to use overloaded members in a class.

Selecting Data

Although it is more efficient to supply the appropriate row selection criteria to a SELECT statement through a WHERE clause, there are times when you'll need to create a subset of the rows loaded to a DataSet. This can be accomplished through the use of the Select and Find methods of the DataTable and DataRowCollection objects, respectively.

The Select method of the DataTable object is overloaded and can return the entire contents of the table, the contents filtered by an expression, filtered and sorted rows, as well as filtered and sorted rows that match a particular state. For example, building on the code in Listing 3.3, the following code could be written to create an array of DataRow objects containing the books for a particular author:

Dim titles As DataTable
Dim foundRows() As DataRow

titles = dsLookup.Tables(0)

foundRows = titles.Select("Author = 'Fox, Dan'")

As you can see, the filter expression is similar to a WHERE clause but contains its own set of rules, as shown in Table 3.2. The array of DataRow objects can then be traversed using the For Each syntax shown in Listing 3.3. Of course, if just the rows for this author were required from the database, you should use the following syntax directly in the SelectCommand of the data adapter used to populate the DataSet or in the stored procedure with the author name passed in as a parameter:

WHERE Author = 'Fox, Dan'

Table 3.2 Expression Syntax Rules



Column Names

You refer to column names using the name defined in the DataSet rather than the name in the underlying data store. Further, if a column name contains a special character, such as ~ ( ) # / \ = > < + - * & % | ^ ' " [ ], the name must be wrapped in brackets.

Literal Values

Strings must be placed in single quotes, dates must be bracketed in # (for example, #11/25/2001#), and numeric expressions can contain decimal points.


AND, OR, NOT, =, <, >, <=, >=, IN, LIKE, +, -, *, /, % (modulus) operators are supported with optional parentheses; otherwise, normal precedence rules apply. Note that string comparisons are case sensitive based on the CaseSensitive property of the DataSet object.

Wildcard Characters

Both * and % can be used interchangeably in wildcard expressions (for example, LIKE 'Fox*' or LIKE 'Fox%'). If the literal string contains one of these characters, it can be set off in brackets.


The CONVERT, LEN, ISNULL, IIF, and SUBSTRING functions are supported. They work as you would expect. More information can be found in the online help.

Aggregate Functions

The Sum, Avg, Min, Max, Count, StDev, and Var functions are typically used in conjunction with a reference to parent or child rows.

Parent and Child Relations

Rows associated through relationships can be accessed using dot notation, as in Child.column or Parent.column. If more than one child exists, you can pass the name of the relation using the syntax Child(relation).column.


Table 3.2 lists the rules for creating expressions that are also used with the Expression property of the DataColumn class, which we'll discuss tomorrow, and the RowFilter property of the DataView class. However, each of these properties can be used for different purposes, and so although legal, not all the available syntax will make sense when used with a particular property. For example, using a string appropriate for a filter in the Expression property of the DataColumn will simply evaluate to a Boolean statement that will cause the value of the column to be a 0 (False) or 1 (True).

WHERE Versus Select

So, when should you use the Select method versus explicit parameters? The rule of thumb is always to retrieve only the data from the database that you're going to use. This means only the data the user needs to see and perhaps modify. In other words, it should be rare that you'd select the contents of an entire table and load it into a DataSet. Most DataSets will contain rows already filtered through a WHERE clause. They can then be further filtered for a particular use through the Select method.

This is the case because in most enterprise databases, the primary tables will contain many more than the approximately 600 rows in the Titles table in the sample database. Can you imagine loading the millions of titles tracked by a real bookseller into a DataSet? That approach would be terribly slow and consume a tremendous amount of memory on the middle-tier server where the DataSet is created. Database products such as SQL Server and Oracle are optimized to select data very quickly based on indexes created by the database administrator. As long as you query data based on these indexes, you're always better off letting the database do the heavy lifting.

In the previous example, if the DataTable has a primary key defined on it, as this one does, the rows will be added to the array in the order of the key. If no key is specified, the array will simply be populated in the order the rows were added to the table.

To explicitly control the sort order, you can specify the sort expression in the overloaded Select method. For example, to select all the books published by Sams in order of publication date, starting with the most recent, the following code could be used:

foundRows = titles.Select("Publisher = 'Sams'", "PubDate DESC")

In the previous two examples, the Select method filtered and sorted the current rows in the DataSet. However, because the DataSet stores both original and modified versions of each row, the Select method also supports selecting rows based on the state of the row in the table. The states are tracked using the DataViewRowState enumeration and can be set to one of eight values (Added, CurrentRows, Deleted, ModifiedCurrent, ModifiedOriginal, None, OriginalRows, and Unchanged). In other words, the previous example is equivalent to

foundRows = titles.Select("Publisher = 'Sams'", "PubDate DESC", _

So, for example, you can use this overloaded signature to select rows that have not yet been modified like so:

foundRows = titles.Select("Publisher = 'Sams'", "PubDate DESC", _

You'll learn more about changing data and how it is tracked on Day 5, "Changing Data."

The second technique for selecting data is to use the Find method of the DataRowCollection object (exposed through the Rows collection of the DataTable object) to find one particular DataRow in the table based on the primary key. For example, using the Titles table retrieved in Listing 3.3 once again, the following syntax can be used to find a specific title:

Dim isbn As String
Dim foundRow As DataRow

isbn = "06720606X"

foundRow = titles.Rows.Find(isbn)

Note that the Find method is overloaded and accepts either a single value of type Object or an array of objects. In this case, a single value is passed because the primary key contains a single column. If the primary key were a composite key consisting of more than one value, an array containing these values in the order they are defined in the key could be passed. In addition, note that although Find accepts arguments of type Object, a String can be passed (even with Option Strict On) because all types in .NET derive from System.Object.

Of course, if no primary key is defined on the DataSet—for example, if the DataSet were not populated using MissingSchemaAction.AddWithKey—the Find method would throw a MissingPrimaryKeyException.

  • + Share This
  • 🔖 Save To Your Account