Home > Articles > Web Services > XML

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

7.2 XML and ADO.NET

One of the most useful features of ADO.NET is its integration with portions of the managed XML data stack. Traditional data access and XML data access have the following integration points:

  • The DataSet class integrates with the XML stack in schema, data, and serialization features.

  • The XML schema compiler lets you generate typed DataSet subclasses.

  • You can mix nonrelational XML data and the relational DataSet through the XmlDataDocument class and do XPath queries on this data using DataDocumentXPathNavigator class.

  • ADO.NET supports SQL Server 2000 XML integration features, both in the SqlClient data provider and in an add-on product called SQLXML. The latter product features a series of SqlXml managed data classes and lets you update SQL Server via updategram or DiffGram format.

These features, although unrelated in some aspects, work to complete the picture of support of nonrelational as well as relational data in ADO.NET, and direct support of XML for marshaling and interoperability. Let's look first at integration in the System.Data.DataSet class and its support for XML.

7.2.1 Defining a DataSet's Schema

In many ways, the ADO.NET DataSet class mimics a relational database. Each DataSet instance contains a schema—the set of tables, columns, and relationships—as does a relational database. You can define the schema of an ADO.NET DataSet instance in at least four ways:

  • Use the DataSet APIs directly to create DataTables, DataColumns, and DataRelations. This approach is similar in concept to using DDL in relational databases.

  • Infer the schema using database metadata through a DataAdapter class. Using DataAdapter.Fill creates tables and columns matching the metadata from DataAdapter's SelectCommand. For this to work, DataAdapter's MissingSchemaAction property must be set to Add or AddWithKey.

  • Define the desired DataSet schema using XSD (XML Schema Definition language), and use DataSet.ReadXmlSchema to load the schema definition into the DataSet. The schema may not use nonrelational data definition styles, or else ReadXmlSchema will throw an error.

  • Use DataSet.InferXmlSchema. The DataSet class will use a set of schema inference rules to infer a DataSet schema from a single XML document.

You can also define DataSet's schema incrementally by using a combination of these methods, as shown in Figure 7–1. Note that in each case the result

Figure 7-1Figure 7-1 Ways to fill the DataSetÕs schema.

DataSet is not aware of the source of the schema, and therefore any method of defining the schema works as well as any other. For example, let's define a simple schema that includes a customers table, an orders table, and a one-to-many relationship between customers and orders. Listing 7–1 uses the four schema-definition methods to accomplish this. Note that, when using DataSet or DataAdapter, you need additional code to set up the relationship, whereas in the case of XML schema or document inference, this information may be available in the schema or exemplar document.

Listing 7–1 Ways to create a DataSet schema

DataSet ds1, ds2, ds3, ds4;

// Use DataSet APIs
ds1 = new DataSet();
ds1.Tables.Add("Customers");
ds1.Tables[0].Columns.Add("custid", typeof(int));
ds1.Tables[0].Columns.Add("custname", typeof(String));
ds1.Tables.Add("Orders");
ds1.Tables[1].Columns.Add("custid", typeof(int));
ds1.Tables[1].Columns.Add("orderid", typeof(int));
ds1.Relations.Add(
  ds1.Tables["Customers"].Columns["custid"],
  ds1.Tables["Orders"].Columns["custid"]);

// Create schema from SQL resultset metadata
ds2 = new DataSet();
SqlDataAdapter da = new SqlDataAdapter(
  "select * from customers;select * from orders",
  "server=localhost;uid=sa;database=northwind");
da.FillSchema(ds2, SchemaType.Source);
ds2.Tables[0].TableName = "Customers";
ds2.Tables[1].TableName = "Orders";
ds2.Relations.Add(
  ds2.Tables["Customers"].Columns["customerid"],
  ds2.Tables["Orders"].Columns["customerid"]);

// Read schema from a file
// Contains customers and orders
ds3 = new DataSet();
ds3.ReadXmlSchema(
  @"c:\xml_schemas\customers.xsd");

// Infer schema from exemplar document
// Contains customers and orders
ds4 = new DataSet();
ds4.InferXmlSchema(
  @"c:\xml_documents\customers.xml",
  null);

Past Microsoft (and other) APIs let you map XML to relational data but require that you specify the XML in a special format. ADO classic, for example, requires the XML to be specified in the format used by ADO's XML support. Some XML support systems require manual coding for each case, based on the programmer's knowledge of the underlying structure and the underlying types of all the elements or attributes involved. Some systems use a document type definition to specify type, but because the DTD is designed around XML's original use as a document markup language it is largely type-ignorant in the traditional programming sense, and the programmer still must know each type that is not string.

ADO classic's XML support uses a Microsoft-specific predecessor of XSD schemas; it's known as XDR (XML Data Reduced). XDR's type system is limited. Simple types are based loosely on OLE DB types, and there is no notion of type derivation, precluding the use of the object-oriented concepts of inheritance and polymorphism. .NET object persistence and DataSet take full advantage of XSD's improvements in these areas. This "object-relational" mapping layer includes two main classes: System.Data.DataSet's XML support and System.Xml.XmlDataDocument, a class that represents a hybrid of the DOM/DataSet model. Unlike the ADO Recordset, which can consume and produce a single XML format based on a single schema style, DataSet can read and write XML corresponding to almost any schema.

Because XSD supports complex user-defined types as well as type derivation by extension and restriction, using XSD to define types supports a schema-based representation of objects and facilitates a natural mapping to object-based data stores, such as object databases. The hierarchical nature of XML makes this a natural for mapping to homogeneous hierarchies and databases such as IMS. Adding support for navigation and multisets brings CODASYL and object-relational data into the picture. Of course, data can also be stored in files using XML's native serialization format.

Using the XML native serialization format, which is defined in the XML 1.0 and Namespaces recommendations, is arguably a specialization of the object database case, although it enables standards compliance and a measure of code similarity to other platforms using the same object models to access serialized documents. In addition, including DataSet in the mix enables codifying of a set of default rules for mapping XML to relational databases. The relational structure of any Infoset can be defined directly using an XML schema for mapping, and it can also be inferred by using a rule-based approach. You can use relations to map XML hierarchies to multiple relational tables. As you've seen, you can map columns to either attributes or subelements of a particular element.

XML schema definition language is more flexible than the rules of a relational database. This means that when you load a DataSet via an XSD schema, you use a simple set of rules to map the XSD schema to a relational schema:

  • ComplexTypes are mapped to DataTables.

  • Nested ComplexTypes are mapped to related DataTables.

  • Key or Unique constraints are mapped to UniqueConstraints.

  • KeyRefs are mapped to ForeignKeyConstraints.

Schema inference is the attempt to deduce a schema from a single exemplar document. Because a single document is used, this technique is more error-prone than simply supplying the schema, and using a predefined schema should always be preferred over schema inference. When you attempt to infer a nonrelational XSD schema into DataSet format, it is either coerced into a relational schema (as in Listing 7–2) or you receive an error (as in Listing 7–3). Just as the schema compiler (XSD.exe) can infer an XSD schema for an XML document, DataSet uses a set of schema inference rules to infer a relational schema for an XML document. The complete set of rules is described in Appendix C. The DataSet.InferSchema method permits you to specify as an optional parameter, a set of namespaces that will be excluded from schema inference.

Listing 7–2 Schema coercion to relational

<!-- This results in a table with three columns -->
<root>
  <document>
     <name>Bob</name>
     <address>111 Any St</address>
  </document>
  <document>
     <name>Bird</name>
     <livesin>tree</livesin>
  </document>
</root>      

Listing 7–3 Attempt to infer a nonrelational schema

// Attempt to infer a schema from this document
/*
<book>
<chapter>
<title>Testing your <noun>typewriter</noun></title>
<p>The quick brown <noun>fox</noun> jumps over
the lazy <noun>dog</noun></p>
</chapter>
</book>  
*/
DataSet ds = new DataSet();
ds.InferXmlSchema(
  @"c:\xml_documents\semistruct.xml",
  null);

// produces error:
// System.ArgumentException,
// The same table (noun) cannot be the child
// table in two nested relations.

Methods that read XML data in .NET usually have the same four overloads. You can read or write XML schema using Stream, String, TextReader, or XmlReader. The XML methods in DataSet follow the same pattern, as shown in Listing 7–4. The ReadXmlSchema method can read the schema in either XDR format or XSD (XML Schema Definition language) format. InferXmlSchema can exclude a set of namespaces from the schema inference process; there is no way to do this with ReadXmlSchema.

Listing 7–4 Overloads of ReadXmlSchema

public class DataSet 
{
  // ... other methods omitted
  public void ReadXmlSchema(Stream stream);
  public void ReadXmlSchema(string);
  public void ReadXmlSchema(TextReader reader);
  public void ReadXmlSchema(xmlReader reader);
}

DataSet also exposes symmetric WriteXmlSchema methods. WriteXmlSchema uses a hard-coded algorithm to write schemas (see "Writing XML Schemas from DataSet" later in this chapter). You can also obtain schema information by using GetXmlSchema, which returns the XML schema as a string.

7.2.2 Refining DataSet's XML Schema

You can further refine DataSet's XML schema by using the properties of DataColumn, DataTable, and DataSet that relate to XML. Properties that are useful for setting the schema include Namespace, Prefix, ColumnMapping, and DataSet's DataRelations collection. Listing 7–5 shows an example of changing the XML schema for DataSet. The default namespace for DataSet and its underlying DataColumns and DataTables is no namespace.

Listing 7–5 Changing a schema using column mappings

SqlDataAdapter da = new SqlDataAdapter(
     "select * from authors", 
     "server=localhost;uid=sa;database=pubs");
DataSet ds = new DataSet();

da.TableMappings.Add("authors", "AllAuthors");
da.TableMappings[0].ColumnMappings.Add("au_id","AuthorID");

da.Fill(ds, "authors"); 
ds.Tables[0].Namespace = "http://www.develop.com/dmauth";
ds.Tables[0].Prefix = "dmauth";
ds.WriteXmlSchema("myauthors.xsd");

XML data can take the form of elements, attributes, or element text. The default in DataSet is elements, but you can override it by specifying an XML schema or by using the InferXmlSchema method. For those DataSets that do not populate their schema from XML, it can be specified by using DataColumn's ColumnMapping property. ColumnMapping choices are MappingType.Element, Attribute, SimpleText, or Hidden. The latter indicates that this column will not be serialized when DataSet is serialized as XML. Listing 7–6 shows the results of using the XML properties of DataSet.

Listing 7–6 Using namespace, prefix, and mappings

// This code:

DataSet ds = new DataSet();
SqlDataAdapter da = new SqlDataAdapter(
  "select top 2 * from jobs", 
  "server=localhost;uid=sa;database=pubs");
da.Fill(ds);
DataTable t = ds.Tables[0];
t.Columns[1].Namespace = "http://www.develop.com";
t.Columns[1].Prefix = "DM";
t.Columns[2].ColumnMapping = MappingType.Attribute;
ds.WriteXml(@"c:\xml_documents\dmjobs.xml");

// Produces this document:

<?xml version="1.0" standalone="yes"?>
<NewDataSet>
  <Table min_lvl="10">
    <job_id>1</job_id>
    <DM:job_desc 
     xmlns:DM="http://www.develop.com">
    Vice Chairman</DM:job_desc>
    <max_lvl>10</max_lvl>
  </Table>
  <Table min_lvl="200">
    <job_id>2</job_id>
    <DM:job_desc 
     xmlns:DM="http://www.develop.com">
    Grand Poobah</DM:job_desc>
    <max_lvl>250</max_lvl>
  </Table>
</NewDataSet>

When you write a DataSet containing multiple DataTables in XML format, each DataTable is an immediate child of the root element, as shown in Listings 7–7 and 7–8.

Listing 7–7 Data relations and XML

SqlDataAdapter da = new SqlDataAdapter(
  "select au_id, au_fname, au_lname from authors;" +
  "select au_id, title_id from titleauthor", 
  "server=localhost;uid=sa;database=pubs");      
DataSet ds = new DataSet();
da.Fill(ds, "foo");
ds.Tables[0].TableName = "authors";
ds.Tables[1].TableName = "titleauthor";

// non-nested relation
ds.Relations.Add(ds.Tables[0].Columns["au_id"],
  ds.Tables[1].Columns["au_id"]);
ds.Relations[0].Nested = false;
ds.WriteXml("myfile.xml");

Listing 7–8 With Nested=false (default)

<?xml version="1.0" standalone="yes"?>
<NewDataSet>
  <authors>
    <au_id>213-46-8915</au_id>
    <au_fname>Bob</au_fname>
    <au_lname>Green</au_lname>
  </authors>
  <authors>
    <au_id>472-27-2349</au_id>
    <au_fname>Burt</au_fname>
    <au_lname>Gringlesby</au_lname>
  </authors>
  <titleauthor>
    <au_id>213-46-8915</au_id>
    <title_id>BU1032</title_id>
  </titleauthor>
  <titleauthor>
    <au_id>213-46-8915</au_id>
    <title_id>BU2075</title_id>
  </titleauthor>
  <titleauthor>
    <au_id>486-29-1786</au_id>
    <title_id>PS7777</title_id>
  </titleauthor>
</NewDataSet>

To produce hierarchical XML from multiple DataTables, you must define a DataRelation between them, and the DataRelation's Nested property must be set to true. This is illustrated in Listing 7–9.

Listing 7–9 With Nested=true

<NewDataSet>
  <authors>
    <au_id>213-46-8915</au_id>
    <titleauthor>
      <au_id>213-46-8915</au_id>
      <title_id>BU1032</title_id>
    </titleauthor>
    <titleauthor>
      <au_id>213-46-8915</au_id>
      <title_id>BU2075</title_id>
    </titleauthor>
    <au_fname>Bob</au_fname>
    <au_lname>Green</au_lname>
  </authors>
  <authors>
    <au_id>472-27-2349</au_id>
    <titleauthor>
      <au_id>472-27-2349</au_id>
      <title_id>TC7777</title_id>
    </titleauthor>
    <au_fname>Burt</au_fname>
    <au_lname>Gringlesby</au_lname>
  </authors>
</NewDataSet>

7.2.3 Reading XML into DataSet

XML document data can be read into DataSet just as data from DataAdapter can, using one of the overloads of DataSet.ReadXml. There are the usual four overloads: String, Stream, TextReader, and XmlReader. To refine the process, you can use another set of overloads via a second XmlReadMode parameter. There is also a set of mostly symmetric WriteXml methods.

There are two ways to read and write the contents of DataSet. The default is to read or write all the current values in DataSet that are not specified as DataColumn.ColumnMapping = MappingType.Hidden. An example is shown earlier in Listing 7–6. A second format, known as DiffGram, includes all the current rows, before images for those rows that you have changed since the last time you called AcceptChanges, and a set of error elements that contains errors for specific rows. Listing 7–10 shows the general layout of a DiffGram.

Listing 7–10 General format of a DiffGram

<?xml version="1.0"?>
<diffgr:diffgram 
 xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"
 xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema">

   <DataInstance>
   </DataInstance>

  <diffgr:before>
  </diffgr:before>

  <diffgr:errors>
  </diffgr:errors>
</diffgr:diffgram>

The DiffGram format adds annotations from the namespace urn:schemas-microsoft-com:xml-diffgram-v1 to the XML output. The most important annotation is the id attribute, which is used to tie before images and errors to specific rows. Listing 7–11 shows a typical DiffGram. DiffGrams are discussed further in "Writing XML Data from DataSet" later in this chapter.

Listing 7–11 A DiffGram in which both rows have been changed

<?xml version="1.0" standalone="yes"?>
<diffgr:diffgram 
 xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" 
 xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
  <NewDataSet>
    <Table diffgr:id="Table1" msdata:rowOrder="0" 
           diffgr:hasChanges="modified">
      <job_id>1</job_id>
      <job_desc>Different value</job_desc>
      <min_lvl>10</min_lvl>
      <max_lvl>10</max_lvl>
    </Table>
    <Table diffgr:id="Table2" msdata:rowOrder="1" 
           diffgr:hasChanges="modified">
      <job_id>2</job_id>
      <job_desc>Different value</job_desc>
      <min_lvl>200</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
  </NewDataSet>
  <diffgr:before>
    <Table diffgr:id="Table1" msdata:rowOrder="0">
      <job_id>1</job_id>
      <job_desc>Vice Chairman</job_desc>
      <min_lvl>10</min_lvl>
      <max_lvl>10</max_lvl>
    </Table>
    <Table diffgr:id="Table2" msdata:rowOrder="1">
      <job_id>2</job_id>
      <job_desc>Grand Poobah</job_desc>
      <min_lvl>200</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
  </diffgr:before>
</diffgr:diffgram>

When you read an XML document into a DataSet using ReadXml, you might have a document with an embedded schema (one that is contained in the same file or stream as the document) or a document that contains only the data. In addition, you can use ReadXml to read from a document that contains a DiffGram or a "normal" XML document. XmlReadMode specifies how the ReadXml method works. The following XmlReadModes are affected by the existence of a schema:

  • InferSchema: Ignores an inline schema and infers the schema from the data. If tables already exist in the DataSet but have incompatible properties, an error is thrown. If the DataSet's existing schema overlaps, tables or columns will be added.

  • IgnoreSchema: Uses only the DataSet's existing schema. Data in the XML input that does not match the existing schema is thrown away.

  • ReadSchema: Uses the inline schema if one exists. If tables with the same name already exist in the DataSet's schema, an exception is thrown.

  • DiffGram: Applies the changes assuming DiffGram format as input. If the input is in DiffGram format, this works in the same way as the DataSet.Merge method. If the XML input is not in DiffGram format, this works the same as IgnoreSchema.

If the XmlReadMode is not specified, the default is XmlReadMode.Auto, which works as follows:

  • If the XML input is a DiffGram, it is read as a DiffGram, possibly populating multiple RowVersions in the DataSet.

  • If the input is not in DiffGram format and if the DataSet already has a schema or there is an inline schema, ReadSchema is used.

  • If no schema exists in the DataSet or inline, InferSchema is used.

Schemas used by ReadXml can be in either XDR or XSD format. Either XDR or XSD format can use inline schemas. XSD can also use the xsi:SchemaLocation attribute to specify the schema location.

Filling the DataSet by using ReadXml works differently than filling the DataSet using DataAdapter.Fill. When you use DataAdapter, by default, changes are implicitly accepted when Fill is finished; you can override this by setting DataAdapter's AcceptChangesDuringFill property. Calling RejectChanges immediately after calling Fill has no effect. When ReadXml is used, the changes are not accepted until AcceptChanges is explicitly called. The reasoning behind this behavior is that ReadXml can be used to read rows into the DataSet, and these rows can be used to update a database through a DataAdapter, as shown in Listing 7–12. If AcceptChanges were implicit, attempting to update a database in this way would fail because the rows in the DataSet would be marked as original rows only. Calling RejectChanges after using ReadXml rolls back any rows added but leaves the schema intact, as shown in Listing 7–13.

Listing 7–12 Changes are not accepted during ReadXml

DataSet ds = new  DataSet();
SqlDataAdapter da = new SqlDataAdapter(
  "select * from jobs",
  "server=localhost;uid=sa;database=pubs");
SqlCommandBuilder bld = new SqlCommandBuilder(da);

da.MissingSchemaAction = MissingSchemaAction.AddWithKey;
da.Fill(ds, "jobs");

ds.ReadXml(@"c:\xml_documents\more_jobs.xml");

// adds jobs in more_jobs.xml to database
da.Update(ds, "jobs");

Listing 7–13 Calling RejectChanges deletes rows added by ReadXml

SqlDataAdapter da = new SqlDataAdapter(
  "select * from jobs",
  "server=localhost;uid=sa;database=pubs");
DataSet ds = new DataSet();
da.Fill(ds);
// 15 rows here
Console.WriteLine("{0} rows in table 0", 
  ds.Tables[0].Rows.Count);
ds.RejectChanges();
// still 15 rows here
Console.WriteLine("{0} rows in table 0", 
  ds.Tables[0].Rows.Count);

DataSet ds2 = new DataSet();
ds2.ReadXml(@"c:\xml_documents\jobs.xml");
// 15 rows here
Console.WriteLine("{0} rows in table 0", 
  ds2.Tables[0].Rows.Count);
ds2.RejectChanges();
// table still exists but 0 rows
Console.WriteLine("{0} rows in table 0", 
  ds2.Tables[0].Rows.Count);

The DiffGram format works differently from "ordinary" XML, no matter which XmlReadMode is specified. With any of the XmlReadModes, if the table or tables that the DiffGram refers to do not already exist in the DataSet's schema, ReadXml results in a System.Data.DataException being thrown. In addition, you must be very careful when mixing DiffGrams with DataSets that are filled using DataSet.Fill. If all the metadata (such as PrimaryKey or Unique constraint) is not present when the DiffGram is read, the DataSet may contain duplicate rows, and the resulting call to DataAdapter.Update may fail or produce the wrong results. For best results when using DataAdapter to update a database, always retrieve the metadata by specifying MissingSchemaAction.AddWithKey or explicitly calling DataAdapter.FillSchema. The rule is to remember that using XmlReadMode.DiffGram acts like DataSet.Merge.

7.2.4 Writing XML Schemas from DataSet

When you write XML from DataSet, you have similar options as when reading XML. The instructions for writing the XML data and schema are specified as properties of DataSet and related classes, so the process is straightforward. WriteXmlSchema has the same four overloads as ReadXmlSchema; it can write to Stream, String, TextWriter, or XmlWriter. In contrast to reading, WriteXmlSchema does not allow a choice of schema format; only XSD schemas are supported. The schemas written reflect the Namespace and Prefix properties of the DataSet, DataTables, and DataColumns as well as the DataRelation's Nested property and the DataColumn's ColumnMapping. Although the schema is a full-fidelity XML schema, it does reflect the relational nature of the DataTables in the DataSet. In addition, the DataSet's schema contains annotations from a Microsoft-specific namespace, making XML documents that use the schema load faster into the DataSet than those that don't use the schema.

Let's look at a sample DataSet containing two DataTables and a primary-foreign key relationship between them. The code that produced this DataSet is shown in Listing 7–14.

Listing 7–14 Code to produce an XML schema

SqlConnection conn = new SqlConnection(
  "server=(local);uid=sa;pwd=;database=pubs");
SqlDataAdapter da = new SqlDataAdapter(
  "select * from authors;select * from titleauthor", 
  conn);

da.TableMappings.Add("Table", "authors");
da.TableMappings.Add("Table1", "titleauthors");

DataSet ds = new DataSet("OneMany");
da.MissingSchemaAction = 
  MissingSchemaAction.AddWithKey;
da.Fill(ds);
ds.Relations.Add(
  "au_ta",
  ds.Tables[0].Columns["au_id"], 
  ds.Tables[1].Columns["au_id"], 
  true); 	
ds.Relations[0].Nested = true;
ds.WriteXmlSchema("one_to_many.xsd");

Listing 7–15 shows the schema produced by calling DataSet.WriteXml-Schema. Examining the schema in detail elucidates the relationship between relational data and XML. The most interesting point is that the schema is nonrelational! Although both the authors and the titleauthors DataTables are represented as xsd:ComplexType, the titleauthors table is represented as a nested xsd:ComplexType, an embedded member of authors. This representation is more similar to an object-oriented, embedded table view than to a relational view, where the tables are not hierarchical. This schema does not allow a "child" type without an existing parent, something that is consistent with the primary-foreign key constraint in a relational database.

Listing 7–15 Schema produced by the code in Listing 7–14

<?xml version="1.0" standalone="yes"?>
<xsd:schema id="OneMany" targetNamespace="" 
   xmlns="" xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
   xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <xsd:element name="OneMany" msdata:IsDataSet="true">
    <xsd:complexType>
      <xsd:choice maxOccurs="unbounded">
        <xsd:element name="authors">
          <xsd:complexType>
            <xsd:sequence>
              <xsd:element name="au_id" 
                           type="xsd:string" />
              <xsd:element name="au_lname" 
                           type="xsd:string" />
              <xsd:element name="au_fname" 
                           type="xsd:string" />
              <xsd:element name="phone" 
                           type="xsd:string" />
              <xsd:element name="address" 
                           type="xsd:string" 
                           minOccurs="0" />
              <xsd:element name="city" 
                           type="xsd:string" 
                           minOccurs="0" />
              <xsd:element name="state" 
                           type="xsd:string"  
                           minOccurs="0" />
              <xsd:element name="zip" 
                           type="xsd:string" 
                           minOccurs="0" />
              <xsd:element name="contract" 
                           type="xsd:boolean" />
              <xsd:element name="titleauthors" 
                           minOccurs="0"
                           maxOccurs="unbounded">
               <xsd:complexType>
                 <xsd:sequence>
                   <xsd:element name="au_id" 
                                type="xsd:string" />
                   <xsd:element name="title_id" 
                                type="xsd:string" />
                   <xsd:element name="au_ord" 
                                type="xsd:unsignedByte" 
                                minOccurs="0" />
                   <xsd:element name="royaltyper" 
                                type="xsd:int" 
                                minOccurs="0" />
                 </xsd:sequence>
                </xsd:complexType>
               </xsd:element>
            </xsd:sequence>
          </xsd:complexType>
        </xsd:element>
      </xsd:choice>
    </xsd:complexType>
    <xsd:unique name="titleauthors_Constraint1" 
     msdata:ConstraintName="Constraint1" 
     msdata:PrimaryKey="true">
      <xsd:selector xpath=".//titleauthors" />
      <xsd:field xpath="au_id" />
      <xsd:field xpath="title_id" />
    </xsd:unique>
    <xsd:unique name="Constraint1" 
                msdata:PrimaryKey="true">
      <xsd:selector xpath=".//authors" />
      <xsd:field xpath="au_id" />
    </xsd:unique>
    <xsd:keyref name="au_ta" 
                refer="Constraint1" 
                msdata:IsNested="true">
      <xsd:selector xpath=".//titleauthors" />
      <xsd:field xpath="au_id" />
    </xsd:keyref>
  </xsd:element>
</xsd:schema>

Because the code does not specify otherwise, all DataColumns are represented as elements. This results in DataColumns that can be NULL, using the minOccurs="0" facet in the schema. NULL values will be represented, not by empty elements but instead by the element's absence in the data. This distinguishes a NULL value from an empty string, as in relational databases. Both the authors and the titleauthors primary keys are represented by the xsd:unique production. The relationship between the two tables is represented by an xsd:keyref production. Note that the xsd:key production is not used to indicate that the unique fields are primary keys. Instead, this is indicated by a Microsoft-specific annotation.

7.2.5 Microsoft-Specific Annotations

The urn:schemas-microsoft-com:xml-msdata namespace annotations used in the DataSet's schema are the most controversial part of the schema, although they are in complete compliance with the schema recommendation document. XML purists argue that they are a "hack" and should not be included in an XML schema at all. Another opinion is that they are convenience features that can be safely ignored by systems that do not support .NET DataSets. The truth falls somewhere in between.

Let's look at some of these annotations. The first annotation, msdata:IsDataSet="true", appears on the element that will become the DataSet. Although it may appear at first that there should be a DataSet xsd:ComplexType, it is impossible because DataSet is a generic container. The number of DataTable, DataColumn, and DataRelation items can vary with each instance, although each DataSet instance can be precisely described. The msdata:IsDataSet annotation assists in deserialization for systems that support such a container; other systems can safely ignore it, although the resulting structure graph will not have DataSet semantics, such as the ability to dynamically add a DataTable.

The msdata:ConstraintName="Constraint1" annotation is an interesting way to specify a relational constraint. It combines with the "name" attribute of the authors table PrimaryKey and the "refer" attribute of the xsd:keyref production to define a relationship. This definition is nonstandard and outside the realm of XSD, but it is a compliant schema element. Using the annotation that results in using msdata:PrimaryKey="true" on an xsd:unique production instead of the xsd:key production is not technically the best way to describe the key in XML schemas, but it does permit the DataSet to have primary-foreign key relationships using unique fields that are not primary keys. Because DataSet allows this, using the PrimaryKey production simplifies deserialization.

The last annotation, msdata:IsNested="true", allows DataSet to serialize hierarchical XML when the relationship is not actually a nested table hierarchy. This is a convenience feature for XML processors that "prefer" a hierarchical representation.

Using Microsoft-specific annotations to coerce relational data into different shapes is a step you can take toward bridging the gap between relational and other data representations. This step is necessary because strict relational is a simplification of the rich set of data representations possible in XML. I discuss this issue further in Chapter 10.

7.2.6 Writing XML Data from DataSet

XML data can be written from DataSet in a variety of formats. Each format reflects the Prefixes, Namespaces, Relations, and ColumnMappings and can be validated using the schema that would be written using WriteXmlSchema against the same DataSet. The DataSet.WriteXml method has two series of the same four overloads as DataSet.ReadXml; the second parameter is the XmlWriteMode enumeration. The XmlWriteMode is a simple enumeration; a DataSet can be written with or without the corresponding XSD schema. In addition, it can be written as DiffGram format.

DiffGram format is an XML representation of an entire DataSet with extra rows that correspond to pending changes. Listing 7–16 shows an example that adds, deletes, and changes rows in an existing DataSet, along with the DiffGram produced. Note that this code uses annotations from the msdata prefixed namespace and that the DiffGram itself comes from a new namespace (urn:schemas-microsoft-com:xml-diffgram-v1). The msdata annotation is used to define the order in which the rows occur in the DataSet. Everything else is defined in the DiffGram's namespace. The DiffGram assigns each row a unique ID composed of the TableName and an ordinal, in this case Table1X, where Table1 is the TableName and X is the unique-ifier. The DiffGram consists of a set of current rows in the DataSet followed by a set of rows contained within a diffgram:before element. Changes are represented by before images and the diffgram:hasChanges attribute. Changed rows have a before and current row, and each has a hasChanges="modified" attribute. Inserted rows appear in the current set of rows only and have a hasChanges="inserted" attribute. Deleted rows appear in the before section only and are marked with hasChanges="deleted". From this format, an entire DataSet of rows and changes is represented. A DiffGram can be persisted with or without schema.

Listing 7–16 DiffGram produced after making changes

//
// This code:

DataSet ds = new DataSet();
SqlDataAdapter da = new SqlDataAdapter(
  "select * from jobs",
  "server=localhost;uid=sa;database=pubs");

da.MissingSchemaAction = MissingSchemaAction.AddWithKey;
da.Fill(ds);

ds.Tables[0].Rows[1][1] = "newer description";
ds.Tables[0].Rows[2].Delete();
ds.Tables[0].Rows.Add(
  new object[4] { null, "new row", 40, 40 });

ds.WriteXml(@"c:\xml_documents\writeDiffGram.xml", 
  XmlWriteMode.DiffGram);

<!-- produces this DiffGram -->
<?xml version="1.0" standalone="yes"?>
<diffgr:diffgram 
 xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" 
 xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"
>
  <NewDataSet>
    <Table diffgr:id="Table1" msdata:rowOrder="0">
      <job_id>1</job_id>
      <job_desc>Vice Chairman</job_desc>
      <min_lvl>10</min_lvl>
      <max_lvl>10</max_lvl>
    </Table>
    <Table diffgr:id="Table2" msdata:rowOrder="1" 
           diffgr:hasChanges="modified">
      <job_id>2</job_id>
      <job_desc>newer description</job_desc>
      <min_lvl>200</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
    <Table diffgr:id="Table4" msdata:rowOrder="3">
      <job_id>4</job_id>
      <job_desc>Chief Financial Officier</job_desc>
      <min_lvl>175</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
    <Table diffgr:id="Table5" msdata:rowOrder="4">
      <job_id>5</job_id>
      <job_desc>Publisher</job_desc>
      <min_lvl>150</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
    <Table diffgr:id="Table6" msdata:rowOrder="5">
      <job_id>6</job_id>
      <job_desc>Managing Editor</job_desc>
      <min_lvl>140</min_lvl>
      <max_lvl>225</max_lvl>
    </Table>
    <!-- some rows (6-14) deleted from diagram here... -->
    <Table diffgr:id="Table16" msdata:rowOrder="15" 
           diffgr:hasChanges="inserted">
      <job_id>0</job_id>
      <job_desc>new row</job_desc>
      <min_lvl>40</min_lvl>
      <max_lvl>40</max_lvl>
    </Table>
  </NewDataSet>
  <diffgr:before>
    <Table diffgr:id="Table2" msdata:rowOrder="1">
      <job_id>2</job_id>
      <job_desc>zzz</job_desc>
      <min_lvl>200</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
    <Table diffgr:id="Table3" msdata:rowOrder="2">
      <job_id>3</job_id>
      <job_desc>Business Operations Manager</job_desc>
      <min_lvl>175</min_lvl>
      <max_lvl>225</max_lvl>
    </Table>
  </diffgr:before>
</diffgr:diffgram>

Finally, you can produce a second DataSet consisting only of changes and then write the changes as a DiffGram. This technique is useful when you're marshaling changes because the entire DataSet need not be passed across the wire. To implement this, you use DataSet's GetChanges method. Listing 7–17 shows an example of the code and resulting DiffGram.

Listing 7–17 Using GetChanges and DiffGrams

//
// This code

DataSet ds = new DataSet();
SqlDataAdapter da = new SqlDataAdapter(
  "select * from jobs",
  "server=localhost;uid=sa;database=pubs");
da.MissingSchemaAction = MissingSchemaAction.AddWithKey;
da.Fill(ds);

ds.Tables[0].Rows[1][1] = "newer description";
ds.Tables[0].Rows[2].Delete();
ds.Tables[0].Rows.Add(
  new object[4] { null, "new row", 40, 40 });

DataSet ds2 = new DataSet();
ds2 = ds.GetChanges();
ds2.WriteXml(
 "c:\\xml_documents\\writeDiffGramChanges.xml", 
 XmlWriteMode.DiffGram);

<!-- produces this DiffGram -->
<?xml version="1.0" standalone="yes"?>
<diffgr:diffgram 
 xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" 
 xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1"
>
  <NewDataSet>
    <Table diffgr:id="Table1" msdata:rowOrder="0" 
           diffgr:hasChanges="modified">
      <job_id>2</job_id>
      <job_desc>newer description</job_desc>
      <min_lvl>200</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
    <Table diffgr:id="Table3" msdata:rowOrder="2" 
           diffgr:hasChanges="inserted">
      <job_id>0</job_id>
      <job_desc>new row</job_desc>
     <min_lvl>40</min_lvl>
      <max_lvl>40</max_lvl>
    </Table>
  </NewDataSet>
  <diffgr:before>
    <Table diffgr:id="Table1" msdata:rowOrder="0">
      <job_id>2</job_id>
      <job_desc>zzz</job_desc>
      <min_lvl>200</min_lvl>
      <max_lvl>250</max_lvl>
    </Table>
    <Table diffgr:id="Table2" msdata:rowOrder="1">
      <job_id>3</job_id>
      <job_desc>Business Operations Manager</job_desc>
      <min_lvl>175</min_lvl>
      <max_lvl>225</max_lvl>
    </Table>
  </diffgr:before>
</diffgr:diffgram>

After you have produced a DataSet as a result of using GetChanges, you can persist it as a DiffGram (current and changed rows) or as a normal XML document (current rows only). You can also use GetChanges to get the rows that have a certain RowState, such as only Added rows. Listing 7–18 shows the use of GetChanges to marshal changes made on the client tier back to a middle tier, where they are used in an update. Note that, although all data is always marshaled in XML format, only changed data is sent across the wire, thereby cutting down on network traffic. Finally, the changed rows are returned to the client to refresh the client's copy of the DataSet.

Listing 7–18 Round-trip update; only changes are marshaled in XML format

// 1. middle tier
DataSet ds = new DataSet();
SqlDataAdapter da = new SqlDataAdapter(
  "select * from jobs",
  "server=localhost;uid=sa;database=pubs");

da.MissingSchemaAction = MissingSchemaAction.AddWithKey;
da.Fill(ds);
// 2. pass to client here

ds.Tables[0].Rows[1][1] = "newer description";
ds.Tables[0].Rows[14].Delete();
ds.Tables[0].Rows.Add(
  new object[4] { null, "new row", 40, 40 });

DataSet ds2 = new DataSet();
ds2 = ds.GetChanges();
// 3. pass ds2 back to middle tier here
// just pass back the changes

SqlCommandBuilder bld = new SqlCommandBuilder(da);
da.InsertCommand = bld.GetInsertCommand();
da.UpdateCommand = bld.GetUpdateCommand();
da.DeleteCommand = bld.GetDeleteCommand();

// make sure we get the identity column on insert
da.InsertCommand.CommandText += 
  ";select * from jobs where job_id = @@identity";
da.InsertCommand.UpdatedRowSource = 
  UpdateRowSource.FirstReturnedRecord;

// update on middle tier using changes only.
// refresh with most current rows
// this is only needed for insert, 
// update has most current rows
da.Update(ds2);
// 4. pass changes only back to client

// client now contains latest changes
ds.Merge(ds2);
  • + Share This
  • 🔖 Save To Your Account