InformIT

.NET Tools for Working with XML

Date: May 21, 2004

Return to the article

.NET provides a powerful set of classes for working with XML directly. This article provides an overview of the most important of these classes, and some examples of what you can do with them.

A lot of people associate the .NET framework with XML, and for good reason. .NET uses XML behind the scenes to implement many of its development tools, such as SOAP and Web services. Beyond that, however, .NET provides a powerful set of classes for working with XML directly. Whatever you need to do with XML—sequential or random access, validation, transforms, or output—the .NET Framework provides you with tools that are not only powerful but easy to use.

This article provides an overview of the most important of these classes, and some examples of what you can do with them. All of .NET's XML classes are in the System.XML namespace, and support the following standards (listed with their WWW namespaces):

XmlTextReader

The XmlTextReader class provides non-cached, forward-only access to a stream of XML data. It is designed specifically for fast access to XML data while placing minimal demands on the systemís resources. Functionally, XmlTextReader is similar to the Simple API for XML (SAX), another technique for reading XML that is popular with non-.NET programmers.

XmlTextReader steps through the XML data one node at a time. At each node, your program can use the class properties to obtain information about the node — its type (element or attribute, for example), data, number of attributes, and so on. You use the read method to advance to the next node, and the EOF property to determine when the end of the data has been reached.

This class does not perform data validation; that's one reason why it is so fast. Nor does it support default attributes or resolve external entities. It does, however, enforce the rules of well-formed XML, which makes it a good well-formedness parser. Because of its speed, it is also well-suited for looking through an XML file for a specific piece of information, or for processing an entire XML file sequentially as when you are generating HTML from the XML data.

Let's look at an example. Listing 1 shows part of an XML data file that will be used in this example. It is data from a checkbook register. Listing 2 shows Visual Basic code that will display the number of checks written to the category "groceries" and the total amount of those checks.

Listing 1. The XML data file used in the examples.

<?xml version="1.0"?>
<checkbook>
<check number="100" date="2004-04-05">
<payee>Wilson Oil Co.</payee>
<amount>156.25</amount>
<category>utilities</category>
</check>
<check number="101" date="2004-04-07">
<payee>Kroger Foods</payee>
<amount>98.25</amount>
<category>groceries</category>
</check>
<check number="102" date="2004-04-07">
<payee>Cancer Society</payee>
<amount>100.00</amount>
<category>charity</category>
</check>
</checkbook>

Listing 2. Using the XmlTextReader class to extract data from an XML file.

Dim rdr As XmlTextReader
Dim amount As String
Dim total As Single = 0
Dim count As Integer = 0
Dim isAmountElement As Boolean
Dim isCategoryElement As Boolean

Try
  rdr = New XmlTextReader("checkbook.xml")
  While rdr.Read()
    ' Look for a start node.
    If rdr.NodeType = XmlNodeType.Element Then
      ' Is it an "amount" or "category" element? 
      'If so set the corresponding flag.
      If rdr.Name = "amount" Then
        isAmountElement = True
      Else
        isAmountElement = False
      End If
      If rdr.Name = "category" Then
        isCategoryElement = True
      Else
        isCategoryElement = False
      End If
    End If
    If rdr.NodeType = XmlNodeType.Text Then
      ' Is it a "category" element with the value "groceries"? If so, increment
      ' the count and add the amount to the total.
      If isCategoryElement And rdr.Value = "groceries" Then
        count += 1
        total += amount
      End If
      ' If it is an "amount" element, save the value for possible future use.
      If isAmountElement Then
      amount = rdr.Value
      End If
    End If
  End While
Catch ex As Exception
  MsgBox("XML error " & ex.Message)
Finally
  If Not rdr Is Nothing Then rdr.Close()
End Try
  MsgBox("You wrote " & count.ToString & " checks for groceries totaling " _
  & Format(total, "C"))

XmlValidatingReader

The XmlValidatingReader class, as its name implies, provides data validation capabilities. Specifically, it can validate XML data against a document type definition (DTD), an XML schema definition language (XSD) schema, or an XML Data Reduced (XDR) schema. This class does not work alone; it must be used in conjunction with an instance of XmlTextReader that is passed to the constructor. Thus, this class gives you the forward-only capabilities of XmlTextReader with validation added. XmlValidatingReader also adds support for default attributes and the ability to resolve external references. Validation of XML data is an inherently complex and slow process.

XmlValidatingReader validates as it goes along. To validate an entire XML file, you must step though it from beginning to end. In broad outline the procedure is as follows:

Validation errors will throw an exception, which you can catch using the handler created in step 3. If the process completes without an exception then the XML data is valid.

XmlDocument

The XmlDocument class implements the W3C Document Object Model (DOM) core levels 1 and 2. This class provides random, cached access to the XML data. In other words, the data is held in memory and your program can move forward and backward as needed. Actually, "forward" and "backward" are not really accurate, because the DOM represents XML data as a tree of nodes, so what you are really doing is "walking the tree." The XmlDocument class also permits you to modify the document's data and structure.

You have a lot of flexibility for loading XML into the XmlDocument class: from a string, a stream, a URL, validated XML from an instance of XmlValidatingReader, or partial data (data from a specified node) from an instance of XmlTextReader.

Once the XmlDocument instance is loaded with XML data you "walk the tree," usually starting at the document's root element, as referenced by the class's DocumentElement property. From this element (or any other element) you move around using these class members:

You can create a simple demo program that uses the XmlDocument class to walk through an XML data file, reading the names and values of the elements and attributes and displaying them in a text box. This demo uses the same checkbook data XML file that was shown earlier. To create the demo:

  1. Start a new Windows Application project. Change the form's Text property to "XmlDocument Demo.".
  2. Place a TextBox control on the form. Set its properties as follows:
  3.   Name: txtOutput
      MultiLine: True
      Text: (a blank string)
      ScrollBars: Vertical
  4. Resize the TextBox control to fill the width of the form and about two-thirds of its height.
  5. Add a Button control beneath the TextBox control. Set its properties as follows:
  6.   Name: btnProcess
      Text: Process
  7. Place the code from Listing 3 into the Click() event procedure for the button. If necessary, edit the first line so it points to the location of the checkbook.xml file.

Listing 3. Demonstrating the XmlDocument Class

Private Sub BtnProcess_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles btnProcess.Click

Dim rdr As New XmlTextReader("checkbook.xml")
Dim xmlDoc As New XmlDocument()
xmlDoc.Load(rdr)
Dim n As XmlNode
Dim n1 As XmlNode
Dim i As Integer
Dim s As String

' Set n to the first  node.
n = xmlDoc.DocumentElement.FirstChild
Do While True
  ' Write out this element's attributes.
  With n.Attributes
    If .Count > 0 Then
      For i = 0 To .Count - 1
        s &= .Item(i).Name & ": " & .ItemOf(i).Value & vbCrLf
      Next
    End If
  End With
  ' Set n1 to the first child node (a  node).
  n1 = n.FirstChild
  Do While Not n1 Is Nothing
    s &= n1.Name & ": " & n1.InnerText & vbCrLf
    n1 = n1.NextSibling()
  Loop
  n = n.NextSibling()
  If n Is Nothing Then Exit Do
  ' A blank line.
  s &= vbCrLf
Loop

txtOutput.Text = s

End Sub

Other Classes in System.XML

We've looked at three of the most important XML-related .Net classes, but there are many more that cannot be covered here. For example, the XmlTextWriter class lets you create XML output that conforms to the W3C Extensible Markup Language (XML) 1.0 and the Namespaces in XML recommendations. And the XslTransform class transforms XML data using an XSLT stylesheet.

XML is becoming increasingly important as a data storage and transfer standard in many areas of information technology. With the tools provided by the .NET Framework, you should be able to handle essentially any XML-related programming task.

800 East 96th Street, Indianapolis, Indiana 46240