Home > Articles > Programming > Web Services/ XML/ SOA/ WebSphere/ WCF

Seven Steps to XML Mastery, Step 4: Parsing and Processing XML (Part 1 of 2)

  • PrintPrint
  • Share ThisShare This
  • DiscussDiscuss
In this fourth step to XML mastery, Frank Coyle starts us into the world of parsing technology with a look at the major parsing models: DOM, SAX, and StAX (a newcomer on the block). With some parsing technology under your belt, you can programmatically extract, modify, and even create XML - and it's actually much less complicated than it sounds.
For more information about this series, start by reading Frank Coyle's introduction, Seven Steps to XML Mastery: About This Series.

Now it’s time to move to step 4 in our series and look at options for working with XML at a programming level. For a company like ZwiftBooks, building a corporate infrastructure around XML implies being able to move XML code into and out of programs seamlessly. This means extracting, modifying, and creating XML by using an XML parser. In this article, we’ll look at how ZwiftBooks can utilize XML parsing technology to integrate with an existing warehouse alert program.

Event Versus Tree Parsing

XML parsers fall into two categories:

Figure 1 illustrates the two major families of parsers for programmatically working with XML. Both event-based parsers and tree-based parsers take an XML document as input, but the two types of parsers treat that XML very differently.

Figure 1

Figure 1 Event versus tree parsing for XML documents.

Event-Based APIs

An event-based API reports parsing events to your application through the use of callbacks. As the XML streams into the parser, your handler is called as the parser encounters events of interest—start of document, start of element, end of element, and end of document (to name a few). Writing a SAX or StAX application means writing handlers that react when an element or attribute of interest is encountered in the XML.

Tree-Based APIs

A main tree-based API such as the W3C’s DOM maps an XML document into an internal tree structure, providing programmatic interfaces for navigating that tree. Methods are available to determine child and parent elements of nodes as well as to extract the content of elements of attributes. With DOM, it’s also possible to modify the tree and thus create new XML.

Choosing a Parser

The choice of event versus tree parser depends on the application requirements:

  • Event-based parsers are good for extracting an element or attribute from some XML and reacting to it in some way. Since event parsers look at only one small part of an XML document at a time, you can parse very large documents. Even documents in the terabyte range can be handled by a SAX or StAX parser.
  • Tree-based APIs build a navigable internal representation of a document. This approach is useful for a wide range of applications, but has a heavy impact on system resources—especially with large documents or special data-modeling requirements. For example, building a DOM tree, mapping it onto a new data structure, and discarding the original is typically not worth the effort. However, if data context is important, DOM is the way to go.
  • Share ThisShare This
  • Save To Your Account

Discussions

comments powered by Disqus

Related Resources

#TuesdayTrivia: Spotlight on WP7 (Win a copy of Sams Teach Yourself Windows Phone 7 Application Development)
By on May 2, 2012Comments
These days, what CAN'T a smartphone do? Microsoft is putting their own spin on things to help you experience "life in motion" when using your device. Instead of containing static application icons, the re-imagined Start screen features live Tiles showing real-time content updates.

March Trivia #1: Let there be light! (Win Microsoft Visual Studio LightSwitch Unleashed)
By on March 13, 2012Comments
Want a simplified self-service tool to help you build business applications for the desktop and beyond? Microsoft programmers… meet Visual Studio LightSwitch.

February Trivia #2: There's an App for That (Win Sams Teach Yourself iOS 5 Application Development in 24 Hours)
By on February 28, 2012Comments
In less than a decade, the iOS platform has changed the way we think about mobile communication.

See All Related Blogs