Home > Articles > Programming > Java

Alternative API: SAX

  • Print
  • + Share This
Benoit Marchal discusses how to read XML documents with SAX, the powerful API, in this sample chapter from XML by Example.
This sample chapter is excerpted from XML by Example, by Benoit Marchal.
This chapter is from the book

This chapter is from the book

Alternative API: SAX

In the previous chapter you learned how to use DOM, an object-based API for XML parsers. This chapter complements the discussion on XML parsers with an introduction to SAX.

You will see that SAX

  • Is an event-based API.

  • Operates at a lower level than DOM.

  • Gives you more control than DOM.

  • Is almost always more efficient than DOM.

  • But, unfortunately, requires more work than DOM.

Why Another API?

Don't be fooled by the name. SAX may be the Simple API for XML but it requires more work than DOM. The reward—tighter code—is well worth the effort.

» The "What Is a Parser?" section in Chapter 7, "The Parser and DOM" (page 211), introduced you to XML parsers.

In the previous chapter, you learned how to integrate a parser with an application. Figure 8.1 shows the two components of a typical XML program:

  • The parser, a software component that decodes XML files on behalf of the application. Parsers effectively shield developers from the intricacies of the XML syntax.

  • The application, which consumes the file content.

Figure 8.1: Architecture of an XML program.

Obviously, the application can be simple (in Chapter 7, we saw an application to convert prices between euros and dollars) or very complex, such as a distributed e-commerce application to order goods over the Internet.

The previous chapter and this chapter concentrate on the dotted line in Figure 8.1—the interface or API (Application Programming Interface) between the parser and the application.

Object-Based and Event-Based Interfaces

In Chapter 7, "The Parser and DOM," you learned that there are two classes of interfaces for parsers: object-based and event-based interfaces.

» The section "Getting Started with DOM" in Chapter 7 introduced DOM as the standard API for object-based parsers. DOM was developed and published by the W3C.

DOM is an object-based interface: it communicates with the application by explicitly building a tree of objects in memory. The tree of objects is an exact map of the tree of elements in the XML file.

DOM is simple to learn and use because it closely matches the underlying XML document. It is also ideal for what I call XML-centric applications, such as browsers and editors. XML-centric applications manipulate XML documents for the sake of manipulating XML documents.

However, for most applications, processing XML documents is just one task among many others. For example, an accounting package might import XML invoices, but it is not its primary activity. Balancing accounts, tracking expenditures, and matching payments against invoices are.

Chances are the accounting package already has a data structure, most likely a database. The DOM model is ill fitted, in that case, as the application would have to maintain two copies of the data in memory (one in the DOM tree and one in the application's own structure).

At the very least, it's inefficient. It might not be a major problem for desktop applications but it can bring a server to its knees.

SAX is the sensible choice for non–XML-centric applications. Indeed SAX does not explicitly build the document tree in memory. It enables the application to store the data in the most efficient way.

Figure 8.2 illustrates how an application can map between an XML tree and its own data structure.

Figure 8.2: Mapping the XML structure to the application structure.

Event-Based Interfaces

As the name implies, an event-based parser sends events to the application. The events are similar to user-interface events such as ONCLICK (in a browser) or AWT/Swing events (in Java).

Events alert the application that something happened and the application needs to react. In a browser, events are typically generated in response to user actions: a button fires an ONCLICK event when the user clicks.

With an XML parser, events are not related to user actions, but to elements in the XML document being read. There are events for

  • Element opening and closing tags

  • Content of elements

  • Entities

  • Parsing errors

Figure 8.3 shows how the parser generates events as it reads the document.

Figure 8.3: The parser generates events.

Listing 8.1 is a price list in XML. It lists the prices charged by various companies for XML training. The structure of this document is shown in Figure 8.4.

Listing 8.1: pricelist.xml

<?xml version="1.0"?>
<xbe:price-list xmlns:xbe="http://www.psol.com/xbe2/listing8.1">
  <xbe:product>XML Training</xbe:product>
  <xbe:price-quote price="999.00" vendor="Playfield Training"/>
  <xbe:price-quote price="699.00" vendor="XMLi"/>
  <xbe:price-quote price="799.00" vendor="WriteIT"/>
  <xbe:price-quote price="1999.00" vendor="Emailaholic"/>
Figure 8.4: The structure of the price list.

The XML parser reads this document and interprets it. Whenever it recognizes something in the document, it generates an event.

When reading Listing 8.1, the parser first reads the XML declaration and generates an event for the beginning of the document.

When it encounters the first opening tag, <xbe:price-list>, the parser generates its second event to notify the application that it has encountered the starting tag for a price-list element.

Next, the parser sees the opening tag for the product element (for simplicity, I'll ignore the namespaces and indenting whitespaces in the rest of this discussion) and it generates its third event.

After the opening tag, the parser sees the content of the product element: XML Training, which results in yet another event.

The next event indicates the closing tag for the product element. The parser has completely parsed the product element. It has fired five events so far: three events for the product element, one event for the beginning of document, and one for price-list opening tag.

The parser now moves to the first price-quote element. It generates two events for each price-quote element: one event for the opening tag and one event for the closing tag.

Yes, even though the closing tag is reduced to the / character in the opening tag, the parser still generates a closing event.

There are four price-quote elements, so the parser generates eight events as it parses them. Finally, the parser meets price-list's closing tag and it generates its two last events: closing price-list and end of document.

As Figure 8.5 illustrates, taken together, the events describe the document tree to the application. An opening tag event means "going one level down in the tree" whereas a closing tag element means "going one level up in the tree."

Figure 8.5: How the parser builds the tree implicitly.


An event-based interface is the most natural interface for a parser: It simply has to report what it sees.

Note that the parser passes enough information to build the document tree of the XML documents but, unlike a DOM parser, it does not explicitly build the tree.


If needed, the application can build a DOM tree from the events it receives from the parser. In fact, several DOM parsers are built on top of a SAX parser.

Why Use Event-Based Interfaces?

Now I'm sure you're confused. Which type of API should you use and when should you use it—SAX or DOM? Unfortunately, there is no clear-cut answer to this question. Neither is either of the two APIs intrinsically better; they serve different needs.

The rule of thumb is to use SAX when you need more control and DOM when you want increased convenience. For example, DOM is popular with scripting languages.

The main reason to adopt SAX is efficiency. SAX does fewer things than DOM but it gives you more control over the parsing. Of course, if the parser does less work, it means you (the developer) have more work to do.

Furthermore, as already discussed, SAX consumes fewer resources than DOM, simply because it does not need to build the document tree.

In the early days of XML, DOM benefited from being the official, W3C-approved API. Increasingly, developers trade convenience for power and turn to SAX.

The major limitation of SAX is that it is not possible to navigate backward in the document. Indeed, after firing an event, the parser forgets about it. As you will see, the application must explicitly buffer those events it is interested in.

Of course, whether it implements the SAX or DOM API, the parser does a lot of useful work: It reads the document, enforces the XML syntax, and resolves entities—to name just a few. A validating parser also enforces the document schema.

There are many reasons to use a parser and you should master APIs, SAX, and DOM. It gives you the flexibility to choose the better API depending on the task at hand. Fortunately, modern parsers support both APIs.

  • + Share This
  • 🔖 Save To Your Account