Home > Articles > Software Development & Management

  • Print
  • + Share This


XML is still a buzzword, so of course it has to be included here.

Xerces 2

If you want to parse XML, you could do a lot worse than use Xerces from the XML Apache project. This is yet another Apache project that's used in many other open source systems. And it's not because there's no other choice—there are plenty. Xerces' main strength is that it's part of the Apache Jakarta project—a guarantee of quality. Jakarta takes their software very seriously. Xerces isn't the fastest parser—and if benchmarks are to be believed, it isn't even the fastest open-source parser—but it's nearly defect-free, and is constantly being updated to follow the latest standards.

Xerces is fully JAXP-compliant, and it supports DOM and SAX.


According to their web site, dom4j paired with Jaxen (described later) hugely outperforms Xerces and Xalan. Their data seems to show that if you add XPath into the code you're much better off using the dom4j/Jaxen software, rather than the XML.Apache software.

The choice really can be about performance in this case, because both Dom4J and Xerces implement standard APIs, which means that we write our code in the same way no matter which parser and transformation technology we choose.


Xalan is a tool that allows you to process XML documents using XSLT. It works well, but slowly. I use it to generate my web site statically; using it dynamically would take far too much processing power. My web site only contains static information anyway, so it's no great loss to me, and if you're in the same boat then you may come to the same conclusion.

If you really need real-time transformations of XML data, however—as a lot of people do—then Xalan may not be the right tool for you.


Jaxen is another standards-compliant XSL transformation tool. In the dom4j benchmarks, Jaxen was found to be between 800 and 1,000 times faster than Xalan.

These transformation tools conform to the TRAX API, so the choice isn't going to affect your code—only your performance.

The fact that both dom4j and Jaxen are Java API standards–compliant is a good thing. If you're looking for a quick performance boost in your code and test cycles on a J2EE project, you could make all of your open source tools use these instead of the Jakarta tools.


Pronounced zeen-dee-chay, Xindice is an XML database. If you have very little relational data (that is, data about relationships), and many hierarchical data structures, you'd probably be fine using Xindice. I use both Xindice and MySQL to service both of my datatypes. It's important to remember to use a mapping tool rather than parsing all of that data by hand—this helps to ensure a consistent interface across data stores, so as to make them transparent.

There is another benefit to Xindice. If your web site's contents are stored in XML, and the transformations are XML, you can set up your site so that your gatekeeper servlet queries and extracts both the content and the transformation script straight from Xindice. If you ever need to render to a different format, you write a new transformation script, and tell your servlet the conditions under which to use it.


We mentioned mapping tools earlier in the context of XML persistence. Castor is one such tool, and so much more.

Castor is a relational-hierarchical-object mapping tool from Exolab. It's also reputed to be the model for Sun's JAXB program. JAXB (XML Binding) is being rewritten because of early problems moving from descriptors to schemas, and in this move Castor is being seen as the current technology leader. There were even early rumors that Sun would adopt Castor as its standard rather than develop a new one, but we aren't that lucky.

Castor, like many of the other tools mentioned in this article, is fantastic. Point it at a schema and it generates domain-like classes to marshal and unmarshal XML data. If the default behavior isn't good enough, you can script certain key elements of it. It's an extremely fast, clean, and efficient way to access your XML data.

Castor will also quite happily bind through JDBC to a database (or any other JDBC-compliant data source). This means that you can use a single tool to persist your data to the majority of places you'll ever want to persist it.

Using this tool, I haven't hand-coded any interaction with XML for the last six months. It's really useful and timesaving.

  • + Share This
  • 🔖 Save To Your Account

Related Resources

There are currently no related titles. Please check back later.