Home > Articles > Web Services > XML

  • Print
  • + Share This
Like this article? We recommend


The major complexities in your design process may largely be independent of XML, being associated with adding any sort of new functionality to an existing application. If your code is well-designed, modular, cohesive, and coherent, adding XML support may not be much of a problem. However, if you're dealing with a pile of spaghetti, you're going to have a tough time. In either situation, the specific changes you have to make are entirely dependent on the design of the application. Some design decisions are directly related to XML, however, and most are dependent on the constraints identified during the analysis phase.

The nature of the interface may seem like a big consideration, and it will be for the overall design of the application changes. However, the parts of the application that deal strictly with XML are largely independent of the specific interface. This is due to the fact that most conceivable interfaces—whether a SOAP messaging facility for a web service, a message queuing system for in-house application integration, or a batch import/export to the file system—deal with XML data as a serialized character stream. To consume that character stream, your application must parse the XML; and to produce a character stream, your application must take the XML instance document data (in whatever internal form it's stored) and serialize it to a character stream. Your task is to determine how you get that character stream from and to your chosen interface.

For design considerations involving validation, there are two main issues to consider:

  • All XML parsers have the ability to pass to your application a well-formed document, even if the document doesn't comply with its schema.

  • At least for now, schema validation is a Boolean proposition; there are no shades of gray between success and failure. A validation error might be as major as an unexpected element or an overall structure that doesn't bear any relation to the schema, or as minor as invalid content in an enumerated element or attribute. All popular parsing APIs stop and return a failure after encountering the first error, so you don't know what other errors might be lurking.

The point is that if you have a requirement to process documents that are not schema-valid, such as "in suspense" purchase orders, safe programming requires that you make very few assumptions about the content of these documents.

Document size and portability requirements can constrain your choice of XML parsing API. Simple API for XML (SAX) is still probably your best choice if your application must consume very large instance documents. If you don't have that requirement, the DOM is the most mature and flexible API, and Level 3 of the DOM finally offers standard approaches for loading and saving documents. Both SAX and the DOM are widely implemented, so portability shouldn't be too much of a concern. However, you'll save yourself a lot of trouble by finding a single parsing API, such as Xerces-C or Xerces-J, that runs or can be built on all of your target platforms.

If portability isn't a constraint, you may save yourself some coding effort by using one of the newer class-binding APIs such as Java Architecture for XML Binding (JAXB). These tools read schemas and create classes for languages such as C++, Java, and C#, allowing you process data in XML documents the same way as you do any other object in your program. However, as with the DOM, they may not be appropriate for processing very large instance documents, because they tend to load the complete document into memory. In addition, these tools are still relatively immature. Some still don't handle some schema constructs gracefully, such as dealing with a choice content model in a complex type.

Finally, you always have the choice of writing your own parser and serializer, rather than using someone else's. You may have reasonable requirements for doing this, but for most of us it would be like writing your own web browser instead of using Internet Explorer or Mozilla.

Beyond these considerations, there are a few other high-level design issues. Most APIs require some up-front housekeeping, such as creating a DocumentBuilder with Java API for XML Parsing (JAXP), or initializing the COM library if you're using MSXML. The API may also require you to release resources that you no longer need. You have to determine how localized within the application to make your XML support. If you do this housekeeping at a fairly high level—say, in application startup and shutdown—the associated resources can be available globally. You save the processing overhead of creating and releasing them on demand. Yet, you also have to manage them and make them accessible globally. For some situations, such as exporting a batch of invoices as XML documents, it may make more sense to make all of the XML associated code fairly localized.

Finally, be prepared to do some prototyping and throw away approaches that don't work, particularly if this is your first foray into XML. You probably won't hit any walls and need to backtrack until you get into implementation, so let's talk about that next.

  • + Share This
  • 🔖 Save To Your Account