XML Software Infrastructure
Executive Summary
The previous chapters examined the theory behind the XML paradigm. But realizing practical benefits requires successfully deploying XML-based systems. As with any software technology, the ability to leverage off-the-shelf infrastructure greatly improves your chances of successful deployment. Luckily, the XML paradigm's high degree of standardization and openness has resulted in rich layers of software infrastructure for both data-oriented and content-oriented applications.
This chapter focuses on the basic infrastructure used in many XML projects. This core functionality supports the rest of the application. Of course, not every application needs every piece of infrastructure, and there are often choices within an infrastructure category. Understanding when different components are necessary and the issues in choosing a particular component will help you ensure that your projects have a strong foundation for success. Figure 5-1 presents a conceptual model of the basic infrastructure.
As you can see, fundamental components provide the basic services for the rest of the categories. Fundamental components, such as XML processors that implement the XML specification and XSLT processors that implement the XSLT specification, are already widely available from open source projects and software vendors. They have had plenty of time to mature and provide a robust infrastructure foundation.
Because XML is about data and content, these fundamental components implicitly assume they have a means of persistent storage. A simple application may use the local filesystem, but one that is more sophisticated requires mechanisms with greater reliability, scalability, and flexibility. There is no single optimal storage solution. The appropriate choice of database management system, content management system, or native XML store depends on the intended purpose and access pattern of your information.
Enterprise and Web architectures depend on various types of servers to manage data, execute behavior, and distribute content. For developers and authors to move XML documents around in this environment, servers must be at least XML-aware and may need XML-specific functions. Data servers must provide interfaces for accessing data in an XML format. Application servers must deliver scaleable execution of fundamental components. Content servers must facilitate the delivery, distribution, and cataloging of XML-packaged content.
Above the server layer, infrastructure clearly diverges along two paths: the data path and the content path. The data path includes XML components related to the generation, processing, and movement of machine-readable XML data. Development tools enable programmers rapidly to create software that moves data between internal data structures and XML documents. Transformation tools enable programmers to convert data between XML formats or between XML formats and other data sources. Web services components help programmers package, process, and interpret XML-encoded messages.
The content path includes components releated to the authoring, presentation, and distribution of human-readable XML content. Authoring tools facilitate the creation and editing of XML documents. Layout tools facilitate the presentation of XML documents in a variety of media. Content management components facilitate collaboration, packaging, and delivery of XML documents.
The two paths of XML infrastructure merge again with design tools. The process of information design is what sets XML apart from other software technologies. This process provides the programmer or author the means to design formats for XML, whether as data or as content. In fact, it is the information design process that determines whether your XML documents are data or content. While this process should actually occur very early in application development, the results drive the choice of other components, which is why design tools appear as the top layer in Figure 5-1.
Figure 5-1: Conceptual Model of XML Software Infrastructure