Seven Steps to XML Mastery: About This Series
It’s been just about eight years now since XML arrived on the scene in the form of a W3C recommendation outlining the rules for writing one’s own tag-based language. Before XML there was SGML, another tag description language but considered too hefty to jumpstart the new breed of data-centric apps just beginning to emerge from the Web.
Now of course, XML is ubiquitous. XML vocabularies have been defined for everything from Human Resources data to RSS feeds and SOAP envelopes. The main reason for XML’s success is its simplicity, fostering numerous XML vocabularies and a broad range of support tools and associated specs, which have enabled developers to leverage XML for a wide variety of tasks.
Figure 1 illustrates how XML sits at the core of a family of related technologies contributing to XML’s power and range of applications. Trying to get a handle on all these technologies can be tricky, as I’ve learned over the past few years giving seminars and designing undergraduate and graduate classes in XML at SMU in Dallas. One thing I’ve learned is that, while technical mastery of the details of the XML specification is useful on a syntactic level, real learning comes from trying to use XML to solve problems. In trying to build XML solutions, you come to see how XML’s simplicity and support systems make it easy for a developer to turn out sophisticated apps with much less heartache and pain than with traditional development.
Figure 1 XML and the family of XML technologies.
In trying to convey XML’s broad capabilities, I’ve found it useful to approach the study of XML and its family of technologies by structuring things around a seven-step program.
- Read before you write. Just like kids learn to read before
they can write, we’ll jump into our study of XML syntax and structure,
learning to read by deconstructing some current XML vocabularies.
First, we’ll look at the structure of Rich Site Summary (RSS) documents and how RSS magic works. Then we’ll take a look at how Scalable Vector Graphics (SVG) defines a lean XML format for drawing complex graphics. As we examine these XML vocabularies, we’ll look at how they use elements and attributes, as well as some of the other components you can add to an XML document, such as processing instructions and CDATA chunks. With an ability to read XML, you’re ready to move on to more ambitious tasks.
- Display for the Web. With some foundational XML technology
under your belt, you’ll be ready to go to work writing some XML and
getting things set up for web display. To do this, we’ll set you up as a
consultant for a hot new startup company that’s looking to take the Web by
storm. You’ll be working for the up-and-coming ZwiftBooks Corp, where
you’ll jump in to create some XML vocabularies and work with a web design
team to establish the company’s web presence.
The trick here will be to bring company XML data in line with the Cascading Style Sheets (CSS) and XHTML of the design team. Once an interface has been defined between the development and web team, the web team can work their design magic without worrying about stepping on the data. We’ll walk thru the steps involved in transforming company XML data into a form suitable for the Web.
- Transform with XSLT. The key to XML manipulation is XSLT, a powerful XML transformation technology that processes XML input and can generate any kind of output. In this step, we’ll set things up so that XSLT does some of the heavy lifting on the detail work necessary to generate our company web pages automatically. As we’ll see, XSLT is a key technology for transforming XML into a form suitable for web display, such as XHTML. Then you learn how to put all this in play by leveraging the power of scripting languages to execute our XSLT transforms painlessly, creating a truly dynamic web site for ZwiftBooks Corp.
- Apply parsing power. In step 4, we’ll take a look at
some of the different parser technologies that have sprung up to help deal with
XML at a programming level. For a company like ZwiftBooks, interested in
building a corporate infrastructure around XML, it’s crucial that we
understand programming options for reading and writing XML.
In this step, we’ll look at the two major parsing models, SAX and DOM, and then take a peek at how the mobile world is gearing up to handle the expected increasing volume of XML traffic, using a new pull parsing approach called StAX.
- Add web services. Before XML, a company looking to set up a
distributed network needed to decide on both a data format and a network
infrastructure for moving its data around. Now, XML vocabularies such as
and UDDI enable a
model by delivering XML over widely used and established web protocols such as
In this step, we’ll look at XML from the standpoint of both consumers and providers of web services. We’ll explore how SOAP can be used to structure XML messages, how WSDL is used to specify the messaging to request a service, and how UDDI enables service lookup from repositories. Along the way, we’ll also look at how companies like Amazon.com are making use of Representational State Transfer (REST) as an alternative to their SOAP interfaces.
- Employ the semantic web.
original vision for the Web included a framework not just for linking pages to
each other but to provide a semantic underpinning to web page content, so that
were free to roam from page to page carrying out sophisticated tasks for
In this step, we’ll consider how this vision is taking shape in the form of several semantic web initiatives. Your assignment will be to use semantic web technologies such as RDF and OWL to create new categories of web services for your users.
- Ensure XML security. The expanding role of XML in network traffic means looking at options for keeping that data secure. Because XML often travels along a path with multiple players adding and transforming data, keeping XML secure is more complex than just encrypting it, because we often want to encrypt or sign only part of the XML. In this step, I’ll show you how XML encryption can help to secure your company data using encryption and digital signatures.