Sams Teach Yourself XML in 21 Days

Sams Teach Yourself XML in 21 Days

By Steven Holzner

How XML Is Used in the Real World

As you already know, XML is designed to help store, structure, and transfer data; because it's written using plain text, it can be sent on the Internet and handled by software on many different platforms. XML was designed to let people circulate data. In its five years, hundreds of XML sublanguages—that is, sets of predefined XML elements—have appeared.

For example, suppose you want to perform genealogical research. To search through many genealogical records rapidly, you would need to have those records in a predetermined form, not just in any order in a simple text file. To do that, you could use a specialized XML sublanguage, Genealogical Data Communication (GEDCOM), which defines its own tags for storing names, dates, marriages, and so on. Using GEDCOM, people from all over the world can search genealogical databases rapidly.

XML sublanguages like GEDCOM are called XML applications (the term is a little unfortunate, because software packages are also called applications, but the idea is that these sublanguages are applications of XML). There are hundreds of XML applications, allowing various groups of people to communicate and exchange data. Here's a list of a few of these applications:

You can find information about XML applications like these by watching the XML news releases from W3C. The Web site http://www.xml.org/xml/marketplace_company.jsp also lists many XML applications. To get an idea of what's going on in XML these days, we'll take a look at a few of these applications next—and we're going to see more throughout this book.

Using XML: Mathematical Markup Language

Mathematical Markup Language, MathML, was designed to let people embed mathematical and scientific equations in Web pages (in fact, Tim Berners-Lee first developed the World Wide Web so that physicists could exchange papers and documents).

MathML is itself a W3C specification, and you can find it at http://www.w3.org/TR/MathML2/. Using MathML, you can display all kinds of equations, but there's only one commonly used Web browser that supports MathML—the Amaya browser, which is W3C's own testbed browser for testing new HTML elements. You can download Amaya for free from http://www.w3.org/Amaya/.

You can see a MathML document, ch01_08.ml, in Listing 1.8. This document just displays the equation 4x2 – 5x + 6 = 0.

Example 1.8. A MathML Document (ch01_08.ml)

<?xml version="1.0"?>
<math xmlns="http://www.w3.org/1998/Math/MathML">
    <mrow>
        <mrow>
            <mn>4</mn>
            <mo>&InvisibleTimes;</mo>
            <msup>
                <mi>x</mi>
                <mn>2</mn>
            </msup>
            <mo>-</mo>
            <mrow>
                <mn>5</mn>
                <mo>&InvisibleTimes;</mo>
                <mi>x</mi>
            </mrow>
            <mo>+</mo>
            <mn>6</mn>
        </mrow>
        <mo>=</mo>
        <mn>0</mn>
    </mrow>
</math>

You can see how this document looks in the Amaya browser in Figure 1.6.

01fig06.gif

Figure 1.6 A MathML document displayed by the Amaya browser.

Using XML: Chemical Markup Language

Chemical Markup Language (CML) was developed by Peter Murray-Rust and lets you view three-dimensional representations of molecules in a Jumbo browser. Using CML, one chemist can publish a visual model of a molecule and exchange that model with others.

For example, this CML document, from the CML Web site at http://www.xml-cml.org, displays the formamide molecule:

<molecule xmlns="http://www.xml-cml.org" id="formamide">
<atomArray>
  <stringArray builtin="atomId">H1 C1 O1 N1 Me1 Me2</stringArray>
  <stringArray builtin="elementType">H C O N C C</stringArray>
  <integerArray builtin="hydrogenCount">0 1 0 1 3 3</integerArray>
  </atomArray>
  <bondArray>
  <stringArray builtin="atomRef">C1 C1 C1 N1 N1</stringArray>
  <stringArray builtin="atomRef">H1 O1 N1 Me1 Me2</stringArray>
  <stringArray builtin="order">1 2 1 1 1</stringArray>
  </bondArray>
  <h:html xmlns:h="http://www.w3.org/TR/html20">
  <p>Formamide is the simplest amide ...</p>
  <p>
  This represents a
  <emph>connection table</emph>
  for formamide. The structure corresponds to the diagram:
  </p>
  <pre>H3 H1 \ / N1-C1=O1 / H2</pre>
  </h:html>
  <float title="molecularWeight" units="g">45.03</float>
 <list title="local information">
 <!--
    <link title="safety" href="/safety/chemicals.xml#formamide">
    </link>
  -->
  <string title="location">Storeroom 12.3</string>
  </list>
</molecule>

We'll see CML at work tomorrow when we take a look at the Jumbo CML browser.

Using XML: Synchronized Multimedia Integration Language

Synchronized Multimedia Integration Language (SMIL, pronounced "smile") lets you customize multimedia presentations, and we'll take a look at SMIL in depth in this book. We'll even be able to create SMIL files that can be run in RealNetwork's RealPlayer (now called RealOne). SMIL is a W3C standard, and you can find more about at http://www.w3.org/AudioVideo/#SMIL.

For example, here's the beginning of a SMIL document that plays background music and displays a slide show of images and text:

<?xml version="1.0"?>
<!DOCTYPE smil PUBLIC "-//W3C//DTD SMIL 1.0//EN"
  "http://www.w3.org/TR/REC-smil/SMIL10.dtd">
<smil>
    <body>
        <par id="show">
            <audio src="river.wav" region="background_audio"
            type="audio/x-wav" dur="20s"/>
            <seq id="slides">
            <par id="slide01">
                <img src="mountain.jpg" type="image/jpeg" dur="5s"/>
                <text src="welcome.txt" type="text/plain" dur="5s"/>
            </par>
        .
        .
        .

Using XML: XHTML

Despite its popularity, W3C thinks there are a lot of problems with HTML—and, having created it, they should know. For example, some HTML elements don't need closing tags, but may be used with them, while others require closing tags. Many Web pages have all kinds of HTML errors, like overlapping elements, that Web browsers struggle to fix. To make HTML more rigorous, and in an attempt to let you extend it with your own tags, W3C introduced Extensible Hypertext Markup Language, or XHTML. XHTML is HTML 4.01 (the current version of HTML) in XML form. We'll be seeing XHTML in depth in Day 11, "Extending HTML with XHTML," and Day 12, "Putting XHTML to Work."

In other words, XHTML is simply an XML application that mimics HTML 4.0 in such a way that you can display the results—true XML documents—in today's Web browsers, as well as extending it with your own new elements. Here are some XHTML resources online:

XHTML 1.0 comes in three different versions: transitional, frameset, and strict. The transitional version is the most popular version of XHTML because it supports HTML as it's used today. The frameset version supports XHTML documents that display frames. The strict version omits all the HTML elements considered obsolete in HTML 4.0 (of which there were quite a few).

XHTML 1.1 is a form of the XHTML 1.0 strict version made a little more strict by omitting support for some elements and adding support for a few more (such as <ruby> for annotated text). You can find a list of the differences between XHTML 1.0 and XHTML 1.1 at http://www.w3.org/TR/xhtml11/changes.html#a_changes.

As an example, you can see an XHTML 1.0 transitional document in Listing 1.9 called ch01_09.html (XHTML documents use the extension .html so they can appear in standard Web browsers—note that all the element names are in lowercase). We're going to take XHTML documents like this apart piece by piece in Days 11 and 12.

Example 1.9. An XHTML Document (ch01_09.html)

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <title>
            An XHTML Page
        </title>
    </head>

    <body>
        <h1>
            Welcome to XHTML!
        </h1>
        <center>
        <p>
        This is an XHTML document.
        </p>
        <p>
        Pretty cool, eh?
        </p>
        </center>
    </body>
</html>

You can see the results of this XHTML in Figure 1.7. Writing XHTML is a lot like HTML, except that you have to adhere to XML syntax (which means, for example, that every element has a closing tag).

01fig07.gif

Figure 1.7 Displaying an XHTML page in Internet Explorer.

Using XML: HTML+TIME

Here's another XML application—HTML+TIME. This one was created by Microsoft, Macromedia, and Compaq as an alternative to SMIL for multimedia alternative. You can find out about HTML+TIME at http://msdn.microsoft.com/workshop/Author/behaviors/time.asp.

You can see a sample HTML+TIME document that displays the words Welcome, to, HTML+TIME, in Listing 1.10. If you open this document in Internet Explorer, you'll see that the words appear one at a time, separated by two seconds, and then the whole process repeats.

Example 1.10. An HTML+TIME Document (ch01_10.html)

<HTML>
    <HEAD>
        <TITLE>
            Using HTML+TIME
        </TITLE>
        <STYLE>
            .time {behavior: url(#default#time);}
        </STYLE>
    </HEAD>

    <BODY>
        <DIV CLASS="time" t:REPEAT="5" t:DUR="10" t:TIMELINE="par">
            <DIV CLASS="time" t:BEGIN="0" t:DUR="10">Welcome</DIV>
            <DIV CLASS="time" t:BEGIN="2" t:DUR="10">to</DIV>
            <DIV CLASS="time" t:BEGIN="6" t:DUR="10">HTML+TIME.</DIV>
        </DIV>
    </BODY>
</HTML>

You can see the results of this HTML+TIME document in Figure 1.8.

01fig08.gif

Figure 1.8 Viewing an HTML+TIME document in Internet Explorer.

Using XML: Microsoft's .NET

Microsoft's .NET initiative took what had been local Windows functionality to the Internet. Components in .NET use XML to communicate, often even when they're on the same machine. You don't usually see the XML in .NET, but each time you communicate between components, it's there.

For example, ADO.NET (ActiveX Data Objects) is the .NET protocol for working with databases, and all communication between your code and the data provider that hosts the database uses XML. You can see an example demonstrating how ADO.NET works using in Visual Basic .NET, one of the programming languages in Visual Studio .NET, in Figure 1.9.

01fig09.gif

Figure 1.9 Writing data in XML in Visual Basic .NET.

When the user clicks the Write Data to XML Document button, the code connects to the SQL Server data provider, opens the sample database named pubs that comes with SQL Server, and reads the data in the employee table from that database using XML. It'll then write that data out to an XML document, data.xml. When the user clicks the Get Data from XML Document button, the code reads in that XML and displays the data in it in the grid you see in Figure 1.9.

Here is the Visual Basic .NET code that handles the button clicks and that does the actual work:

Private Sub Button1_Click(ByVal sender As System.Object, _
    ByVal e As System.EventArgs) Handles Button1.Click
    DataSet11.Clear()
    OleDbDataAdapter1.Fill(DataSet11)
    DataSet11.WriteXml("data.xml")
End Sub

Private Sub Button2_Click(ByVal sender As System.Object, _
    ByVal e As System.EventArgs) Handles Button2.Click
    Dim dataset As New DataSet()
    ds.ReadXml("data.xml")
    DataGrid1.SetDataBinding(dataset, "employee")
End Sub

And here is the XML that was written out to disk in data.xml—note that it matches the data you see in Figure 1.9:

<?xml version="1.0" standalone="yes"?>
<DataSet1 xmlns="http://www.tempuri.org/DataSet1.xsd">
  <employee>
    <emp_id>PMA42628M</emp_id>
    <fname>Paolo</fname>
    <minit>M</minit>
    <lname>Accorti</lname>
    <job_id>13</job_id>
    <job_lvl>35</job_lvl>
    <pub_id>0877</pub_id>
    <hire_date>1992-08-27T00:00:00.0000000-04:00</hire_date>
  </employee>
  <employee>
    <emp_id>PSA89086M</emp_id>
    <fname>Pedro</fname>
    <minit>S</minit>
    <lname>Afonso</lname>
    <job_id>14</job_id>
    <job_lvl>89</job_lvl>
    <pub_id>1389</pub_id>
    <hire_date>1990-12-24T00:00:00.0000000-05:00</hire_date>
  </employee>
        .
        .
        .

That's what the XML that's used to move data between components in XML looks like behind the scenes.

Using XML: Scalable Vector Graphics

A number of popular XML applications revolve around graphics, and one of these applications is Scalable Vector Graphics (SVG), a W3C-based XML application. Until recently, SVG found only limited support, notably because Microsoft had its own XML-style graphics language for Internet Explorer, Vector Markup Language (VML), followed by its DirectAnimation tools. Now, however, Adobe has created an SVG viewer as a browser plug-in, and we'll take a look at SVG and that plug-in in Day 13, "Creating Graphics and Multimedia: SVG and SMIL." You can find the SVG specification itself at http://www.w3.org/TR/SVG11/, and an SVG overview at http://www.w3.org/Graphics/SVG/Overview.htm8.

Millions of SVG viewers from Adobe have already been downloaded (Adobe calls SVG "the future of Web graphics") and you can get the SVG viewer at http://www.adobe.com/svg/. You can see a sample SVG document in Listing 1.11, which draws a blue ellipse filled in with light blue color.

Example 1.11. An SVG Document (ch01_11.svg)

<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg">
    <title>SVG Example</title>
    <ellipse cx="200" cy="100" rx="100" ry="60"
        style="fill:lightblue; stroke:blue; stroke-width:6"/>
</svg>

You can see ch01_11.svg at work in Figure 1.10, where we're using the Adobe SVG plug-in in Internet Explorer.

01fig10.gif

Figure 1.10 Viewing an SVG example.

Using XML: SOAP

These days, more and more Web applications are appearing every day. Based on the Internet, these programs can communicate with each other, transferring data back and forth as needed. For example, a Web application might provide real estate agents in the field with today's real estate listings, which they can download into their laptops.

One problem with Web applications is that they can end up using their own XML element sets only, making it difficult for a Web application written in Java to communicate with one written in a .NET language like Visual Basic .NET or C# .NET. To make communication between Web applications easier, the XML-based Simple Object Access Protocol (SOAP, which you can read about at http://www.w3.org/TR/SOAP/) was created. SOAP defines a widely accepted lightweight XML protocol that lets you send messages between Web applications, no matter what language such Web applications might have been written in.

You'll see more about SOAP in Day 18, "Working with SOAP and RDF," when you take a look at some examples. SOAP messages contain a SOAP envelope that acts like the root element of the message, a SOAP header that tells the recipient what kind of message this is, and a SOAP body that holds the message. For example, if you wanted to tell a Web application that there are currently 200 desks in stock in your warehouse, you might send a SOAP message like this:

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope
    xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
    <soap:Header>
        SOAP Example
    </soap:Header>
    <soap:Body>
        <desks:NumberInStock>
            200
        </desks:NumberInStock>
    </soap:Body>
</soap:Envelope>

That gives us a taste of how XML is put to use these days. Before finishing up today, we'll take a quick look at some of the rich XML resources available online—there's a great deal of free stuff out there for you.

Share ThisShare This

Informit Network