Sams Teach Yourself XML in 21 Days

Sams Teach Yourself XML in 21 Days

By Steven Holzner

All About DTDs

Yesterday we discussed creating well-formed XML documents, and while an XML document needs to be well-formed to be considered a true XML document, that's only part of the story. In real life, we also need to give an XML processor some way of checking the syntax (also called the grammar) of an XML document to make sure the data remains intact. For example, take a look at the XML document you created yesterday that contains data about employees:

<?xml version = "1.0" standalone="yes"?>
<document>
    <employee>
        <name>
            <lastname>Kelly</lastname>
            <firstname>Grace</firstname>
        </name>
        <hiredate>October 15, 2005</hiredate>
        <projects>
            <project>
                <product>Printer</product>
                <id>111</id>
                <price>$111.00</price>
            </project>
            <project>
                <product>Laptop</product>
                <id>222</id>
                <price>$989.00</price>
            </project>
        </projects>
    </employee>
        .
        .
        .
</document>

Say we've expanded to 5,000 employees, and that we have a team of typists typing in all that employee data. The likelihood is high that there are going to be errors in all that data entry. But how will an XML processor know that a <project> element must contain at least one <product> element unless we tell it so? How do we tell an XML processor that each <employee> element must contain one <name> element? To do this and more, we can use a DTD. DTDs are all about specifying the structure of an XML document, not the data in that document. The formal rules for DTDs are available in the XML 1.0 recommendation, http://www.w3.org/TR/REC-xml. (Note that the XML 1.1 candidate recommendation has nothing to add about DTDs as of this writing.)

We define the syntax of an XML document by using a DTD, and we declare that definition in a document by using a document type declaration. We can use a <!DOCTYPE> element to create a DTD, and the DTD appears in that element. The element can take many different forms, including the following (where URI is the URI of a DTD outside the current XML document and rootname is the name of the root element) :

To use a DTD, we need a DTD, which means we need a <!DOCTYPE> element. The <!DOCTYPE> element is part of a document's prolog. For example, here's how we would add a <!DOCTYPE> element to the employees example:

<?xml version = "1.0" standalone="yes"?>
<!DOCTYPE document [

           .

           .

      <!-- DTD goes here! -->

           .

           .

   ]>
<document>
    <employee>
        <name>
            <lastname>Kelly</lastname>
            <firstname>Grace</firstname>
        </name>
        <hiredate>October 15, 2005</hiredate>
        <projects>
            <project>
                <product>Printer</product>
                <id>111</id>
                <price>$111.00</price>
            </project>
            <project>
                <product>Laptop</product>
                <id>222</id>
                <price>$989.00</price>
            </project>
        </projects>
    </employee>
        .
        .
        .
</document>

So what does a DTD look like? The actual XML syntax for DTDs is pretty terse, so today's discussion is dedicated to unraveling that terseness. To get started, Listing 4.1 shows a full <!DOCTYPE> element that contains a DTD for the employee document. We're going to dissect that DTD today.

Example 4.1. A Sample XML Document with a DTD (ch04_01.xml)

<?xml version = "1.0" standalone="yes"?>
<!DOCTYPE document [

   <!ELEMENT document (employee)*>

   <!ELEMENT employee (name, hiredate, projects)>

   <!ELEMENT name (lastname, firstname)>

   <!ELEMENT lastname (#PCDATA)>

   <!ELEMENT firstname (#PCDATA)>

   <!ELEMENT hiredate (#PCDATA)>

   <!ELEMENT projects (project)*>

   <!ELEMENT project (product,id,price)>

   <!ELEMENT product (#PCDATA)>

   <!ELEMENT id (#PCDATA)>

   <!ELEMENT price (#PCDATA)>

   ] >
<document>
    <employee>
        <name>
            <lastname>Kelly</lastname>
            <firstname>Grace</firstname>
        </name>
        <hiredate>October 15, 2005</hiredate>
        <projects>
            <project>
                <product>Printer</product>
                <id>111</id>
                <price>$111.00</price>
            </project>
            <project>
                <product>Laptop</product>
                <id>222</id>
                <price>$989.00</price>
            </project>
        </projects>
    </employee>
    <employee>
        <name>
            <lastname>Grant</lastname>
            <firstname>Cary</firstname>
        </name>
        <hiredate>October 20, 2005</hiredate>
        <projects>
            <project>
                <product>Desktop</product>
                <id>333</id>
                <price>$2995.00</price>
            </project>
            <project>
                <product>Scanner</product>
                <id>444</id>
                <price>$200.00</price>
            </project>
        </projects>
    </employee>
    <employee>
        <name>
            <lastname>Gable</lastname>
            <firstname>Clark</firstname>
        </name>
        <hiredate>October 25, 2005</hiredate>
        <projects>
            <project>
                <product>Keyboard</product>
                <id>555</id>
                <price>$129.00</price>
            </project>
            <project>
                <product>Mouse</product>
                <id>666</id>
                <price>$25.00</price>
            </project>
        </projects>
    </employee>
</document>

Share ThisShare This

Informit Network