Sams Teach Yourself XML in 21 Days
- Table of Contents
- About the Author
- Acknowledgments
- We Want to Hear from You!
- Introduction
- Part I: At a Glance
- Day 1. Welcome to XML
- Day 2. Creating XML Documents
- Day 3. Creating Well-Formed XML Documents
- Day 4. Creating Valid XML Documents: DTDs
- Declaring Attributes in DTDs
- Day 6. Creating Valid XML Documents: XML Schemas
- Day 7. Creating Types in XML Schemas
- Part I. In Review
- Day 8. Formatting XML by Using Cascading Style Sheets
- Day 9. Formatting XML by Using XSLT
- Day 10. Working with XSL Formatting Objects
- Part II. In Review
- Part III: At a Glance
- Day 11. Extending HTML with XHTML
- Day 12. Putting XHTML to Work
- Day 13. Creating Graphics and Multimedia: SVG and SMIL
- Day 14. Handling XLinks, XPointers, and XForms
- Part III. In Review
- Part IV: At a Glance
- Day 15. Using JavaScript and XML
- Day 16. Using Java and .NET: DOM
- Day 17. Using Java and .NET: SAX
- Day 18. Working with SOAP and RDF
- Part IV. In Review
- Part V: At a Glance
- Day 19. Handling XML Data Binding
- Day 20. Working with XML and Databases
- Day 21. Handling XML in .NET
- Part V. In Review
- Appendix A. Quiz Answers
Creating CDATA Sections
When an XML processor parses an XML document, it interprets the markup in that document and replaces entity references (like the built-in general entity reference ") with whatever those entity references refer to (which is a double quotation mark, ", for the general entity reference "). On the other hand, sometimes you might not want text data parsed—for example, what if your text contains many < and & characters? When parsed, those characters will be interpreted as part of the markup unless you convert them to < and &, which is called escaping them. To avoid that, you can specify that you don't want the XML processor to parse part of your text data by placing it in a CDATA section. CDATA stands for character data, as opposed to parsed character data, which is PCDATA.
You use the CDATA section to tell the XML processor to leave the enclosed text alone, and pass it on unchanged. You start a CDATA section with the markup <![CDATA[ and end it with ]]>.
For example, suppose you are documenting how your XML application works, and want to say this:
Here's how the element starts:
<employee status="retired">
<name>
<lastname>Kelly</lastname>
<firstname>Grace</firstname>
</name>
<hiredate>October 15, 2005</hiredate>
<projects>
<project>
<product>Printer</product>
<id>111</id>
<price>$111.00</price>
</project>
.
.
.
This partial <employee> element without a closing </employee> tag would drive an XML processor crazy, so you should enclose this text in a CDATA section to tell the XML processor not to parse it, as you see in Listing 2.3. When an XML processor parses this document, it is supposed to place the text in the CDATA section directly into the output it produces, without trying to interpret that text (as well as removing the <![CDATA[ and ]]> markup).
Example 2.3. Using a CDATA Section in an XML Document (ch02_03.xml)
<?xml version = "1.0" standalone="yes"?>
<document>
<text>
Here's how the element starts:
<![CDATA[
<employee status="retired">
<name>
<lastname>Kelly</lastname>
<firstname>Grace</firstname>
</name>
<hiredate>October 15, 2005</hiredate>
<projects>
<project>
<product>Printer</product>
<id>111</id>
<price>$111.00</price>
</project>
.
.
.
]]>
</text>
</document>
You can see that Internet Explorer treats this CDATA section as unparsed text in Figure 2.11. (If it had parsed the text, you would see an error instead of the display you see in the figure.)
Figure 2.11 Viewing a CDATA section in Internet Explorer.
Here's another example using XHTML, the version of HTML that is written in XML. XHTML pages can be parsed like other XML documents, but that can cause problems if you've included certain characters that a scripting language like JavaScript uses, such as the less than (<) JavaScript operator. To avoid confusing an XML processor reading an XHTML page with this embedded JavaScript operator, you can enclose that JavaScript in a CDATA section:
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>
Checking the temperature
</title>
</head>
<body>
<script language="javascript">
<![CDATA[
var temperature
temperature = 234.77
if (temperature < 32) {
document.writeln("Below freezing!")
}
]]>
</script>
<center>
<h1>
Checking the temperature
</h1>
</center>
</body>
</html>
Unfortunately, there's a problem here—the markup <![CDATA[ and ]]>, confuses HTML browsers, which means you can't use syntax like this until those browsers are fully equipped to handle XHTML. You can, however, include JavaScript in XHTML pages like this one if they're intended only for HTML browsers, not XML processors, by omitting the markup <![CDATA[ and ]]>.
Handling Entities | Next Section

Account Sign In
View your cart