Home > Articles > Web Services > XML

  • Print
  • + Share This
This chapter is from the book

Parsed Character Data

XML documents are read and processed by a specific piece of software called an XML parser. When a document is processed by the XML parser, each character in the document is read, or parsed, in order to create a representation of the data.

Any text that gets read by the parser is Parsed Character Data, or PCDATA. This is important because you will see the term PCDATA pop up all over. Element content is considered either other elements or PCDATA. Attribute values are considered PCDATA.

By definition, PCDATA is parsed, which means that the parser looks at each of the characters and tries to determine their meaning. For example, if the parser encounters a < then it knows that the characters that follow represent an element instance. When the parser encounters a /, it knows that it has encountered an end tag.

Because PCDATA is parsed, it cannot contain <, >, and / characters, as these characters have special meaning in markup. For example:

<math>
If you want to denote one number is smaller than another, 
you can use the < less-than sign
</math>

This element will cause an error, because the parser will interpret the < as the start of a new element. If you want to include these characters, you will need to use the equivalent entity—for example, &lt;, to represent a less-than sign.

  • + Share This
  • 🔖 Save To Your Account