XML Must Be Well-Formed
- Oct 7, 2002
Well-Formed XML Documents
If XML is to be used as a format for data interchange, it must adhere to a consistent syntax so that programs can reliably produce and parse XML documents. An XML document that adheres to proper XML syntax is said to be well-formed.
If the results of parsing are to be presented by an XML processor (also known as an XML parser) to its associated application, the XML document must be well-formed. If the document is not well-formed, the XML processor should report one or more errors encountered, and normal processing, including the passing of parsed data to the application, should stop. Ensuring that the XML documents that you write are well-formed is crucial to achieving the desired processing of the data that they contain.
Some of the rules for well-formedness are straightforward. Some can seem pretty obscure the first time you read them, so if some of the rules in this chapter don't make too much sense the first time through, don't worry too much. As you learn more about other aspects of XML in later chapters, the pieces of the syntax jigsaw will fit together more clearly.
In Chapter 2, "The Structure of an XML Document," you learned about the structure that an XML document must conform to. All well-formed XML documents must follow the permitted options of that structure. In addition to those rules, an XML document must satisfy several other rules to be considered well-formed.
This chapter gives a complete description of well-formedness constraints. To do so, it is necessary to refer to concepts described more fully in later chapters. You might find it helpful to reread parts of this chapter after reading Chapter 4, "Valid XMLDocument Type Definitions," and Chapter 5, "XML Entities."
The term well-formed is used to describe the rules that all XML documents must satisfy. If an XML document is not well-formed, an XML processor signals an error and stops normal processing. It is crucial that you understand the well-formedness constraints in XML 1.0, to ensure that the XML documents that you create will be processed correctly and without errors.
To be well-formed, an XML document must satisfy each of three broad rules or sets of rules:
The structure of the document must follow that described in Chapter 2an optional prolog, followed by a required document element (and any content that it has) and, finally, an optional miscellaneous section.
The document must satisfy the well-formedness constraints described in the following sections of this chapter.
Any parsed entities referenced from the document, whether directly or indirectly, must themselves be well-formed.
The following several sections consider each of the XML 1.0 well-formedness constraints.