Home > Articles

  • Print
  • + Share This
This chapter is from the book

Summary

This chapter covered XML syntax rules and basic parsing concepts.

  • We were introduced to fundamental XML terminology, such as element, attribute, tag, and content.

  • XML document structure was discussed, including the XML prolog, consisting of the XML declaration and the document type declaration, both of which are optional but desirable.

  • Names of elements, attributes, and many other XML identifiers are required to conform to the definition of an XML Name.

  • An XML Name consists of a leading letter, underscore, or colon, followed by name characters (letters, digits, hyphens, underscores, colons, or periods).

  • XML is case-sensitive. Although there is no universal convention concerning use of uppercase or lowercase when developing your own language, one recommendation is to use UpperCamelCase for elements and lowerCamelCase for attributes, a convention used in SOAP.

  • We learned the difference between markup and character data; all text that isn't markup is character data.

  • We covered most of the types of markup, including start and end tags, empty element tags, entity references, character references, comments, CDATA sections, document type declarations, processing instructions, and XML declarations.

  • The minimal requirement for an XML document is that it be well-formed, meaning that it adheres to a number of XML syntax rules.

  • Although well-formedness is a prerequisite for validity, a document can be valid only if it also conforms to the constraints imposed by a DTD or XML Schema.

  • More modern parsers can be toggled between two states: validating and nonvalidating. Validation mode is crucial during development. In a production environment, however, it may be desirable (under certain circumstances) to disable validation for efficiency.

  • Event-based (e.g., SAX) and tree-based (e.g., DOM) parsing were briefly contrasted.

  • + Share This
  • 🔖 Save To Your Account