Home > Store

XML Family of Specifications: A Practical Guide

Register your product to gain access to bonus material or receive a coupon.

XML Family of Specifications: A Practical Guide

Book

  • Sorry, this book is no longer in print.
Not for Sale

Description

  • Copyright 2002
  • Dimensions: 7-3/8" x 9-1/4"
  • Pages: 1168
  • Edition: 1st
  • Book
  • ISBN-10: 0-201-70359-9
  • ISBN-13: 978-0-201-70359-7

As XML continues to mature, developers need to understand how this standard and its related technologies are revolutionizing software development. XML Family of Specifications: A Practical Guide, now a two-volume set, provides a complete roadmap for understanding how XML, XSL, XML Schema, and related specifications interlink to create powerful, real-world applications.

Extras

Web Resources

Click below for Web Resources related to this title:
Supporting Web Site

Sample Content

Online Sample Chapters

Exploring the XML Infoset

Working with Canonical XML

XML and Namespaces

XML Syntax and Parsing Concepts

Downloadable Sample Chapter

Click below for Sample Chapter(s) related to this title:
Sample Chapter 3

Table of Contents



List of Figures.


List of Tables.


Preface.


Acknowledgments.

INTRODUCTION.

1. History of the Web and XML.

Ancient History (1945 to 1985).

Hypertext and Early User Interfaces.

GML and SGML: Content vs. Presentation.

ARPANET and the Internet: Infrastructure.

Medieval History (1986 to 1994).

Berners-Lee, the Web, and HyperText Markup Language.

Historic Timelines.

Premodern History (1994 to 1998).

Ultramodern History (1998 to 2001).

Summary.

For Further Exploration.

Notes.

I. Fundamental XML Concepts, Syntax, and Modeling.

2. Overview of the XML Family of Specifications .

Fixing the Web.

HTML Standards Change Too Slowly.

Browser-Specific Extensions Are Problematic.

No Meaningful Markup of Data.

Presentation Is Often Fixed for Monitors.

Content Changes Cause Problems.

Browser Paradigm Is Too Constraining.

Search Engines Need Better Focusing.

Can't Specify Collections of Related Pages.

One-Way Linking Is Too Limited.

Enter XML and Its Many Benefits.

What Does XML Look Like?

Presentation vs. Structure.

XML Representation with a DTD.

Document-centric vs. Data-centric.

Processing XML with XSLT.

XPath Language.

Benefits and Applications.

Domain-Specific Vocabularies.

XML Can Describe User Interfaces.

XML Complements HTML.

Validated, Self-Describing Data.

Metadata.

Search Engines.

Distributed Applications.

Granular Updates.

User-Selected and User-Specific View of Data.

Device-Dependent Display of Data.

Resolution-Independent Graphics in a Text Format.

Rendering with Formatting Objects.

Unicode and Alternate Character Set Support.

The Big Picture and the Role of the W3C.

W3C Recommendation Process.

W3C Domains, Activities, and Working Groups.

The Big Picture.

Summary.

For Further Exploration.

3. XML Syntax and Parsing Concepts.

Elements, Tags, Attributes, and Content.

XML Document Structure.

XML Declaration.

Document Type Declaration.

Document Body.

Markup, Character Data, and Parsing.

XML Syntax Rules.

Well-Formedness.

Legal XML Name Characters.

Elements and Attributes Are Case-Sensitive.

Uppercase Keywords.

Case Conventions or Guidelines.

Root Element Contains All Others.

Start and End Tags Must Match.

Empty Elements.

Proper Nesting of Start and End Tags.

Parent, Child, Ancestor, Descendant.

Attribute Values Must Be Quoted.

White Space Is Significant.

Comments.

Processing Instructions.

Entity References.

CDATA Sections.

Well-Formed vs. Valid Documents.

Validity.

Well-Formed or Toast?

Validating and Nonvalidating Parsers.

Event-Based vs. Tree-Based Parsing.

Event-Based Parsing.

Tree-Based Parsing.

Summary.

For Further Exploration.

4. DTD Syntax.

What Is a DTD and When Is It Needed?

Elements and Content Models.

Element Content Model.

Sequencing and Choosing Elements.

Occurrence Indicators.

Mixed Content Model.

EMPTY Content Model.

ANY Content Model.

Expressions: The Good, the Bad, and the Ugly.

Deterministic Content Models.

Comments.

Attributes, Attribute Types, and Default Values.

Attribute-List Declaration.

Attribute Types.

Attribute Default Values.

Elements vs. Attributes: Guidelines.

Example DTD and XML Instance.

Invoice DTD.

Invoice XML Instance.

External and Internal DTD Subsets.

External Subsets.

Internal Subsets.

External Subset with Internal Subset.

Entities and Notations.

Document Entity.

General Entities.

Parameter Entities.

Notations.

Generating DTDs and XML Instances.

Generating a DTD from an XML Instance.

Generating XML Instances from a DTD.

Overall DTD Structure.

Summary.

For Further Exploration.

5. Namespaces, XML Infoset, and Canonical XML.

Namespaces.

Why Namespaces Are Needed: Resolving Name Conflicts.

Qualified Names, Prefixes, Local Names, and Other Terminology.

Declaring Namespaces in XML Documents.

Default Namespace.

Handling Namespaces in a DTD or XML Schema.

Validating Documents with Namespaces.

What Does a Namespace Point To?

Namespace Support and Use.

Special Attributes: xmlns, xml:space, xml:lang, and xml:base.

Common Namespaces.

XML Information Set.

What's in the Infoset?

What's Not in the Infoset?

Canonical XML.

Why Is Canonical XML Needed?

Canonical XML Terminology.

Canonical XML Example: Different but Equal.

Summary.

For Further Exploration.

6. XML Schema: DTDs on Steroids.

The Need for Schemas: Why DTDs Aren't Always Enough.

Document-centric.

No Metadata Access.

Limited Datatypes.

Hard-to-Define Ranges or Sets.

No Subclassing.

Order of Children Is Too Rigid.

Limited Way to Express Number of Repetitions.

Lacks Namespace Support.

Enter XML Schema.

Historical Perspective: Forerunners of XML Schema.

XML-Data and XML-Data Reduced.

Document Content Description.

Schema for Object-Oriented XML.

Document Definition Markup Language.

Relevant Specifications.

Basic Example.

Address DTD and Instance.

Address Schema and Instance.

What Have We Gained?

Collection Schema Example.

Collection DTD.

Collection XML Schema.

Collection Schema Instance.

Key Concepts and Terminology.

Schema Components.

Keywords: DTD vs. XML Schema.

Elements, Declarations, and Definitions.

Element Repeatability: minOccurs and maxOccurs.

XML Representation Summary of xsd:element.

Local vs. Global Scope.

Attribute Declarations and Occurrence.

Content Model and Model Groups: Introduction.

Creating and Using Datatypes.

Definition of Datatype.

Object-Oriented Analogy: A Brief Detour.

Simple Types, Complex Types, Simple Content, Complex Content, and Derivation.

Built-in Datatypes.

Derivation by Constraining a Simple Type with Facets.

Regular Expressions for Pattern Facet.

Derivation by List.

Derivation by Union.

More about Complex Types.

Adding and Constraining Attribute Values.

Restricting a User-Defined Type.

Adding Element Children.

Adding Elements to a User-Defined Type.

Extension of Complex Types with Complex Content.

Restriction of Complex Types with Complex Content.

Empty Elements with and without Attributes.

Summary of Type Definition Cases.

Limiting Derivation.

More about Content Models and Model Groups.

Mixed Content Model.

Generic xsd:any Content Model.

Group Element.

Attribute Groups.

Miscellaneous XML Schema Topics.

Annotations.

Namespaces and XML Schema.

Import and Include.

Working with XML Schema.

XML Schema Software.

Converting DTDs to XML Schema.

Converting with TurboXML by TIBCO Extensibility.

Converting with XML Spy IDE by Altova.

Schema Validation.

Validation Using XSV by Henry Thompson.

Validation Using XML Spy.

Schema Repositories and Registries.

XML.org.

BizTalk.

Open Applications Group.

More Miscellaneous Topics.

Shortcomings of XML Schema.

When to Use DTDs Instead of XML Schema.

XML Schema Topics Not Covered in Detail.

XML Schema Alternatives.

RELAX.

TREX.

RELAX NG.

Schematron.

Summary.

For Further Exploration.

II. PARSING AND PROGRAMMING APIS.

7. Parsing with SAX.

Overview of Parsing and Processing XML.

Parsing, Validation, APIs, and Consumers.

Different Approaches: SAX and DOM, JDOM, and JAXP.

XML Infoset and Its Relation to Parsing.

Recommended Java Parsers and Non-Java Support.

Development of SAX.

SAX: Event Handler Model.

SAX2 Interfaces and Classes.

Major Interfaces and Classes.

Overall SAX Application Sequence.

ContentHandler and Context Tracking.

XMLReader.

ErrorHandler: Well-Formedess and Validation.

Features and Properties.

SAX1 vs. SAX2.

Using SAX with Java.

Determining the Parser Driver Classname.

Compiling and Running a SAX Application.

Minimal SAX Example.

More Robust SAX Example.

Valid Parse Results.

Error Results.

More Representative Input.

SAX Filters.

Writing XML Using SAX.

Summary.

For Further Exploration.

8. Parsing with the DOM.

Overview of the DOM.

Historical Perspective.

Tree vs. Event Model.

@BHEADS = Generic Interfaces: Good or Bad?

DHTML Comparison.

Relevant Specifications and Key Resources.

DOM Levels.

DOM Level 2 Specifications.

DOM Level 3 Specifications.

Testing for Feature Support.

Collection DTD and Instance Revisited.

DOM Nodes and How the DOM Works.

Node Interface.

NodeList and NamedNodeMap.

Node IDL Definition.

Node Java Binding.

Overview of DOM Interfaces and Their Methods.

Core DOM Level 2 Interfaces and Methods.

Document Interface.

Using the DOM.

Compiling and Running a DOM Application.

Overview of Apache Xerces Packages.

Minimal DOM Application.

Handling Additional DOM Processing Requirements.

Error Handling.

Checking for Feature Availability.

Accessing Node Type-Specific Properties.

DOM2 Output with Valid Input.

Examples of Well-Formedness and Validation Error Handling.

Accessing Attributes.

Filtering Nodes of Interest.

Adding, Changing, and Removing DOM Nodes.

Serializing DOM Trees.

Script Access to the DOM.

Microsoft's DOM.

DOM-Related Markup Languages.

DOM for MathML 2.0.

DOM for SMIL Animation.

DOM for SVG 1.0.

Complete DOM Code Example.

Summary.

For Further Exploration.

9. Processing with JDOM and JAXP.

Overview of Java XML APIs.

JDOM: A Java-centric Parsing Approach.

How JDOM Differs from the DOM.

JDOM Packages.

Using JDOM.

Reading and Writing with JDOM.

JDOM Output with Valid Input.

Examples of Well-Formedness and Validation Error Handling.

JDOM Summary.

Sun's XML APIs: The Java XML Pack.

JAXP: Sun's Java API for XML Processing.

JAXP 1.1 Components and Packages.

Specifications Supported by JAXP.

Using JAXP 1.1.

Enabling Validation.

Using SAX with JAXP.

Using DOM with JAXP.

Java System Properties for JAXP.

JAXP Code Example.

TrAX Overview and Basic Example.

JAXP Summary.

SAX vs. DOM vs. JDOM vs. JAXP--Who Wins?

For Further Exploration.

III. TRANSFORMING AND DISPLAYING XML.

10. Styling XML Using CSS2.

What Is CSS?

CSS Basics: Declarations, Selectors, and Use with HTML.

Rules, Declarations, Selectors, Properties, and Values.

Embedded CSS Style Sheet in HTML.

External CSS Style Sheet.

Associating External CSS with HTML.

CSS2 Selectors.

Netscape 6 vs. Internet Explorer 5.5.

Style Sheet or Stylesheet?

Using CSS with XML.

Associating CSS with XML.

Element Names and the Display Property.

Rendering Example of XML with CSS.

Improved CSS Style Sheet for XML.

Robust CSS Style Sheet with Generated Text.

CSS with Internet Explorer 5.5.

Limitations of Using CSS for XML.

Summary.

For Further Exploration.

11. Transforming XML with XSLT and XPath.

Overview of XSLT and XPath.

What Is XSL? XSLFO? XSLT? XPath?

XSLT Processing Model.

Historical Perspective.

Relevant Specifications.

Using XSLT.

Server-Side Transformations.

XSLT and XSLFO Software Lists.

Functional Capabilities of XSLT.

Advantages of XSL/XSLT Compared to CSS.

@BHEADS = Choosing and Using XSLT Processors.

Running the Xalan XSLT Processor.

Hello XSLT Example.

XSLT Concepts and Examples.

Example XML Document: collection6.xml.

Central Concepts.

Value of a Node or Expression.

Default Built-in Template Rules and Node Tests.

The <xsl:stylesheet> Element.

Namespaces for XSLT, XPath, and XSLFO.

Stylesheet Structure.

Implicit vs. Explicit Stylesheets and Push vs. Pull.

Conditionals and Variables.

More about Setting Variables.

Multiple Decisions.

Generating an HTML Table.

Accessing Attributes.

Attribute Sets for Reuse.

XML to XML: Shallow Copies.

XML to XML: Deep Copies and Creating Elements.

More XML to XML Transformations.

Reader Challenge: Invoice XML to XML Transformation.

Iterating and Sorting.

Primary and Secondary Sort Keys.

Associating XSL with XML: Processing Instruction or Element.

Special Characters: Disabling Output Escaping.

Output Methods Revisited: XML, HTML, Text.

Reuse: Named Templates and Passing Parameters.

Attribute Value Templates.

Reuse: Including and Importing.

XPath Concepts and Examples.

The XPath Model.

Location Paths and Steps.

XPath Axes.

Node Types, Node Values, and Node Tests.

@BHEADS = Data Types.

XPath Expressions vs. XSLT Patterns.

XPath Operators.

XPath Functions.

Node-Set Functions in XPath.

String Functions in XPath.

Boolean Functions in XPath.

Numeric Functions in XPath.

XSLT Elements and Instructions.

Root and Top-Level Elements.

XSLT Instructions.

XSLT Functions.

Examples of XSLT Functions.

Case Study: Generating Link Pages from This Book's “For Further Exploration” Sections.

Problem Statement and Goals.

Examining the Generated XML Structure.

Basic Structure and Pitfalls.

Extracting the Links.

Adding Chapter Information.

Sorting the Links.

Adding Style.

Final Solution.

Additional XSLT Topics.

Microsoft XSL: Old and in the Way.

@BHEADS = Topics Not Covered in Detail.

Beyond XSLT 1.0: XSLT 2.0, XPath 2.0, EXSLT, and XSLTSL.

Summary.

For Further Exploration.

12. Practical Formatting Using XSLFO.

Introduction.

The Context of XSLFO.

Extensible Markup Language.

The XML Family of Recommendations.

Examples.

Basic Concepts of XSLFO.

Basic Concepts.

Area and Page Basics.

Page Geometry.

XSLFO Objects Related to Basic Areas and Simple Page Definitions.

Generic Body Constructs.

XSLFO Objects Related to Generic Body Constructs.

Tables.

XSLFO Objects Related to Tables.

Static Content and Page Geometry Sequencing.

XSLFO Objects Related to Static Content and Page Geometry Sequencing.

Floats and Footnotes.

XSLFO Objects Related to Floats and Footnotes.

Keeps, Breaks, Spacing, and Stacking.

Interactive Objects.

XSLFO Objects Related to Dynamic Properties and Dynamic Rendering Sequencing.

Supplemental Objects.

Lesser Used XSLFO Objects.

For Further Exploration.

IV. RELATED CORE XML SPECIFICATIONS.

13. XLink: XML Linking Language.

Overview.

Why HTML Linking Isn't Sufficient.

What Is XLink?

Link Types and the xlink:type Attribute.

Terminology and Concepts.

Support for XLink.

Relevant Specifications.

Historical Perspective.

Simple Link: Reinventing the Anchor.

Simple Link Code Example.

Declaring XLinks in DTDs.

Validating Simple XLinks with a Modified DTD.

XLink Attributes.

Attributes by Purpose and Datatype.

Attributes by Link Type.

Link Behavior.

xlink:show Attribute.

xlink:actuate Attribute.

Extended Links: The True Flexibility of XLink.

Basic Extended Link Example.

Locator Link Type.

Resource Link Type.

Arc Link Type.

Title Link Type.

Extended Link Code Example.

Outbound, Inbound, and Third-Party Links.

Third-Party Extended Link Example.

Linkbases.

XML Base Support for Relative URIs.

XLink Implementations.

Summary.

For Further Exploration.

14. XPointer: XML Pointer Language.

Overview.

Relevant Specifications.

Relationship to XPath.

Relationship to XLink.

Forms of XPointers.

Terminology and Concepts.

Historical Perspective.

Code Example Modification.

Forms of XPointers.

Full XPointers.

Bare Names.

Child Sequences.

XPointer Functions.

start-point() and end-point().

here() and origin().

range-to().

range() and range-inside().

string-range ().

Node-Points and Character-Points.

Node-Points.

Character-Points.

Escaping in XPointers.

XPointer Implementations.

Summary.

For Further Exploration.

V. SPECIALIZED XML VOCABULARIES.

15. XHTML: HTML for the Present and the Future.

The XHTML Family.

Why Do We Need XHTML?

Overlap of XHTML Family.

Relevant Specifications: The XHTML Family.

XHTML 1.0.

Strictly Conforming XHTML Documents.

Simple XHTML Example.

Three Flavors of XHTML 1.0 DTDs: Which Should You Use?

Differences between XHTML 1.0 and HTML 4.01.

Sloppy HTML Example.

Validating HTML 4.01 Transitional and Strict.

HTML Tidy: Converting HTML to XHTML.

Validating XHTML 1.0 Transitional and Strict.

Considerations for Displaying XHTML.

Combining XHTML with Another Vocabulary.

Modularization of XHTML.

Abstract Modules and Implementations.

The Modules.

Anatomy of a Module.

Drivers.

Building DTD Modules (Adding Your Own Module).

XHTML Basic.

RDDL: An Extension of XHTML Basic.

RDDL DTD.

RDDL Qualified Names Module.

RDDL 1.0 Document Model Module.

RDDL Resource Module.

RDDL XLink Module.

Sample RDDL Document.

XHTML 1.1--Module-based XHTML.

XHTML 1.1 Modules and Elements.

Conforming XHTML 1.1 Documents and the Driver.

Document Model Module and Customizing XHTML 1.1.

Near Future XHTML.

XForms: Next Generation Web Forms.

XML Events: An Events Syntax for XML.

Modularization of XHTML in XML Schema.

XHTML 2.0.

Summary.

For Further Exploration.

16. RDF: Resource Description Framework.

Overview.

Motivation.

History of RDF.

Model, or “Why Couldn't I Just Use XML?”

RDF Specifications.

Terminology.

Core Data Model of RDF.

Vocabulary for Ontological Description Using RDF.

Qualified Values: Special Case of Structured Values.

Classes and Instances.

Properties, Subproperties, and Property Constraints.

Containers.

Miscellaneous Vocabulary.

XML Serialization Syntax for RDF.

Full Syntax.

Container Syntax.

Abbreviated Syntax.

Parse Types.

Syntactic Conventions for Schemata.

Advanced Topics.

Statements about Statements.

Introducing Additional Constraints.

Model Theory.

For Further Exploration.

APPENDICES.

A. XML Family of Specifications in a Nutshell.
B. E-Commerce Specifications.
C. HTML 4.01 Character Entities.
D. Setting Up Your XML Environment.
E.
Index. 0201703599T05082002

Preface

XML: It's a cheese spread. No, it's a floor wax. No, it's two--two--two products in one! Or maybe it's everything but the kitchen sink? Say, did you hear the one about the XML Kitchen Sink Language? (see http://blogspace.com/xkitchensink/)

XML: What It's All About

It has been said that XML, the Extensible Markup Language, will become the ASCII of the twenty-first century because it is rapidly becoming ubiquitous. XML is expected to have an impact on both the Web and application development comparable to that of Java and JavaScript because it has opened up a wide variety of new capabilities and has been embraced by so many sectors of human endeavor.

XML is a metalanguage--a syntax for describing other languages. These languages span diverse vertical industries including accounting, advertising, aerospace, agriculture, astronomy, automotive products, biology, chemistry, database management, e-commerce/EDI, education, financial institutions, health care, human resources, mathematics, publishing, real estate, software programs, supply chain management, and many more (for the many more, see http://www.xml.org/ml/industry_industrysectors.jsp). In one sense, XML is really a very trivial thing--just a markup syntax for describing structured text using angle brackets. But in another sense, XML is a basic building block--an enabling technology that makes it possible to develop more complex, more interesting, and more powerful tools.

In the Web arena, XML is facilitating exciting improvements such as user-controllable views and filtering of information, creation of truly device-independent content that can be re-purposed for vastly different devices, highly focused searching based on element hierarchies, and more sophisticated and flexible linking mechanisms. In the business and application arena, XML makes it easier to deliver filtered content from databases, to more readily share data between applications and between companies, and to exchange EDI messages that describe complex transactions. In the scientific arena, XML is a natural fit for describing complex datasets, models, control of instruments, images, chemical compounds, and much more.

Just as Java made data processing platform-independent, XML has done the same for data, making the exchange of information much easier than ever before. But, no, XML is not the kitchen sink; it is not the solution to all of the world's problems in one tidy package; nor is it the solution to all your computer needs either, at least not alone. Rather, XML is a tool, or more accurately, a set of tools from the same toolbox. That toolbox is the XML family of specifications. This book will help you see what XML can and cannot do by describing how to use each tool.

Although XML shares a number of concepts with its ancestor, SGML (Standard Generalized Markup Language), XML is said to yield 80 percent of the benefits of SGML, but with only 20 percent of the complexity. It is precisely this 80/20 rule that has excited countless companies and developers, encouraging them to support the efforts of the World Wide Web Consortium (W3C) in the development of XML. A few of the more than 500 companies and organizations that actively support XML development as members of the W3C include IBM, Sun, Microsoft, Oracle, Commerce One, and NASA.

Audience: Who Should Read This Book?

The book is intended for Web developers, which includes programmers, content writers, and designers. Depending on your background and interests, some chapters may be more relevant to you than others. It's intended for those who may be familiar with particular aspects of XML but who have not been formally exposed to all of the major W3C specifications, as well as those who have never dealt with XML before. Later in this preface, I provide a roadmap to help orient you.

I've assumed that most readers are familiar with HTML elements and syntax, although the XML and DTD syntax discussions in Chapters 3 and 4 pretty much cover the concepts of elements, attributes, types, entities, and content that carry over from HTML to XML. In other words, you can get by without knowing HTML, except the XHTML chapter, which will make much more sense to you if you do. For those who would like to brush up on HTML, see "For Further Exploration: HTML and Java" at the end of this preface.

Some examples require programming knowledge, but for most examples, anyone with general Web development skills will find them beneficial. Generally, scope and breadth of treatment is favored over depth. On the other hand, some readers will find that the depth is more than they expected, but they should still be able to "tread the water." My intent in writing this book was to cover a number of XML-related technologies in varying degrees of detail. I'd like to make it clear that although there are three chapters containing Java examples, this is not a book about Java and XML. You don't need a Java background for the vast majority of what's in this book.

Although I do assume the Windows operating system, this is not a statement of preference. My formative years were spent on UNIX (I still use UNIX utilities to maintain a ski club site) at the office and a Mac at home. Rather, since Windows tends to be somewhat ubiquitous, it seems appropriate to show Windows command lines and mention some Windows-only tools. UNIX and Mac users are encouraged to share their experiences with fellow readers via the book's Web site. Personally, I have found cygwin--a UNIX environment for Windows developed by Red Hat--to be very handy (see http://cygwin.com/).

What's Special About This Book?

There are several features that contribute to making this book an invaluable resource for anyone beginning to plunge into the somewhat turbulent "seas" of XML.

  • XML Family of Specifications Big Picture--Since early 1998, I've periodically updated a diagram I call "The Big Picture of the XML Family of Specifications." This unique diagram (front inside cover) depicts virtually all of the key W3C efforts related to XML, with colors to indicate each specification's status (maturity); it includes related non-W3C efforts as well. Physical positioning denotes a relationship among neighboring specifications, as explained in Chapter 2. Best of all, the Big Picture diagram appears as an imagemap on the CD-ROM and on this book's Web site, possibly as a more up-to-date version. The Big Picture imagemap on the Web site expands acronyms as your mouse hovers over a term. Clicking on the acronym or name connects you instantly to the actual specification or, in some cases, a collection of documents relating to that specification.

  • History Timeline--A detailed "History of the Web and XML" in timeline form--the product of a considerable amount of research--is broken down into three time periods in Chapter 1, which should be interesting to many readers. Historical perspectives are also presented for particular specifications in their own chapters. A rather unique pullout at the back of the book shows, in bar chart format, the gestation periods of all of the XML specifications in this book, giving you a visual picture of what developments occurred in sequence and/or in tandem.

  • Coverage--I've selected what are generally considered to be the most significant XML-related specifications from the W3C: XML/DTDs, XML Namespaces, XML Schema, the DOM, CSS, XSLT, XPath, XSLFO, XLink, XPointer, XHTML, and RDF. Several of the less frequently discussed specifications, such as XML Infoset, Canonical XML, XML Base, and XML Inclusions, are also covered. In addition, I've included four topics that are not under the purview of the W3C: RDDL, SAX, JDOM, and JAXP. The focus is on breadth rather than depth of coverage because if you have a general understanding of a lot of XML topics, you can better appreciate which are most relevant to your needs and you can "drill down" to the details by following the links I provide. The hope is that as you become more familiar with each of the topics I present, you'll know which areas you'll want to explore by buying more specialized Addison-Wesley or Prentice Hall books (e.g., about XSLT, XML with Java, or XHTML). I've tried hard to make the information current and have spent a good bit of time in the final months polishing and updating details here and there. All topics are as up-to-date as possible, except where noted otherwise.

  • For Further Exploration--Each chapter ends with a section called "For Further Exploration," which presents quite a few links that serve not only as my bibliography, but also points to resources that contain more details than what can be provided here without killing way more than my fair share of trees. Links are provided to the specifications themselves, to articles that explain the specs in more everyday language than the precision required for formal specifications, and to articles describing subtleties or nuances of the specs. Links to tutorials, books, software, special references, and so on are also supplied. My intention is that readers will use the links, so they all appear in HTML form on the book's CD. Professors may wish to consider some of these links for students' research assignments.

  • Tables--I'm a big fan of the use of tables. When I read a technical book, I seldom read it word for word, cover to cover. Often I want to locate some particular detail pretty quickly, so I look it up in the table of contents or index--I don't want to have to skim through paragraph after paragraph to find the little tidbit I need. Therefore, I feel that tables will help you do the same thing, maximizing the use of your time. The List of Tables is something with which you might want to familiarize yourself--let a table be your friend.

  • CD-ROM--The CD that accompanies the book contains all the sample code presented in the text, as well as most of the software I used while writing this book, including the following:
  • - Code Examples--every example that appears as a code listing plus a number of variations
    - XML Environment--batch files to simplify using XML with Java on Windows operation systems
    - For Further Exploration--all links from the end of each chapter
    - Big Picture of XML Family of Specifications Imagemap--links to more than 60 specifications, including many not covered in this book (see Chapter 2)
    - W3C XML Specifications in PDF Form--every W3C specification discussed in this book is available (unedited) for offline reading(hours and hours of fun for the whole family)
    - Glossary of terms
    - Chapter 12, "Practical Formatting Using XSLFO" by G. Ken Holman, in HTML format with two useful appendices which aren't included in the printed book
    - Freeware and evaluation copies of commercial software (XML/DTD/XML Schema editors, validators, parsers, XSLT processors, and more)
  • Web Site--The book's main Web site is hosted by Web Developer's Virtual Library, an Internet.com site. I maintain the extensive XML section of WDVL.com. The book's URL there is http://WDVL.Internet.com/Authoring/Languages/XML/XML-Family. There you'll find all the links from the "For Further Exploration" sections organized by chapter, as well as the online version of the Big Picture imagemap, and of course the inevitable corrections to the text. While this material appears on the CD-ROM, the Web site versions may be more up-to-date. The Web site will be updated periodically; you can register to receive e-mail when the site is updated, if you wish.

Organization and Roadmap: How You Should Read This Book

This book is divided into five conceptual parts. With the exception of a few chapters in Part I, it is not absolutely necessary to read this book chapter by chapter (and I'll tell you right up front: "the butler did it"). Chapter 1, "History of the Web and XML," provides an interesting historical perspective of the development of XML, but some readers may prefer to skip it entirely, or at least defer reading it until they've completed other chapters or find themselves on a long, boring plane flight with neither good movies nor readable magazines. Readers without a Java background may wish to gloss over the three chapters that contain Java examples, instead focusing on the concepts that are discussed in these chapters. The following describes the book's organization and suggested reading emphasis.
  • Introduction: History of the Web and XML--As mentioned, Chapter 1 provides an historical perspective. It's divided into three eras: Ancient History (1945 to 1984), Medieval History (1986 to 1994), and Modern History: From HTML to XML (1994 to 2001).
  • Part I: Fundamental XML Concepts and Syntax--This part introduces XML Syntax, DTD Syntax, the XML Infoset abstraction, Canonical XML, Namespaces, RDDL (Resource Directory Description Language), and XML Schema, corresponding to Chapters 2 through 6, intended to be read in sequence. All readers should read these chapters, although if you won't be developing your own vocabularies, you might be able to skim the DTD and XML Schema chapters (4 and 6, respectively). Although XML Schema is expected to replace the use of DTDs in many applications, your own project needs may dictate sticking with DTDs, in which case you could skip the XML Schema chapter, although I still recommend that you read the sections in Chapters 4 and 6 that highlight DTD limitations and XML Schema advantages. If you are tempted to skip the chapter on Infoset, Canonical XML, Namespaces and RDDL (Chapter 5), be sure to at least read the Namespaces section because this concept is central to many XML specifications. All chapters following 5 assume you are familiar with XML Namespaces. Although RDDL is a recent grassroots effort as I write this, it's bound to have gathered a lot of momentum by the time you read this.
  • Part II: Parsing and Programming APIs--This part presents SAX (Simple API for XML), DOM (Document Object Model), JAXP (Java API for XML Processing) and JDOM--Chapters 7 through 9. All of these are application programming interfaces (APIs) to parsing and manipulating XML documents. This is the part of the book with the most Java examples. While all readers are encouraged to read the initial sections of the SAX and DOM chapters, non-Java developers can completely skip Chapter 9, which covers JAXP and JDOM, as well as the code examples in the SAX and DOM chapters. However, be sure to read the explanation of parsing at the beginning of Chapter 7 and study the comparison, "SAX vs. DOM vs. JDOM vs. JAXP--Who Wins?" at the end of Chapter 9.
  • Part III: Displaying and Transforming XML--This part covers CSS (Cascading Style Sheets), XSLT (Extensible Stylesheet Language Transformations), XPath (XML Path Language), XSLFO (Extensible Stylesheet Language Formatting Objects), presented in Chapters 10 to 12. Of these, the lengthy Chapter 11 on XSLT and XPath is essential reading for anyone who wishes to display or transform XML into other formats (including HTML, XHTML, text, or other kinds of XML, particularly in e-commerce applications). Chapter 10 on CSS is more important if your XML display needs are more modest and your transformation needs are nil. The chapter can be skimmed for XML hooks if you are already familiar with CSS. Chapter 12 concerns XSL Formatting Objects, sort of the next generation CSS for desktop publishing quality layout, PDF, and targeting your output for different devices. The XSLFO chapter was contributed by noted XSL expert and instructor, G. Ken Holman, chair of the OASIS XSLT/XPath Conformance Technical Committee (see his home page at http://www.cranesoftwrights.com/).
  • Part IV: Related Core XML Specifications--This part focuses on XLink (XML Link Language) and XPointer (XML Pointer Language)--Chapters 13 and 14. Most developers will benefit from reading about XLink and XPointer because they greatly extend the notion of linking and fragment access beyond what is possible in HTML 4.01, including one-to-many links, multidirectional links, links stored external to the documents, and linking to specific elements without hooks being provided by the original author.
  • Part V: Specialized XML Vocabularies--This part presents two unrelated XML-based languages: XHTML (Extensible HyperText Markup Language) in Chapter 15 and RDF (Resource Description Framework) in Chapter 16. Please consider Chapter 15 on XHTML as essential reading for all developers. As you'll see, XHTML is its own nuclear family of specifications that is currently replacing HTML, especially in the increasingly popular world of handheld devices, voice browsers, and other alternative Web interfaces. RDF should be of particular interest to developers and scientists with an interest in metadata (data about data), site descriptions, catalogs, intelligent software agents, and so on. RDF attempts to add semantics to the Web; related concepts are the recent XML Topic Maps (XTM) effort and the older Dublin Core work. The RDF chapter was contributed by Ora Lassila, co-author of the Resource Description Framework Model and Syntax Specification for the W3C and contributor to the RDF Core Working Group and Web-Ontology (WebOnt) Working Group (see his home page at http://www.lassila.org/).

This book does not cover XQuery, an XML Query language, nor Scalable Vector Graphics (SVG), except in passing. XQuery was still very much in flux at the time of this writing. As for SVG, with a more than 500-page specification, I felt I could not do the topic justice in the time I had left after writing the rest of this book. Well, there's always the Second Edition, I guess.

What You Need to Get the Most Out of This Book

All code examples have been developed on a Dell Dimension XPS R450 PC (a paltry 450 MHz) running Windows 98. DOS .bat files are provided to help you configure your environment so that you can run the examples on your own. UNIX developers should be able to study the .bat files and set environment variables accordingly, such as CLASSPATH for Java and variables that point to the location of XML parsers and XSLT processors. I'm afraid I can't say much to Mac developers at this point (sadly, my own ancient PowerMac 7100/80 hasn't been used for the better part of three years), but if you contact me via the Web site and want me to share your experiences with others, I will gladly do so. I'll give you credit and a free copy of this book--it makes a great gift and keeps its flavor longer than fruitcake.

XML and DTD examples are plain text, so they are viewable in their raw form on all platforms using any text editor. To process XML in a browser, however, you'll need the most current generation of browsers, such as Netscape 6.x, Internet Explorer 5.5 or 6.x, Amaya 5.x, or Opera 5.x or higher. If you're not the type of reader who has to try out every example in his or her own browser, then perhaps the many screenshots in this book will be sufficient. Evaluation copies of commercial XML, DTD and XML Schema editors appear on the CD that accompanies this book; XML parsers and XSLT processors also appear there. The CD also contains a page of links to the current versions of all provided software, as well as links to software that couldn't be included on the CD for a variety of reasons.

The Java code examples should compile and run fine with either JDK 1.2.x or 1.3.x, also known by other confusing names and numbers such as Java 2 SDK, J2EE, and J2SE--or their equivalent as provided with your favorite Java IDE (Integrated Development Environment). This book does not attempt to teach Java; on the other hand, you really don't need to know Java to follow most of the discussions. Interested readers who desire a better Java background should refer to the key Java resources listed in "For Further Exploration: HTML and Java" that follows.

I truly hope you enjoy this book and find the XML family of specifications as fascinating as I do.

Conventions Used in This Book

The typographic conventions used in this book are as follows:

  • Glosssary terms look like this where they are defined: node-set
  • Code excerpts, code listings, command lines, filenames, element names, and attribute names look like this: <xsl:template match="/CD"> or collection8.xml.
  • Quotations (material excerpted from another source) is indented both left and right and is set in a smaller type size.
  • Notes, important information or things to watch out for, are set off by an arrow in the margin and rules above and below their text.

For Further Exploration: HTML and Java

Dave Raggett's Getting Started with HTML
http://www.w3.org/MarkUp/Guide/

Web Design Group's HTML 4.0 Reference
http://www.htmlhelp.com/reference/html40/

Google's HTML Tutorials category
http://directory.google.com/Top/Computers/Data_Formats/Markup_Languages/HTML/Tutorials/

Java Technology Products and APIs
http://java.sun.com/products/

The Java Tutorial
http://java.sun.com/docs/books/tutorial/

Google Web Directory: Java includes a Books category
http://directory.google.com/Top/Computers/Programming/Languages/Java/

Google Web Directory: Java IDEs
http://directory.google.com/Top/Computers/Programming/Languages/Java/Development_Tools/Integrated_Development_Environments/

Cafe au Lait Java FAQs, News, and Resources
http://www.ibiblio.org/javafaq/

0201703599P01182002

Index

Click below to download the Index file related to this title:
Index

Updates

Submit Errata

More Information

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020