Home > Store

Charles F. Goldfarb's XML Handbook, 4th Edition

Register your product to gain access to bonus material or receive a coupon.

Charles F. Goldfarb's XML Handbook, 4th Edition

Premium Website

  • Sorry, this book is no longer in print.
Not for Sale


  • Copyright 2002
  • Edition: 4th
  • Premium Website
  • ISBN-10: 0-13-065198-2
  • ISBN-13: 978-0-13-065198-3

  • The proven XML resource: applications, products, technologies, and tutorials!
  • Revised and enlarged-latest standards and trends: schemas, datatypes, XSL, voice, wireless
  • Two CD-ROMs: 175 genuinely free software packages, including the IBM alphaWorks suite
  • Web services: SOAP, WSDL, UDDI

FREE Trial Version TurboXML IDE & Schema Editor

FREE NeoCore XMS Native XML Database—Personal Edition

100,000 copies in print


The proven resource for the Semantic Web and Web Services—100,000 copies in six languages!

Developers, managers, consultants, and VCs rely on its technical accuracy, accessible writing style, and broad and deep coverage.

Learn XML

Start by learning what XML is, why it came to be, how it differs from HTML, and the handful of vital concepts that you must understand to apply XML quickly and successfully—in your business and in your code.


Experience XML through illustrated discussions of tools and applications: Web services, B2B, B2C, EDI, exchanges, e-commerce, integration, portals, content management, databases, conversion, syndication, telephony, wireless, customization, publication, presentation.

Master XML

Master the details from friendly, in-depth presentations: XML, schemas, DTDs, datatypes, XSLT, XSL-FO, XLink, XPath, XPointer, XSDL, namespaces, topic maps, RDF, SOAP, UDDI, WSDL, VoiceXML.

"This book is an excellent starting point where you can learn and experiment with XML. As the inventor of SGML, Dr. Charles F. Goldfarb is one of the most respected authorities on structured information."

—From the Foreword by Jean Paoli,
Microsoft XML architect and co-editor of the W3C XML specification

2 CD-ROMs: 175 no-time-limit FREE packages

Sample Content

Online Sample Chapter

The XML Usage Spectrum

Table of Contents

Preface by Charles F. Goldfarb.

Foreword by Jean Paoli, co-editor of W3C XML Recommendation.

Prolog by Jon Bosak, chair of W3C XML Working Group.


 1. Why XML?
 2. Just enough XML.
 3. The XML usage spectrum.
 4. Better browsing through XML.
 5. Taking care of e-business.
 6. XML Jargon Demystifier(tm).


 7. Personalized frequent-flyer website.
 8. Building an online auction website.
 9. Enabling data sources for XML.


10. From EDI to IEC: The new Web commerce.
11. XML and EDI: Working together.
12. An information pipeline for petrochemicals.


13. Application integration with Web and email.
14. Integrating the mainframe.
15. Integrated provisioning.
16. Business integration.


17. “World” class content management.
18. Content systems.
19. Components: Key to content management.
20. Components for graphic content.


21. Portal servers for e-business.
22. Content systems for portals.
23. RxML: Your prescription for healthcare.
24. Information and Content Exchange (ICE).


25. Personalized financial publishing.
26. High-volume data reporting.
27. Developing reusable content.


28. XML and databases.
29. XPath-based XML DBMS.
30. Storing XML in a relational DBMS.
31. XML, SQL, and XPath: Getting it all together.


32. XML mass-conversion facility.
33. Do-it-in-house mass conversion.
34. Integrating legacy data.
35. Acquiring reusable renditions.


36. Building a schema for a product catalog.
37. Schema management.
38. Building your e-commerce vocabulary.
39. XML design.


40. VoiceXML in a mobile environment.
41. Adding telephony to your website.


42. Extended linking.
43. Topic maps: Knowledge navigation aids.
44. RDF: Metadata description for Web resources.
45. Application integration using topic maps.


46. The Web services vision.
47. Web services technologies.
48. Deploying a Web service.


49. XML processing.
50. Java technology for XML development.
51. Compression techniques for XML.
52. New directions for XML applications.


53. XML basics.
54. Creating a document type definition.
55. Entities: Breaking up is easy to do.
56. Advanced features of XML.
57. Reading the XML specification.


58. Namespaces.
59. Datatypes.
60. XML Schema (XSDL).


61. XML Path Language (XPath).
62. XSL Transformations (XSLT).
63. XSL formatting objects (XSL-FO).
64. XML Pointer Language (XPointer).
65. XML Linking Language (XLink).


66. Free resources on the CD-ROM.
67. Repositories and vocabularies.
68. Acronyms and initialisms in The XML Handbook.
69. Other books on XML.



When Paul Prescod and I wrote the first edition of this book—four years and 100,000 copies ago—XML was brand new and the subject of extraordinary hype. It promised to provide universal data interchange, revolutionize publishing on the Web, and transform distributed computing.

Those claims were amazing, not just because of the extent of the promised impact, but because of the diversity of the areas affected. More amazingly, the claims have largely been fulfilled. With the support of the entire computer industry, an XML-based infrastructure is being constructed for modern computing; indeed, for modern business itself.

In some ways, though, the construction site resembles the Tower of Babel. The professionals in the areas affected by XML tend to talk and write about it in their own way, from each area's unique perspective, and in its specialized jargon.

But not in The XML Handbook!

From the first edition, our aim has been to integrate and unify the teaching of XML so that any tech industry professional can learn it, regardless of background. And by "learn it" we mean not just the technical details but the way that XML is used. Specifically:

  • We use a unified standards-based vocabulary consistently. We explain when particular disciplines or industries use terms in conflicting or ambiguous ways.
  • We explain all technical concepts as we introduce them, even the basics, but we don't indulge in "simplification by distortion". We clarify without sacrificing accuracy.
  • We describe major trends, applications, and product categories objectively, employing the unified vocabulary, so you can see clearly how they relate to one another and to XML technology.

As a result, developers with diverse backgrounds found they could get the full picture of XML from The XML Handbook. Moreover, they also found they could encourage management to read the book and learn why XML is so important to the enterprise.

XML in a nutshell

HTML—the HyperText Markup Language—made the Web the world's library. XML—the Extensible Markup Language—is its sibling, and it is making the Web the world's commercial and financial hub.

In the process, the Web is becoming much more than a static library. Increasingly, users are accessing the Web for "Web pages" that aren't actually on the shelves. Instead, the pages are generated dynamically from information available to the Web server. That information can come from databases on the Web server, from the site owner's enterprise databases, or even from other websites.

And that dynamic information needn't be served up raw. It can be analyzed, extracted, sorted, styled, and customized to create a personalized Web experience for the end-user. To coin a phrase, Web pages are evolving into Web services.

For this kind of power and flexibility, XML is the markup language of choice. You can see why by comparing XML and HTML. Both are based on SGML—the International Standard for structured information—but look at the difference:


<p>P200 Laptop<br>Friendly Computer Shop<br>$1438


<product><model>P200 Laptop</model><dealer>Friendly Computer Shop</dealer><price>$1438</price></product>

Both of these may appear the same in your browser, but the XML data is smart data. HTML tells how the data should look, but XML tells you what it means.

With XML, your browser knows there is a product, and it knows the model, dealer, and price. From a group of these it can show you the cheapest product or closest dealer without going back to the server.

Unlike HTML, XML allows custom tags that can describe exactly what you need to know. Because of that, your client-side applications can access data sources anywhere on the Web, in any format. New "middle-tier" servers sit between the data sources and the client, translating everything into your own task-specific XML.

But XML data isn't just smart data, it's also a smart document. That means when you display the information, the model name can be in a different font from the dealer name, and the lowest price can be highlighted in green. Unlike HTML, where text is just text to be rendered in a uniform way, with XML text is smart, so it can control the rendition.

And you don't have to decide whether your information is data or documents; in XML, it is always both at once. You can do data processing or document processing or both at the same time.

With that kind of flexibility, it's no wonder that we're starting to see a new Web of smart, structured information. It's a "Semantic Web" in which computers understand the meaning of the data they share.

Your broker sends your account data to Quicken using XML. Your imaging software keeps its templates in XML. Everything from math to multi-media, chemistry to commerce, wireless to Web services, is using XML or is preparing to start.

The XML Handbook will help you get started too!

What about SGML?

This book is about XML. You won't find feature comparisons to SGML, or footnotes with nerdy observations like "the XML empty-element tag does not contradict the rule that every element has a start-tag and an end-tag because, in SGML terms, it is actually a start-tag followed immediately by a null end-tag".

Nevertheless, for readers who use SGML, it is worth addressing the question of how XML and SGML relate. There has been a lot of speculation about this.

Some claim that XML will replace SGML because there will be so much free and low-cost software. Others assert that XML users, like HTML users before them, will discover that they need more of SGML and will eventually migrate to the full standard.

The truth is that XML is a simplified subset of SGML. The subsetting was optimized for the Web environment, which implies data-processing-oriented (rather than publishing-oriented), short life-span (in fact, usually dynamically-generated) information. The vast majority of XML documents will be created by computer programs and processed by other programs, then destroyed. Humans will never see them.

Eliot Kimber, who was a member of both the XML and SGML standards committees, says:

There are certain use domains for which XML is simply not sufficient and where you need the additional features of SGML. These applications tend to be very large scale and of long term; e.g., aircraft maintenance information, government regulations, power plant documentation, etc.
Any one of them might involve a larger volume of information than the entire use of XML on the Web. A single model of commercial aircraft, for example, requires some four million unique pages of documentation that must be revised and republished quarterly. Multiply that by the number of models produced by companies like Airbus and Boeing and you get a feel for the scale involved.

I agree with Eliot. I invented SGML, I'm proud of it, and I'm awed that such a staggering volume of the world's mission-critical information is rep-resented in it.

I'm gratified that SGML made the Web possible and that the Society for Technical Communication awarded joint Honorary Fellowships to the Web's inventor, Tim Berners-Lee, and myself in recognition of the synergy.

But I'm also proud of XML. I'm proud of my friend Jon Bosak who made it happen, and I'm glad that the World Wide Web is becoming XML-based.

If you are new to XML, don't worry about any of this. All you need to know is that the XML subset of SGML has been in use for a decade or more, so you can trust it.

SGML still keeps the airplanes flying, the nuclear plants operating safely, and the defense departments in a state of readiness. You should look into it if you produce documents on the scale of an Airbus or Boeing. For the rest of us, there's XML.

About our sponsors

With all the buzz surrounding a hot technology like XML, it can be tough for a newcomer to distinguish the solid projects and realistic applications from the fluff and the fantasies. It is tough for authors as well, to keep track of all that is happening in a field expanding as rapidly as this one.

In this case, the solution to both problems was to seek support and expert help from friends in the industry. I know the leading companies in the XML arena and know they have experience with both proven and leading-edge applications and products.

In the usual way of doing things, had we years to write this book, we would have interviewed each company to learn about its strategies, products and/or application experiences, written some chapters, asked the companies to review them, etc., and gone on to the next company. To save time and improve accuracy, we engaged in parallel processing. I spoke with each sponsor, agreed on subject matter for a chapter that would fit the book plan, and asked them to write the first draft.

I used their materials as though they were my own interview notes—editing, rewriting, deleting, and augmenting as necessary to achieve my objective for the chapter in the context of the book. I used consistent standards-based terminology and an objective factual style. All sponsored chapters are identified with the name of the sponsor, and usually with the names of the experts who contributed to it. I'd like to take this opportunity to thank them for being so generous with their time and knowledge.

We are grateful to our sponsors just as we are grateful to you, our readers. Both of you together make it possible for The XML Handbook to exist. In the interests of everyone, we make our own editorial decisions and we don't recommend or endorse any product or service offerings over any others.

Our thirty-four sponsors are:

  • Adobe Systems Incorporated, http://www.adobe.com/products/framemaker
  • Altova, Inc., http://www.xmlspy.com
  • Auto-trol Technology, http://www.auto-trol.com
  • Business Layers, http://www.businesslayers.com
  • Chrystal Software, http://www.chrystal.com
  • eidon GmbH, http://www.eidon-products.com
  • Four J's Development Tools, http://www.4js.com
  • Hewlett-Packard Company, http://www.hp.bluestone.com
  • HostBridge Technology, http://www.hostbridge.com
  • IBM Corporation, http://www.ibm.com/xml
  • IBM Informix, http://www.informix.com
  • ICE Authoring Group, http://www.icestandard.org
  • Infoteria Corporation, http://www.infoteria.com
  • Innodata Corporation, http://www.innodata.com
  • Intel Corporation, http://www.intel.com/eBusiness
  • Intelligent Compression Technologies, Inc., http://www.ictcompress.com
  • Interwoven, Inc., http://www.interwoven.com
  • Liquent, http://www.liquent.com
  • Microsoft Corporation, http://msdn.microsoft.com/xml
  • Mozquito Technologies AG, http://www.mozquito.com
  • NeoCore, http://www.neocore.com
  • Oracle Corporation, http://www.oracle.com/xml
  • Peregrine Systems, http://www.peregrine.com
  • Profium Ltd., http://www.profium.com
  • Sequoia Software, http://www.sequoiasoftware.com
  • Software AG, http://www.softwareagusa.com
  • Sun Microsystems, http://java.sun.com/xml
  • Synth-Bank LLC, http://www.synthbank.com
  • TIBCO Software Inc., http://www.extensibility.com
  • Vertical Computer Systems/Emily Solutions, http://www.emilysolutions.com
  • Voxeo Corporation, http://community.voxeo.com
  • XCare.net, http://www.xcare.net
  • XMLCities, Inc., http://www.XMLCities.com
  • XyEnterprise, http://www.xyenterprise.com

How to use this book

The XML Handbook has eighteen parts, consisting of 69 chapters, that we intend for you to read in order.

Well, if authors didn't have dreams they wouldn't be authors.

In reality, we know that our readers have diverse professional and technical backgrounds and won't all take the same route through a book this large and wide-ranging. Here are some hints for planning your trip.

To start, you can get the best feel for the subject matter by reading the Table of Contents and the introductions to each part. The introductions are less than a page long and usually epitomize the subject area of the part in addition to introducing the chapters within it.

Part One contains introductory tutorials and establishes the terminology used in the remainder of the book. Please read it first.

Parts Two through Fourteen cover different application domains. The chapters are application discussions, case studies, and tool category discussions, plus some introductory discussions and tutorials. You can read them with only the preceding parts (especially Part One) as background, although technical readers may want to complete the remaining tutorials first. 2

Those can be found in Parts Fifteen through Seventeen. We strove to keep them friendly and understandable for readers without a background in subjects not covered in this book. Tutorials whose subject matter thwarted that goal are labeled as being a tad tougher so you will know what to expect, but not to discourage you from reading them.

Part Eighteen contains resources: guides to the CD-ROM and to public XML vocabularies, an acronym dictionary, and a guide to the other books in this series.


Submit Errata

More Information

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information

To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.


Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.


If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information

Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.


This site is not directed to children under the age of 13.


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information

If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information

Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents

California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure

Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact

Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice

We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020