Home > Articles > Web Services > XML

📄 Contents

  1. What Is XML?
  2. Building XML Messages from Processes to Data
  3. Is XML Ready for Business?
  • Print
  • + Share This
This chapter is from the book

Building XML Messages from Processes to Data

This section looks at the process for building business messages with XML. As we discussed earlier in this chapter, XML lets trading partners define their own elements and tags, taking advantage of XML's extensible nature—the X in XML. But XML messages also represent the structure of those elements, following their prescribed relationships in the hierarchy. The message schema DTD captures the names of the elements as represented by the tags and their hierarchical structure. Messages exchanged among trading partners therefore must represent the rules and practices of a business or industry, as captured in the schema DTD.

For example, in Chapter 3, "ebXML at Work," the Marathoner running store case study points out how retailer and manufacturers can exchange product identifiers and precise inventory levels, so that manufacturers can compare inventory levels to predefined reorder points and decide whether they need to ship more product. Before any of these exchanges can happen, however, the retailer and the manufacturers—or, better yet, the entire industry—need to agree on common terminology and structure of the messages. With this common set of rules, shoe manufacturers and retailers can use the same basic set of messages, which promotes the use of packaged software and makes it possible for the parties to develop their systems faster and for less money.

We call this common set of rules a data model because, like a schematic drawing, it offers a skeleton view of the messages, specifies the order of the elements in a message, and shows how the various elements relate to one another in a hierarchy. The term comes from the database world, where database design needs to meet the users' business requirements as efficiently as possible, yet still allow for future growth. The logical model defines the information fields and their relationships in a database (much like a schema DTD in an XML message), while the physical model details field sizes and datatypes, such as alphanumeric or date formats.30 In fact, defining an XML schema of information is analogous to creating traditional row-and-column layouts for a database design system.

The XML syntax is not just about interpreting the content. The business process is a vital component of the content and is helped along by XML.

Determine Processes

As shown in the case studies in Chapter 3, the parties identify business processes or actions taken by the companies to achieve their business goals. For example, the travel agency case proposes a process to decide on a tour package. This process has contingencies built in for continued bids and best-and-final offers if the customers don't want to accept one of the first offers. By working out these larger processes, the trading partners can agree on the overall conduct of the business, before trying to determine the individual messages.

A tool called use cases can help identify these processes. Use cases describe scenarios in which users interact with each other and the systems under development. Each scenario describes the accomplishment of a specific task or achievement of a goal. They also identify the players, steps in the process, and the messages or even the data exchanged. By describing these situations in a storytelling mode, use cases often uncover the processes underlying business practices.31

One of the ebXML development activities involves identifying similarities in business processes across industries. While each industry has its own language and culture, using these common processes helps speed the work and improves the chances for interoperability among industries.32

NOTE

By working out larger processes, trading partners can agree on the overall conduct of the business, before trying to determine the individual messages.

Determine Message Flows

Each process contains a set of individual messages exchanged among the trading partners. In Chapter 3, the running store case listed a series of messages in the process of reporting inventory levels and replenishing the stock:

  • Periodic inventory report sent from the store to the manufacturer

  • Ship notice sent from the manufacturer to the store with the shipment details

  • Receiving report sent from the store to the manufacturer once inventory is accepted

Industries defining their processes can identify the individual messages contained in those processes, as well as how and when the companies send and receive the messages. These messages may resemble EDI transaction sets (see Chapter 5, "The Road Toward ebXML," for a discussion of EDI), as in the running store case, or look nothing like EDI transactions, as in the travel agency case.

Identify Data in the Messages

Once industries identify the messages, they next need to identify the sets of business data that go into those messages. Industry organizations that have previously developed EDI transactions can use this work as the basis for identifying data for XML messages. Newer business processes must rely on information analysis between companies to determine the content required, often replacing older, paper-based documents. But the objective is to improve the way companies do business—not necessarily to follow the current EDI transactions or old paper-process documents. Industries sometimes use this exercise to test traditional assumptions and practices, which can cut out captured or exchanged data that's no longer needed. On the other hand, this process can generate more pieces of data needed by trading partners to meet their business requirements.

When applying this process to XML, industry groups develop XML vocabularies that put these groups of data into definable messages, also identifying the structure of the data in the messages. To aid understanding and reuse, the XML structure should link related and most-used pieces of information together as logical blocks. The messages thus embody the rules and practices of doing business in a particular industry, defined in terms of XML. In this way, industries can design common groups of data with common structures as industry-wide rules for processing XML messages.

XML vocabularies can represent more than vertical industries. Vocabularies can also define business functions found in multiple industries, or entire frameworks that provide interoperability across industries and functions. One of these frameworks is ebXML itself, which provides the underpinning for global business, not just an industry sector.33

Business Schema DTDs

As discussed earlier, DTDs, as specified for XML, contain the rules for both constructing and structurally validating XML messages. We'll now describe schemas in more detail to give you an understanding of how this key piece of the XML technology is used to enable consistent electronic business.

DTDs assemble information into elements with connected attributes. Elements are the basic building blocks of XML messages, and therefore the basic components of DTDs. Elements can contain other elements expressed in a hierarchy (compound elements), or they can stand alone as simple containers for character data. Compound elements for parent/child blocks can be referenced together. When the modeling process identifies the data in proposed XML messages, most of these data items will become elements, identified as such in DTDs. In XML messages themselves, elements are marked up as tags within the now familiar angle brackets (<>). Element definitions can indicate the frequency with which the elements occur—once or more than once—and whether they're required or optional.34

NOTE

XML vocabularies can define business functions found in multiple industries, or entire frameworks that provide interoperability across industries and functions.

Then attributes provide additional description or qualification for elements. Using the language metaphor often applied to XML, one can think of elements as nouns and attributes as adjectives. The XML document example presented earlier and the following DTD fragment identify the PostalCode as an element, with the codetype and its use as an attribute of that element:

<PostalCode codetype='ZIP'>96045 </PostalCode>

<!— DTD definition for element and attribute —>

<!ELEMENT PostalCode (#PCDATA) >
<!ATTLIST PostalCode
         codetype CDATA #IMPLIED >

With the schema DTD syntax, the attributes also provide a limited form of data typing, which means that they describe the kind of data allowed for that element. Attributes can contain strings (character data), enumerated lists, or references to other components in the document called tokens.

Enumerated lists restrict the attribute to only permitted character strings. For example, an attribute to identify smoking preferences for hotel reservations would have the following as its enumerated listing: SMOKING or NONSMOKING. Attributes can likewise indicate a default response, used routinely unless the customer requests otherwise. Returning to the hotel example, the NONSMOKING response could serve as the default, unless the customer specifically requests SMOKING.35 While schema DTD datatyping is deliberately simplistic but thereby more easily understood, the new W3C extended schema datatyping is extensive and sophisticated.36

The Entity Referencing System

Entities are rather misnamed. They're really aliases or substitution strings, intended to identify the reusable objects in a schema DTD, providing handy shortcuts and helping to ensure consistency in the rules expressed by the DTD. These reusable objects can consist of text strings, such as legal boilerplate, or more complex data element and attribute combinations, defined in advance and recalled when needed. Entities can be internal to the DTD or stored as fragments externally.37

Entities also help when placing a character inside a character data or CDATA section of an XML document that would cause confusion with the processing of the XML, such as &, <, >, and ".

Consider the telephone number in the following example. The boldfaced element <Telephone> is a substitution string declared as an entity in the schema DTD telephone-usa.xml, and then included as needed in XML documents based on that DTD. The OpenTravel Alliance uses this technique in its customer profile, which specifies several telephone numbers (customer, emergency contact, travel agency, and so on). The use of this technique simplifies the schema DTD and guarantees that all telephone numbers in the valid messages are defined consistently.38

<?xml version="1.0"?>
<!DOCTYPE Cust.Telephone SYSTEM 'http://xml.org/telephone-usa.xml' []>
<Cust.Telephone PhoneTech="Voice" PhoneUse="Home">
  < Telephone CountryAccessCode="1">
   < Phone.AreaCityCode>703
 	 </Phone.AreaCityCode>
   < Phone.Number>555-9999
   </Phone.Number>
  </ Telephone>
</ Cust.Telephone>

Example of Building a Data Model and XML Equivalent

Using a traveler's customer profile, we can show an example of a DTD and how it helps build and validate an XML message.

Table 4.1 shows the pieces of information in a scaled-down traveler profile database, showing three levels in the data hierarchy, as well as the content of each level—element, text, or attribute—as well as single/multiple occurrences, requirement indicator, and allowable options.

The control information identifies the creator of the profile (a travel agency, for the purpose of this exercise), whether it's a new record or an update, whether the customer has given permission to share the data in the profile, and a date/time stamp that most systems can generate routinely.

Table 4.1: Traveler Profile Database Structure

Data level 1

Data level 2

Data level 3

Content

Occurs

Required?

Options

Control info

 

 

Element

Single

Yes

 

 

Share permission?

 

Attribute

 

 

Yes

 

 

 

 

 

 

No

 

Agency

 

Element

Single

Yes

 

 

 

Agency name

Text

Single

Yes

 

 

 

Agency ID

Text

Single

 

 

 

New/Update

 

Text

Single

Yes

New Update

 

Date-time

 

Text

Single

Yes

 

Traveler ID

 

 

Element

Multiple

Yes

 

 

Traveler name

 

Element

Single

Yes

 

 

 

Title

Text

Multiple

 

 

 

 

Family name

Text

Single

Yes

 

 

 

Given names

Text

Multiple

 

 

 

Address

 

Element

Multiple

Yes

 

 

 

Address type

Attribute

 

 

Mailing Delivery

 

 

Number/street

Text

Single

Yes

 

 

 

Room/floor

Text

Multiple

 

 

 

 

City name

Text

Single

Yes

 

 

 

Postal code

Text

Single

Yes

 

 

 

State/Province

Text

Multiple

 

 

 

 

Country

Text

Single

 

 

 

Telephone

 

Element

Multiple

Yes

 

 

 

Telephone use

Attribute

 

 

Work Home

 

 

Country access

Text

Single

 

 

 

 

Area/city code

Text

Single

Yes

 

 

 

Tel. number

Text

Single

Yes

 

 

Email

 

Element

Multiple

 

 

 

 

Email type

Attribute

 

 

Work Personal

 

 

Email address

Text

Single

 

 

Form of payment

 

 

Element

Multiple

Yes

 

 

Payment type

 

Attribute

 

 

Credit card Debit card

 

Payment detail

 

Element

Multiple

Yes

 

 

 

Card number

Text

Single

Yes

 

 

 

Exp. date

Text

Single

Yes

 

 

 

Name on card

Text

Single

Yes

 

Travel preferences

 

 

Element

Multiple

 

 

 

General

 

Element

Multiple

 

 

 

 

Smoking section

Text

Single

 

Smoking Non-smoking

 

 

Meal preferences

Text

Multiple

 

 

 

 

Special needs

 

Multiple

 

 

 

Loyalty programs

 

Element

Multiple

 

 

 

 

Program type

Attribute

 

 

General Airline Hotel Rental car

 

 

Program name

Text

Single

 

 

 

 

Program ID

Text

Single

 

 

 

Airline

 

Element

Multiple

 

 

 

 

Departure airport

Text

 

 

 

 

 

Seat selection

Text

 

 

Aisle Center Window

 

Hotel

 

Element

Multiple

 

 

 

 

City section

Text

 

 

Downtown Suburbs Airport

 

 

Room type

Text

 

 

Single Double

 

Car rental

 

Element

Multiple

 

 

 

 

Car type

Text

 

 

Compact Midsize Full
SUV
Truck

 

 

Child seat

Text

Single

 

Yes
No


The DTD for this database structure (Traveler.dtd) is found on this book's web site (http://www.ebxmlbooks.com). Please note that this DTD example is meant only to illustrate how a DTD works, and should not be used for normal business messages.

From this database structure, a travel agency wants to create a traveler profile record for a traveler, with the following specific data and preferences:

Administrative control data

  • Agency name: GoGo Travel

  • Agency ID code: ZZY98234

  • Purpose of record: new

  • Date/time: 21 June 2001, 3:55 pm

  • Permission to share data in profile? No

Traveler identification

  • Traveler's name: Ms. Phoebe P. Peabody-Beebe

  • Address (delivery): 312 Sycamore St., Buffalo, NY 14204

  • Telephone (work): 716-555-9999

  • Email: Phoebe@PeabodyBeebe.com

Payment data

  • Type of payment: Credit card

  • Card number: 0000111122223333

  • Expiration date: 12/2002

  • Name on card: Phoebe P Peabody-Beebe

Preferences

  • Nonsmoking

  • Meal type: Vegetarian

  • Loyalty program—airlines: US Airways, no. 24680

  • Loyalty program—car rental: National Car Rental, no. 54321

  • Loyalty program—general: AmEx Membership Miles, no. 09876

  • Departure airport (IATA code): BUF

  • Airline seat preference: Aisle

  • Hotel, city section preference: downtown

  • Hotel room preference: single

  • Car type preference: Compact

Listing 4.4 gives a validated XML document for these entries based on the rules presented in Traveler.dtd.

Listing 4.4 Sample XML Document Based on Traveler.dtd

<Traveler>
 <Control>
    <Agency>
       <AgencyName>Go-Go Travel        </AgencyName>
       <AgencyID>ZZY98234</AgencyID>
    </Agency>
    <Purpose>New</Purpose>
    <DateTime>20010621t15:55:00</DateTime>
 </Control>
 <TravelerID Share="No">
    <TravelerName>
       <Title>Ms</Title>
       <Family>Peabody-Beebe</Family>
       <Given>Phoebe</Given>
       <Given>P.</Given>
    </TravelerName>
    <Address AddressType="Deliver">
       <NumberStreet>312 Sycamore St
       </NumberStreet>
       <City>Buffalo</City>
       <PostalCode>14204</PostalCode>
       <StateProv>NY</StateProv>
    </Address>
    <Telephone PhoneUse="Work">
       <AreaCity>716</AreaCity>
       <PhoneNumber>555-9999
       </PhoneNumber>
    </Telephone>
    <Email>
       <EmailAddress>
        Phoebe@PeabodyBeebe.Com
       </EmailAddress>
    </Email>
 </TravelerID>
 <Payment>
    <PayDetail>
      <CardNumber>
      0000111122223333
      </CardNumber>
      <ExpDate>12/2002</ExpDate>
      <NameOnCard>
      Phoebe P Peabody Beebe
      </NameOnCard>
    </PayDetail>
 </Payment>
 <Preferences>
    <General>
       <Smoking>Non-smoking</Smoking>
       <MealPref>Vegetarian</MealPref>
    </General>
    <Loyalty LoyalType="Airline">
       <LoyalName>US Airways
       </LoyalName>
       <LoyalID>24680</LoyalID>
    </Loyalty>
    <Loyalty LoyalType="Car Rental">
       <LoyalName>National Car     
        Rental</LoyalName>
       <LoyalID>54321</LoyalID>
    </Loyalty>
    <Loyalty LoyalType="General">
       <LoyalName>Amex Member
        Miles</LoyalName>
       <LoyalID>09876</LoyalID>
    </Loyalty>
    <Airline>
       <DepartAirport>BUF
       </DepartAirport>
       <SeatSelect>Aisle</SeatSelect>
    </Airline>
    <Hotel>
       <CitySection>Downtown
       </CitySection>
       <RoomType>Single</RoomType>
    </Hotel>
    <CarRent>
       <CarType>Compact</CarType>
    </CarRent>
 </Preferences>
</Traveler>

This message referencing the Traveler.dtd contains all of the required data, uses tags that match the element names in the DTD, presents the elements and tags in the order prescribed by the DTD, and therefore conforms as a valid structure to that DTD. Notice that the example doesn't have any data for child seat preferences listed under the XML car rentals section, but does have three different loyalty programs listed. The rules expressed in the DTD allow for such variations. However, if a message left out the traveler's name, a validating parser would return an error message accordingly.

XML Schema

The generic name for DTDs is schemas, a term borrowed from the database world. DTDs represent data only in a hierarchy, which works fine for documentation; remember that the W3C borrowed DTDs from SGML, designed for electronic documentation and the predecessor to XML.

However, many business databases use other kinds of structures—such as relational databases or object-oriented classes and properties—some of which don't always lend themselves to a hierarchical model. In some cases, particularly when working with a simple data structure, data architects have been able to adapt object-oriented structures or relational data models to the kind of hierarchies represented in DTDs. But business doesn't always deal a simple hand, and technologists need more robust and flexible tools than the DTD to be prepared for these more complex conditions.

The W3C has developed XML Schema, a major enhancement to XML that offers extended tools for representing information structures and objects, as well as providing extended datatypes beyond those in DTDs. In May 2001, XML Schema reached full recommendation status.39

XML Schema provides more power for defining the structure, content, and semantics of XML documents. The W3C specifications document has three parts:

  • Methods for describing the structure of data

  • Definition of datatypes

  • A primer, explaining its features40

The first part of the specification deals with structures, documenting the meaning, use, and relationships of the components of an XML document, such as elements, attributes, and entities. It provides the rules for validating XML documents, based on the rules described in the schemas. It also allows for referencing partial or multiple schemas, thus providing a great deal more flexibility and power than DTDs.41

NOTE

Software and systems supporting XML Schema will need to resist the temptation to cover all of the bells and whistles, since they build in more complexity and cost than is needed.

The second part of XML Schema covers datatypes and addresses the need for defining more kinds of data in the rules used to validate XML documents. This part of the specification identifies a group of basic (or primitive) datatypes such as strings, integers, dates, and sequences. The specification describes features of a datatype system, including acceptable ranges of values and valid representations of the data (such as whole numbers or scientific notation).

The specification identifies datatypes derived from those built into the basic XML recommendations, such as character data (CDATA), tokens, and entities. And it defines various components of datatypes to allow for the development of unanticipated datatypes.42

This greater flexibility comes with a price, however. While it's tempting to use many of these new features, many business applications require just a few of them at any time. For example, being able to validate dates and times will be a significant addition to XML's ability to support business. Few businesses, however, will need the ability to create entirely new datatypes. Software and systems supporting XML Schema will need to resist the temptation to cover all of the bells and whistles, since they build in more complexity and cost than is needed.43

As an alternative, work on RELAX NG is being developed by an OASIS Technical Committee and eventually for submission to ISO. RELAX NG is designed as a simpler and more accessible approach to providing schema functionality for XML documents.44

Other Details

XML Schema incorporates one of the first enhancements to the XML specification, called XML Namespaces. With XML Namespaces, schemas can address multiple XML vocabularies in a single document. Namespaces provide for uniqueness in element names by combining the namespace prefix (mapped to a uniform resource identifier, like a web address), and the local part or element or attribute name.45

Put simply, XML Namespaces allow different companies or industries to avoid name clashes where they both use the same word with different meanings or contexts, but with the same tag name. An example is the word stock, which has at least six possible meanings. An obvious example is using formats such as billing:address and supplier:address to clarify that address is being used in two different contexts.

  • + Share This
  • 🔖 Save To Your Account