Home > Articles > Web Services > XML

  • Print
  • + Share This
This chapter is from the book

The Technology of Web Services

Programs that interact with one another over the Web must be able to find one another, discover information allowing them to interconnect, figure out what the expected interaction patterns are—a simple request/reply or more complicated process flow?—and negotiate such qualities of service as security, reliable messaging, and transactional composition. Some of these qualities of service are covered in existing technologies and proposed standards, but others are not. In general, the Web services community is working to meet all these requirements, but it's an evolutionary process, much like the Web itself has been. Web services infrastructure and standards are being designed and developed from the ground up to be extensible, such as XML and HTML before them, so that whatever is introduced in the short term can continue to be used as new standards and technologies emerge.

The New Silver Bullet?

Web services are sometimes portrayed as "silver-bullet" solutions to contemporary computing problems, filling the role previously played by the original Web, relational databases, fourth-generation languages, and artificial intelligence. Unfortunately, Web services by themselves can't solve much. Web services are a new layer—another way of doing things—but are not a fundamental change that replaces the need for existing computing infrastructure. This new layer of technology performs a new function—a new way of working—but, most important, provides an integration mechanism defined at a higher level of abstraction.

Web services are important because they are capable of bridging technology domains, not because they replace any existing technology. You could say that newer languages, such as Visual Basic, C#, C/C++ and Java—replace older languages, such as COBOL and FORTRAN, although a lot of programs in those languages are still around, as are Web-services mappings for them. Web services, like Web servers, are complementary to, not in conflict with, existing applications, programs, and databases. Application development continues to require Java, VB, and C#. All that's new is a way of transforming data in and out of programs and applications, using standard XML data formats and protocols to reach a new level of interoperability and integration.

Developers may have to take Web services into account when designing and developing new programs and databases, but those programs and databases will still be required behind Web services wrappers. Web services are not executable things in and of themselves; they rely on executable programs written using programming languages and scripts. Web services define a powerful layer of abstraction that can be used to accomplish program-to-program interaction, using existing Web infrastructure, but they are nothing without a supporting infrastructure.

Web services require several related XML-based technologies to transport and to transform data into and out of programs and databases.

  • XML (Extensible Markup Language), the basic foundation on which Web services are built provides a base language for defining data and how to process it. XML represents a family of related specifications published and maintained by the World Wide Web Consortium (W3C) and others.

  • WSDL (Web Services Description Language), an XML-based technology, defines Web services interfaces, data and message types, interaction patterns, and protocol mappings.

  • SOAP (Simple Object Access Protocol), a collection of XML-based technologies, defines an envelope for Web services communication—mappable to HTTP and other transports—and provides a serialization format for transmitting XML documents over a network and a convention for representing RPC interactions.

  • UDDI (Universal Description, Discovery, and Integration), a Web services registry and discovery mechanism, is used for storing and categorizing business information and for retrieving pointers to Web services interfaces.

Usage Example

The basic Web services standards are used together. Once the WSDL is obtained from the UDDI or other location, a SOAP message is generated for transmission to the remote site.

As shown in Figure 1-6, a program submitting a document to a Web service address uses an XML schema of a specific type, such as WSDL, to transform data from its input source—a structured file in this example—and to produce an XML document instance in the format consistent with what the target Web service expects, as described in the same WSDL file. The WSDL file is used to define both the input and the output data transformations.

Figure 6Figure 1-6 Web services use XML documents and transform them into and out of programs.

The sending computer's SOAP processor transforms the data from its native format into the predefined XML schema data types contained in the WSDL file for text, floating point, and others, using mapping tables. The mapping tables associate native data types with corresponding XML schema data types. (Standard mappings are widely available for Java, Visual Basic, CORBA, and other commonly used type systems. Many XML mapping tools are available for defining custom or special mappings.) The receiving computer's SOAP processor performs the transformation in reverse, mapping from the XML schema data types to the corresponding native data types.

The URL, in widespread use on the Web, points to a TCP (Transmission Control Protocol) address containing a Web resource. Web services schemas are a form of Web resource, contained in files accessible over the Internet and exposed to the Web using the same mechanism as for downloading HTML files. The major difference between HTML file downloading and accessing Web services resources is that Web services use XML rather than HTML documents and rely on associated technologies, such as schemas, transformation, and validation, to support remote communication between applications. But the way in which Web services schemas are published and downloaded is the same: an HTTP operation on a given URL.

When it receives a document, a Web service implementation must first parse the XML message and validate the data, perform any relevant quality-of-service checking, such as enforcing security policies or trading-partner agreements, and execute any business process flow associated with the document. The Web service at the fictional skateboots.com Web site is located in the skateboots.com/order folder, which is what the URL points to.3

The Web services available at this Internet address are identified within a public WDSL file that can be downloaded to the sending computer and used to generate the message. The Skateboots Company also posted a listing in the public UDDI directory, pointing to the same WSDL file, for customers who might discover the company through the UDDI service. In general, anyone wishing to interact with the Web services that place or track orders for the Skateboots Company over the Web must find a way to obtain and to use that particular WSDL file to generate the message.

Programs at the skateboots.com address provide an HTTP listener associated with the names of the Web services in order to recognize the XML messages sent in the defined format. The programs include XML parsers and transformers and map the data in the SOAP message into the formats required by the Skateboots Company order entry system.

These technologies are enough to build, deploy, and publish basic Web services. In fact, even basic SOAP is enough. Other technologies are continually being added to the expanding Web services framework as they emerge. These fundamental technologies are enough to support use of the Internet for basic business communication and to bridge disparate IT domains, however; and this form of Web interaction is being adopted very quickly.

Over time, as standards for registry, discovery, and quality of service mature, the vision of an ad hoc, dynamic business Web will start to take hold, and Web services will begin to operate more like the current Web, allowing companies to find and to trade with one another purely by using Internet-style communications. In the meantime, the basic Web services technologies and standards covered in this book are sufficient for many solutions, such as integrating disparate software domains—J2EE and .NET, for example—connecting to packaged applications, such as SAP and PeopleSoft, and submitting documents to predefined business process flows.

XML: The Foundation

In the context of Web services, XML is used not only as the message format but also as the way in which the services are defined. Therefore, it is important to know a little bit about XML itself, especially within the context of how it is used to define and to implement Web services.

Reinventing the Wheel

Some people say that Web services are reinventing the wheel because they share many characteristics with other distributed computing architectures, such as CORBA or DCOM. Web services do share considerable common ground with these and other distributed computing architectures and implementations, but there's also a good reason for inventing a new architecture. The Web is established, and to take advantage of this tremendous global network, the concepts of distributed computing need to be adapted. First, the Web is basically disconnected; that is, connections are transient and temporary. Distributed computing services, such as security and transactions, traditionally depend on a transport-level connection and have to be redesigned to provide equivalent functionality for the disconnected Web. Second, the Web assumes that parties can connect without prior knowledge of one another, by following URL links and observing a few basic rules. For Web services, this means that any client can access Web services published by anyone else, as long as the information about the service—the schema—is available and understandable and XML processors are capable of generating messages conforming to the schema.

Traditional distributed computing technologies assume a much more tightly coupled relationship between client and server and therefore cannot inherently take advantage of the existing World Wide Web. Because Web services adopt the publishing model of the Web, it's possible to wrap and to publish a specific end point, or business operation, using a Web services interface definition, without the existence of a client for that end point. The paradigm shift that clients can develope and integrate later has many advantages in the elusive solution to the problem of enterprise integration.

Purposes of XML

XML was developed to overcome limitations of HTML, especially to better support dynamic content creation and management. HTML is fine for defining and maintaining static content, but as the Web evolves toward a software-enabled platform, in which data has associated meaning, content needs to be generated and digested dynamically. Using XML, you can define any number of elements that associate meaning with data; that is, you describe the data and what to do with it by using one or more elements created for the purpose. For example:

<Company>
  <CompanyName region="US">
  Skateboots Manufacturing
  </CompanyName>
  <address>
   <line>
    200 High Street
   </line>
   <line>
   Springfield, MA 55555
   </line>
   <Country>
   USA
   </Country>
  </address>
  <phone>
  +1 781 555 5000
  </phone>
</Company>

In this example, XML allows you to define not only elements that describe the data but also structures that group related data. It's easy to imagine a search for elements that match certain criteria, such as <Country> and <phone> for a given company, or for all <Company> elements and to return a list of those entities identifying themselves as companies on the Web. Furthermore, as mentioned earlier, XML allows associated schemas to validate the data separately and to describe other attributes and qualities of the data, something completely impossible using HTML.

Of course, significant problems result from the great flexibility of XML. Because XML allows you to define your own elements, it's very difficult to ensure that everyone uses the same elements in the same way to mean the same thing. That's where the need for mutually agreed on, consistent content models comes in.

Two parties exchanging XML data can understand and interpret elements in the same way only if they share the same definitions of what they are. If two parties that share an XML document also share the same schema, they can be sure to understand the meaning of the same element tags in the same way. This is exactly how Web services work.

Technologies

XML is a family of technologies: a data markup language, various content models, a linking model, a namespace model, and various transformation mechanisms. The following are significant members of the XML family used as the basis of Web services:

  • XML v1.0: The rules for defining elements, attributes, and tags enclosed within a document root element, providing an abstract data model and serialization format

  • XML schema: XML documents that define the data types, content, structure, and allowed elements in an associated XML document; also used to describe semantic-processing instructions associated with document elements

  • XML namespaces: The uniquely qualified names for XML document elements and applications

The Future of the Web

The inventor of the World Wide Web, Tim Berners-Lee, has said that the next generation of the Web will be about data, not text; XML is to data what HTML is to text. The next generation of the Web is intended to address several shortcomings of the existing Web, notably the difficulty searching the Web for exact matches on text strings embedded in Web pages. Because the Web has been so successful, however, the future of the Web must be accomplished as an extension, or an evolution, of the current Web. It's impossible to replace the entire thing and start over! Solutions for application-to-application communication must be derived from existing Internet technologies.

If the future of the Web depends on its ability to support data communications as effectively and easily as it supports text communications, Web services need to be able to refer dynamically to Web end points, or addresses (URLs), and to map to and from XML transparently. These end points, or addresses, provide the services that process the XML data, in much the same way that browsers process HTML text. These addresses also can be included in any program capable of recognizing a URL and parsing XML. Thus it will be possible to communicate from your spreadsheet to a remote source of data or from your money management program to your bank account management application, make appointments with colleagues for meetings, and so on.

Microsoft and others are already developing these kinds of standard services accessible from any program, and a large part of Microsoft's .NET strategy is focused on development tools for creating and stitching together applications that use predefined Web services. But getting this to happen requires significant standardization, comparable to the effort involved in standardizing PC components, and might therefore not happen for several years.

  • XML Information Set: A consistent, abstract representation of the parts of an XML document

  • XPointer: A pointer to a specific part of a document; XPath, expressions for searching XML documents; and XLink, for searching mulitple XML documents

  • Extensible Stylesheet Language Transformations (XSLT): Transformation for XML documents into other XML document formats or for exporting into non-XML formats

  • DOM (Document Object Module) and SAX (Simple API for XML): Programming libraries and models for parsing XML documents, either by creating an entire tree to be traversed or by reading and responding to XML elements one by one

These technologies and others are described in further detail in Chapter 2.

WSDL: Describing Web Services

The Web Services Description Language (WSDL) is an XML schema format that defines an extensible framework for describing Web services interfaces. WSDL was developed primarily by Microsoft and IBM and was submitted to W3C by 25 companies.4 WSDL is at the heart of the Web services framework, providing a common way in which to represent the data types passed in messages, the operations to be performed on the messages, and the mapping of the messages onto network transports.

WSDL is, like the rest of the Web services framework, designed for use with both procedure-oriented and document-oriented interactions. As with the rest of the XML technologies, WSDL is so extensible and has so many options that ensuring compatibility and interoperability across differing implementations may be difficult. If the sender and the receiver of a message can share and understand the same WSDL file the same way, however, interoperability can be ensured.

WSDL is divided into three major elements:

  • Data type definitions
  • Abstract operations
  • Service bindings

Each major element can be specified in a separate XML document and imported in various combinations to create a final Web services description, or they can all be defined together in a single document. The data type definitions determine the structure and the content of the messages. Abstract operations determine the operations performed on the message content, and service bindings determine the network transport that will carry the message to its destination.

Figure 1-7 shows the elements of WSDL, layered according to their levels of abstraction, which are defined independently of the transport, specifically so that multiple transports can be used for the same service. For example, the same service might be accessible via SOAP over HTTP and SOAP over JMS. Similarly, data type definitions are placed in a separate section so that they can be used by multiple services. Major WSDL elements are broken into subparts.

Figure 7Figure 1-7 WSDL consists of three major elements and seven parts.

The definition parts include data type definitions, messages, and abstract operations, which are similar to interface definitions in CORBA or DCOM. Messages can have multiple parts and can be defined for use with the procedure-oriented interaction style, the oriented interaction style, or both. Through the abstraction layers, the same messages can be defined and used for multiple port types. Like the other parts of WSDL, messages also include extensibility components—for example, for including other message attributes.

WSDL data type definitions are based on XML schemas, but another, equivalent or similar type definition system can be substituted. For example, CORBA Interface Definition Language (IDL) data types could be used instead of XML schema data types. (If another type definition system is used, however, both parties to a Web services interaction must be able to understand it.)

The service bindings map the abstract messages and operations onto specific transports, such as SOAP. The binding extensibility components are used to include information specific to SOAP and other mappings. Abstract definitions can be mapped to a variety of physical transports. The WSDL specification includes examples of SOAP one-way mappings for SMTP (simple mail Transfer Protocol), SOAP RPC mappings for HTTP, SOAP mappings to HTTP GET and POST, and a mapping example for the MIME (multipurpose Internet messaging extensions) multipart binding for SOAP.

XML namespaces are used to ensure the uniqueness of the XML element names used in each of the three major WSDL elements. Of course, when the WSDL elements are developed separately and imported into a single complete file, the name-spaces used in the separate files must not overlap. Associated schemas are used to validate both the WSDL file and the messages and operations defined within the WSDL file.

It's safe to say that WSDL is likely to include many extensions, changes, and additions as Web services mature. Like SOAP, WSDL is designed as an extensible XML framework that can easily be adapted to multiple data type mappings, message type definitions, operations, and transports. For example, IETF (Internet Engineering Task Force) working groups are proposing a new protocol standard—Blocks Extensible Exchange Protocol (BEEP)—to define a useful connection-oriented transport. (HTTP, by contrast, is inherently connectionless, making it difficult to resolve quality-of-service problems at the transport level.) Companies interested in using Web services for internal application or integration may choose to extend WSDL to map to more traditional protocols, such as DCOM or IIOP (Internet Inter-ORB Protocol).

SOAP: Accessing Web Services

So far, you have defined the data (XML) and expressed the abstraction of the service necessary to support the communication and processing of the message (WSDL). You now need to define the way in which the message will be sent from one computer to another and so be available for processing at the target computer.

The SOAP specification defines a messaging framework for exchanging formatted XML data across the Internet. The messaging framework is simple, easy to develop, and completely neutral with respect to operating system, programming language, or distributed computing platform. SOAP is intended to provide a minimum level of transport on top of which more complicated interactions and protocols can be built.

SOAP is fundamentally a one-way communication model that way of defining ensures that a coherent message is transferred from sender to what information receiver, potentially including intermediaries that can process gets sent and how part of or add to the message unit. The SOAP specification contains conventions for adapting its one-way messaging for the request/response paradigm popular in RPC-style communications and also defines how to transmit complete XML documents. SOAP defines an optional encoding rule for data types, but the end points in a SOAP communication can decide on their own encoding rules through private agreement. Communication often uses literal, or native XML, encoding.

As shown in Figure 1-8, SOAP is designed to provide an independent, abstract communication protocol capable of bridging, or connecting, two or more businesses or two or more remote business sites. The connected systems can be built using any combination of hardware and software that supports Internet access to existing systems such as .NET and J2EE. The existing systems typically also represent multiple infrastructures and packaged software products. SOAP and the rest of the XML framework provide the means for any two or more business sites, marketplaces, or trading partners to agree on a common approach for exposing services to the Web.

Figure 8Figure 1-8 SOAP messages connect remote sites.

SOAP has several main parts:

  • Envelope: Defines the start and the end of the message

  • Header: Contains any optional attributes of the message used in processing the message, either at an intermediary point or at the ultimate end point

  • Body: Contains the XML data comprising the message being sent

  • Attachment: Consists of one or more documents attached to the main message (SOAP with Attachments only)

  • RPC interaction: Defines how to model RPC-style interactions with SOAP

  • Encoding: Defines how to represent simple and complex data being transmitted in the message

Only the envelope and the body are required.

UDDI: Publishing and Discovering Web Services

After you have defined the data in the messages (XML), described the services that will receive and process the message (WSDL), and identified the means of sending and receiving the messages (SOAP), you need a way to publish the service that you offer and to find the services that others offer and that you may want to use. This is the function that UDDI (universal distribution, discovery, and interoperability) provides.

Inside the Enterprise

Many companies are exploring the potential advantages of using Web services both inside and outside the enterprise. This is analagous to using browsers and Web servers inside the enterprise in internal networks. Existing internal Web infrastructure can be put to good use in support of Web services–style interactions. Although unlikely to replace existing distributed computing environments, such as COM and CORBA, Web services can be a valuable supplement to existing technologies. Sometimes, all you have is an HTTP or an SMTP connection. Because they represent a completely neutral format that can be used to achieve a new level of inter-operability, Web services can also be used to bridge across COM, CORBA, EJB, and message queueing environments. Finally, because Web services use existing HTTP infrastructure, the impact on system administrators is minimal compared to introducing other distributed computing technologies into an IT department. Performance is certainly an issue compared to more traditional binary-oriented transports and protocols, but the potential benefits outweigh the costs for many applications, and performance issues tend to get solved over time, as they have been for the original Web.

The UDDI framework defines a data model in XML and SOAP application programming interfaces (APIs) for registering and discovering business information, including the Web services a business publishes. UDDI is produced by an independent consortium of vendors, founded by Microsoft, IBM, and Ariba, to develop an Internet standard for Web service description registration and discovery. Microsoft, IBM, Hewlett-Packard, and SAP are hosting the initial deployment of a public UDDI service, which is conceptually patterned after DNS, the Internet domain name server service that translates Internet host names into TCP addresses. In reality, UDDI is much more like a replicated database service accessible over the Internet.

UDDI is similar in concept to a Yellow Pages directory. Businesses register their contact information, including such details as phone and fax numbers, postal address, and Web site. Registration includes category information for searching, such as geographical location, industry type code, business type, and so on. Other businesses can search the information registered in UDDI to find suppliers for parts, catering services, or auctions and marketplaces. A business may also discover information about specific Web services in the registry, typically finding a URL for a WSDL file that points to a supplier's Web service.

Businesses use SOAP to register themselves or others with UDDI; then the registry clients use the query APIs to search registered information to discover a trading partner. An initial query may return several matches from which a single entry is chosen. Once a business entry is chosen, a final API call is made to obtain the specific contact information for the business.

Figure 1-9 shows how a business would register Web service information, along with other, more traditional contact information, with the UDDI registry. A business first generates a WSDL file to describe the Web services supported by its SOAP processor (1) and uses UDDI APIs to register the information with the repository (2). After a business submits its data to the registry, along with other contact information, the registry entry contains a URL that points to the SOAP server site's WSDL or other XML schema file describing the Web service. Once another business's SOAP processor queries the registry (3) to obtain the WSDL or other schema (4), the client can generate the appropriate message (5) to send to the specified operation over the identified protocol (6). Of course, both client and server have to be able to agree on the same protocol—in this example, SOAP over HTTP—and share the same understanding, or semantic definition of the service, which in this example is represented via WSDL. With the widespread adoption of these fundamental standards, however, this common understanding of WSDL seems ensured.

Figure 9Figure 1-9 The UDDI repository can be used to discover a Web service.

  • + Share This
  • 🔖 Save To Your Account