Definition and Ownership of Web Service Interfaces: WSDL or XSD?
Date: Jun 4, 2004
Introduction
Web services are a preferred way to expose business modules. But the decision to use web servicesor any other distributed technologyshould be independent of the interface definition or development of the business module. The business module should be developed first, and then the decision can be made to use a suitable distributed technology. This strategy implies that WSDL may not be the appropriate choice for defining the interface, as it's part of a specific distributed technology specification. If you already have XML-based messages as an interface to your module, the XSD for the messages could also be the web service interface for the module.
Our previous article discussed the practice of separating the business module from the web services infrastructure, using web services as a "packaging strategy" for an application. This technique gives the company the flexibility to have separate development teamsone to build the business module and another to develop the web services layer to expose that module. In that article, we also introduced the gist of this articledetermining the appropriate technology to expose the web service interface. In this article, we explore the use of XSD Schema as a web service message interface and identify the appropriate group in the enterprise to take ownership of that interface.
Using XML Schema To Define Messages
The intention behind separating the business module implementation from the web services layer is for each of these designs to be independent of the other. The business module implementation is a business problem requiring domain knowledge; the web services layer solves an infrastructure issue to enable remote access. The two design processes requires different specific skill sets.
Our previous article discussed how web services enable us to define a business-type interface for a business module and use XML for that interface. For a standalone business module that has an XML interface and that would be exposed to external clients, you would typically define an XSD schema for the message exchanges and validate incoming messages against that schema before processing any requests. Even if the clients are always trusted or are internal, it helps to define a schema for the message in order to achieve loose coupling between the systems and allow parallel development of all the internal software modules. In essence, the schema becomes the contract among the entities that exchange these messages. A schema for the messages is a reasonable expectation for all software modules with XML interfaces.
If web services is considered purely as a way to expose the business module to remote clients, then it really is just a technology enabler. The schema for the business messages should also be exposed as the schema for the web services messages. A web service does package messages inside its envelope (SOAP). And we would need to use WSDL to define the SOAP message exchange and identify end points of the service. But the actual message should remain unaltered even inside the SOAP envelope and be validated against its schema.
WSDL Interfaces for RPC-Style Web Services
An RPC-style web service mimics the way in which distributed technologies worked prior to web services. It takes the low-level data objects that make up the service API or operation, serializes them, and then transports them over the wire. The operations are performed in the reverse order on the other side to re-create the data objects.
Developers who are used to prior distributed technologies can relate to this style of distributed access and are comfortable using it. It's also the easiest way to build a web service. Hence, it has become the de facto way to start learning more about web services. But there are issues with this approach:
Performance. Our next article in this series will talk about performance issues with web services in greater detail. The relevant observation here is that RPC-style web services perform very poorly when used in production for even a moderately-sized user base. This is true even for very small payloads. Thus, RPC-style services are not really suitable for production usage where scalability is of utmost importance.
Auto-generation of the actual interface (WSDL). In RPC-style Web services, the WSDL is the only interface contract binding all entities. Even though the technology-specific data objects may act as the API to the business module, the XML-serialized form represented in the WSDL is the interface to the web service.
API fidelity. As we mentioned in our previous article, web services are all about interoperability to gain technology and platform independence. RPC-style services attempt to exchange technology-specific data objects and APIs by serializing the data objects into XML and de-serializing again at the other end. Since web services are technology- and platform-neutral, de-serialization on the client side (based on the web services engine) to convert back to the data objects may not re-create the exact signature of the objects as defined by the provider.
NOTE
Exposing legacy systems as web services may be critical for the enterprise to enable remote access and for the extensibility of the system's reach. In such cases, exposing its API as RPC-style makes perfect sense despite the possible poor performance. This fact should eventually lead to document-style services to fine-tune the performance aspect.
For a service-oriented architecture, the interface definition is key to achieving loose coupling between systems. Enterprises must have full control over this definition and the technology to specify it. RPC-style web services offer convenient tools (such as Axis' Java2WSDL) to auto-generate the WSDL along with the server and client stubs. But while this setup may be convenient for prototypes, it could be the downfall of production environments. Auto-generating a key artifact such as the interface should not be an option for systems in production because this technique forces the enterprise to depend on the convenience tools for service interfaces. On the contrary, a process must be put in place by the enterprise to manage interface definitions and their future enhancements. The auto-generated WSDL merely serves as a starting point to build the WSDL manually for production-ready services.
Axisan open source web service engineoffers the utility tools Java2WSDL and WSDL2Java to convert the data objects to XML and back to data objects for Java consumers and producers. Even with the same technology performing these two operations on your API, however, you'll get an altered version of the API you started with. The problem could be worse if you have to use another web service engine for another technology (for example, Perl's SOAP::Lite) and expect it to convert your serialized XML string into Perl objects.
What problem does this mangled API/object definition cause? From the business module's perspective, the API of the business module should be the contract to the clients and not the intermediate XML message stream defined by the WSDL. Otherwise, the business module owner loses control of the module's interfacea critical artifact for a loosely coupled system.
All these limitations make RPC-style services less desirable for production-quality web services for enterprises (except for legacy systems, in which the opportunity to bring the system online far outweighs these drawbacks).
XSD Interfaces for Document-Style Web Services
In document-style web services, the business module interface is always XML. If the business module interface is already defined via a schema, it's undesirable to redefine it through WSDL for a couple of reasons: First, this is a redundant step; more importantly, we'd have to change the WSDL with every change in the schema. The real solution is to use the XSD Schemas for the business module as the API signature in the web services. Web services can continue to use WSDL to specify SOAP envelope-specific details such as transport protocol and endpoint identification. But the SOAP message merely wraps the XML business message and transmits itwithout any knowledge of the schema used to format that message.
This fact brings up a critical architectural decision: Who owns the validation of incoming messages? The web services layer handles decryption, verification of the digital certificates, and authentication of the client. For RPC-style web services, it also performs validation of the input message to the WSDL, because otherwise it would be impossible for the web services layer to map the message to the business data objects and its API. But in document-style web serviceswhere the SOAP envelope has no knowledge of the message schemahow could the web services layer validate the message before handing it off to the business module? Validation of the incoming messages now resides in the purview of the business module. The business module need not know how the message got to it, but it needs to validate all incoming messages before processing them. This design is perfectly fine because this layer really needs to attach meaning to the schema of the incoming message. If the responsibility of validating the message lies with the business module, one central component in the architecture will have both responsibility and authority to handle messages. From an architectural perspective, this is a characteristic of a good design.
It's important to realize that only the ownership of the validation step has shifted hands from the web services infrastructure layer to the business modulethis plan has no performance impact on the overall processing time of the message. Business API validation is part of the business module and its ownership, and thus validation should reside there. The high-level architectural diagram in Figure 1 highlights the responsibilities handled by the web service layer and the business module layer.
Figure 1 Functionality and ownership of the web services infrastructure and business module layers.
Ownership of the XML Message Schemas
XML technologies and web services have enabled us to look at ownership aspects of the definition of the interfaces from different perspectives. All the XML technologies are geared toward complete platform and technology independence. Web services allow coarse-grained, loosely coupled, high-level business module interfaces to be exposed as services. These interfaces don't need to be defined as low-level technology data objects, but instead are suited for defining and controlling the business functionality that's exposed to the user community. Consequently, it greatly benefits the process to have this definition of the interfaces done by the business owners of the project, rather than the engineering or IT department that implements the business functionality, based on the requirements document provided to them by the business community.
Once the requirements-gathering phase is complete and the engineering team is handed a requirements document, the first step must be to translate the requirements objectively and define an XML schema for the messages. As business groups get more XML-savvy, these XML schemas could then be part of the requirements document itselfmuch like how UI screen mockups are determined during the requirements/analysis phase prior to any engineering work.
Here are the benefits of this approach:
Business groups rightly own the message definition that expresses business functionality.
Business has full control of the functionality that it exposes or controls in the current release and subsequent releases of the software.
Business groups objectively define the business functionality specified in the requirements document. This strategy aids the development of the engineering/IT groups.
Business groups don't depend on the IT groups or the implementation of the business module to determine the functionalities to be exposed.
Business analysts choose naming conventions used in the interfaces. These conventions don't have to be based on low-level implementation artifacts or technology-specific terminologies.
Interfaces are defined prior to implementation! This is a good service-oriented architecture practice.
Loosely coupled systems (providers and consumers) can go about building their modules independently and in parallel.
Conclusion
Continuing our architectural discussion of using the web services layer as a pure packaging strategy of a business module, we evolved the architecture further in this article to discuss message interfaces. We've explored both RPC-style and document-style web services and identified issues with WSDL as the interface definition. Using the XSD to define the message interfaces provides separation of responsibility between the business module and the web services infrastructure layers. Further, it allows the enterprise to have its business module interfaces defined and managed by its business groups, where such responsibility belongs. The engineering group is responsible for the implementation of the business functionality and can focus on its task with clear business functionality definitions already provided. Web services and XML technologies are evolving enterprise processes to truly define ownerships and responsibility within groupswithout any constraints of technology, platform, or products used within the enterprise.
Our next article will delve into performance issues and measure web services overhead to determine its usability in production environments.