Home > Articles > Web Services > XML

This chapter is from the book

This chapter is from the book

Server Infrastructure

With an appropriate place to store XML data persistently, the next concern is distributing and manipulating this data. In modern Web application architectures, servers play critical roles in assembling, processing, and distributing information. Adding XML support to your server infrastructure mostly involves making sure that existing servers are XML-enabled, with perhaps the installation of a few XML-specific components. Most important, you must verify that XML capabilities meet the scalability and reliability demands of all server functions.

In general, there are three types of server components: data servers, application servers, and content servers. Data servers access, aggregate, and format data. Application servers execute business logic components and mediate distributed business processing. Content servers facilitate the acquisition of content, enhance its accessibility to users, and apply formatting. Figure 5-5 shows these different types of servers working together in a typical Web application environment. This type of server web provides the conduit for propagating XML documents within an enterprise and throughout the Internet.

While Figure 5-5 shows each server component as a distinct node, this arrangement isn't necessary. Server software may combine these components in different ways, and, in fact, different combinations lead to distinct product segments. Integration servers combine data server functions to aggregate information from multiple sources with application server functions to control the flow of business processes. Portal servers combine data server functions to access information from multiple sources with content server functions to filter this information based on user requirements. Personalization servers combine application server functions to calculate user needs with content server functions to customize their experiences dynamically. By understanding the roles of the three basic types of server functionality, you can evaluate whether such a combination suits your needs.

Figure 5-5Figure 5-5 Types of Server Components


Data Servers

DBMSs inherently constrain the use of data. They have to choose a particular paradigm, such as relational or object. Relational DBMSs with normalized tables optimize the combination data in different ways. Object DBMSs with associated instances optimize the traversal of information webs. Within a given paradigm, each individual database has a particular structure limiting the types of information it can store and the access patterns it supports. DBMSs do a wonderful job of managing data when a given database must support only a few types of applications and when each application relies on only a few databases. However, when a given database must support a wide variety of application types or a given application must rely on many different databases, satisfying these demands often tax DBMSs to their limits. In such cases, an XML-enabled data server can improve flexibility and performance.

XML broadens the use of data. The ability to design special purpose data formats quickly encourages the combination of information managed in different databases. So while data servers have existed for some time, XML's emergence as a solution to information exchange problems has elevated their role. Data servers perform three major functions: (1) they unify the data access interface to simplify application development, (2) they aggregate data from different sources to deliver customized packages of information, and (3) they consolidate requests to DBMSs to improve performance. XML requires special support only in the first two functions. Because optimizing performance through consolidation strategies like data caching and connection pooling occurs internally to the data server, the use of XML as the format does not affect this function.

An XML-enabled data server supports XML as the unified data access format. When an application submits a request to the data server, the data server fulfills it with an XML document. Given the rise of XML messaging discussed in Chapter 4, the data server should probably support this interaction over SOAP, using an interface specified in WSDL. Merely retrieving ad hoc bits of data as XML documents that the application then has to translate into programming data structures doesn't add much benefit. Programmatic solutions such as ODBC and JDBC already satisfy this need. The more substantial benefit comes from defining synthetic XML documents that form customized packages of data suited to a particular purpose.

To deliver a synthetic XML document, the data server must have a mapping between the document type and the structures managed by backend DBMSs. A developer defines an XML DTD or Schema for the document type and then maps fields in the various database schemas to element and attribute types. The developer also defines the keys used to select the correct records for populating a document instance. At runtime, an application submits a request for a synthetic document type and the appropriate keys. The data server then looks up the mapping, constructs queries based on the mapping and the keys, and puts the results into an XML document. This results document is valid with respect to the specified DTD or Schema.

In some cases, a DBMS vendor may include some data server capabilities with its DBMS product. For instance, Oracle9i includes XML mapping capabilities. In cases where the need for a data sever stems from a small set of homogenous databases attempting to serve many different applications, this solution is sufficient. But when the need for a data server stems from a set of applications attempting to aggregate data across heterogeneous databases, you probably need a separate data server product.

Such products include eXcelon's eXtensible Information Server and Versant enJin, both of which are based on object persistence engines. Data servers require many of the capabilities of backend databases to provide high availability and transactional integrity. They use their own persistence engine as a staging area between applications and backend DBMSs. Therefore, most of the native XML store products discussed previously can also operate as XML data servers by adding features for synchronizing with backend databases. In fact, many vendors of these products are finding that this approach drives a substantial percentage of their sales. Conversely, data server products like eXcelon and enJin can operate as native XML stores, so distinctions between the two markets are blurring. When evaluating either type of product's suitability as a data server, focus on the facilities for mapping backend data to XML documents and the efficiency of performance optimization strategies like caching and pooling.

Application Servers

Application servers operate in the middle tier, applying business logic to data, then handing off the results for presentation. In this capacity, they have three primary reasons for working with XML documents.

  1. They may need to accept data as XML documents from data servers.

  2. They may need to provide business results as XML documents to content servers.

  3. They may have to exchange XML-formatted business messages with other application servers.

To support these operations, the application server can supply basic and advanced services.

Basic services include the execution of XML and XSLT processors, as well as a SOAP implementation. Whether it extracts data from XML documents, exchanges XML business documents, or produces XML business results, the application server needs the access and creation capabilities of an XML processor. Because many developers use XSLT for pre- and postdocument processing, support for this standard should be part of the basic package. Interaction with XML-enabled data, application, and content servers almost certainly includes SOAP communication, so an implementation of the protocol is essential.

Theoretically, because an application server can execute any code in a language it supports, providing basic services is simply a matter of downloading XML and XLST processors plus a SOAP implementation, then installing them. Practically, assuring the performance and quality of execution requires the vendor at the very least to certify components for use with the application server and probably include the recommended packages in the product distribution. You want to make sure that the vendor has tested the particular components, can provide estimates of how much throughput these components can handle, and knows how to support their use with its application server. For J2EE application servers, most vendors recommend the Xerces XML processor, the Xalan XSLT processor, and either their own or a particular third-party SOAP implementation. Microsoft has its own XML processor, XSLT processor, and SOAP implementation for its application server products.

Advanced services tend to vary significantly across application servers and evolve rapidly over time. Therefore, it's more appropriate to focus on the categories of advanced services rather than particular instances. Most advanced services are delivered in the form of frameworks. There are abstraction frameworks and task frameworks. Abstraction frameworks give developers more flexibility to make future changes by performing operations at a higher level. Two excellent examples are Sun's Java API for XML Processing (JAXP) and Java API for XML Messaging (JAXM). Both of these frameworks provide high-level APIs for performing specific XML-related operations. By programming to these abstract APIs rather than the concrete APIs of specific components, developers make it possible to switch their XML processor or XML messaging protocol easily.

Task frameworks provide additional functionality for building specific types of applications. Personalization is a good example of a task framework used to produce XML documents for content servers. These types of applications use metadata about user preferences and metadata about content topics to generate customized content. Because XML is a convenient format for both types of metadata, there is the opportunity to deliver a package that greatly simplifies the development of such applications. But perhaps the best XML-related example of such an application is B2B messaging. This type of application touches on a host of issues, from specifying the allowable flows of messages, to generating views of executing processes, to integrating with back-end systems. Providing all this functionality would be difficult for a single application development team. By using XML, vendors can deliver a widely applicable framework that puts such applications within the reach of more organizations. All the major application server vendors—including BEA, IBM, Microsoft, Oracle, and Sun—provide their own flavors of both personalization and B2B messaging frameworks.

Content Servers

Content servers combine data from DBMSs, results from business operations, and authored content into presentation formats suitable for different users. XML-based technologies improve every stage of the fulfillment pipeline. At the very end of the pipeline, they enable dynamic layouts that better fit each user's needs. In the middle of the pipeline, they make it easier to connect a user to the exact information he wants. At the beginning of the pipeline, they make it easier to acquire the library of content necessary to satisfy the user base. Most content servers focus on one or two aspects of this pipeline, so implementing a complete XML content strategy may require several types of content servers.

The most common use for XML in content servers is applying dynamic presentation to XML content. This process occurs as described in Chapter 3's discussion of using XSLT to generate pages in XML-based presentation languages such as HTML, VoiceXML, and WML. Based on variables, including the type of client device, the type of content, and the localization settings for the user, the content server selects an XSLT transform and applies it to the XML document. Because most Web servers have programming extensions that support XSLT, you won't need any additional server infrastructure if all you want is dynamic presentation.

Customizing layouts for users is only part of the content delivery equation. Users also need help finding the content that addresses their immediate needs. Traditional search engines suffer from the problem (raised in Chapter 1) of distinguishing between different contexts for the same word. With XML content, a search engine can use the element structure and attribute values to improve search precision. Using an XML-aware search engine helps maximize the benefits of an XML-based content strategy. Usually, employing such a product involves assigning a dedicated server or cluster of servers to perform searches that then refer users to the appropriate content. Such standalone solutions include DocSoft's extend XML and XML Global's GoXML Search. Of course, most of the CMS and native XML store products discussed previously can perform searches on XML document collections, but this approach works only if you store all the content you plan to search in one of these products.

XML-aware search engines leverage metadata at the element and attribute levels. However, metadata can also apply to entire collections of content. The foundation of the Semantic Web is the use of metadata to provide a conceptual map of an entire site or group of sites. Another W3C Recommendation, Resource Definition Framework (RDF), provides a standardized XML vocabulary for describing the types of content offered, the relationships among content, and the conditions under which content might be relevant. Most site creators use an implicit information model in selecting and organizing content. RDF makes it possible to state this model explicitly. The availability of machine-readable models facilitates automated information retrieval, filtering, and visualization capabilities far beyond those of traditional search engines. The Semantic Web is in its early development, and much of the work is in the form of research and open source projects. However, in the near future, RDF may migrate into mainstream content infrastructure. Web servers will offer RDF descriptions. Search engines will use these descriptions as part of the search criteria. Authoring tools will generate these descriptions.

In addition to making it easier to find content, XML also makes it easier to acquire content. Content can come from two sources: You can create it, or you can borrow it. When creating content, the ability of multiple authors to collaborate effectively greatly enhances productivity. Web Distributed Authoring and Versioning (WebDAV), a set of XML-based extensions to HTTP from the IETF, makes it possible for authors to work together to create, enhance, and maintain content. A WebDAV server manages contributions, tracks changes, and enforces permissions. A number of portal servers, including Microsoft's SharePoint Portal Server and Oracle9iAS Portal, use WebDAV to enable the collaborative editing of portal content. Common Web servers such as Apache and IIS also support WebDAV. Any client that speaks the WebDAV protocol can use these servers to collaborate on documents. Such clients include popular content authoring tools such as Adobe Acrobat and Microsoft Office. Taken to an extreme, WebDAV enables the replacement of traditional document management systems with a set of distributed WebDAV-capable servers. Oracle iFS and Xythos's Web File Server use this approach.

It is often more cost effective to borrow content from someone else than to generate it yourself. However, this type of syndication faces two problems. First, it is often difficult to fit third party content into an application because of differences in layout. XML solves this problem by giving both parties a format for exchanging information separate from presentation. The subscriber knows the structure of each publisher's content, so it can use XSLT to integrate content from different sources and apply its preferred layout. There is also the problem of how to negotiate subscriptions, track usage, and update information automatically. Information and Content Exchange (ICE) addresses these issues by providing a standard XML protocol for such interactions between subscribers and publishers. ICE support is available in a wide variety of products that generate and manage content, including Interwoven's OpenSyndicate, Oracle9i, and Vignette's Content Syndication Server.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020