JP MORGENTHAL is Chief Technology Officer for XML Solutions Corporation. He has over fifteen years experience designing and developing distributed applications. JP is also an internationally known industry analyst covering Internet technologies and is a sought-after speaker at industry events.
About the Series EditorDr. Goldfarb is the father of markup languages, a term he coined in 1970. He is the inventor of SGML, the International Standard on which both XML and HTML are based. You can find him on the Web at www.xmltimes.com
"XML is the amazing new web standard for universal data interchange. With this book and XML you can integrate your applications without converting the code!"
Charles F. Goldfarb
Integrate your enterprise with XML and Java!
Enterprise Application Integration (EAI) links diverse applications, platforms, and operating systems so they work as one-and deliver powerful business results seamlessly. Platform-independent Java is one powerful tool for building EAI applications, and XML adds the missing link: robust mechanisms to exchange data with non-Java applications. Now there's a complete, step-by-step guide to using Java and XML together to deliver enterprise integration solutions that work! Enterprise Application Integration With XML and Java covers all this, and more:
Whether you're a technical manager planning for enterprise application integration, or a Java developer tasked with delivering it, Enterprise Application Integration With XML and Java delivers the in-depth solutions and real-world expertise you need.
CD-ROM INCLUDEDThe CD-ROM contains extensive source code from the book, plus a remarkable library of leading-edge software and trialware, including: Bluestone Visual-XML desktop XML development environment; IBM XML4J Java-based parser; and Push-technologies SpiritWAVE2 implementation of the Java Messaging Service!
Click here for a sample chapter for this book: 0130851353.pdf
1. Introduction.
About this Book. XML Basics. XML in the Business World. Getting Started.
I. BASICS OF EAI.
2. Building an EAI Infrastructure.Introduction. Building Virtual Applications. EAI Infrastructures. Requirements for Data Sharing. Requirements for Exchanging Data. Summary.
3. Methods of Processing XML.Introduction. Parsing XML. The Simple API for XML. W3C Document Object Model. Sidebar: Is DOM Too Heavyweight For EAI? Summary. Looking Ahead.
II. SHARING AND EXCHANGING DATA.
4. Sharing and Exchanging Data.Introduction. Summary.
6. Using XML with Relational Databases.Introduction. Modeling Spectrum. The Example. Extending the DTD. Modeling Datatypes. Modeling Relationships. The Example Step-by-Step. Conclusion. Listings. Mapping XML into Existing Schemas. Summary.
7. XML and Message-Oriented Middleware.Introduction. Summary.
8. XML and Directory Services.Introduction. Directory Services. Sidebar: Extending Directory Services with XML. Summary.
III. PROGRAMMING MODELS FOR EAI.
9. The Declarative Programming Model.Introduction. The Declarative Programming Model. The Declarative Programming Model XML Document Type. Summary.
10. Dynamic Applications.Introduction. MDSAX. SAX Filters. Summary.
11. Wrapping Up.Markup Language.
Appendix B: Document Object.Model (Core) Level.
Appendix C: SAX Interfaces.Binding.
Index.Enterprise Application Integration (EAI) is rapidly emerging as one of the leading initiatives for the computing industry in the early half of the new century. With the Y2K scare behind us, 2000 looks to be a year of rebirth for new computing initiatives with a strong focus on the use of technology for establishing tighter relationships with an organization's consumers and suppliers.
These relationships will be forged in many ways; for example, providing customized information for customers tailored to their specific interests and buying patterns, or including supply-chain in the automation of the overall procurement cycle. To accomplish these missions, developers are going to need a broad experience with many tools, products, operating systems, and hardware platforms. Experienced EAI development teams will include members with skills in networking, administration, project management, and software development for multiple platforms, all focused with the single goal of providing seamless integration of business systems.
The goal of EAI is not new; we have been doing it since we started distributing data away from the mainframe and onto front-end processors. However, with so much data trapped in so many different systems and data formats, companies are finding it extremely difficult and very expensive to open up and share this data with their trading partners.
Fundamentally, EAI is about developing systems that provide seamless business functionality. A key requirement for integration of systems to provide this seamless functionality is an ability to share and exchange data. Furthermore, this sharing and exchange occurs between systems that have little to no knowledge about each other's storage locations or formats.
For example, an accounting system may not have knowledge of the schema used for a sales system, thereby making it difficult for the accounting system to update sales order information after an invoice has been paid. Or, perhaps, a company needs to update its shipping and inventory systems simultaneously based upon the receipt of a new order. The data needs to be input into each of the existing systems in a manner that they understand. Due to the fact that many of these systems were departmentally chosen or developed without forethought of integrating with other departments within or external to the company, this is a significantly difficult task.
As we already mentioned, a key requirement of EAI is moving data from one system in one particular format, transforming it to be used by one or more other systems, and delivering the data to those other systems. It includes communications, data management, and business processing under the single, umbrella task of making disparate systems work as one.
The primary purpose of this book is to discuss the techniques and methodologies for integrating systems. Secondly, this book will illustrate how to build solutions based on these methodologies using Java and XML. Throughout this book, we will address and discuss the needs of EAI and discuss how XML can be used to solve some of the more complex issues surrounding EAI. Each of these solutions will be examined in technical depth by exercising the capabilities of the Java platform.
XML and Java are clearly the two most intriguing developments of the computer industry within the last ten years. Java is a programming language and a specification for a virtual machine that can execute binary modules by compiling the programming language. It also defines a consistent set of services that is available to all Java programs. This is a significant change from applications developed using traditional third-generation programming languages, which most often do not offer a consistent set of services, such as networking, file I/O, windowing, etc.
XML, on the other hand, is a specification that allows users to define their own markup languages. No fancy tricks or computer voodoo; XML is a simple definition that was forged by a group of industry leaders that recognized the importance of a truly open and well-defined, neutral data format.
Together these two bodies of work allow users to write applications that can process dynamically structured and unstructured data anywhere on a network where there is a Java virtual machine. In addition, for those places where a Java virtual machine is not available, such as when dealing with C++ or legacy applications, XML provides a method of moving data outside of the virtual machine in a way that is highly reusable by non-Java platforms. In effect, we end up with a ubiquitous programming environment and a ubiquitous data representation.
This book is the melding of these two technologies in a way that is not often examined. True, most of the early XML parsers and tools were written in Java, but this book goes beyond processing XML using the Java programming language.
This book is about integrating disparate systems using XML with advanced Java capabilities, such as:
When you finish this book, you will have a better understanding of how to build powerful Java applications to automate business processes, conduct electronic commerce, and share information effortlessly.
The Extensible Markup Language (XML) has received a lot of attention since the W3C officially blessed it in early 1998. If you ask knowledgeable people involved with the XML community about its popularity, you will get a diverse set of opinions concerning XML's quick climb to success. For example, one response would be that XML isn't really a new technology, just a new name for a subset of a proven technology called "SGML" (Standard Generalized Markup Language). The XML subset is optimized for the unmanaged heterogeneous networked environment of the Web.
The answer we like best, however, is that it is simple! HTML (Hypertext Markup Language)another derivative of SGMLwas extremely successful as the presentation language for the World Wide Web because it was simple and because most people could learn to use it with ease. That does not necessarily mean that what people created was beautiful to look at, but it conveyed information that was important to its author and that is the most important part of the Web.
XML lives because those closest to the Web soon realized that forcing users to tie their presentation to their content was defeating. The Web is about presentation, but most times, the look changes without a corresponding change in the underlying messages being presented. XML provides a way to design content such that any presentation can be applied post-authoring and the messages are preserved in reusable documents.
For some, the last paragraph has significant importance, but when we attempted to use XML for this purpose, we soon found ourselves creating a vocabulary that very closely mimicked HTML. That is, we had to include too many visual hints in the markup language to get the effect we wanted in our presentation.
Still, XML had hit a nerve with usit represented a way to encapsulate variable-length, free-form text with highly structured data under a single context. The start of real electronic knowledge-capture was ours to be had with a set of initiatives for simple processing.
Since our introduction to XML, we have put a significant amount of time and effort into analyzing this phenomenon. We know that XML is more than just a fad that became popular because it had an Internet label. The real benefit that XML provides to the industry is data interoperability. After all, we have been working on the problem of process interoperability for over ten years, but only in recent years has real data interoperability become a concern for the reasons we mentioned earlier in the introduction. Between the industry's advancement in distributed computing and the addition of XML to the toolkit, we finally had the ability to build complete solutions for exchanging and sharing data without requiring months of design and development for a single point-to-point exchange.
Before jumping into the technical end of the pool, let's take a brief moment to explore the business requirements that are driving XML's popularity and success. In this section, we will explore how XML is being used to solve complex, real-world business problems today.
Computers have been mostly helpful to companies for automating data processing and providing near instantaneous access to information about the business. We use the qualifier "near" here because it is theoretically possible, but hindered due to poor system and application designs. There is a certain instant gratification that can be brought about by the use of computers. They provide a sense of immediate feedback for any stimuli we provide themincluding bad stimuli.
Companies are just starting to realize that they will never be finished building systems to run their businesses, because business is always changing. Therefore, the systems that support business need to change rapidly as well. This was not a widely held belief before the mid-1990s when many companies viewed their Information Systems (IS) departments as nasty expenses that were less expensive than using humans to perform the same tasks.
However, finally there has been a revolution within the last five years. Companies have realized that they can use the talent in their IS departments to provide them with systems that proactively watch market trends and conditions and help the business react in a positive direction. With this change comes a major paradigm shift. No longer will businesses need monolithic systems that perform a single task, such as accounting, human resources, or sales, but they will need modular components that can talk to each other and be aggregated into a larger component that understands the goals of the business. The more of these components that become available, the more intelligent the system becomes.
So it goes; we reached the end of a century and the end of a millennium. Companies were faced with the possibility of widespread system failure if they underestimated their risk from potential Y2K problems, and at the same time were forced to enter into a growing global electronic economy. The choices were minimal; take what we have and make it work with new applications. There were neither time nor resources available for rebuilding legacy systems, even if that were the preferred option. This alone has had sweeping consequences for the computer industry.
EAI exists today as a means of freeing the data trapped inside existing legacy systems for use by many other applications. EAI encompasses many fine-grained technologies, such as supply-chain integration, automated procurement, sales, customer service, distribution, routing, etc. Underlying each of these areas are technologies for data exchange and data sharing, such as file systems, distributed objects, Web, application servers, etc.
As we go deeper and deeper under the covers, it soon becomes clear that woven into this intricate tapestry is the need to move data from point A to point B in a secure and transacted manner with no guarantee that points A and B know anything about each other. Until now, most of the work of moving data from A to B was accomplished by having two teams of developers that were intimate with A and B get together and hash out a middle ground. Then each team would walk away to create its half of an application portal that would connect these two disparate systems. Some weeks later, when each team was finished, they would put the two halves together and run a test. All this cost between $30,000 and $250,000. Imagine the costs of making all production applications in the enterprise communicate this way!
It did not take long, based on the financial requirements cited above, to realize there had to be a better way. Some software companies began to provide software that simplified the process of building the portals between applications. These portal development tools, however, did not come cheaply, but they did lower the cost of each additional system being integrated. And, even with this software, developers intimate with the system being integrated were required to spend time making the innards of their application known to those trained in using the portal development tools.
We are now entering the next phase of this process. With agreement by the industry on XML as a language for representing hierarchically structured information, there is now a way for system experts who are intimate with the structures of their applications to describe their data without having to spend hours communicating how their application works and what their internal structures mean. XML can even be leveraged by portal development tools for help in speeding integration between systems.
Here are some simple scenarios that illustrate how XML can enhance this process significantly:
Company A has two systemssales and customer servicehandling various aspects of business processing, but they do not communicate. Due to the nature of business, a customer in the sales system will eventually become a user in the customer service system. This usually happens the first time a customer calls in with a particular problem with regard to the products sold to them.
There's nothing wrong with this process from the perspective of Company A except that customers are a bit irate that information on the product they purchased, such as the serial number, date, and cost, is not accessible to the customer service representative. Just having this information on file could change the whole experience of some calls from negative to positive.
Company A explores the option of purchasing a new integrated system that links customer service and sales, but decides that the costs are too high when training and data migrating are incorporated into the picture. The simple solution, and less costly one, is to provide a link between these two systems, such that all sales information is transferred to the customer service system when an order is shipped. This includes issuing compensating transactions when items are returned.
Clearly, there is never to be a direct communication between the sales and customer service systems. Instead, some business logic must be developed that takes all completed invoices and all authorized returns at day's end and packages them up in a single XML message to be delivered to a process running in customer service. The grammar for this document must be dictated by the customer service system because only it can decide what is the minimal set of information needed by other systems. Because they used XML, Company A was able to leverage the large body of tools now available on the market and did not have to spend time building a parser for a proprietary data format.
All companies have terabytes of data trapped inside their legacy applications. Even when the data is stored inside database management systems, it is not readily available because one must understand the schema and table relationships to make use of the data stored there. Unfortunately, it is this base of information that we now wish to use to foster decision support systems within the enterprise.
Luckily, all applications have a way to get at their data, even if it means reading the data right out of the screen memory buffera process known as "screen scraping." The question is, how can all this data be brought together in a way that is intelligible and manageable? That is, if new data types are continually being introduced, is there some way to organize the data for maximum reuse? The answer to these questions is yes!
The most likely reason for bringing these data types together is to provide a singular view of some business object; for example, the complete history of information for a customer or product. Inside most large organizations, this data can be found across many systems and stored in many different formats and data sources. XML provides an excellent vehicle for representing these new interim data objects.
If Company A wants to view the entire history of a product, inclusive of its sales since inception and broken down by geography and consumer; the sales plan for the product; and the biographies of the management team for the product, they would need a format that could support both structured data and free-form text. In addition, they would need a way to identify the information inside this new aggregate data object and some context for the information contained within it. Context is what allows Company A to differentiate between total sales figures in the Northeast and total sales figures for men between the ages of 25 and 34.
It is clear to see how XML could be used to represent the aggregate data set, but the simplicity of the big picture is that if the data source where this data is extracted from generates XML, that data only needs to be grafted into the document as a child element of the root. Thus, XML simplifies the collection of data and provides a container for representing it.
These two examples quickly illustrate what can be accomplished by learning how to integrate XML into your systems today. This book will show you how to do it using Java, but the principles can be applied to other programming environments just as easily.
Note The book assumes a basic knowledge of Java programming and of XML. There are many books available on both subjects that can supply that knowledge. In this series, there are two books with XML tutorials: The XML HandbookTM by Goldfarb and Prescod, and XML by Example by Sean McGrath. The second edition of The XML Handbook also includes tutorials on namespaces, XLink, and other XML-based technologies discussed in this book.
The book begins with Part One, which covers basics of Enterprise Application Integration (EAI) and processing XML. It lays a very important foundation that will be used throughout the rest of the book. The part includes: