This section presents the architecture of our system for defining and implementing an XML data warehouse. Our system has been designed to integrate XML sources, using a data-warehousing approach. The data warehouse is defined as a set of XML materialized views.
The architecture depicted in Figure 16.1 is based on three main components:
The data warehouse specification module, which allows us to design the data warehouse
The data warehouse implementation module, which allows us to store XML data in a relational DBMS and manages data extraction and maintenance
The query manager module for querying the data warehouse
Figure 16.1. System Architecture
The Datawarehouse specification component allows us to design data warehouse content. It provides a graphic editor that produces an XML document containing the data warehouse specification. This specification is composed of information on XML sources and view specifications.
The Datawarehouse implementation component is responsible for creating the relational database of the data warehouse. The XML data are stored in a relational DBMS, to take advantage of the performance of this type of system. We distinguished two levels of data storage: (1) the Datawarehouse component stores the metadata (i.e., patterns and views organization data) and (2) the XML data component stores the content of XML elements or attributes.
The query manager is responsible for reconstructing XML documents from the relational data. In the future, we plan to use query-rewriting techniques (Manolescu et al. 2001) to translate an XML query on the data warehouse interface to an SQL query.