A detailed study of the impact of objects and type theory on the relational model of data, including a comprehensive proposal for type inheritance"This is the first attempt to describe what object/relational means. If you're interested in object/relational technology, this is the book to read." --Rick van der Lans Independent consultant, author of Introduction to SQL, and past member of the Dutch committee responsible for developing the International SQL Standard
"This book is an excellent piece of work. It is very rare in computer science to come across a book that provides such a complete and precise theory that is systematically presented and compared to all of the other work in the area. Even those who find the conclusions controversial will admire this thoroughness." --Rick Cattell ODMG Chair, author of Object Data Management and JDBC Database Access with Java, and co-editor of the Object Database Standard: ODMG 2.0
Foundation for Object/Relational Databases: The Third Manifesto is a proposal for the future direction of data and database management systems (DBMSs). It consists of a precise, formal definition of an abstract model of data, to be considered as a blueprint for the design of a DBMS and a database language. In particular, it provides a rock-solid foundation for integrating relational and object technologies, a foundation conspicuously lacking in current approaches to such integration.
The proposed foundation represents an evolutionary step, not a revolutionary one. It builds on Codd's relational model of data and on the research that resulted from that work. Most notably, it incorporates a precise and comprehensive specification for a method of defining data types, including a comprehensive model of type inheritance, to address a lack that has been observed by many authorities; thus, it also builds on research in the field of object orientation. With a sound footing in both camps of the object/relational divide, the Manifesto is offered as a firm foundation for true object/relational DBMSs.
The authors combine precision and thoroughness of exposition with the approachability that readers familiar with their previous publications will recognize and welcome. This book is essential reading for database students and professionals alike.
Hugh Darwen has been involved in software development since 1967 as an employee of IBM United Kingdom Ltd. He has been active in the relational database arena since 1978, and was one of the chief architects and developers of an IBM relational product called Business System 12--a product that faithfully embraced the principles of the relational model. His writings include notable contributions to Date's Relational Database Writings series (Addison-Wesley, 1990, 1992) and A Guide to the SQL Standard (4th edition, Addison-Wesley, 1997). He has been an active participant in the development of SQL international standards since 1988.
C.J. Date is an independent consultant, author, lecturer, and researcher specializing in relational database systems. He was one of the first persons to recognize and support Codd's pioneering work on the relational model. Mr. Date was also involved in technical planning for the IBM products SQL/DS and DB2. He is best known for his books, in particular An Introduction to Database Systems (6th edition, Addison-Wesley, 1996), which has sold well over half a million copies worldwide.
THE THIRD MANIFESTO: FOUNDATION FOR OBJECT / RELATIONAL DATABASES by
THE THIRD MANIFESTO:
OBJECT / RELATIONAL DATABASES
Hugh Darwen and C. J. Date
Note: A book with the same title by the same authors is due to be published by Addison-Wesley in mid 1998. The article that follows is basically a late draft of one chapter from that book, though it has been edited somewhat to make it more self-contained. A brief description of that book can be found at the end of this article. For more information, contact either of the authors (contact information is also given at the end of the article).
The Third Manifesto  is a detailed and rigorous proposal for the future of data and database management systems. The present article consists of an informal discussion of certain of the key technical ideas underlying the Manifesto, including in particular the idea that domains in the relational world and object classes in the object world are the same thing.
Copyright ã 1998 Hugh Darwen and C. J. Date
There is much current interest in the database community in the possibility of integrating objects and relations. However (and despite the fact that several vendors have already announced-in some cases, even released-"object/relational" products), there is still some confusion over the question of the right way to perform that integration. Since part of the purpose of The Third Manifesto  is to answer this very question, the idea of bringing the Manifesto to the attention of a wider audience than hitherto seems timely.
The Manifesto is meant as a foundation for the future of data and database management systems (DBMSs). Because of our twin aims in writing it of comprehensiveness and brevity, however, it is-unfortunately but probably inevitably-rather terse and not very easy to read; hence this introductory article (which might be characterized as "the view from 20,000 feet"). Our aim is to present some of the key technical ideas underlying the Manifesto in an informal manner, thereby paving the way for a proper understanding of the Manifesto itself. In particular, as already indicated, we would like to explain what we believe is the right way to integrate objects and relations. More precisely, we want to address the following question:
What concept in the relational world is the counterpart to the concept "object class" in the object world?
There are two equations that can be proposed as answers to this question:
1. domain = object class
2. relation = object class*
In what follows, we will argue strongly that the first of these equations is right and the second is wrong.
* More correctly, relvar = object class. See the section "Relations vs. Relvars," later.
Copyright ã1998 Hugh Darwen and C. J. Date..............................Page 1
WHAT PROBLEM ARE WE TRYING TO SOLVE?
Databases of the future will contain much more sophisticated kinds of data than current commercial ones typically do. For example, we might imagine a biological database that includes a BIRD relation like that shown in Fig. 1. Thus, what we want to do is extend-dramatically-the range of possible kinds of data that we can keep in our databases. Of course, we want to be able to manipulate that data, too; for example, we might want to find all birds whose migration route includes Italy:
SELECT NAME, DESCR, VIDEO
WHERE INCLUDES ( MIGR, COUNTRY ( 'Italy' ) ) ;
Note: We use SQL here for familiarity, though in fact the Manifesto expressly proscribes it (see the next section).
Thus, the question becomes: How can we support new kinds of data within the relational framework? Note that we do take it as axiomatic that we want to stay in the relational framework!-it would be unthinkable to walk away from nearly 30 years of solid relational R&D. We mustn't throw the baby out with the bathwater.
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 2
WHY THE THIRD MANIFESTO?
Before going any further, we should explain that "third" in our title. In fact, the Manifesto is the third in a series (of a kind). Its two predecessors are:
1. The Object-Oriented Database System Manifesto 
2. The Third Generation Database System Manifesto 
Like our own Manifesto, each of these documents offers a proposed basis for future DBMSs. However:
1. The first essentially ignores the relational model! In our opinion, this flaw is more than enough to rule it out immediately as a serious contender.
2. The second does agree that the relational model must not be ignored, but assumes that SQL (with all its faults) is an adequate realization of that model and hence an adequate foundation for the future. By contrast, we feel strongly that any attempt to move forward, if it's to stand the test of time, must reject SQL unequivocally. Our reasons for taking this position are many and varied, far too much so for us to spell them out in detail here; in any case, we've described them in depth in other places (see, e.g., references  and ), and readers are referred to those publications for the specifics.
A major thesis of The Third Manifesto is thus that we must get away from SQL and back to our relational roots. Of course, we do realize that SQL databases and applications are going to be with us for a very long time-to think otherwise would be quite unrealistic. So we do have to pay some attention to the question of what to do about today's SQL legacy, and The Third Manifesto does include some proposals in this regard. Further details are beyond the scope of this article, however.
Without further preamble, let's take a look at some of the key technical aspects of our proposal.
RELATIONS vs. RELVARS
The first thing we have to do is clear up a confusion that goes back nearly 30 years. Consider the bill-of-materials relation shown in Fig. 2. As the figure indicates, every relation has two parts, a heading and a body; the heading is a set of column-name/domain-name pairs, the body is a set of rows that conform to that heading. For the relation in Fig. 2:
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 3
Now, there's a very important (though perhaps unusual) way of thinking about relations, and that's as follows. Given a relation R, the heading of R denotes a certain predicate (or truth- valued function), and each row in the body of R denotes a certain true proposition, obtained from that predicate by substituting certain domain values for that predicate's parameters ("instantiating the predicate"). In the case of the bill-of-materials example, the predicate is
part MAJOR_P# contains QTY of part MINOR_P#
(the three parameters are MAJOR_P#, QTY, and MINOR_P#, corresponding of course to the three columns of the relation), and the true propositions are
part P1 contains 2 of part P2
(obtained by substituting the domain values P1, 2, and P2);
part P1 contains 4 of part P3
(obtained by substituting the domain values P1, 4, and P3); and so on. In a nutshell:
It follows that:
Now we can get back to the main theme of the present section. Historically, there's been much confusion between relations per se (i.e., relation values) and relation variables. Suppose we say in some programming language:
DECLARE N INTEGER ...
N here isn't an integer per se, it's an integer variable whose values are integers per se-different integers at different times. Likewise, if we say in SQL:
CREATE TABLE T ...
R here isn't a relation (or table) per se, it's a relation variable whose values are relations per se-different relations at different times. And when we "update R" (e.g., by "inserting a new row"), what we're really doing is replacing the old relation value of R en bloc by an entirely new relation value. Of course, it's true that the old value and the new value are somewhat similar-the new one just has one more row than the old one-but conceptually they are different values.
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 4
Now, the trouble is that, very often, when people talk about relations, they really mean relation variables, not relations per se. This distinction-or, rather, the fact that this distinction is usually not clearly made-has been a rich source of confusion in the past. For example, the overall value of a given relation, like the overall value of a given domain, doesn't change over time, whereas of course the value of a relation variable certainly does. Despite this obvious difference, some people-we suppress the names to protect the guilty-have proposed that domains and relations (meaning relation variables) are really the same kind of thing! See the section "Relvars vs. Object Classes," later.
In The Third Manifesto, therefore, we've tried very hard to be clear on this point (and the same goes for the rest of the present article). Specifically, we've introduced the term relvar as a convenient shorthand for relation variable, and we've taken care to phrase our remarks in terms of relvars, not relations, when it's really relvars that we mean.
DOMAINS vs. OBJECT CLASSES
It's an unfortunate fact that most people have only a rather weak understanding of what domains are all about; typically they perceive them as just conceptual pools of values, from which columns in relations draw their actual values (to the extent they think about the concept at all, that is). This perception is accurate so far as it goes, but it doesn't go far enough. The fact is, a domain is really nothing more nor less than a data type-possibly a simple system-defined data type like INTEGER or CHAR, more generally a user-defined data type like P# or QTY in the bill-of-materials example.
Now, it's important to understand that the data type concept includes the associated concept of the operators that can legally be applied to values of the type in question (values of that type can be operated upon solely by means of the operators defined for that type). For example, in the case of the system-defined INTEGER domain (or type-we use the terms interchangeably):
Likewise, if we had a system that supported domains properly (but most of today's systems don't), then we would be able to define our own domains-say the part number domain P#. And we would probably define operators "=", "<", and so on, for comparing two part numbers. However, we would probably not define operators "+", "*", and so on, which would mean that arithmetic on part numbers would not be supported.
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 5
Observe, therefore, that we distinguish very carefully between a data type per se and the representation or encoding of values of that type inside the system. For example, part numbers might be represented internally as character strings, but it doesn't follow that we can perform string operations on part numbers; we can perform such operations only if appropriate operators have been defined for the type. And (in general) the operators we define for a given domain will depend on that domain's intended meaning, not on the way values from that domain happen to be represented or encoded inside the system.
By now you might have realized that what we've been talking about is what's known in programming language circles as strong typing. Different writers have slightly different definitions for this term; as we use it, however, it means, among other things, that (a) everything has a type, and (b) whenever we try to perform an operation, the system checks that the operands are of the right type for the operation in question. And note carefully that-as already indicated-it's not just comparison operations that we're talking about here (despite the emphasis on comparisons in much of the database literature). E.g., suppose we're given the well-known suppliers-and-parts database, with relvars S (suppliers), P (parts), and SP (shipments), and consider the following expressions:
1. P.WEIGHT + SP.QTY /* part weight plus shipment quantity */
2. P.WEIGHT * SP.QTY /* part weight times shipment quantity */
The first of these expressions makes no sense, and the DBMS should therefore reject it. The second, on the other hand, does make sense-it denotes the total weight for all parts involved in the shipment. So the operators we would define for weights and quantities would presumably include "*" but not "+".
Observe now that so far we've said nothing at all about the nature of the values that can belong to a domain. In fact, those values can be anything at all! We tend to think of them as being very simple (numbers, strings, and so forth), but there's absolutely nothing in the relational model that requires them to be limited to such simple forms. Thus, we can have domains of sound recordings, domains of maps, domains of videos, domains of engineering drawings, domains of legal documents, domains of geometric objects (and so on, and so on). The only requirement is that (to say it one more time) the values in the domain must be manipulable solely by means of the operators defined for the domain in question.
The foregoing message is so important that we state it again in different words:
THE QUESTION AS TO WHAT DATA TYPES ARE SUPPORTED
IS ORTHOGONAL TO THE QUESTION OF SUPPORT FOR THE
To sum up, therefore: What we're saying is that, in the relational world, a domain is a data type, probably user-defined, of arbitrary internal complexity, whose values are manipulable solely by means of the operators defined for the type in question. Now, if we turn to the object-oriented (OO) world, we find that what is arguably the most fundamental OO concept of all, the object class, is a data type, probably user-defined, of arbitrary internal complexity, whose values are manipulable solely by means of the operators defined for the type in question ... In other words, domains and object classes are the same thing! Thus, we have here the key to integrating the two technologies-and, of course, this position is exactly what we espouse in The Third Manifesto. Indeed, we believe that a relational system that supported domains properly would be able to deal with all of those "problem" kinds of data that (it's often claimed) OO systems can handle and relational systems cannot: time-series data, biological data, financial data, engineering design data, office automation data, and so on. Accordingly, we also believe that a true "object/relational" system is nothing more than a true relational system -- which is to say, a system that supports the Relational Model, with all that that entails.
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 6
RELVARS vs. OBJECT CLASSES
In the previous section we equated object classes and domains. Many people, however, equate object classes and relvars instead (see reference  for an example). We now argue that this latter equation is a serious mistake. Indeed, the Manifesto includes a categorical statement to the effect that relvars are not domains.
Consider the following example. First, here's part of a simple object class definition, expressed in a hypothetical OO language (the keyword PUBLIC is meant to indicate that the specified items are "public instance variables"):
CREATE OBJECT CLASS EMP
PUBLIC ( EMP# CHAR(5),
And here's part of a simple relational-or at least SQL-table (relvar) definition:
CREATE TABLE EMP
It's very tempting to equate these two definitions!-which is in effect what certain systems (both prototypes and commercial products) have already done. So let's take a closer look at this equation. More precisely, let's take the CREATE TABLE just shown, and let's consider a series of possible extensions that (some people would argue) make it more "OO"-like.
First, we allow column values to be tuples from some other relvar ("tuple" here being just another word for row, loosely speaking). In the example, we might replace the original CREATE TABLE by the following collection of definitions:
CREATE TABLE EMP
CREATE TABLE ACTIVITY
CREATE TABLE COMPANY
CREATE TABLE CITYSTATE>
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 7
Fig. 3 shows the structure of relvar EMP at this point.
Explanation: Column HOBBY in relvar EMP is declared to be of type ACTIVITY. ACTIVITY in turn is a relvar of two columns, NAME and TEAM, where TEAM gives the number of players in the corresponding team-for instance, a possible "activity" might be (Soccer,11). Each HOBBY value is thus actually a pair of values, a NAME value and a TEAM value (more precisely, it's a pair of values that currently appear as a row in relvar ACTIVITY). Note that we've already violated the dictum that relvars aren't domains!
Similarly, column WORKS_FOR in relvar EMP is declared to be of type COMPANY, and COMPANY is also a relvar of two columns, one of which is defined to be of type CITYSTATE, which is another two-column relvar, and so on. In other words, relvars ACTIVITY, COMPANY, and CITYSTATE are all considered to be types as well as relvars (as is relvar EMP itself, of course).
This first extension is thus roughly analogous to allowing objects to contain other objects, thereby supporting the concept sometimes known as a containment hierarchy.
Note: As an aside, we remark that we have characterized this first extension as "columns containing rows" because that is the way advocates of the "relvar = class" equation themselves characterize it. It would be more accurate, however, to characterize it as "columns containing pointers to rows" -- a point that we will be examining in a few moments. (In Fig. 3, therefore, we should really replace each of the three appearances of the term row by the term pointer to row.) Analogous remarks apply to the second extension also.
That second extension, then, is to add relation-valued columns. E.g., suppose employees can have an arbitrary number of hobbies, instead of just one (refer to Fig. 4):
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 8
CREATE TABLE EMP
Explanation: The HOBBIES value within any given row of relvar EMP is now (conceptually) a set of zero or more (NAME,TEAM) pairs-i.e., rows-from the ACTIVITY relvar. This second extension is thus roughly analogous to allowing objects to contain "aggregate" objects (a more complex version of the containment hierarchy).
The third extension is to permit relvars to have associated methods (i.e., operators). E.g.:
CREATE TABLE EMP
Explanation: RETIREMENT_BENEFITS is a method that takes a given EMP instance as its argument and produces a result of type NUMERIC. The code that implements the method is written in a language such as C.
The final extension is to permit the definition of subclasses. E.g. (refer to Fig. 5):
CREATE TABLE PERSON
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 9
CREATE TABLE EMP
Explanation: EMP now has three additional columns (SS#, BIRTHDATE, ADDRESS) inherited from PERSON. If PERSON had any methods, it would inherit those too.
Along with the definitional extensions sketched above, numerous manipulative extensions are required too, of course-for instance:
( 'E001', 'Smith', $50000,
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 10
So much for a quick overview of how the "relvar = class" equation is realized in practice. What's wrong with it?
Well, first of all, a relvar is a variable and a class is a type; so how can they possibly be the same thing? (We showed in the section "Relations vs. Relvars" that relations and domains aren't the same thing; now we see that relvars and domains aren't the same thing either.)
The foregoing argument should be logically sufficient to stop the "relvar = class" idea dead in its tracks. However, there is more that can usefully be said on the subject, so let us agree to suspend disbelief a little longer ... Here are some more points to consider:
It follows that we're not really talking about the relational model any more. The fundamental data object isn't a relation containing values, it's a "relation" (actually not a proper relation at all) containing values and pointers.
Well, "class" EMP had just one method, RETIREMENT_BENEFITS, and that one clearly doesn't apply to V. In fact, it hardly seems reasonable that any methods that applied to "class" EMP would apply to V-and there certainly aren't any others. So it looks as if (in general) no methods at all apply to the result of a projection; i.e., the result, whatever it is, isn't really a class at all. (We might say it's a class, but that doesn't make it one!-it will have public instance variables and no methods, whereas we've already observed that a true class has methods and no public instance variables.)
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 11
In fact, it's clear that when people equate relvars and classes, it's specifically base relvars they're referring to-they're forgetting about the derived ones. (Certainly the pointers discussed above point to rows in base relvars, not derived ones.) As we've argued elsewhere , to distinguish base and derived relvars in this way is a mistake of the highest order, because the question as to which relvars are base and which derived is, to a very large degree, arbitrary. For further discussion of this important issue, see that same paper .
A NOTE ON INHERITANCE
You might have noticed that we did briefly mention the possibility of inheritance in the previous section but not in the earlier section "Domains vs. Object Classes." And you might therefore have concluded that support for inheritance does constitute at least one point in favor of the "relvar = class" equation. Not so, however; we do indeed want to include inheritance as part of our "domain = class" approach, and thus (e.g.) be able to define domain CIRCLE as a "subdomain" of "superdomain" ELLIPSE. The problem is, however, there doesn't seem to be a clearly defined and generally agreed model of inheritance at the time of writing. As a consequence, The Third Manifesto includes conditional support for inheritance, along the lines of "if inheritance is supported, then it must be in accordance with some well defined and commonly agreed model." We do also offer some detailed proposals toward the definition of such a model.
We have discussed the question of integrating relational and object-oriented (OO) database concepts. In our opinion, OO contains exactly one unquestionably good idea: user-defined data types (which includes user-defined operators). It also contains one probably good idea: type inheritance (though the jury is still out on this one, to some extent). A key technical insight underlying The Third Manifesto is that these two ideas are completely orthogonal to the relational model. In other words, the relational model needs no extension, no correction, no subsumption-and, above all, no perversion!-in order for it to accommodate these orthogonal ideas.
To sum up, therefore: What we need is simply for the vendors to give us true relational DBMSs (and note that "true relational DBMSs" does not mean SQL systems) that include proper domain support. Indeed, an argument can be made that the whole reason OO systems (as opposed to "O/R" systems) look attractive is precisely the failure on the part of the SQL vendors to support the relational model adequately. But this fact shouldn't be seen as an argument for abandoning the relational model entirely (or at all!).
This article is an updated version of "Introducing The Third Manifesto" by Hugh Darwen and C. J. Date, which appeared in Database Programming & Design 8, No. 1 (January 1995); it appears here by permission of Miller Freeman Inc. We would also like to thank the many people who have reviewed drafts of The Third Manifesto and offered constructive criticism and helpful comments on those drafts.
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 12
1. Malcolm Atkinson et al. "The Object-Oriented Database System Manifesto." Proc. First International Conference on Deductive and Object-Oriented Databases, Kyoto, Japan (1989). New York, N.Y.: Elsevier Science (1990).
2. Hugh Darwen. "Adventures in Relationland." In C. J. Date and Hugh Darwen, Relational Database Writings 1985-1989. Reading, Mass.: Addison-Wesley (1990).
3. Hugh Darwen and C. J. Date. "The Third Manifesto." ACM SIGMOD Record 24, No. 1 (March 1995). Version Two of this document is due to be published in book form by Addison-Wesley in 1998.
4. C. J. Date. An Introduction to Database Systems (6th edition). Reading, Mass.: Addison- Wesley (1995).
5. C. J. Date. "Objects and Relations: Forty-Seven Points of Light." Data Base Newsletter 23, No. 5 (September/October 1995).
6. Won Kim. "On Marrying Relations and Objects: Relation-Centric and Object-Centric Perspectives." Data Base Newsletter 22, No. 6 (November/December 1994).
7. Michael Stonebraker et al. "Third Generation Database System Manifesto." ACM SIGMOD Record 19, No. 3 (September 1990).
THE THIRD MANIFESTO: A NOTE REGARDING THE BOOK-LENGTH VERSION
The book's full title is the same as that of the current article -- viz., The Third Manifesto: Foundation for Object/Relational Databases. And there's a subtitle too: a detailed study of the impact of objects and type theory on the relational model of data, including a comprehensive proposal for type inheritance. As noted in the abstract to the present article, The Third Manifesto is a detailed and rigorous proposal for the future of data and database management systems; it consists of a precise, formal definition of an abstract model of data, to be considered as a blueprint for the design of a DBMS and a database language. In particular, it provides a rock-solid foundation for integrating relational and object technologies, a foundation conspicuously lacking in current approaches to such integration.
The proposed foundation represents an evolutionary step, not a revolutionary one. It builds on Codd's relational model of data and on the research that sprang from Codd's work. Most notably, it incorporates a precise and comprehensive specification for a method of defining data types, including a comprehensive model of type inheritance, to address a lack that has been observed by many authorities; thus, it also builds on research in the field of object orientation. With a sound basis in both camps of the object/relational divide, therefore, the Manifesto is offered as a firm foundation for true object/relational DBMSs.
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 13
The book is arranged into four parts and a set of appendixes:
I. Preliminaries: Background and overview; objects and relations
II. Formal Specifications: The Manifesto proper; a new relational algebra; and a language called Tutorial D, a concrete realization of the ideas of the Manifesto
III. Informal Discussions and Explanations: A careful point-by-point examination and exposition of the Manifesto, with copious examples in Tutorial D
IV. Subtyping and Inheritance: A detailed and comprehensive proposal for a model of type inheritance, with (again) numerous examples
Appendixes: Annotated references and bibliography; comparisons with SQL3 and ODMG; database design considerations; and many other topics
The authors combine precision and thoroughness of exposition with the approachability that readers familiar with their previous publications will recognize and welcome. The book is essential reading for the database student or professional. Author information: Hugh Darwen (firstname.lastname@example.org, +44 (0) 1926-464398 voice, +44 (0) 1926-410764) is a database systems specialist at IBM United Kingdom Limited; C. J. Date (+1 707/433-6523 voice, +1 707/433-7322 fax) is an independent author, lecturer, researcher, and consultant, specializing in relational database systems.
*** End *** End *** End ***
Copyright ã 1998 Hugh Darwen and C. J. Date..............................Page 14
I. PRELIMINARIES.1. Background and Overview.
What is The Third Manifesto?
Why did we write it?
Back to the relational future.
Some guiding principles.
Some crucial logical differences.
Topics deliberately omitted.
The Third Manifesto: A summary.2. Objects and Relations.
What problem are we trying to solve?
Relations vs. relvars.
Domains vs. object classes.
Relvars vs. object classes.
A note on inheritance.
II. FORMAL SPECIFICATIONS.3. The Third Manifesto.
RM Very Strong Suggestions.
OO Very Strong Suggestions.4. A New Relational Algebra.
Motivation and justification.
BREMOVE(c), BRENAME(c), and BCOMPOSE(c).
Treating operators as relations.
Transitive closure.5. Tutorial D.
Types and expressions.
Relations and arrays.
Mapping the relational operations.
III. INFORMAL DISCUSSIONS AND EXPLANATIONS.6. RM Prescriptions.
RM Prescription 1: Scalar types.
RM Prescription 2: Scalar values are typed.
RM Prescription 3: Scalar operators.
RM Prescription 4: Actual vs. possible representations.
RM Prescription 5: Expose possible representations.
RM Prescription 6: Type generator TUPLE.
RM Prescription 7: Type generator RELATION.
RM Prescription 8: Equality.
RM Prescription 9: Tuples.
RM Prescription 10: Relations.
RM Prescription 11: Scalar variables.
RM Prescription 12: Tuple variables.
RM Prescription 13: Relation variables (relvars).
RM Prescription 14: Real vs. virtual relvars.
RM Prescription 15: Candidate keys.
RM Prescription 16: Databases.
RM Prescription 17: Transactions.
RM Prescription 18: Relational algebra.
RM Prescription 19: Relvar names, relation selectors, and recursion.
RM Prescription 20: Relation-valued operators.
RM Prescription 21: Assignments.
RM Prescription 22: Comparisons.
RM Prescription 23: Integrity constraints.
RM Prescription 24: Relvar and database predicates.
RM Prescription 25: Catalog.
RM Prescription 26: Language design.7. RM Proscriptions.
RM Proscription 1: No attribute ordering.
RM Proscription 2: No tuple ordering.
RM Proscription 3: No duplicate tuples.
RM Proscription 4: No nulls.
RM Proscription 5: No nullological mistakes.
RM Proscription 6: No internal-level constructs.
RM Proscription 7: No tuple-level operations.
RM Proscription 8: No composite attributes.
RM Proscription 9: No domain check override.
RM Proscription 10: Not SQL.8. OO Prescriptions.
OO Prescription 1: Compile-time type checking.
OO Prescription 2: Single inheritance (conditional).
OO Prescription 3: Multiple inheritance (conditional).
OO Prescription 4: Computational completeness.
OO Prescription 5: Explicit transaction boundaries.
OO Prescription 6: Nested transactions.
OO Prescription 7: Aggregates and empty sets.9. OO Proscriptions.
OO Proscription 1: Relvars are not domains.
OO Proscription 2: No object IDs.10. RM Very Strong Suggestions.
RM Very Strong Suggestion 1: System keys.
RM Very Strong Suggestion 2: Foreign keys.
RM Very Strong Suggestion 3: Candidate key inference.
RM Very Strong Suggestion 4: Transition constraints.
RM Very Strong Suggestion 5: Quota queries.
RM Very Strong Suggestion 6: Generalized transitive closure.
RM Very Strong Suggestion 7: Tuple and relation parameters.
RM Very Strong Suggestion 8: Special (“default”) values.
RM Very Strong Suggestion 9: SQL migration.11. OO Very Strong Suggestions.
OO Very Strong Suggestion 1: Type inheritance.
OO Very Strong Suggestion 2: Types and operators unbundled.
OO Very Strong Suggestion 3: Collection type generators
OO Very Strong Suggestion 4: Conversions to/from relations
OO Very Strong Suggestion 5: Single-level store
IV. SUBTYPING AND INHERITANCE.12. Preliminaries
Toward a type inheritance model.
Single vs. multiple inheritance.
Scalars, tuples, and relations.
Summary.13. Formal Specifications.
IM Proposals.14. Informal Discussions and Explanations.
IM Proposal 1: Types are sets.
IM Proposal 2: Subtypes are subsets.
IM Proposal 3: “Subtype of” is reflexive.
IM Proposal 4: Proper subtypes.
IM Proposal 5: “Subtype of” is transitive.
IM Proposal 6: Immediate subtypes.
IM Proposal 7: Single inheritance only.
IM Proposal 8: Global root types.
IM Proposal 9: Type hierarchies.
IM Proposal 10: Subtypes can be proper subsets.
IM Proposal 11: Types disjoint unless one a subtype of the other.
IM Proposal 12: Scalar values (extended definition).
IM Proposal 13: Scalar variables (extended definition).
IM Proposal 14: Assignment with inheritance.
IM Proposal 15: Comparison with inheritance.
IM Proposal 16: Join etc. with inheritance.
IM Proposal 17: TREAT DOWN.
IM Proposal 18: TREAT UP.
IM Proposal 19: Logical operator IS_T(SX).
IM Proposal 20: Relational operator RX:IS_T(A).
IM Proposal 21: Logical operator IS_MS_T(SX).
IM Proposal 22: Relational operator RX:IS_MS_T(A).
IM Proposal 23: THE_ pseudovariables.
IM Proposal 24: Read-only operator inheritance and value substitutability.
IM Proposal 25: Read-only parameters to update operators.
IM Proposal 26: Update operator inheritance and variable substitutability.
What about specialization by constraint?15. Multiple Inheritance.
The running example.
IM Proposals 1-26 revisited.
Many supertypes per subtype.
Least specific types unique.
Most specific types unique.
Comparison with multiple inheritance.
Operator inheritance.16. Tuple and Relation Types.
Tuple and relation subtypes and supertypes.
IM Proposals 1-11 still apply.
Tuple and relation values (extended definitions).
Tuple and relation most specific types.
Tuple and relation variables (extended definitions).
Tuple and relation assignment.
Tuple and relation comparison.
Tuple and relation TREAT DOWN.
IM Proposals 18-26 revisited.Appendixes.
Builtin relation operator invocations.
Free and bound range variable references.
Relation UPDATE and DELETE operators.
Examples.Appendix B. The Database Design Dilemma.
Further considerations.Appendix C. Specialization by Constraint.
A closer look.
The “3 out of 4” rule.
Can the idea be rescued?Appendix D. Subtables and Supertables.
Some general observations.
The terminology is extremely bad.
The concept is not type inheritance.
Why?Appendix E. A Comparison with SQL3.
RM Very Strong Suggestions.
OO Very Strong Suggestions.
IM Proposals (scalar types, single inheritance).
IM Proposals (scalar types, multiple inheritance).
IM Proposals (tuple and relation types).
History of the wrong equation in SQL3.Appendix F. A Comparison with ODMG.
RM Very Strong Suggestions.
OO Very Strong Suggestions.
IM Proposals (scalar types, single inheritance).
IM Proposals (scalar types, multiple inheritance).
IM Proposals (tuple and relation types).Appendix G. The Next 25 Years of the Relational Model?
Remarks on republication.
The Third Manifesto and SQL.
More on SQL.
Miscellaneous questions.Appendix H. References and Bibliography.
The Third Manifesto is a detailed proposal for the future direction of data and database management systems (DBMSs). Like Codd's original papers on the relational model, it can be seen as an abstract blueprint for the design of a DBMS and the language interface to such a DBMS. In particular, it lays the foundation for what we believe is the logically correct approach to integrating relational and object technologies--a topic of considerable interest at the present time, given the recent appearance in the marketplace of several "object/relational" DBMS products (sometimes called universal servers). Perhaps we should add immediately that we do not regard the idea of integrating relational and object technologies as "just another fad," soon to be replaced by some other briefly fashionable idea. On the contrary, we think that object/relational systems are in everyone's future--a fact that makes it even more important to get the logical foundation right, of course, while we still have time to do so.
The first version of the Manifesto was published informally in early 1994 (though we had been thinking about the idea of such a document for several years prior to that time), and the first "official" version appeared in 1995. Since then we have presented the material in a variety of forms and forums and discussed it with numerous people--indeed, we continue to do so to this day--and we have refined and expanded the original document many, many times. We would like to stress, however, that those refinements and expansions have always been exactly that; nobody has ever shown us that we were completely on the wrong track, and development of the Manifesto has always proceeded in an evolutionary, not a revolutionary, manner. Now we feel it is time to make the material available in some more permanent form; hence the present book.
One reason we feel the time is ripe for wider dissemination of our ideas is as follows. As already indicated, we see a parallel between the Manifesto and Codd's original papers on the relational model; like those papers of Codd's, the Manifesto offers a foundation for what (we believe) the database systems of the future ought to look like. Also like those papers of Codd's, however, the Manifesto itself is, deliberately, fairly terse and not all that easy to read or understand. Would it not have been nice to have had a book that documented and explained and justified Codd's ideas, back at the beginning of the relational era? Well, here we are at the beginning of "the object/relational era," and--modesty aside--we believe this book can play a role analogous to that of that hypothetical relational book. To that end, we have been careful to include not only the formal specifications of the Manifesto itself (of course), but also a great deal of supporting and explanatory material and numerous detailed examples.
By the way, we should make it clear that our ideas do rest very firmly in the relational tradition. Indeed, we would like our Manifesto to be seen, in large part, as a definitive statement of just what the relational model itself consists of at the time of writing (for it too has undergone a certain amount of evolution over the years). Despite our remarks in the previous paragraph concerning "the object/relational era," therefore, the ideas expressed in the Manifesto must not be thought of as superseding those of the relational model, nor do they do so; rather, they use those ideas as a foundation and build on them. We believe strongly that the relational model is still highly relevant to database theory and practice and will remain so for the foreseeable future. Thus, we regard our Manifesto as being very much in the spirit of Codd's original work and continuing along the path he originally laid down. To repeat, we are talking evolution, not revolution.
There is another point to be made here, too. Given the current interest in object/relational systems, we can expect to see a flurry of books on such systems over the next few years. However, it is unlikely, if history is anything to go by, that those books will concern themselves very much with general principles or underlying theory; it is much more probable that they will be product-oriented, if not actually product-specific. The present book, by contrast, definitely is concerned with theoretical foundations rather than products; in other words, it allows you to gain a solid understanding of the underlying technology per se, thereby enabling you among other things to approach the task of evaluating commercial products from a position of conceptual strength.
While we are on the subject of commercial products, incidentally, we should make it clear that we ourselves have no particular commercial ax to grind. We regard ourselves as independent so far as the marketplace is concerned, and we are not trying to sell any particular product. The ax we do have to grind is that of logical correctness!--we want to do our best to ensure that the industry goes down the right path, not the wrong one.
And in that connection, we would like to mention another reason we feel the book is timely: namely, the fact that the SQL standards bodies, both national and international, are currently at work on a proposal called SQL3 that addresses some of the same issues as our Manifesto does. An appendix to the present book gives a detailed set of comparisons between our ideas and those of the current SQL3 proposal.
Note: Another body, the Object Database Management Group (ODMG), has also published a set of proposals that, again, address some of the same issues. Another appendix to this book therefore takes a look at the ODMG ideas as well.
Two more special features of the book are the following:
Finally, we should mention one further feature that we believe to be highly significant, and that is our proposal for a model of subtyping and inheritance. Many authorities have rightly observed that there is currently no consensus on any such model, and we offer our proposal for consideration in the light of this observation. Indeed, we believe we have some original--and, we also believe, logically sound and correct--thoughts to offer on this important subject. Part IV of the book (five chapters) is devoted to this topic.
The body of the book is divided into four major parts:
Part I sets the scene by explaining in general terms what the Manifesto is all about and why we wrote it. It also contains an informal overview of two approaches to building an object/relational system, one of which is (we claim) right and the other wrong. We recommend that you read both of these chapters fairly carefully before moving on to later parts of the book.
Part II is the most formal part. It consists of three chapters:
Note: Most of the material of these three chapters is provided primarily for purposes of reference; it is not necessary, and probably not even a good idea, to study it exhaustively, at least not on a first reading.
Part III is the real heart of the book. It consists of six chapters, one for each of the six sections of the Manifesto as defined in Part II. (Again, for the benefit of anyone who might have seen earlier drafts of the Manifesto, this part of the book consists essentially of a hugely expanded version of the informal commentary from those earlier drafts.) Each chapter discusses the relevant section of the Manifesto in considerable detail, with examples, and thereby explains the motivations and rationale behind the formal proposals of Part II (especially Chapter 3). Note, therefore, that the Manifesto itself serves as the organizing principle for this, the major part of the book.
Finally, Part IV does for subtyping and inheritance what Parts I, II, and III do for the Manifesto proper. It consists of five chapters. Chapter 12 gives an overall introduction to the topic; Chapter 13 gives formal definitions; and Chapter 14 gives informal explanations and discussions of the ideas underlying those formal definitions. Chapter 15 then extends the ideas of Chapters 12-14 to address multiple inheritance, and Chapter 16 then extends those ideas further to take tuple and relation types into account as well.
In addition to the foregoing, there are also several appendixes: one defining an alternative version of Tutorial D that is based on relational calculus instead of relational algebra, another discussing "subtables and supertables," another containing the text of an interview the present authors gave on the subject of the Manifesto in 1994, and so on. In particular, the SQL3 and ODMG comparisons can be found in this part of the book, as already mentioned. The final appendix, Appendix H, gives an annotated and consolidated list of references for the entire book.
Note: While we are on the subject of references to publications, we should explain that throughout the book such references take the form of numbers in square brackets. For example, the reference "2" refers to the second item in the list of references in Appendix H: namely, a paper by Malcolm P. Atkinson and O. Peter Buneman entitled "Types and Persistence in Database Programming Languages," published in ACM Computing Surveys, Volume 19, No. 2, in June 1987.
Finally, we should say a word about our use of terminology. It is our experience that many of the terms in widespread use in this field are subject to a variety of different interpretations, and that communication suffers badly as a result (examples seem superfluous--you can surely provide plenty of your own). While we have not deliberately used familiar terms in unfamiliar ways, therefore, we have nevertheless found it necessary to introduce our own terminology in certain places. We apologize if this fact causes you any unnecessary difficulties.
Who should read this book? Well, in at least one sense the book is definitely not self-contained--it does assume you are professionally interested in database technology and are therefore reasonably familiar with classical database theory and practice. However, we have tried to define and explain, as carefully as we could, any concepts that might be thought novel; in fact, we have done the same for several concepts that really should not be novel at all but do not seem to be as widely understood as they ought to be ("candidate key" is a case in point). Thus, we have tried to make the book suitable for both reference and tutorial purposes, and we have indicated clearly those portions of the book that are more formal in style and are provided primarily for reference.
Our intended audience is, therefore, just about anyone with a serious interest in database technology, including but not limited to the following:
First of all, we are delighted to be able to acknowledge all of the numerous friends and colleagues who, over the past several years, have given encouragement, participated in discussions, and offered comments (both written and oral) on various drafts of The Third Manifesto or portions thereof: John Andrews, Tanj Bennett, Charley Bontempo, Declan Brady, Bob Brown, Rick Cattell, Linda DeMichiel, Vincent Dupuis, Bryon Ehlmann, Mark Evans, Ron Fagin, Oris Friesen, Ric Gagliardi, Ray Gates, Mikhail Gilula, Zaid Holmin, Michael Jackson, Achim Jung, John Kneiling, Adrian Larner, Bruce Lindsay, David Livingstone, Albert Maier, Carl Mattocks, Nelson Mattos, David McGoveran, Roland Merrick, Serge Miranda, Jim Panttaja, Mary Panttaja, Fabian Pascal, Ron Ross, Arthur Ryman, Alan Sexton, Mike Sykes, Stephen Todd, Rick van der Lans, Anton Versteeg, and Fred Wright (and we apologize if we have inadvertently omitted anyone from this list). We would also like to acknowledge the many conference and seminar attendees, far too numerous to mention individually, who have expressed support for the ideas contained herein.
Second, we would like to thank our reviewers Charley Bontempo, Declan Brady, Rick Cattell, David Livingstone, and David McGoveran for their careful and constructive comments on the manuscript.
Third, we are--of course!--deeply indebted to our wives, Lindsay Darwen and Lindy Date, for their unfailing support throughout this project and so many others over the years.
Finally, we are, as always, grateful to our editor, Elydia Davis, and to the staff at Addison-Wesley for their assistance and their continually high standards of professionalism. It has been, as always, a pleasure to work with them.
Hugh Darwen adds: My gratitude to my colleague and friend, Chris Date, goes without saying. However, I would like to comment on something, significant to us, that you possibly haven't noticed. It concerns the book's attribution. In our previous joint productions our names have been linked by the preposition with, intended to distinguish the primary author from the contributing assistant. This time around we have thought it more appropriate to use the conjunction and, of whose commutativity we Relationlanders are especially conscious! We came to this conclusion despite the fact that, as usual, Chris has done the lion's share of the actual writing. That the writing so faithfully and agreeably records our joint thinking (often painstakingly wrought out) is therefore a source of great pleasure to me, especially in those cases where I can still identify the thinking in question as having arisen from ideas first placed into discussion by myself.
My own thinking has been molded, of course, with the aid of many valued mentors over the years, including Chris himself. Here I would like to single out just two other people for special mention: Adrian Larner for my relational thinking, and Nelson Mattos for my object-oriented thinking.
Chris Date adds: If Hugh feels he has learned from me over the years, I can assure you (and him) that I have most certainly learned a great deal from him!--a state of affairs for which I will always be grateful. As for the matter of the book's attribution, it is of course true that The Third Manifesto is a joint effort, but Hugh should really take the credit for being the original and prime mover on this project: It was he who came up with the idea of the Manifesto in the first place, and it was he who wrote the very first draft, early in 1994. Though I should immediately add that our thinking on the matters with which the Manifesto deals goes back very much further than that; in some respects, in fact, I think we could claim that it goes all the way back to the beginning of our respective careers in the database field.