Refactoring Databases: Evolutionary Database Design
- By Scott W. Ambler, Pramod J. Sadalage
- Published Mar 3, 2006 by Addison-Wesley Professional. Part of the Addison-Wesley Signature Series (Fowler) series.
- Copyright 2006
- Dimensions: 7x9-1/4
- Pages: 384
- Edition: 1st
- ISBN-10: 0-321-29353-3
- ISBN-13: 978-0-321-29353-4
Register your product to gain access to bonus material or receive a coupon.
Product Author Bios
Scott W. Ambler is a software process improvement (SPI) consultant living just north of Toronto. He is founder and practice leader of the Agile Modeling (AM) (www.agilemodeling.com), Agile Data (AD) (www.agiledata.org), Enterprise Unified Process (EUP) (www.enterpriseunifiedprocess.com), and Agile Unified Process (AUP) (www.ambysoft.com/unifiedprocess) methodologies. Scott is the (co-)author of several books, including Agile Modeling (John Wiley & Sons, 2002), Agile Database Techniques (John Wiley & Sons, 2003), The Object Primer, Third Edition (Cambridge University Press, 2004), The Enterprise Unified Process (Prentice Hall, 2005), and The Elements of UML 2.0 Style (Cambridge University Press, 2005). Scott is a contributing editor with Software Development magazine (www.sdmagazine.com) and has spoken and keynoted at a wide variety of international conferences, including Software Development, UML World, Object Expo, Java Expo, and Application Development. Scott graduated from the University of Toronto with a Master of Information Science. In his spare time Scott studies the Goju Ryu and Kobudo styles of karate.
Pramod J. Sadalage is a consultant for ThoughtWorks, an enterprise application development and integration company. He first pioneered the practices and processes of evolutionary database design and database refactoring in 1999 while working on a large J2EE application using the Extreme Programming (XP) methodology. Since then, Pramod has applied the practices and processes to many projects. Pramod writes and speaks about database administration on evolutionary projects, the adoption of evolutionary processes with regard to databases, and evolutionary practices’ impact upon database administration, in order to make it easy for everyone to use evolutionary design in regards to databases. When he is not working, you can find him spending time with his wife and daughter and trying to improve his running.
Refactoring has proven its value in a wide range of development projects—helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems.
Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design—without changing semantics. You’ll learn how to evolve database schemas in step with source code—and become far more effective in projects relying on iterative, agile methodologies.
This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone database applications as well as sophisticated multi-application scenarios. You’ll master every task involved in refactoring database schemas, and discover best practices for deploying refactorings in even the most complex production environments.
The second half of this book systematically covers five major categories of database refactorings. You’ll learn how to use refactoring to enhance database structure, data quality, and referential integrity; and how to refactor both architectures and methods. This book provides an extensive set of examples built with Oracle and Java and easily adaptable for other languages, such as C#, C++, or VB.NET, and other databases, such as DB2, SQL Server, MySQL, and Sybase.
Using this book’s techniques and examples, you can reduce waste, rework, risk, and cost—and build database systems capable of evolving smoothly, far into the future.
39 of 44 people found the following review helpful
Excellent refactoring reference and eye-opening book,
This review is from: Refactoring Databases: Evolutionary Database Design (Hardcover)This is an excellent book that, in my opinion, serves two purposes. First, it is a compendium of well thought-out ways to evolve a database design. Each refactoring includes descriptions of why you might make this change, tradeoffs to consider before making it, how to update the schema, how to migrate the data, and how applications that access the data will need to change. Some of the refactorings are simple ones that even the most change-resistant DBAs will have used in the past ("Add index"). Most others (such as "Merge tables" or "Replace LOB with Table") are ones many conventional thinking DBAs avoid, even to the detriment of the applications their databases support.
This brings me to the second purpose of this book. Many DBAs view their jobs as protectors of the data. While that is admirable, they sometimes forget that they are part of a software development team whose job is to provide value to the organization through the development of new (and enhancement of... Read more
13 of 13 people found the following review helpful
Some important ideas but much work remains to be done,
Amazon Verified Purchase(What's this?)
This review is from: Refactoring Databases: Evolutionary Database Design (Hardcover)In software development a 'refactoring' is a change that improves code quality without changing functionality. Refactoring helps keep an application maintainable over its life-cycle as requirements evolve, and is particularly of interest to those adopting modern 'agile' methodologies. This book comprises five general chapters on database refactoring - about 70 pages - followed by a 200 page catalog of various refactorings. The refactorings are classified as 'structural', 'data quality', 'referential integrity', 'architectural' and 'methods'. An additional chapter catalogs 'transformations', more on which in a moment. Each catalog entry uses a template including 'Motivation', 'Tradeoffs', 'Schema Update Mechanics', 'Data-Migration Mechanics' and 'Access Program Update Mechanics'. The 'Mechanics' sections include example code fragments for Oracle, JDBC and Hibernate.
Several of the structural refactorings are just simple database schema changes: rename/drop... Read more
18 of 21 people found the following review helpful
on the right path,
This review is from: Refactoring Databases: Evolutionary Database Design (Hardcover)This book is a much needed exploration on the subject. It tries to categorize those operations that developers and DBAs do on a database, who, for various reasons, must address a specific problem, need. I only gave it three stars because it is somewhat insufficient. It also doesn't make much of a distinction between a database in development and one in production. In the latter case, it is really difficult to make changes when there is already a data structure in place with data, being updated constantly by users, and plans to migrate to a different data model while a system is in production is really not for the faint of heart. What I am really loooking for is a thick book of bad designs that, for various reasons (unclear or evolving requirements, political), a database model presents itself with a bunch of problems, real problems and not just theoretical, and the ways a DBA and developers came about after a big pow-wow on how to solve it.
› See all 23 customer reviews...
Online Sample Chapter
Table of Contents
About the Authors xv
Chapter 1: Evolutionary Database Development 1
Chapter 2: Database Refactoring 13
Chapter 3: The Process of Database Refactoring 29
Chapter 4: Deploying into Production 49
Chapter 5: Database Refactoring Strategies 59
Chapter 6: Structural Refactorings 69
Chapter 7: Data Quality Refactorings 151
Chapter 8: Referential Integrity Refactorings 203
Chapter 9: Architectural Refactorings 231
Chapter 10: Method Refactorings 277
Chapter 11: Transformations 295
Appendix: The UML Data Modeling Notation 315
References and Recommended Reading 327
© Copyright Pearson Education. All rights reserved.
Evolutionary Database Design
Evolutionary, and often agile, software development methodologies, such as Extreme Programming (XP), Scrum, the Rational Unified Process (RUP), the Agile Unified Process (AUP), and Feature-Driven Development (FDD), have taken the information technology (IT) industry by storm over the past few years. For the sake of definition, an evolutionary method is one that is both iterative and incremental in nature, and an agile method is evolutionary and highly collaborative in nature. Furthermore, agile techniques such as refactoring, pair programming, Test-Driven Development (TDD), and Agile Model-Driven Development (AMDD) are also making headway into IT organizations. These methods and techniques have been developed and have evolved in a grassroots manner over the years, being honed in the software trenches, as it were, instead of formulated in ivory towers. In short, this evolutionary and agile stuff seems to work incredibly well in practice.
In the seminal book Refactoring, Martin Fowler describes a refactoring as a small change to your source code that improves its design without changing its semantics. In other words, you improve the quality of your work without breaking or adding anything. In the book, Martin discusses the idea that just as it is possible to refactor your application source code, it is also possible to refactor your database schema. However, he states that database refactoring is quite hard because of the significant levels of coupling associated with databases, and therefore he chose to leave it out of his book.
Since 1999 when Refactoring was published, the two of us have found ways to refactor database schemas. Initially, we worked separately, running into each other at conferences such as Software Development (http://www.sdexpo.com) and on mailing lists (http://www.agiledata.org/feedback.html). We discussed ideas, attended each other's conference tutorials and presentations, and quickly discovered that our ideas and techniques overlapped and were highly compatible with one another. So we joined forces to write this book, to share our experiences and techniques at evolving database schemas via refactoring.
The examples throughout the book are written in Java, Hibernate, and Oracle code. Virtually every database refactoring description includes code to modify the database schema itself, and for some of the more interesting refactorings, we show the effects they would have on Java application code. Because all databases are not created alike, we include discussions of alternative implementation strategies when important nuances exist between database products. In some instances we discuss alternative implementations of an aspect of a refactoring using Oracle-specific features such as the SE,T UNUSED or RENAME TO commands, and many of our code examples take advantage of Oracle's COMMENT ON feature. Other database products include other features that make database refactoring easier, and a good DBA will know how to take advantage of these things. Better yet, in the future database refactoring tools will do this for us. Furthermore, we have kept the Java code simple enough so that you should be able to convert it to C#, C++, or even Visual Basic with little problem at all.
Why Evolutionary Database Development?
Evolutionary database development is a concept whose time has come. Instead of trying to design your database schema up front early in the project, you instead build it up throughout the life of a project to reflect the changing requirements defined by your stakeholders. Like it or not, requirements change as your project progresses. Traditional approaches have denied this fundamental reality and have tried to "manage change," a euphemism for preventing change, through various means. Practitioners of modern development techniques instead choose to embrace change and follow techniques that enable them to evolve their work in step with evolving requirements. Programmers have adopted techniques such as TDD, refactoring, and AMDD and have built new development tools to make this easy. As we have done this, we have realized that we also need techniques and tools to support evolutionary database development.
Advantages to an evolutionary approach to database development include the following:
You minimize waste. An evolutionary, just-in-time (JIT) approach enables you to avoid the inevitable wastage inherent in serial techniques when requirements change. Any early investment in detailed requirements, architecture, and design artifacts is lost when a requirement is later found to be no longer needed. If you have the skills to do the work up front, clearly you must have the skills to do the same work JIT.
You avoid significant rework. As you will see in Chapter 1, "Evolutionary Database Development," you should still do some initial modeling up front to think major issues through, issues that could potentially lead to significant rework if identified late in the project; you just do not need to investigate the details early.
You always know that your system works. With an evolutionary approach, you regularly produce working software, even if it is just deployed into a demo environment, which works. When you have a new, working version of the system every week or two, you dramatically reduce your project's risk.
You always know that your database design is the highest quality possible. This is exactly what database refactoring is all about: improving your schema design a little bit at a time.
You work in a compatible manner with developers. Developers work in an evolutionary manner, and if data professionals want to be effective members of modern development teams, they also need to choose to work in an evolutionary manner.
You reduce the overall effort. By working in an evolutionary manner, you only do the work that you actually need today and no more.
There are also several disadvantages to evolutionary database development:
Cultural impediments exist. Many data professionals prefer to follow a serial approach to software development, often insisting that some form of detailed logical and physical data models be created and baselined before programming begins. Modern methodologies have abandoned this approach as being too inefficient and risky, thereby leaving many data professionals in the cold. Worse yet, many of the "thought leaders" in the data community are people who cut their teeth in the 1970s and 1980s but who missed the object revolution of the 1990s, and thereby missed gaining experience in evolutionary development. The world changed, but they did not seem to change with it. As you will learn in this book, it is not only possible for data professionals to work in an evolutionary, if not agile, manner, it is in fact a preferable way to work.
Learning curve. It takes time to learn these new techniques, and even longer if you also need to change a serial mindset into an evolutionary one.
Tool support is still evolving. When Refactoring was published in 1999, no tools supported the technique. Just a few years later, every single integrated development environment (IDE) has code-refactoring features built right in to it. At the time of this writing, there are no database refactoring tools in existence, although we do include all the code that you need to implement the refactorings by hand. Luckily, the Eclipse Data Tools Project (DTP) has indicated in their project prospectus the need to develop database-refactoring functionality in Eclipse, so it is only a matter of time before the tool vendors catch up.
Agility in a Nutshell
Although this is not specifically a book about agile software development, the fact is that database refactoring is a primary technique for agile developers. A process is considered agile when it conforms to the four values of the Agile Alliance (http://www.agilealliance.org). The values define preferences, not alternatives, encouraging a focus on certain areas but not eliminating others. In other words, whereas you should value the concepts on the right side, you should value the things on the left side even more. For example, processes and tools are important, but individuals and interactions are more important. The four agile values are as follows:
Individuals and interactions OVER processes and tools. The most important factors that you need to consider are the people and how they work together; if you do not get that right, the best tools and processes will not be of any use.
Working software OVER comprehensive documentation. The primary goal of software development is to create working software that meets the needs of its stakeholders. Documentation still has its place; written properly, it describes how and why a system is built, and how to work with the system.
Customer collaboration OVER contract negotiation. Only your customer can tell you what they want. Unfortunately, they are not good at thisthey likely do not have the skills to exactly specify the system, nor will they get it right at first, and worse yet they will likely change their minds as time goes on. Having a contract with your customers is important, but a contract is not a substitute for effective communication. Successful IT professionals work closely with their customers, they invest the effort to discover what their customers need, and they educate their customers along the way.
Responding to change OVER following a plan. As work progresses on your system, your stakeholders' understanding of what they want changes, the business environment changes, and so does the underlying technology. Change is a reality of software development, and as a result, your project plan and overall approach must reflect your changing environment if it is to be effective.
How to Read This Book
The majority of this book, Chapters 6 through 11, consists of reference material that describes each refactoring in detail. The first five chapters describe the fundamental ideas and techniques of evolutionary database development, and in particular, database refactoring. You should read these chapters in order:
Chapter 1, "Evolutionary Database Development," overviews the fundamentals of evolutionary development and the techniques that support it. It summarizes refactoring, database refactoring, database regression testing, evolutionary data modeling via an AMDD approach, configuration management of database assets, and the need for separate developer sandboxes.
Chapter 2, "Database Refactoring," explores in detail the concepts behind database refactoring and why it can be so hard to do in practice. It also works through a database-refactoring example in both a "simple" single-application environment as well as in a complex, multi-application environment.
Chapter 3, "The Process of Database Refactoring," describes in detail the steps required to refactor your database schema in both simple and complex environments. With single-application databases, you have much greater control over your environment, and as a result need to do far less work to refactor your schema. In multi-application environments, you need to support a transition period in which your database supports both the old and new schemas in parallel, enabling the application teams to update and deploy their code into production.
Chapter 4, "Deploying into Production," describes the process behind deploying database refactorings into production. This can prove particularly challenging in a multi-application environment because the changes of several teams must be merged and tested.
Chapter 5, "Database Refactoring Strategies," summarizes some of the "best practices" that we have discovered over the years when it comes to refactoring database schemas. We also float a couple of ideas that we have been meaning to try out but have not yet been able to do so.
About the Cover
Each book in the Martin Fowler Signature Series has a picture of a bridge on the front cover. This tradition reflects the fact that Martin's wife is a civil engineer, who at the time the book series started worked on horizontal projects such as bridges and tunnels. This bridge is the Burlington Bay James N. Allan Skyway in Southern Ontario, which crosses the mouth of Hamilton Harbor. At this site are three bridges: the two in the picture and the Eastport Drive lift bridge, not shown. This bridge system is significant for two reasons. Most importantly it shows an incremental approach to delivery. The lift bridge originally bore the traffic through the area, as did another bridge that collapsed in 1952 after being hit by a ship. The first span of the Skyway, the portion in the front with the metal supports above the roadway, opened in 1958 to replace the lost bridge. Because the Skyway is a major thoroughfare between Toronto to the north and Niagara Falls to the south, traffic soon exceeded capacity. The second span, the one without metal supports, opened in 1985 to support the new load. Incremental delivery makes good economic sense in both civil engineering and in software development. The second reason we used this picture is that Scott was raised in Burlington Ontarioin fact, he was born in Joseph Brant hospital, which is near the northern footing of the Skyway. Scott took the cover picture with a Nikon D70S.
Downloadable Sample Chapter
Download the Sample
Chapter related to this title.
Download the Foreword
file related to this title.
This product currently is not for sale.
Get access to thousands of books and training videos about technology, professional development and digital media from more than 40 leading publishers, including Addison-Wesley, Prentice Hall, Cisco Press, IBM Press, O'Reilly Media, Wrox, Apress, and many more. If you continue your subscription after your 30-day trial, you can receive 30% off a monthly subscription to the Safari Library for up to 12 months. That's a total savings of $199.