Berkeley DB Transactional Data Store Applications
- Why Transactions?
- Terminology
- Application Structure
- Opening the Environment
- Opening the Databases
- Recoverability and Deadlock Avoidance
- Atomicity
- Repeatable Reads
- Transactional Cursors
- Nested Transactions
- Environment Infrastructure
- Deadlock Detection
- Performing Checkpoints
- Database and Log File Archival Procedures
- Log File Removal
- Recovery Procedures
- Recovery and Filesystem Operations
- Berkeley DB Recoverability
- Transaction Throughput
It is difficult to write a useful transactional tutorial and still keep within reasonable bounds of documentation; that is, without writing a book on transactional programming. We have two goals in this section: to familiarize readers with the transactional interfaces of Berkeley DB and to provide code building blocks that will be useful for creating applications.
We have not attempted to present this information using a real-world application. First, transactional applications are often complex and time-consuming to explain. Also, one of our goals is to give you an understanding of the wide variety of tools Berkeley DB makes available to you, and no single application would use most of the interfaces included in the Berkeley DB library. For these reasons, we have chosen to simply present the Berkeley DB data structures and programming solutions, using examples that differ from page to page. All the examples are included in a standalone program you can examine, modify, and run; and from which you will be able to extract code blocks for your own applications. Fragments of the program will be presented throughout this article, and the complete text of the example program for IEEE/ANSI Std 1003.1 (POSIX) standard systems is included in the Berkeley DB distribution.
Why Transactions?
Perhaps the first question to answer is "Why transactions?" There are a number of reasons to include transactional support in your applications. The most common ones are the following:
Recoverability. Applications often need to ensure that no matter how the system or application fails, previously saved data is available the next time the application runs.
Deadlock avoidance. When multiple threads of control change the database at the same time, there is usually the possibility of deadlock; that is, each of the threads of control owns a resource another thread wants, so no thread is able to make forward progress; all waiting for a resource. Deadlocks are resolved by having one of the operations involved release the resources it controls so the other operations can proceed. (The operation releasing its resources usually just tries again later.) Transactions are necessary so that any changes that were already made to the database can be undone as part of releasing the held resources.
Atomicity. Applications often need to make multiple changes to one or more databases, but want to ensure that either all of the changes happen, or none of them happens. Transactions guarantee that a group of changes are atomic; that is, if the application or system fails, either all of the changes to the databases will appear when the application next runs, or none of them will appear.
Repeatable reads. Applications sometimes need to ensure that while doing a group of operations on a database, the value returned as a result of a database retrieval doesn't change; that is, if you retrieve the same key more than once, the data item will be the same each time. Transactions guarantee this behavior.