It’s a sad fact that we spend more time dealing with performance after an application is in production than during its design phase. Yet it’s usually the architecture and design of an application that limits its ultimate performance potential. A hasty design decision made early in the application’s lifecycle can require days or months of corrective action once the application is deployed.
In this article, we’ll review some of the key considerations for designing an application that works effectively with the Oracle database server and which can be expected to achieve optimal performance and scalability.
The Data Model
The performance of any database application is fundamentally constrained by its data model. The data modelmore than any other factordetermines how much work must be undertaken to satisfy a database request. Furthermore, the data model is the hardest aspect of an application to change once deployed. Even small changes to the data model will typically have ramifications through all layers of the application, and there can often be significant downtime when migrating data from the old model to the new model.
Data modeling is an enormous topic, but here are some general principles:
- Start with a normalized data model. Normalization involves eliminating redundancy and ensuring that all data is correctly, completely, and unambiguously represented.
- Use varying length character strings (VARCHAR2) in preference to fixed-length (CHAR) strings. Varying length strings use less storage, resulting in less table scan IO unless the data is truly fixed-length.
- For character data longer than 4000 bytes, choose one of the modern object Large Object (LOB) types, not the legacy LONG data type. If you want the data to be stored outside of the database (in the original files, for instance), then use the BFILE type. If the data contains only text, then use a CLOB; if binary, then use the BLOB type.
- Allowing columns to be NULL can have significant performance implications. NULLs aren’t usually included in indexes, so don’t make a column NULL if you might want to perform an indexed search to find the null values. On the other hand, NULLs take up less storage than a default value, and this can result in smaller tables that are faster to scan.
Subtypes and Super Types
Sometimes you will model an entity with subtypes. For instance, a PERSON entity might have subtypes of CUSTOMER and EMPLOYEE. During the physical modeling phase, you can create one, two, or three tables to implement the entity (see Figure 1). For most applications, two tables (CUSTOMER and EMPLOYEE, for instance) results in the most efficient SQL. The three-table option (PERSON, CUSTOMER, and EMPLOYEE tables, for instance) requires that a join be performed for almost all operations and should be avoided.
Figure 1 Options for modeling super types and subtypes.
The normalized data model may be a good starting point, but we often want to introduce redundant, repeating, or otherwise non-normalized structures into the physical model to get the best performance. For instance, we might:
- Replicate columns from one table in another to avoid joins.
- Create summary tables to avoid expensive aggregate queriespossibly using materialized views.
- Vertically partition a table so that long, infrequently accessed columns are stored in a separate table.
Remember, denormalization introduces the risk of data inconsistency, and the overhead of maintaining denormalized data can slow down transaction processing. Using triggers to maintain denormalization is a good idea because they centralize the logic.
Oracle’s partitioning option requires separate licensing, but offers many advantages:
- Some queries may be able to work on a subset of partitions. This partition elimination can reduce IO overhead.
- Some parallel operationsespecially DML operationscan be significantly faster when partitioning is available.
- Purging old data can sometimes be achieved by quickly dropping a partition instead of laboriously deleting thousands or millions of rows.
- Some forms of contentionhot blocks and latchescan be reduced by splitting up the table across the multiple segments of a partitioned object.