- Introduction
-
Table of Contents
- Microsoft SQL Server Defined
- Microsoft SQL Server Features
- Microsoft SQL Server Administration
-
Microsoft SQL Server Programming
- An Outline for Development
- Database
- Database Services
- Database Objects: Databases
- Database Objects: Tables
- Database Objects: Table Relationships
- Database Objects: Keys
- Database Objects: Constraints
- Database Objects: Data Types
- Database Objects: Views
- Database Objects: Stored Procedures
- Database Objects: Indexes
- Database Objects: User Defined Functions
- Database Objects: Triggers
- Database Design: Requirements, Entities, and Attributes
- Business Process Model Notation (BPMN) and the Data Professional
- Business Questions for Database Design, Part One
- Business Questions for Database Design, Part Two
- Database Design: Finalizing Requirements and Defining Relationships
- Database Design: Creating an Entity Relationship Diagram
- Database Design: The Logical ERD
- Database Design: Adjusting The Model
- Database Design: Normalizing the Model
- Creating The Physical Model
- Database Design: Changing Attributes to Columns
- Database Design: Creating The Physical Database
- Database Design Example: Curriculum Vitae
- NULLs
- The SQL Server Sample Databases
- The SQL Server Sample Databases: pubs
- The SQL Server Sample Databases: NorthWind
- The SQL Server Sample Databases: AdventureWorks
- The SQL Server Sample Databases: Adventureworks Derivatives
- UniversalDB: The Demo and Testing Database, Part 1
- UniversalDB: The Demo and Testing Database, Part 2
- UniversalDB: The Demo and Testing Database, Part 3
- UniversalDB: The Demo and Testing Database, Part 4
- Getting Started with Transact-SQL
- Transact-SQL: Data Definition Language (DDL) Basics
- Transact-SQL: Limiting Results
- Transact-SQL: More Operators
- Transact-SQL: Ordering and Aggregating Data
- Transact-SQL: Subqueries
- Transact-SQL: Joins
- Transact-SQL: Complex Joins - Building a View with Multiple JOINs
- Transact-SQL: Inserts, Updates, and Deletes
- An Introduction to the CLR in SQL Server 2005
- Design Elements Part 1: Programming Flow Overview, Code Format and Commenting your Code
- Design Elements Part 2: Controlling SQL's Scope
- Design Elements Part 3: Error Handling
- Design Elements Part 4: Variables
- Design Elements Part 5: Where Does The Code Live?
- Design Elements Part 6: Math Operators and Functions
- Design Elements Part 7: Statistical Functions
- Design Elements Part 8: Summarization Statistical Algorithms
- Design Elements Part 9:Representing Data with Statistical Algorithms
- Design Elements Part 10: Interpreting the Data—Regression
- Design Elements Part 11: String Manipulation
- Design Elements Part 12: Loops
- Design Elements Part 13: Recursion
- Design Elements Part 14: Arrays
- Design Elements Part 15: Event-Driven Programming Vs. Scheduled Processes
- Design Elements Part 16: Event-Driven Programming
- Design Elements Part 17: Program Flow
- Forming Queries Part 1: Design
- Forming Queries Part 2: Query Basics
- Forming Queries Part 3: Query Optimization
- Forming Queries Part 4: SET Options
- Forming Queries Part 5: Table Optimization Hints
- Using SQL Server Templates
- Transact-SQL Unit Testing
- Index Tuning Wizard
- Unicode and SQL Server
- SQL Server Development Tools
- The SQL Server Transact-SQL Debugger
- The Transact-SQL Debugger, Part 2
- Basic Troubleshooting for Transact-SQL Code
- An Introduction to Spatial Data in SQL Server 2008
- Performance Tuning
- Practical Applications
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
Creating The Physical Model
Last updated Mar 28, 2003.
In the last few articles in this series, I’ve explained the process to develop a database design in an intelligent, systematic way. I've covered the process for gathering business requirements, the conversion of those requirements into logical statements, the creation of entities, attributes and relationships, and the creation of the logical Entity Relationship Diagram (ERD) a graphical tool showing this logical design.
The example database I’m designing tracks the hours consultants spend on projects. I've been careful to design the entities and attributes in accordance with the requirements document in fact, making sure the requirements reflect the data, rules and processes for the system forms the basis of everything else I’ll show you in this process. It’s the single-most important step.
After the initial design, I explained a problem with the relationship aspect of the diagram. I had to go back to the design committee for clarification, and once I did, I altered the design to properly reflect the relationship. While I was at it, I added a few attributes that were missing.
This design is what is delivered to the business community, along with the business rules, user documentation and so forth, in the form of that graphical Entity Relationship Diagram. That serves as a discussion vehicle, something that has simple symbols and text to rapidly convey what information the database will store, and the rules that will be enforced for optionality (whether something can, should or must exist) and cardinality (how data is related to each other). Once all of the teams agree on the logical layout of the data, the next step is to turn that diagram into a physical model for whatever platform the organization will use. In this series, I’ll use SQL Server, but if you pick another database system such as Oracle or XML files, the diagram will look different. This is the reason for separating the logical ERD from the physical design it allows the business and developers to create a program that can be implemented on multiple platforms, and the data professionals to focus on the best way to implement the actual platform in the most secure, best performing method.
The business community normally doesn't have much use for the physical ERD I’ll make next. That deliverable is used more by the database analysts, administrators and developers. Not every shop uses a graphical ERD for a physical model. In some cases you’ll go straight from the logical ERD graphic to Transact-SQL statements, and that’s OK. If I’m using an expensive tool for the data design, it often has the capability to simply add an option to show either the logical or physical ERD. I find that the larger the project, the more graphics I use, since I often need to coordinate with more people.
There are several ways to convert a logical design to a physical implementation, but the route I usually use looks something like this:
- Apply first normal form rules to the logical ERD
- Convert the entities into tables
- Convert the attributes into fields
- Create primary keys
- Apply second normal form rules to the logical ERD
- Create foreign keys
- Enforce business rules with constraints and defaults
- Apply third normal form rules to the logical ERD
- Add any platform components
While this might look like a short list, it's longer than you might think. Let's try it out, one step at a time.
Apply first normal form rules to the logical ERD
Recall that first normal form specifies separating out any repeating entities into multiple entities.
Convert the entities into tables
The logical model I have largely fits this rule, so the first part of the process is complete.
Convert the attributes into fields
So far, the logical diagram holds with a couple of small changes. For one, I’ll change any spaces in words to underlines. While SQL Server allows spaces in the names of many objects, I’d have to bracket [ ] them when I reference them in any T-SQL statements. That's inviting an error, and I would like to avoid errors where possible.
I’ve also changed the singular notation on the entities to plural. While this seems a minor change, people can get really concerned about it. My personal rule of thumb is that when I designing logical entities, I'm describing a concept. Therefore, Client is a singular entity, so I keep it singular. Hours and Skills don't stand alone, so those entities I keep plural. When I convert the logical entities to physical tables, I'm no longer describing concepts, but buckets. For that reason, I’ll stick with naming the tables in the plural. There are other schools of thought on this, but in the end the key is to be understandable and consistent.
I’m not done with the fields just yet, though. The other part of this process I’ll have to complete is setting data types, nullability and so on. I’ll complete that in a bit.
Create primary keys
At this point I’ll add my primary keys. In another article on this site I discussed primary keys at length, but for my purposes here I'll simplify the process for creating a primary key into just one rule:
Set a primary key that completely identifies each row in the table in a unique way.
It's a simple rule, and yet its application is a bit more complex. As you may recall from the earlier tutorial regarding keys, there are two possibilities. The first is that a current attribute fits the bill. This is called a "natural" key. In my case, the Staff Member, Client, and Project entities already have unique tracking numbers assigned to them, so they make great primary keys.
Just as often I’ll follow the second practice of setting the primary key to something that never occurs naturally in the database it’s just a number or series of characters that I make up to represent the actual values on that row, called a “surrogate” key. It’s like the way your employee number at work represents your name and hire date.
An important aspect to the simple rule from above is that the primary key shouldn't change, ever. While you can change a primary key later, it's painful. This brings up an issue with the choice for the primary key. If the natural key might change or may not exist, use a surrogate or artificial key. I’ll have to do that for the Hours and Skills entities, since the values I’m storing there are not unique, and may change over time.
Apply second normal form rules to the logical ERD
Second normal form says that I need to separate out any shared attributes into separate tables. I’ve largely accomplished that goal already, but an additional part of this form is that in separating these attributes into other tables, I need to set up the foreign keys. These are the non-unique values in one table that point to the parent or relating table.
Create foreign keys
In my diagram, I already have pointer attributes (which convert to foreign keys) in the Hours entity. Notice the Project Code and Staff Code attributes in the logical diagram. In the physical model I’ve renamed those attributes to fields called Project_Code and Staff_Code.
In addition, I need to create a few more foreign key fields. I’ve added Staff_Code to the Skills table, and Client_Code to the Projects table.
It's no accident that I’ve named the foreign keys the same as the primary keys in their parent tables. This is usually a good practice, but if it isn't possible I could preface the name with FK_ or some other naming convention. Many tools will “pick up” on the duplicate names and infer that you want them to relate to each other, and some are even intelligent enough to use the Primary/Foreign Key designations to help your users design reports, screens and so on. The easier you make the process on your users, the less they need you to be involved and the fewer errors you’ll face.
Apply third normal form rules to the logical ERD
Third normal form dictates that I have to make sure that all the fields in a table depend on their primary key. It's easier to see this with an example. Let's say that I had a table called Employees with the following fields:
- Employee_Code
- Name
- Address
- Phone
- Nationality
At first glance, this table might seem to conform to third normal form. But what about that Nationality field? While it is true that the employee can only have one nationality, it may violate third normal form because the nationality of an employee isn't dependent on their employee code. Nationality may need to be moved out to its own table and a relating table created to match up Nationality to Employee.
I examine my current design, and believe that I am in third normal form.
Enforce business rules with constraints and defaults
Next I need to apply some constraints to the diagram. In the tutorial regarding database objects I explained a few of the constraints you have available and how to apply them, but here's that short list again:
- Keys, primary and foreign
- Not Null
- Check
- Unique
This process is often the part that takes the longest, and requires the most care. Even in my simple database, this process requires a lot of work, so I'm going to table this part of this part of the tutorial for next week. I wanted to make sure I covered the entire process, but I want to balance that with giving this section its due.
Add any platform components
Here's where I set the fields to the data types, nullability, and other SQL Server-specific constructs. Since this part of the process is also quite lengthy, I'll cover it in the next part of the series with the constraints section.
There's more to do, and I haven't even started my indexes, storage strategy, and maintenance. Even after that, I still have to explain programming considerations such as views, stored procedures and the security. We’ll get to all that in the next few articles in this series.
