- The Importance of Good Database Design
- Types of Table Relationships
- Understanding Normalization
- Following the Design Process
- Summary
- Q&A

## Understanding Normalization

Normalization is simply a set of rules that will ultimately make your life easier when you're wearing your database administrator hat. It's the art of organizing your database in such a way that your tables are related where appropriate and flexible for future growth.

The sets of rules used in normalization are called *normal forms*. If
your database design follows the first set of rules, it's considered in the
*first normal form*. If the first three sets of rules of normalization are
followed, your database is said to be in the *third normal form*.

Throughout this hour, you'll learn about each rule in the first, second, and third normal forms and hopefully will follow them as you create your own applications. You'll be using an example set of tables for a students and courses database and taking it to the third normal form.

### Problems with the Flat Table

Before launching into the first normal form, you have to start with something
that needs to be fixed. In the case of a database, it's the *flat
table*. A flat table is like a spreadsheet—many, many columns. There are
no relationships between multiple tables; all the data you could possibly want
is right there in that flat table. This scenario is inefficient and consumes
more physical space on your hard drive than a normalized database.

In your students and courses database, assume you have the following fields in your flat table:

`StudentName`—The name of the student.`CourseID1`—The ID of the first course taken by the student.`CourseDescription1`—The description of the first course taken by the student.`CourseIntructor1`—The instructor of the first course taken by the student.`CourseID2`—The ID of the second course taken by the student.`CourseDescription2`—The description of the second course taken by the student.`CourseIntructor2`—The instructor of the second course taken by the student.Repeat

`CourseID`,`CourseDescription`, and`CourseInstructor`columns many more times to account for all the classes a student can take during their academic career.

With what you've learned so far, you should be able to identify the
first problem area: `CourseID`, `CourseDescription`, and
`CourseInstructor` columns are repeated groups.

Eliminating redundancy is the first step in normalization, so next you'll take this flat table to first normal form. If your table remained in its flat format, you could have a lot of unclaimed space, and a lot of space being used unnecessarily—not an efficient table design!

### First Normal Form

The rules for the first normal form include

Eliminate repeating information.

Create separate tables for related data.

If you think about the flat table design, with many repeated sets of fields for the student and courses database, you can identify two distinct topics: students and courses. Taking your student and courses database to the first normal form would mean that you create two tables: one for students and one for courses, shown in Figure 3.9.

**Figure 3.9 Breaking the flat table
into two tables.**

Your two tables now represent a one-to-many relationship of one student to
many courses. Students can take as many courses as they wish and are not limited
to the number of
`CourseID`/`CourseDescription`/`CourseInstructor` groupings
that existed in the flat table.

The next step is to put the tables into second normal form.

### Second Normal Form

The rule for the second normal form is

No non-key attributes depend on a portion of the primary key.

In plain English, this means that if fields in your table are not entirely
related to a primary key, you have more work to do. In the students and courses
example, it means breaking out the courses into their own table, and modifying
the `students_courses` table.

`CourseID`, `CourseDesc`, and `CourseInstructor` can become
a table called `courses` with a primary key of `CourseID`. The
`students_courses` table should then just contain two fields: `StudentID`
and `CourseID`. You can see this new design in Figure
3.10.

**Figure 3.10 Taking your tables to
second normal form.**

This structure should look familiar to you as a many-to-many relationship using an intermediary mapping table. The third normal form is the last form we'll look at, and you'll find it's just as simple to understand as the first two.

### Third Normal Form

The rule for the third normal form is

No attributes depend on other non-key attributes.

This rule simply means that you need to look at your tables and see if more
fields exist that can be broken down further and that aren't dependent
on a key. Think about removing repeated data and you'll find your answer—instructors.
Inevitably, an instructor will teach more than one class. However, `CourseInstructor`
is not a key of any sort. So, if you break out this information and create a
separate table purely for the sake of efficiency and maintenance (as shown in
Figure 3.11), that's the third normal
form.

**Figure 3.11 Taking your tables to
third normal form.**

Third normal form is usually adequate for removing redundancy and allowing for flexibility and growth. The next section will give you some pointers for the thought process of database design and where it fits in the overall design process of your application.