- Table of Contents
- Microsoft SQL Server Defined
- Microsoft SQL Server Features
- Microsoft SQL Server Administration
- Microsoft SQL Server Programming
- An Outline for Development
- Database Services
- Database Objects: Databases
- Database Objects: Tables
- Database Objects: Table Relationships
- Database Objects: Keys
- Database Objects: Constraints
- Database Objects: Data Types
- Database Objects: Views
- Database Objects: Stored Procedures
- Database Objects: Indexes
- Database Objects: User Defined Functions
- Database Objects: Triggers
- Database Design: Requirements, Entities, and Attributes
- Business Process Model Notation (BPMN) and the Data Professional
- Business Questions for Database Design, Part One
- Business Questions for Database Design, Part Two
- Database Design: Finalizing Requirements and Defining Relationships
- Database Design: Creating an Entity Relationship Diagram
- Database Design: The Logical ERD
- Database Design: Adjusting The Model
- Database Design: Normalizing the Model
- Creating The Physical Model
- Database Design: Changing Attributes to Columns
- Database Design: Creating The Physical Database
- Database Design Example: Curriculum Vitae
- The SQL Server Sample Databases
- The SQL Server Sample Databases: pubs
- The SQL Server Sample Databases: NorthWind
- The SQL Server Sample Databases: AdventureWorks
- The SQL Server Sample Databases: Adventureworks Derivatives
- UniversalDB: The Demo and Testing Database, Part 1
- UniversalDB: The Demo and Testing Database, Part 2
- UniversalDB: The Demo and Testing Database, Part 3
- UniversalDB: The Demo and Testing Database, Part 4
- Getting Started with Transact-SQL
- Transact-SQL: Data Definition Language (DDL) Basics
- Transact-SQL: Limiting Results
- Transact-SQL: More Operators
- Transact-SQL: Ordering and Aggregating Data
- Transact-SQL: Subqueries
- Transact-SQL: Joins
- Transact-SQL: Complex Joins - Building a View with Multiple JOINs
- Transact-SQL: Inserts, Updates, and Deletes
- An Introduction to the CLR in SQL Server 2005
- Design Elements Part 1: Programming Flow Overview, Code Format and Commenting your Code
- Design Elements Part 2: Controlling SQL's Scope
- Design Elements Part 3: Error Handling
- Design Elements Part 4: Variables
- Design Elements Part 5: Where Does The Code Live?
- Design Elements Part 6: Math Operators and Functions
- Design Elements Part 7: Statistical Functions
- Design Elements Part 8: Summarization Statistical Algorithms
- Design Elements Part 9:Representing Data with Statistical Algorithms
- Design Elements Part 10: Interpreting the Data—Regression
- Design Elements Part 11: String Manipulation
- Design Elements Part 12: Loops
- Design Elements Part 13: Recursion
- Design Elements Part 14: Arrays
- Design Elements Part 15: Event-Driven Programming Vs. Scheduled Processes
- Design Elements Part 16: Event-Driven Programming
- Design Elements Part 17: Program Flow
- Forming Queries Part 1: Design
- Forming Queries Part 2: Query Basics
- Forming Queries Part 3: Query Optimization
- Forming Queries Part 4: SET Options
- Forming Queries Part 5: Table Optimization Hints
- Using SQL Server Templates
- Transact-SQL Unit Testing
- Index Tuning Wizard
- Unicode and SQL Server
- SQL Server Development Tools
- The SQL Server Transact-SQL Debugger
- The Transact-SQL Debugger, Part 2
- Basic Troubleshooting for Transact-SQL Code
- An Introduction to Spatial Data in SQL Server 2008
- Performance Tuning
- Practical Applications
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
Database Design: Creating an Entity Relationship Diagram
Last updated May 13, 2011.
In the last few articles, I've explained the process of designing a database. I've been detailing database objects, and then showed you the process of defining entities. In this series I've also explained a bit about relationships.
As you follow this process, what you're really doing is creating a data model. A data model displays the entities, attributes and relationships involved with a store of data. It's important to realize that a database model is not a process flow diagram; there are no 'start here and 'end here' boxes. They represent a static view of the data and its relationships. They also convey lots of information in a compact space, and are useful to developers, designers, architects and business professionals.
There are really three phases or types of models. The first is the conceptual model. This diagram shows the basic entities alone and how they relate to each other, and are used by management or other high-level participants in the process. These people aren't interested in the details, just the nature of what information they need to keep. I'll cover this tool in another tutorial.
The second type of model is the logical diagram. This diagram contains more detail, and is a bit more specific to a Relational Database Management System (RDBMS). This is what I'll focus on in this tutorial — and I'll use the information from the business data requirements from the last few articles and map it to the symbols and processes used for a logical model.
This model is called an Entity Relationship Diagram, or ERD. The ERD is useful because of the amount of information it displays in a small space, and the relatively few symbols you have to learn to understand it. It's mostly composed of boxes and lines, with a circle here and there. From those primitive symbols you can quickly understand how a database should be laid out. The other advantage is that the logical ERD is not tied to a specific vendor — in other words, you could use this diagram to create a database for SQL Server, Oracle, MySQL or any other RDBMS. As such it's a useful skill to learn.
After you create a logical ERD, the next type of diagram is the physical ERD. This diagram is platform-dependent, and has a great deal of detail. After I flesh out the logical ERD, I'll morph it into my physical ERD, and from there, I can create my database.
I'd like to clear up the terminology a bit. An Entity Relationship Diagram (ERD) is the formal term for a Crow's Foot-Diagram. There are other styles of graphical diagrams to show the model of a database as well. I use the term ERD to describe any or all of these methods. While there are software programs to create these diagrams, what I'm describing is more a process than a result. You can read about other symbol notation here. I find that the Crow's Foot notation is a little easier to read on small pictures, but you can use any type you like, as long as your team agrees on the standard you want to use.
A word is also warranted here about the software that you can use to create these diagrams. There are packages specifically designed to work with an underlying database which will create the first drafts all the way through to the SQL Statements needed to create the database. These packages can cost thousands of dollars to obtain and take weeks of training to fully use. There are also tools such as Microsoft's Visio that will do much the same thing for less money and a shorter learning curve investment. You can also use any graphics program, if all you're after is the documentation part. All you really need is a pencil and paper to create the diagram, as long as you follow the agreed-on symbols.
Let's get to those symbols. I need to describe the entities, their attributes, and relationships between the entities. Because it's a graphic, the ERD describes a lot of information in a very comprehensive way.
You'll recall from previous articles that entities are the basic units that you work with, which normally represent a group of data elements you'll make into a table. Entities are normally discovered from the business requirements for a database as the nouns in the sentence. Entities will eventually become tables in your finished database.
Entities are represented by two shapes. A box like this one represents a 'parent' or owning entity:
Here's where I'll deviate a bit from the standard. Notice that the box above has square edges. That means this entity stands alone — nothing else is needed. We call this a "parent" entity.
If an entity must have some other entity to exist, like a child in real life, I use a rounded box to indicate that - like this one:
The name of the entity is written above the box. As I explain in the articles on the business requirements for this sample project, I can have a client that hasn't started the formal project yet. You'll also recall that I can't have a project without a client, so clients own projects. This simple difference in shape denotes whether an entity is a parent or child.
Attributes are also nouns in the business requirements document, but they are further descriptions of the entity. For instance, blue is a color of socks. In this case, color is an attribute of sock, and blue is the value of that particular sock's color.
Attributes aren't represented with a graphic; they are placed inside the box of their entity. So the Client entity from my diagram might look like this:
Relationships between entities are enforced with key fields between them, but simply having a key doesn't explain how (cardinality), how many (degree), whether a child is required (optionality), and other key information. An ERD helps solve this problem by including the key and graphically demonstrating these items.
The key for an entity is shown by drawing a line near the top of the box, and placing the key field above the line, like this:
You might recognize that I haven't covered the Client Code attribute yet, but I will. One of the great things about an ERD is that it forces you to think things out more iteratively, which might change your design.
If the key is a composite or multiple-value key, place both attribute names above the line.
The other parts of the relationship are shown in the ERD with a series of lines and shapes.
To show the relationship (cardinality), a simple line is drawn between the entities. If there is one child per parent, then a single end to the line is used.
To show a 'many' relationship on the child side, a 'crow's foot' shape is used:
This shows the degree of the relationship. It's also permissible to put a number on the line, representing a set number of items, or even a range.
To show the optionality of the relationship, one of two symbols is used. If children are optional, then an open O shape is used, like this:
If the value is mandatory, then you show that relationship with a cross-bar instead of the open circle, like this:
It's also important to know that this symbol can be used on the parent or the child side of the line. That's not true of the crow's foot, since just as in nature, children have only one set of natural parents! If you find yourself with two entities that are related many times, don't panic. This is called a many-to-many join, and you can resolve it with the use of another entity. I'll cover that in a bit. For now, just draw them like they fall.
Let's put all this together. Take a look at this diagram snippet and see if you can guess what it's trying to tell you:
How did you do? Here's the rundown:
- A client owns projects
- A client can have many projects
- You don't have to have a project to have a client
- You can't have a project without a client
- There are various attributes displayed for clients and projects
While it's not quite a thousand words, it does represent a lot of data effectively. When the diagram gets quite large, you'll see that it represents the data model in a way that nothing else really can.
So is all this really necessary? Absolutely! As you can see, putting the model to paper fleshes out the concepts originally defined in the business requirements. This helps you ensure that the database design is well thought-out.
Diagramming the database also helps the developers that will write code against the data. It helps them understand the business rules, and what data they can provide to the users. Finally, diagramming the database is a form of documentation. Since you need to model anyway, you help that documentation to be even clearer. Just after the business requirements, I create an ERD to "talk to" what the database will do.
In future tutorials I'll explain other ways to represent a data design, especially from the developer's perspective. They don't always use an ERD approach, do it's good to understand what they use.