Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

Database Design: Creating an Entity Relationship Diagram

Last updated Mar 28, 2003.

If you’re looking for more up-to-date information on this topic, please visit our SQL Server article, podcast, and store pages.

In the last few articles, I've explained the process of designing a database. I've been detailing database objects, and then showed you the process of defining entities. In this series I've also explained a bit about relationships.

As you follow this process, what you're really doing is creating a data model. A data model displays the entities, attributes and relationships involved with a store of data. It's important to realize that a database model is not a process flow diagram; there are no 'start here and 'end here' boxes. They represent a static view of the data and its relationships. They also convey lots of information in a compact space, and are useful to developers, designers, architects and business professionals.

There are really three phases or types of models. The first is the conceptual model. This diagram shows the basic entities alone and how they relate to each other, and are used by management or other high-level participants in the process. These people aren't interested in the details, just the nature of what information they need to keep. I'll cover this tool in another tutorial.

The second type of model is the logical diagram. This diagram contains more detail, and is a bit more specific to a Relational Database Management System (RDBMS). This is what I'll focus on in this tutorial — and I'll use the information from the business data requirements from the last few articles and map it to the symbols and processes used for a logical model.

This model is called an Entity Relationship Diagram, or ERD. The ERD is useful because of the amount of information it displays in a small space, and the relatively few symbols you have to learn to understand it. It's mostly composed of boxes and lines, with a circle here and there. From those primitive symbols you can quickly understand how a database should be laid out. The other advantage is that the logical ERD is not tied to a specific vendor — in other words, you could use this diagram to create a database for SQL Server, Oracle, MySQL or any other RDBMS. As such it's a useful skill to learn.

After you create a logical ERD, the next type of diagram is the physical ERD. This diagram is platform-dependent, and has a great deal of detail. After I flesh out the logical ERD, I'll morph it into my physical ERD, and from there, I can create my database.

I'd like to clear up the terminology a bit. An Entity Relationship Diagram (ERD) is the formal term for a Crow's Foot-Diagram. There are other styles of graphical diagrams to show the model of a database as well. I use the term ERD to describe any or all of these methods. While there are software programs to create these diagrams, what I'm describing is more a process than a result. You can read about other symbol notation here. I find that the Crow's Foot notation is a little easier to read on small pictures, but you can use any type you like, as long as your team agrees on the standard you want to use.

A word is also warranted here about the software that you can use to create these diagrams. There are packages specifically designed to work with an underlying database which will create the first drafts all the way through to the SQL Statements needed to create the database. These packages can cost thousands of dollars to obtain and take weeks of training to fully use. There are also tools such as Microsoft's Visio that will do much the same thing for less money and a shorter learning curve investment. You can also use any graphics program, if all you're after is the documentation part. All you really need is a pencil and paper to create the diagram, as long as you follow the agreed-on symbols.

Let's get to those symbols. I need to describe the entities, their attributes, and relationships between the entities. Because it's a graphic, the ERD describes a lot of information in a very comprehensive way.

You'll recall from previous articles that entities are the basic units that you work with, which normally represent a group of data elements you'll make into a table. Entities are normally discovered from the business requirements for a database as the nouns in the sentence. Entities will eventually become tables in your finished database.

Entities are represented by two shapes. A box like this one represents a 'parent' or owning entity:

Figure 1

Here's where I'll deviate a bit from the standard. Notice that the box above has square edges. That means this entity stands alone — nothing else is needed. We call this a "parent" entity.

If an entity must have some other entity to exist, like a child in real life, I use a rounded box to indicate that - like this one:

Figure 2

The name of the entity is written above the box. As I explain in the articles on the business requirements for this sample project, I can have a client that hasn't started the formal project yet. You'll also recall that I can't have a project without a client, so clients own projects. This simple difference in shape denotes whether an entity is a parent or child.

Attributes are also nouns in the business requirements document, but they are further descriptions of the entity. For instance, blue is a color of socks. In this case, color is an attribute of sock, and blue is the value of that particular sock's color.

Attributes aren't represented with a graphic; they are placed inside the box of their entity. So the Client entity from my diagram might look like this:

Figure 3

Relationships between entities are enforced with key fields between them, but simply having a key doesn't explain how (cardinality), how many (degree), whether a child is required (optionality), and other key information. An ERD helps solve this problem by including the key and graphically demonstrating these items.

The key for an entity is shown by drawing a line near the top of the box, and placing the key field above the line, like this:

Figure 4

You might recognize that I haven't covered the Client Code attribute yet, but I will. One of the great things about an ERD is that it forces you to think things out more iteratively, which might change your design.

If the key is a composite or multiple-value key, place both attribute names above the line.

The other parts of the relationship are shown in the ERD with a series of lines and shapes.

To show the relationship (cardinality), a simple line is drawn between the entities. If there is one child per parent, then a single end to the line is used.

To show a 'many' relationship on the child side, a 'crow's foot' shape is used:

Figure 5

This shows the degree of the relationship. It's also permissible to put a number on the line, representing a set number of items, or even a range.

To show the optionality of the relationship, one of two symbols is used. If children are optional, then an open O shape is used, like this:

Figure 6

If the value is mandatory, then you show that relationship with a cross-bar instead of the open circle, like this:

Figure 7

It's also important to know that this symbol can be used on the parent or the child side of the line. That's not true of the crow's foot, since just as in nature, children have only one set of natural parents! If you find yourself with two entities that are related many times, don't panic. This is called a many-to-many join, and you can resolve it with the use of another entity. I'll cover that in a bit. For now, just draw them like they fall.

Let's put all this together. Take a look at this diagram snippet and see if you can guess what it's trying to tell you:

Figure 8

How did you do? Here's the rundown:

  • A client owns projects
  • A client can have many projects
  • You don't have to have a project to have a client
  • You can't have a project without a client
  • There are various attributes displayed for clients and projects

While it's not quite a thousand words, it does represent a lot of data effectively. When the diagram gets quite large, you'll see that it represents the data model in a way that nothing else really can.

So is all this really necessary? Absolutely! As you can see, putting the model to paper fleshes out the concepts originally defined in the business requirements. This helps you ensure that the database design is well thought-out.

Diagramming the database also helps the developers that will write code against the data. It helps them understand the business rules, and what data they can provide to the users. Finally, diagramming the database is a form of documentation. Since you need to model anyway, you help that documentation to be even clearer. Just after the business requirements, I create an ERD to "talk to" what the database will do.

In future tutorials I'll explain other ways to represent a data design, especially from the developer's perspective. They don't always use an ERD approach, do it's good to understand what they use.