- Introduction
-
Table of Contents
- Microsoft SQL Server Defined
- Microsoft SQL Server Features
- Microsoft SQL Server Administration
-
Microsoft SQL Server Programming
- An Outline for Development
- Database
- Database Services
- Database Objects: Databases
- Database Objects: Tables
- Database Objects: Table Relationships
- Database Objects: Keys
- Database Objects: Constraints
- Database Objects: Data Types
- Database Objects: Views
- Database Objects: Stored Procedures
- Database Objects: Indexes
- Database Objects: User Defined Functions
- Database Objects: Triggers
- Database Design: Requirements, Entities, and Attributes
- Business Process Model Notation (BPMN) and the Data Professional
- Business Questions for Database Design, Part One
- Business Questions for Database Design, Part Two
- Database Design: Finalizing Requirements and Defining Relationships
- Database Design: Creating an Entity Relationship Diagram
- Database Design: The Logical ERD
- Database Design: Adjusting The Model
- Database Design: Normalizing the Model
- Creating The Physical Model
- Database Design: Changing Attributes to Columns
- Database Design: Creating The Physical Database
- Database Design Example: Curriculum Vitae
- NULLs
- The SQL Server Sample Databases
- The SQL Server Sample Databases: pubs
- The SQL Server Sample Databases: NorthWind
- The SQL Server Sample Databases: AdventureWorks
- The SQL Server Sample Databases: Adventureworks Derivatives
- UniversalDB: The Demo and Testing Database, Part 1
- UniversalDB: The Demo and Testing Database, Part 2
- UniversalDB: The Demo and Testing Database, Part 3
- UniversalDB: The Demo and Testing Database, Part 4
- Getting Started with Transact-SQL
- Transact-SQL: Data Definition Language (DDL) Basics
- Transact-SQL: Limiting Results
- Transact-SQL: More Operators
- Transact-SQL: Ordering and Aggregating Data
- Transact-SQL: Subqueries
- Transact-SQL: Joins
- Transact-SQL: Complex Joins - Building a View with Multiple JOINs
- Transact-SQL: Inserts, Updates, and Deletes
- An Introduction to the CLR in SQL Server 2005
- Design Elements Part 1: Programming Flow Overview, Code Format and Commenting your Code
- Design Elements Part 2: Controlling SQL's Scope
- Design Elements Part 3: Error Handling
- Design Elements Part 4: Variables
- Design Elements Part 5: Where Does The Code Live?
- Design Elements Part 6: Math Operators and Functions
- Design Elements Part 7: Statistical Functions
- Design Elements Part 8: Summarization Statistical Algorithms
- Design Elements Part 9:Representing Data with Statistical Algorithms
- Design Elements Part 10: Interpreting the Data—Regression
- Design Elements Part 11: String Manipulation
- Design Elements Part 12: Loops
- Design Elements Part 13: Recursion
- Design Elements Part 14: Arrays
- Design Elements Part 15: Event-Driven Programming Vs. Scheduled Processes
- Design Elements Part 16: Event-Driven Programming
- Design Elements Part 17: Program Flow
- Forming Queries Part 1: Design
- Forming Queries Part 2: Query Basics
- Forming Queries Part 3: Query Optimization
- Forming Queries Part 4: SET Options
- Forming Queries Part 5: Table Optimization Hints
- Using SQL Server Templates
- Transact-SQL Unit Testing
- Index Tuning Wizard
- Unicode and SQL Server
- SQL Server Development Tools
- The SQL Server Transact-SQL Debugger
- The Transact-SQL Debugger, Part 2
- Basic Troubleshooting for Transact-SQL Code
- An Introduction to Spatial Data in SQL Server 2008
- Performance Tuning
- Practical Applications
- Professional Development
- Application Architecture Assessments
- Business Intelligence
- Tips and Troubleshooting
- Additional Resources
Database Design Example: Curriculum Vitae
Last updated Mar 28, 2003.
I’ve explained the database design process in a series of articles that you can find here. In this tutorial, I’ll put that information to good use and design a database from the start of the process to the finish. To keep this example to a single article, I’ll not repeat the information from those previous articles, so make sure you look those over before reading this one.
I’ll use my standard process for creating this database:
- Define Nouns
- Refine and Group Nouns
- Remove Repeating Nouns
- Creating Joining Entities
- Define Data Types
- Apply Constraints
And I’ll add on the two parts that “bookend” that process — the requirements for the application and the physical statements for creating the database — normally referred to as the “Physical Model.”
Business Requirements
In the United States, whenever you look for a new job you create a “Resume.” In the simplest form of resume, you list out the jobs you’ve previously had, and your education.
In other countries, and indeed in some companies here in the U.S., There’s a larger document called a Curriculum Vitae, or CV. In my experience, a CV is longer and includes more material than a U.S. Resume. These are catching on in the U.S. because businesses and organizations use them for more than just granting a job — they are using them once you get a job to determine whether you fit for a certain promotion or other position. They can even be used by an editor to determine if you’re qualified to write a book, by a group to see if they would like to hear you present a topic to them and so on.
My “CV” is on line, and available to anyone who wants to look at it. It’s just static text and pictures right now, but I would actually like to be able to query the data and present it in multiple ways. For instance, I would like to be able to show, across all jobs, schools and so on, how much writing experience I have. Or perhaps I would like to show how long I’ve been formally teaching classes, or managing people, or working with databases, and so on. Right now you have to manually read each section to find that out.
Also, I would like to be able to hand that database over to a prospective employer to show them my database design skills and that I actually know how to do what they are looking for. In fact, in older times a “finish carpenter” (one who does the more detailed woodwork in a project) would build his own toolbox. As he showed up on the job site to apply for a job, the foreman could simply look at the result of the skill of that carpenter to see if they wanted to hire them. The carpenter quite literally “carried his resume” where everyone could see it. That’s the concept I’m going for in this project. Not only the data in my CV will represent me, but the process of building the CV itself.
What is NOT inside the scope of this project is the presentation. Right now I plan to keep that as the plain text and graphics on my website, but in the future that might go to excel, a dynamic web page, or any other construct that can read from a database. I might even make it a “web service” such that it can be polled from an API asking for all people familiar with Cloud Computing or databases.
Define Nouns
That’s the driving reason or “business requirements” for this database, and the next part of the process is to ask the “customer” (in this case, myself) what kind of information I want to capture. From the Resume I currently have, and from other resume’s and CV’s I find on the web I come up with the following general areas of information I would like to track:
- Organizations I’ve worked with and for
- The skills I’ve gained from those firms — and other locations like hobbies
- My education history — both formal public education and private classes
- Publications I’ve written or contributed to
- The certifications I have held or currently hold
- The locations of where I’ve worked — some organizations are interested in a global presence, and like to see where I’ve lived
Refine and Group Nouns
As a data professional I would begin to question the user on the details of those nouns from that broad list. I’ll start by teasing out the nouns I really want to track, and changing them to a singular tense — that’s just my convention, you can use whatever you like:
- Organization
- Skill
- Education
- Publication
- Certification
- Location
- Reference
Notice I’ve added one more — “Reference." The reason I added this will become clearer as I explain the design, but the background is that each of these areas may have one or more resources that point to them. For instance, the organization I worked at may have a web site, a balance sheet, and so on. I may have the names of people who know my work or teaching at a location, and so on. Those references were varied, and could be queried alone, so I decided to break them out into their own entity.
Next I need to add some detail to those nouns. I added attributes that I wanted to track about each noun, and then grouped them this way:
- Organization
- DateStart
- DateEnd
- OrganizationName
- OrganizationType
- OrganizationSize
- OrganizationIndustry
- Title(s)
- Skill(s)
- Education
- DateEarned
- Institution
- Publication
- DatePublished
- Location
- Topic
- Description
- Certification
- DateGranted
- GrantingOrganization
- Location
- StartDate
- EndDate
- City
- State
- Country
- Reference
- ReferenceType
- ReferenceDetail
Not too bad a start. You’ll notice in some areas I include a date field, and in others I have both a start and end date. When an item doesn’t “expire” (such as an education event like a college degree) I just need the one date. But when I want to know the duration of an event (such as the amount of time I’ve been writing programs) I’ll include both dates so that I can use some math to show how long I’ve had that skill or that I’ve lived in a certain area.
Remove Repeating Nouns
You’ll notice I have some “s’s” in my list. For instance, at one Organization I may have held several titles or titles. In fact, that happens everywhere I go. The “s” means that I need to break those entities into “child” entities pointing back to their “parent” entity:
- Organization
- DateStart
- DateEnd
- OrganizationName
- OrganizationType
- OrganizationSize
- OrganizationIndustry
- OrganizationDetail
- OrganizationTitles
- TitleName
- TitleDescription
- Skill
- SkillID
- StartDate
- EndDate
- SkillArea
- SkillName
- SkillDescription
- Education
- DateEarned
- Institution
- EducationDescription
- Publication
- DatePublished
- Location
- PublicationTopic
- PublicationDescription
- Location
- StartDate
- EndDate
- City
- State
- Country
- Reference
- ReferenceType
- ReferenceDetail
Notice the naming convention I use here. In a child entity, I include the parent name in the table. When I do that, the parent is in the singular, and the child is in the plural. Again, you don’t have to do that, it’s just the way I built my toolbox. :)
Creating Joining Entities
Now I need to join everything up. To do that, I need Primary and Foreign Keys, and I tend to use surrogate or artificial keys — numbers or sequences that don’t have anything to do with the data. In a small database like this, I could certainly find one or two fields in each table that uniquely identify a row and use those, but with a surrogate key I’m guaranteed that I have a unique number.
I also change the design slightly to have a few more attributes that are needed to handling things like putting the Certifications in with the Education entity.
I also have some “many to many” relationships — in specific, one skill or title may be present in many companies, and I may have held those titles multiple times. So I need to create some “tertiary joining tables." I’ve been working in text, but to show others what I’ve done it makes sense to include a graphical representation of my database design. For that I’ll use an Entity Relationship Diagram, or an ERD as I’ve explained previously.
Define Data Types
At this point I’ll switch to using an ERD, although in larger designs I actually start with an ERD for the entire design. It’s a pretty easy decision for most of the data types, with one exception. Some items seem to want a lot of “detail” or explanation, so a binary type field could hold an entire Microsoft Word document to a VARCHAR(MAX) could be used to hold a lot of text. But I chose against this for a couple of reasons, the most important one being that it doesn’t directly address the use of this data. This isn’t intended to be a life story or full narrative — it’s meant to show others what I have done to support what I can do.
The ERD shows the data types I’ve selected, along with the Keys, relationships, and refined columns:
Apply Constraints
My only constraints so far are the Primary and Foreign Keys, and the data types. Those enforce the fact that I can’t have a child entity without a parent.
Another constraint is that I have a data type for each value. That means you can’t put a description in a date field, or dates in number values. In addition, I have certain fields designated as NULL (meaning they don’t need a value) like the “End Date” in an Organization, and others set to not allow blank values.
Beyond that, I don’t have any other glaring needs for constraints, although I could add in CHECK or other constraints, I don’t think I need any at this time. I can always add them later if my initial tests show I need them.
The Final Script
I’ve decided to create this database in Microsoft’s SQL Azure — a SQL Server running in another location. That will help me develop a front-end for the data that can be used just about anywhere, and even become a service later on.
Because of that choice, I have a few constraints I need to consider. The size of this data is quite small, so I’ll keep it to a Web edition of only 1GB. I’ll also need a clustered index on each table, and a primary key — something I normally do anyway.
Here is the completed script — note that it will run on most any edition of SQL Server 2008, because of the DATE type it will not run on earlier versions successfully. If you change that to a DATETIME it will, however.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Organization](
[OrganizationID] [int] NOT NULL,
[DateStart] [date] NOT NULL,
[DateEnd] [date] NULL,
[OrganizationName] [varchar](100) NOT NULL,
[OrganizationSize] [int] NULL,
[OrganizationIndustry] [varchar](100) NULL,
[OrganizationDetail] [varchar](255) NULL,
CONSTRAINT [PK_Organization] PRIMARY KEY CLUSTERED
(
[OrganizationID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Location](
[LocationID] [int] NOT NULL,
[StartDate] [date] NOT NULL,
[EndDate] [date] NULL,
[City] [varchar](150) NULL,
[StateOrProvince] [varchar](200) NOT NULL,
[Country] [varchar](150) NOT NULL,
CONSTRAINT [PK_Location] PRIMARY KEY CLUSTERED
(
[LocationID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Education](
[EducationID] [int] NOT NULL,
[DateEarned] [date] NOT NULL,
[EducationTitle] [varchar](150) NULL,
[EducationType] [varchar](100) NOT NULL,
[GrantingInstitution] [varchar](200) NOT NULL,
[EducationDescription] [varchar](255) NULL,
CONSTRAINT [PK_Education] PRIMARY KEY CLUSTERED
(
[EducationID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Skill](
[SkillID] [int] NOT NULL,
[StartDate] [date] NOT NULL,
[EndDate] [date] NULL,
[SkillArea] [varchar](150) NOT NULL,
[SkillName] [varchar](150) NOT NULL,
[SkillDescription] [varchar](255) NULL,
CONSTRAINT [PK_Skill] PRIMARY KEY CLUSTERED
(
[SkillID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Reference](
[ReferenceID] [int] NOT NULL,
[ReferenceType] [varchar](150) NOT NULL,
[ReferenceDetail] [varchar](255) NOT NULL,
CONSTRAINT [PK_Reference] PRIMARY KEY CLUSTERED
(
[ReferenceID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Publication](
[PublicationID] [int] NOT NULL,
[PublicationLocation] [varchar](200) NULL,
[PublicationTopic] [varchar](255) NULL,
[PublicationDescription] [varchar](255) NULL,
CONSTRAINT [PK_Publication] PRIMARY KEY CLUSTERED
(
[PublicationID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Title](
[TitleID] [int] NOT NULL,
[TitleName] [varchar](100) NULL,
[TitleDescription] [varchar](255) NOT NULL,
CONSTRAINT [PK_Title] PRIMARY KEY CLUSTERED
(
[TitleID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[SkillsOrganizations](
[SkillsOrganizationID] [int] NOT NULL,
[SkillID] [int] NOT NULL,
[OrganizationID] [int] NOT NULL,
CONSTRAINT [PK_SkillsOrganizations] PRIMARY KEY CLUSTERED
(
[SkillsOrganizationID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[OrganizationsTitles](
[OrganizationsTitlesID] [int] NOT NULL,
[TitleID] [int] NOT NULL,
[OrganizationID] [int] NOT NULL,
CONSTRAINT [PK_OrganizationsTitles] PRIMARY KEY CLUSTERED
(
[OrganizationsTitlesID] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF)
)
GO
ALTER TABLE [dbo].[Location] ADD CONSTRAINT [DF_Location_Country] DEFAULT ('United States of America') FOR [Country]
GO
ALTER TABLE [dbo].[SkillsOrganizations] WITH CHECK ADD CONSTRAINT [FK_SkillsOrganizations_Organization] FOREIGN KEY([OrganizationID])
REFERENCES [dbo].[Organization] ([OrganizationID])
GO
ALTER TABLE [dbo].[SkillsOrganizations] CHECK CONSTRAINT [FK_SkillsOrganizations_Organization]
GO
ALTER TABLE [dbo].[SkillsOrganizations] WITH CHECK ADD CONSTRAINT [FK_SkillsOrganizations_Skill] FOREIGN KEY([SkillID])
REFERENCES [dbo].[Skill] ([SkillID])
GO
ALTER TABLE [dbo].[SkillsOrganizations] CHECK CONSTRAINT [FK_SkillsOrganizations_Skill]
GO
ALTER TABLE [dbo].[OrganizationsTitles] WITH CHECK ADD CONSTRAINT [FK_OrganizationsTitles_Organization] FOREIGN KEY([OrganizationID])
REFERENCES [dbo].[Organization] ([OrganizationID])
GO
ALTER TABLE [dbo].[OrganizationsTitles] CHECK CONSTRAINT [FK_OrganizationsTitles_Organization]
GO
ALTER TABLE [dbo].[OrganizationsTitles] WITH CHECK ADD CONSTRAINT [FK_OrganizationsTitles_Title] FOREIGN KEY([TitleID])
REFERENCES [dbo].[Title] ([TitleID])
GO
ALTER TABLE [dbo].[OrganizationsTitles] CHECK CONSTRAINT [FK_OrganizationsTitles_Title]
GO
Care and Feeding
I have yet to add Indexes, Stored Procedures, Views and other constructs, and to add my security and maintenance. Because I’m using SQL Azure, there’s less maintenance required (and available). I’ve discussed how to hook up a Microsoft Access front-end to a SQL Azure database, so I may start there, and then add a Web page, on-site code — perhaps even Business Intelligence displays for the data.

