- Making Virtual Spies—Creating Spy Net in SQL Server 2000
- Creating the SQL Spy Net Application Database
- Using the Data Definition Language (DDL) to Create Databases and Objects
- Bringing Our Data Model and Database Together
- Next Steps
Bringing Our Data Model and Database Together
We have covered a fair amount so far in this chapter, but we really have our application under way now. We have a database, and we have a data model. All we have to do now is implement the data model into our database.
The implementation of the data model will be relatively straightforward, and we will use both Enterprise Manager and Query Analyzer to help us achieve this.
In the rest of this chapter we will discuss some of the best practices that are actually in use out there in the real world. Although these best practices are recognized by many organizations, there are still many differing practices that individual organizations implement.
The good thing about standards is that they are always being refined and redeveloped. As we go through the process of developing (or implementing) our data model we will discuss one of the most common practicesnaming conventions.
Developing the First Database Table for the SQL Spy Net Application
Finally, I will stop talking and let you develop your first table, really truly, I promise!
As we go through the development of our tables, we will revisit our ERD and fill in the blanks.
Note - Normally you wouldn't begin coding until you had the initial design complete (barring minor changes), but I know that you must be really chomping at the bit to get the development underway.
Why do we not start coding before the design is complete? Well this normally ends up requiring us to do a lot of rework. We find that we create tables as we go, don't like the design or flow of the application, and end up having to drop and then re-create the tables. On a tight budget, this can end up chewing much of the budget without any real progress.
So I will break the golden rule and allow you to get on with the application development.
All righty then, let's get things under way.
Expand the SQLSpyNet database in Enterprise Manager and click the Tables folder. You will see a list of the tables currently in the database (only system tables at this stage though).
Right-click the Tables folder to open the pop-up menu that gives you the option to create a new table (see Figure 3.16). Doing this will launch the new table window. This is similar to Microsoft Access's new table editor, another new feature of SQL Server 2000. Within this editor we can create columns, primary keys, and indexes, and even provide descriptions of columns (another new feature).
When you implement a data model, you must start at the highest-level entity. This ensures that when we develop the relationships between the tables, the tables will be available. For example, we will start with our Person entity because this entity does not have a foreign key to any other entity. Other possible entities that we could start with are the ActivityType and the AddressType table.
Figure 3.17 shows the table with the columns defined as well as the primary key.
We are going to implement the PersonID as an identity column. An identity column has an automatically incrementing value. For example, the column will start at one, but when the next row is inserted, SQL Server 2000 will increment the row to two (if the identity increment is defined as one, the default). Every consecutive insert will increment the value in the column. Deleting a value will not decrease the column value.
What Can We Call Our Columns? - Before you actually decide on a name for your column, you should take into consideration some facts.
SQL Server 2000 restricts the characters that you can use when creating or defining a column in table. The first character in the name must include the character sets az, AZ, or _, @, or #. Any characters following on from this must be in the first rule's definition but can also include numbers and the $ sign. The column name must be a minimum of one character long or a maximum length of 128 characters long. Although spaces are allowed in column names, it is not recommended because you must use square brackets () to implement the column names.
The following options will allow us to define the structure of our Person table. We will create the columns and specify the primary key. Besides the limitations I mention in the Excursion sidebar, there are also other limitations that are enforced for your column definitions. SQL Server 2000 has a list of reserved words that should not be used as columns names; this list is included in the online help for SQL Server 2000.
A reserved word is a special term that SQL Server 2000 uses to describe something that has special meaning. For example, you cannot create an object in SQL Server 2000 by naming it with the reserved word CREATE. This is exactly the same as in programming languages such as Visual Basic or C++.
Note - You can implement column names that are defined from reserved words if you enclose them in square brackets; for example, [SELECT] is a perfectly legal column name. However, it pays to avoid this type of naming convention because some ODBC drivers do not interpret the square brackets as they should.
First, ensure the PersonID column Allow Nulls option is not checked. To specify an identity column, the column must not allow NULLs. See the Excursion on NULLs for more on my reasoning for this option.
Preventing NULLs in Our SQL Spy Net Application - We briefly discussed NULLs earlier and I gave you an abbreviated description, but just to refresh your memory, a NULL is an unknown value. A NULL is not equal to either a zero or an empty string (""). In fact, a NULL is not even thought of being equivalent to another NULL! When you compare a NULL to another value, a NULL is returned (though this behavior can be changed).
Many differing opinions exist on the roles of NULLs in database design, and I guess I will upset some people by saying this, but as I mentioned earlier I do not like NULLs!
When I design and develop an application, I try to reduce the amount of NULLs in the application as much as is possible. I believe that the extra overhead caused by checking for the existence of NULLs in a client application is tedious. For example, if you try to JOIN two tables together and the columns that you are joining on contain NULLs, you might not receive the results that you would expect. This would require you to change the type of JOIN you execute, causing extra resources to be used on the server.
One way to reduce NULLs in an application is to introduce default values. We will look at other ways shortly, but for now let's discuss Defaults.
Default values, which we specify for our table columns, allow a value to be automatically entered when data is inserted into the database. For example, if we have an Order Date column of data type datetime, we can specify that the column automatically gets today's date when a new row is inserted into the table.
We define a default with the keyword DEFAULT when we create our tables through code, but when designing a table in Enterprise Manager we enter a value in the Default Value option in the table designer. This ensures that a value is entered into a column even if the user does not enter one.
One other simple way to prevent NULLs is to explicitly specify that we do not want NULLs in our columns by defining them as NOT NULL. If data is inserted into a row and a NULL value is attempted to be inserted into a column that specifies NOT NULL, the insert will fail.
As we develop our application tables, we will explicitly state each column to either allow NULL values or enforce that NULLs are not allowed. This will help us ensure data integrity and prevent us from having data that is not valuable.
We can attempt to reduce NULLs as much as physically possible, but sometimes it is impossible to avoid them. This is especially true for a column that is not updated until a later stage, or if a value at time of entry is unknown.
So the last word on NULLs is, as much as I try to avoid them, they are actually an integral part of a RDBMS, and all you can do is try to reduce them as much as possible.
Note - SQL Server 2000 has by default NOT NULL specified on a column when created. This is where the Relational Database Management System (RDBMS) differs from the ANSI SQL-92 standard, which states that if NOT NULL isn't specified, NULL is the default. This can be changed so that SQL Server 2000 conforms to the standard, but it is better for us to explicitly state our intentions by specifying either NULL or NOT NULL on every column. That way, if we script our tables to implement into another RDBMS (urgghhh!), we will ensure that our intentions are retained.
Second, click the Columns property tab at the bottom of the screen (see Figure 3.17). Set the Identity option to yes. This will automatically populate the Identity Seed (beginning number) and the Identity Increment (value to increment by) with one. This will be fine for our requirements.
To define the primary key for the table, click the PersonID column. This will put an arrow next to the column name. You can then either click the key on the toolbar, or right-click and select Set Primary Key.
Note - This might not mean too much to you at the moment, but SQL Server 2000 creates a unique index on the column you define as a primary key automatically. We will discuss indexes in depth a little later in Chapter 11.
Next enter all the columns names and data types as shown in Figure 3.17.
Finally, click the save icon (the floppy disk icon) and enter the table name, which is Person. We have used the entity name here without defining it as a table like some organizations do. For example, I have seen many examples of companies using tblPerson, tPerson, or even PersonTable. There is little need for this. We know it is a table because it resides in the Tables folder. As we discussed earlier, keep it short and sweet.
You can now close the New Table window. The table you just created will be in the Tables folder (if you don't see it, right-click on the folder and select Refresh). You will see that under the type column the Person table is set to User. This shows that a database user has created the table (in this case sa) rather than it being a system-generated table.
Note - The User tables are sometimes known as base tables because this is where the user data is held. Be careful using the term base tables because some developers refer to top-level (highest-level) entities as base tables, which, as you can imagine, could create some confusion.
A Rose by Any Other Name Still Smells as Sweet - Many companies that I have worked for, and indeed many others around the globe, adopt some type of naming standard for all their database objects, including columns in a table. This is what is called in the industry as good practice and allows developers to easily recognize the context of the object.
Many developers/companies will also use Person_ID or PersonID as the name of the primary key column for the Person table. Whenever you see this throughout the databasethat is, in the Spy tableyou know that it refers to the Person table. This is a very good naming convention to adopt, but be careful. Ensure that anything that has the suffix ID is actually a primary key or foreign key column.
Keep your names simple and meaningful, drop unnecessary datatype definitions from the names, and try to avoid using reserved words.
These naming conventions not only apply to column definitions but also to all database objects.
That was the first database table. Simple, wasn't it? We will now create three more database tables through the user interface and the final four tables through Transact-SQL code, using the CREATE TABLE statement.
You can now create the AddressType table using the same method you used for the Person table. Figure 3.18 displays the columns and the data types as well as the primary key for the table.
The AddressType table is known as a lookup table. The few columns characterize this, as does its placement within our ERD. Lookup tables are usually placed on the peripheral of the ERD, as the AddressType table has been.
This table will be used to record the different types of addresses (postal, physical, and so on) that a person can have. The benefit of modeling in this way means if we want to add more address types (for example, a business address type) at a future date, we can do so without altering the data model.
Another benefit of using lookup tables is that they help us ensure the integrity of our data. If we have a lookup table, we know that every time we use a row with an ID of 5, for example, the row will contain a value of Home Address.
Lookup tables help to prevent people entering information that can be misconstrued due to spelling mistakes.
Exploring Data Integrity - Data integrity allows the data in our database to remain correct and consistent. If the data in the database does become inconsistent, data integrity is absent.
SQL Server 2000 supports and allows you to implement four main categories of data integrity in our application.
Entity integrity is the lowest level of integrity and is enforced by the primary key and related indexes. For example, if a person has a Social Security number of 123497859, no other person can have the same Social Security Number.
Domain integrity enables you to limit the values that a column can accept. For example, you can specify that an Annual Salary column must have a value greater than $10,000.00.
Referential integrity specifies that the values entered in a column exist in another related table. This is enforced through a foreign key. For example, if you enter a person and his address details (which are held in another table), referential integrity would ensure that the Address table must have a related person in the Person table.
User-defined integrity enables the developer to define specific business rules that do not fall into one of the other integrity types.
All these categories enable you to develop your own integrity rules.
There is one other way that we ensure the consistency of the data in our applicationthrough data types. Data types in SQL Server 2000 are similar to data types provided by many programming languages. We have strings (such as char and varchar), integers, money, dates, and many more, including a new type just released in SQL Server 2000 called SQL_variant. If you have done any Microsoft Visual Basic programming, this is similar to the variant type in Visual Basic, allowing many data different data type values to be stored in a column.
With data types we can specify that a column in a table will only accept values that are consistent with that type. For example, a Date of Birth column would be defined as a datetime type. This would prevent users from entering invalid values such as a string like 'unknown' in the column.
We will be implementing most of these categories of data integrity, so you will learn how to create each of these types as we go.
The third table that we will create is another lookup table. The Country table has the same characteristics as the AddressType table, allowing us to add new countries at a later date without having to change the data model. Follow the preceding methods to create this table. See Figure 3.19 for column, data types, and primary key definitions.
The fourth and final table that we will create through the user interface is the Address table. This is the table that captures the addresses that a person might have.
Once again you will use the same method to develop this table as the previous three. Figure 3.20 shows the columns and the data types, as well as the primary keys for the table.
Keys, Keys, and More Keys - Notice that the Address table is a little different in how it defines its primary key. The table has what is known as a composite primary key. The combination of the foreign keys from the other tables makes the primary key for the associative Address table. Why? It is guaranteed then that a person can have only one address type for one country, ensuring integrity of the data.
A composite primary key is a primary key for a table that is made up of more than one field. For example, in our Person, Country, and AddressType, the associative or junction table that combines these tables needs a foreign key to all three tables. Neither CountryID, AddressTypeID, nor PersonID alone can uniquely identify a row because a person can appear in the table more than once, and likewise with an address type and country. The only way to uniquely identify a row in the table is by combining the three foreign keys into the primary key for the Address table. This guarantees uniqueness for any given row.
Although this is a perfectly valid way to define the primary key for this table (and is the recommended method of relational theory), this table could also have another style of primary key. We could define an AddressID column and set it as an identity value. This could act as the primary key for the associative Address table. Many developers use this method for associative tables, and I might step on some toes here, but it does not ensure data integrity. This method would allow us to insert multiple records of the same address type for the same country and the same person. We could create a unique index on this table to prevent this, but there would be little point when the correct primary key definition would do it for us.
For the sake of data integrity and sticking to the rules of relational theory, we will develop the table to have a composite primary key. Although data updates are a little trickier, I believe that the benefits far outweigh the negatives. We will take a look at how to implement the structure in another way shortly.
Tip - If you hold down the Ctrl key you will be able to select more than one column at time.
Because the Address table has the primary key columns from the Person, AddressType, and Country tables, we call these foreign keys. This is the first step in creating a relationship between the tables. As we have entered the foreign key columns, we now need to explicitly define the relationships that these tables have.
3NF, 3NF, Wherefore Art Thou 3NF? - If you look at the Address table that we have just defined, you can see that the relation (table) is not actually in third normal form (3NF). A transitive dependency is in the table. See if you can spot it! I will give you the answer at the end of the chapter.
A transitive dependency is if the non-key attributes are not mutually independent. For example, if a dependency exists in our table, such as a ZIP code that is dependent on a city, that is a transitive dependency.
Diagramming Our Database to Build Relationships Between the Tables
Click the Diagrams folder, right-click, and select New Database Diagram. This will launch the Database Diagram Wizard (if it doesn't, don't worry because we are going to do this task without the wizard). Click Cancel. This will give you a blank canvas waiting for you to perform a task. Right-click the blank canvas, and select the Add Table option, as shown in Figure 3.21.
Note - The database diagram tool is a very cool tool! You can perform a multitude of tasks, including drawing diagrams, defining relationships between tables, modifying columns, and creating new tables. It is very simple to use, and most developers prefer to do everything here because it is all in one place.
This will launch the Add Table dialog window, which contains a list of tables in the database, including the system tables. This is shown in Figure 3.22.
Add the tables that we have just created (Address, AddressType, Country, and Person) to the diagram by selecting the table name and then clicking the Add button.
Tip - If you hold down the Ctrl key you will be able to select more than one table at time.
When our four tables have been added to the diagram, close the Add Tables dialog window.
You can move the new tables around on the database diagram canvas by dragging them. This enables you to arrange them so you can see them all together.
To create the relationships between the tables, we will start with the Person and Address tables. Click the key next to the PersonID column in the Person table and drag it to the key next to the PersonID in the Address table. This will launch the Create Relationship dialog window, as shown in Figure 3.23.
This window is very powerful and introduces one of the many new features to SQL Server 2000.
Let's start at the top. Remember that everything in a SQL Server 2000 database must be uniquely identifiable, including relationship names. As you can see, SQL Server 2000 automatically fills this name in for you. You do have the option to change the relationship name, but this naming convention is fine for our needs.
The default name (FK_Address_Person) describes the relationship well. FK stands for foreign key (the type of relationship we are implementing), the Address is the Address table (the many x side of the relationship), and consequently the Person is the Person table (the one [key] side of the relationship).
Next we can define the columns on which the relationships depend.
As you can see in the primary key side we have PersonID (the primary key of the Person table). In the foreign key side we have the PersonID (the foreign key in the Address table). If you had many more columns you could specify them in this box. As long as the two PersonID columns are shown, as they are in Figure 3.23, this will be fine.
Next in the dialog box is the Check Existing Data on Creation option. This will ensure that the data in the foreign key table will allow you to create a relationship between the tables. Because we have only just created these tables, we have no data in the tables; this option does not affect our tables. Still leave this option checked.
The Enforce Relationship for Replication option specifies that whenever the table is replicated (copied) to another database the referential integrity (foreign key constraint) will remain in place. Although this does not concern us at this time, our application can grow to a stage where this sort of integrity is required, so leave this option selected.
The Enforce Relationship for INSERTs and UPDATEs option is the whole reason why we are implementing a relationship. This ensures that when data is inserted or updated to the Address table a person actually does exist. Leave this option in place.
Finally, we have one of the new features that SQL Server 2000 implements. This is a feature that Microsoft Access developers have had for a long time, and SQL Server developers have had to admit that at least on one score Access had something that SQL Server didn't. Well no more!
The Cascade Update Related Fields option updates the foreign key value in the Address table whenever the primary key value in the Person table is updated. For example, if we had a person with an ID of 5 and they had a record in the Address table, if for some reason the primary key value of 5 was changed to 7, SQL Server would automatically update all records in the Address table. Pretty cool, huh?
The Cascade Delete Related Records option automatically deletes all records in the foreign key table when the primary key table record is deleted. For example, if we have a person with addresses in the Address table and they no longer need to be recorded in the Person table, deleting the Person record would remove all the address records. In relational theory, and before this option was implemented in SQL Server, you had to delete the child records first (foreign key records) before deleting the parent records. If you didn't, you would receive an error preventing you from proceeding with the delete.
Because we do not want to delete records if someone accidentally deletes a person, we will leave the Cascade Delete Related Records option unchecked and do likewise for the Cascade Update Related Fields option. Our Person table has an identity field defined for the primary key, and this cannot be manually changed (well not easily at least), so we will not require this option.
After you have the options set, click OK. SQL Server 2000 will draw a relationship between the tables. The infinity symbol (x) denotes the many side of the relationship, whereas the key denotes the one side. We have a one-to-many relationship now defined between our Person and Address tables, as shown in Figure 3.24.
Note - The asterisk (*) next to the table names in the diagram shows that the changes have not been made to the database, and will not be made until the diagram has been saved.
Now I want you to perform the same task for all the tables. Drag the primary key from the AddressType table to the foreign key of the same name in the Address table. Select the same options you did for the first relationship (except, of course, the column names).
Repeat the same task for the Country and Address tables. You should now have a relationship diagram that looks similar to the one in Figure 3.25.
If you now save the diagram by clicking the floppy disk icon, SQL Server 2000 will prompt you for a name for the database diagram. You can enter anything you would like here, but something relevant would be appropriate, for example Current ERD.
After you click OK, SQL Server 2000 will notify you that the tables will be saved to the database, as shown in Figure 3.26. You can save a text file of the tables affected by the changes. This is a very good feature once your application is in a stable state (not undergoing many changes) or when you want to keep track of changes. However, we do not require this, so just click Yes.
After you click Yes, SQL Server 2000 will perform the necessary database alterations to implement your changes. And that is it! You have now created a relational database. Congratulations! Once again your ability has shined through and you have gained another A+.
Creating Tables the Fun WayWith Code
Boy, you work hard, never giving yourself a break. Because we have implemented half of our database through the user interface, let's create the rest by using Transact-SQL code.
In this section you are introduced to the CREATE TABLE and ALTER TABLE Transact-SQL statements.
The CREATE TABLE syntax is very simple to understand (or at least for our tables it will be!). Once again though, when creating the tables, we should start with the parent table and work our way down to the children tables. This ensures that when we create the relationships the tables exist.
The CREATE TABLE statement for SQL Server 2000 allows us to specify many parameters, including but not limited to the filegroup to which the table belongs, the user who owns the table, the primary and foreign keys, and of course data types and column names.
Most of the parameters of the CREATE TABLE statement you will not need in your day-to-day administration of SQL Server 2000 because they are for more complex data storage issues or for replication and so forth that are not relevant to our application development so will not be covered in this book.
What is the most top-level table left of the four tables in our ERD? The ActivityType table. This table is a lookup table used to define the types of activities that a spy and a bad guy might be involved in. Like the other lookup tables, it allows us to easily add new activity types without having to change the data model.
Note - We are going to create our remaining tables with the CREATE TABLE statement. When we create these tables, as we did through the UI, we will only specify the primary key constraints, not the foreign key constraints. We will use an ALTER TABLE statement after the tables have been created to implement the relationships.
Let's create our ActivityType table. Start Query Analyzer and select the SQLSpyNet database from the drop-down list in the Menu bar.
Warning - Make sure you have the SQLSpyNet database selected. If you don't, SQL Server 2000 will create the table in whichever database you have selected.
Our ActivityType table will have two columns, ActivityTypeID and Description. Enter the code in Listing 3.3 into the database window.
Listing 3.3Creating the ActivityType Table Through Transact-SQL Code
1: CREATE TABLE ActivityType 2: ( 3: ActivityTypeID INT IDENTITY(1,1) NOT NULL 4: CONSTRAINT PK_ActivityTypeID 5: PRIMARY KEY CLUSTERED, 6: Description VARCHAR(50) NOT NULL 7: )
After you run this code the ActivityType table will be created in the SQLSpyNet database. Query Analyzer will return a message similar to the one shown in Figure 3.27.
The command(s) completed successfully.
Note - If you get a message different from this, check the syntax of the statement that you entered and try again.
You can now go back to Enterprise Manager (or the Object Browser in Query Analyzer) and refresh the Tables folder. You will see the ActivityType table in the folder list.
Note - Although our new table exists in the database, it does not exist in our diagram (Current ERD) that we have just drawn. For this table to exist in the diagram, we would need to add the table to the diagram, as we did with the other tables.
So what does the Transact-SQL code mean?
Line 1 specifies to Query Analyzer that we want to create a table and that its name is ActivityType.
The opening parenthesis on line 2 specifies that the column names will be defined inside the parentheses.
Line 3 defines our first column for the table. Its name is ActivityTypeID and it is of data type INT. It is also an identity column. The NOT NULL statement at the end specifies that the column cannot contain a NULL value because it is illegal for identity column and primary key columns to allow NULLs.
Line 4 supplies a name for the primary key constraint that we are deploying for the table. This can be left blank and allow the system to generate a name for us (similar to the foreign key constraint names we discussed earlier), but here we have specified that the name will be PK_ActivityTypeID (PK = primary key, and the column name).
Line 5 specifies that the primary key for the table is the column defined in line 2. How? Notice the comma (,)? This marks the end of a column definition. Because the comma follows the PRIMARY KEY CLUSTERED statement and is not separated from line 3 with a comma, the ActivityTypeID column becomes the primary key. The CLUSTERED keyword specifies that the physical storage of the data on disk will be in order of the primary key specified; that is, 1, 2, 3, 4, and so on. This is analogous to the Dewey decimal system at a library, where all books are sorted and stored based on a key value. If the CLUSTERED keyword is not used, SQL Server 2000 will create a clustered index for the primary key, providing that no other clustered indexes exist on the table. But more on this in Chapter 11.
Line 6 is the second column in our table with a name of Description. It is of a data type VARCHAR that has a maximum size of 50 characters. The Description column cannot contain a NULL value.
The closing parenthesis on line 7 closes the parenthesis opened on line 2.
With that basic code defined we will now create the remaining three tables. We will start with the Spy table. Enter the code in Listing 3.4 for the Spy table.
Listing 3.4The Create Spy Table Transact-SQL
1: CREATE TABLE Spy 2: ( 3: SpyID INT IDENTITY(1,1) NOT NULL 4: CONSTRAINT PK_SpyID 5: PRIMARY KEY CLUSTERED, 6: PersonID INT NOT NULL, 7: SpyNumber VARCHAR(10) NULL, 8: Alias VARCHAR(25) NULL, 9: DateCommencedWork DATETIME NOT NULL 10: DEFAULT (GETDATE()), 11: AnnualSalary MONEY NOT NULL 12: CONSTRAINT Check_AnnualSalary 13: CHECK (AnnualSalary >= 10000), 14: IsActive BIT NOT NULL 15: DEFAULT 1 16: )
This table definition is similar to the preceding table, but we introduce a couple of new parameters here.
The DEFAULT statement specifies that if a value is not supplied, SQL Server 2000 will automatically insert a value. So on line 10, the default value for the DateCommencedWork column will be today's date.
The GETDATE() function is a special SQL Server 2000 function that returns the server's current date and time. We will take a look at functions again in Chapter 6.
A DEFAULT value also exists for the IsActive column. This automatically is set to 1 if the user does not supply a value.
Note - A BIT data type is normally used for a true/false (Boolean) value. A BIT can only have either a 0 or 1 entered, meaning true (1) or false (0). The name of the column IsActive is the naming convention I use most often when referring to a Boolean value.
Line 13 introduces another type of data integrity. The CHECK constraint specifies that the value entered in the AnnualSalary column can be equal to but not less than $10,000.00 (a pretty small sum for an international spy!).
This is one of the benefits of the flexibility of SQL Server 2000. We can easily build in business rulesthe salary rule, for exampleand modify these at a later time if needed.
We will now create our BadGuy table. Enter the code in Listing 3.5 into Query Analyzer and let's begin.
Listing 3.5The Create BadGuy Table Transact-SQL
1: CREATE TABLE BadGuy 2: ( 3: BadGuyID INT IDENTITY(1,1) NOT NULL 4: CONSTRAINT PK_BadGuyID 5: PRIMARY KEY CLUSTERED, 6: PersonID INT NOT NULL, 7: KnownAs VARCHAR(25) NULL, 8: IsActive BIT NOT NULL 9: DEFAULT 1 10: )
This table is a bit simpler than the Spy table, and we have covered all the major points thus far. So let's create the fourth and final table, the Activity table.
Enter the code in Listing 3.6 into Query Analyzer.
Listing 3.6The Create Activity Table Transact-SQL
1: CREATE TABLE Activity 2: ( 3: ActivityID INT IDENTITY(1,1) NOT NULL 4: CONSTRAINT PK_ActivityID 5: PRIMARY KEY CLUSTERED, 6: SpyID INT NOT NULL, 7: BadGuyID INT NOT NULL, 8: ActivityTypeID INT NOT NULL, 9: IsPlanFoiled BIT NOT NULL 10: DEFAULT 0, 11: DatePlanAttempted DATETIME NOT NULL 12: DEFAULT (GETDATE()), 13: DatePlanFoiled DATETIME NULL 14: )
Note - You might have noticed that this table is a little different from the other associative entity (Address table) that we created earlier. This table has its own primary key, the ActivityID. If you remember, for the sake of data integrity we did not implement a primary key like this. Now we are actually going to. Why? This will allow you to use both types of primary keys and decide which one suits you best. We have to do a little more work on this table to ensure that our data integrity is maintained. We will need to create a unique index (like a primary key) to achieve this.
As I specified, we need to do a little more work on this table before it is complete. Enter the code in Listing 3.7 into Query Analyzer and run it
Listing 3.7The Create Unique Index Transact-SQL for the Activity Table
1: CREATE UNIQUE NONCLUSTERED INDEX IDX_Spy_BadGuy_ActType 2: ON Activity (SpyID, BadGuyID, ActivityTypeID)
Okay, so what does the code mean?
Line 1 defines that we want to create a unique index named IDX_Spy_BadGuy_ActType (index on the Spy, BadGuy, and ActivityType columns). This index must be nonclustered because when we created the primary key for the table earlier we specified that as being clustered. You are only allowed one clustered index per table because the clustered index states the way the data is stored on disk. If we had two clustered indexes, which one would we sort by?
Line 2 defines the table on which we want to create the index. In this case it is the Activity table. The bracketed column names are the list of column names that we want to define to be unique in the table. Why don't we include the primary key for the table? If we had the primary key in the list we would always have a unique value because the primary key is an identifier column. Every new row inserted has a new value. We might as well not have a unique index.
We also must create unique indexes on the PersonID in both the Spy and the BadGuy tables because the only way to define a one-to-one relationship is to have a unique index on the foreign key column in the table that has the foreign key. Why would we need to do this? A single spy can only come from one person, and a person cannot be more than one spy, so this is a one-to-one relationship (two no's, remember?). The same is true for the BadGuy table.
Enter the code in Listing 3.8, and the Spy table will be almost complete.
Listing 3.8The Create Unique Index Transact-SQL for the Spy Table
1: CREATE UNIQUE NONCLUSTERED INDEX IDX_Spy_Person 2: ON Spy (PersonID)
And enter the same again for the BadGuy table (see Listing 3.9).
Listing 3.9The Create Unique Index Transact-SQL for the BadGuy Table
1: CREATE UNIQUE NONCLUSTERED INDEX IDX_BadGuy_Person 2: ON BadGuy (PersonID)
After you have run this code you will have created all the tables for the database, as well as created some of the relationships and the indexes. You should now have eight user (or base) tables. You can check this in either Enterprise Manager or Query Analyzer through the Object Browser.
We still need to define the relationships between the tables, as we did for the Person and Address tables, before we are finished.
Let's define the relationships and get on with it! I am sure that if you looked at the ERD, you could probably define the relationships between the tables in the database-diagram tool. But that would be too easy. Instead, I am going to let you enter the ALTER TABLE syntax, which will allow us to create the relationships through Transact-SQL code.
The first relationship we will define is between the Spy and the Person tables.
If you enter the code in Listing 3.10 into Query Analyzer and run it, we will define our first relationship.
Listing 3.10Altering the Spy Table So We Can Specify the Relationship to the Person Table
1: ALTER TABLE Spy 2: ADD CONSTRAINT FK_Spy_Person 3: FOREIGN KEY (PersonID) 4: REFERENCES Person(PersonID)
What does the code mean?
Line 1 specifies to SQL Server 2000 that we are going to change (alter) the definition of the Spy table.
Line 2 explicitly states that we are going to add a new constraint to the Spy table name FK_Spy_Person (foreign key is between the Spy and Person tables).
Line 3 states the type of constraint we are defining. In this case it is a referential integrity (foreign key) constraint. The PersonID is the column in the Spy database that is the foreign key to another table.
Line 4 states that the foreign key we have just created will reference the PersonID column within the Person table.
And that's it. We have now created a relationship between the Person and Spy tables. This is what the database-diagram tool does on your behalf. Now you know why it is so very cool!
We will prove that this relationship has been created in a moment, but first let's enter the other relationship definitions.
The next one to enter is the BadGuy relationship. This is very similar to Spy relationship (virtually a cut and paste job!). See Listing 3.11.
Listing 3.11Altering the BadGuy Table So We Can Specify the Relationship to the Person Table
1: ALTER TABLE BadGuy 2: ADD CONSTRAINT FK_BadGuy_Person 3: FOREIGN KEY (PersonID) 4: REFERENCES Person(PersonID)
Next we will add the foreign key constraints to the Activity table.
Remember this table is linked to three other tables (Spy, BadGuy, and ActivityType), so we will have three foreign key constraints.
Enter the code in Listing 3.12 and run it.
Listing 3.12Altering the Activity Table So We Can Specify the Relationship to the Spy, BadGuy, and ActivityType Tables
1: ALTER TABLE Activity 2: ADD CONSTRAINT FK_ActivitySpy 3: FOREIGN KEY (SpyID) 4: REFERENCES Spy(SpyID), 5: CONSTRAINT FK_Activity_BadGuy 6: FOREIGN KEY (BadGuyID) 7: REFERENCES BadGuy(BadGuyID), 8: CONSTRAINT FK_Activity_ActivityType 9: FOREIGN KEY (ActivityTypeID) 10: REFERENCES ActivityType(ActivityTypeID)
The only key difference between this code and the preceding code is the multiple foreign key constraints. The keyword ADD is used only once, and commas separate the constraints.
We have now completed the data model successfully. Once again you have excelled and brought upon yourself a standard that others will find difficult to achieve. Congratulations! Another A+, you really are a star student!
Just to prove that the relationships did actually work and I am not congratulating you for no good reason, go back into the database diagram tool in Enterprise Manager.
Open the ERD you created earlier (mine was called Current ERD) by double-clicking the ERD icon.
Now add the new tables to the diagram by right-clicking and selecting the Add Table option. Once again arrange the tables so that you can see all eight tables at once.
The relationships are all in place and defined for you. The diagram you will see should look like Figure 3.28.
Notice the relationships between the Spy and Person tables and also between the BadGuy and the Person tables.
The key at either end signifies a one-to-one relationship. Without the unique index on the PersonID column, these relationships would have been incorrect.
Also you might notice how the ERD in SQL Server 2000 represents our ERD that we created earlier. This means that we have achieved our task of implementing our data model. So once again give yourselves a pat on the back!
Filling in the Blanks
We have created our database and our base tables. We have implemented referential integrity (relationships), but we haven't entered any data.
This is where you can be as creative as you like! I will show you how to enter the data in the tables and give you a list of values to insert into each table. But you can add and or remove from the lists until your heart is content. Just remember though, this is not all of the data we will insert; there will be more as we progress, but this is a starting point.
Like creation of tables, we start at the highest-level tables or parent tables. Why? With referential integrity we must have a record that exists in the parent table before we can insert a record into the child table. You can't have a child without a parent!
Let's insert some countries into the Country table. Select the Country table from the Tables folder in the SQLSpyNet database. Make sure you are using Enterprise Manager.
Right-click the table and select Open Table, and then the Return All Rows option. From now on we will refer to this action as open the tablename table in data view. For example: Open the Country table in data view.
When the table is open, click the first row of the table, which should be a blank row. Because the primary key for this table is an identity field (auto-incrementing) we do not need to enter values in the CountryID column. And in fact, SQL Server 2000 will return an error if you try to enter a value in this column. Identity columns make data entry for us very easy!
Select the Description column and type United States in the space. Move to the next row by either clicking with the mouse or pressing the Tab key. You should see the CountryID column automatically assigned a value of 1. See Figure 3.29 for verification.
Our identity column works as we expect, so let's enter some more countries into our Country table.
United StatesAlready entered
You now have a small list of countries. You can expand on this list if you want, but we will probably add more as we develop the application.
Let's do the same for the AddressType table. Once again open the table in data view. Because this table is similar to the Country table (has an identity key), enter the following address types in the Description field:
Address Type Description
Business Postal address
Business Physical address
Home Postal address
Home Physical address
We are now going to repeat the same task for the ActivityType table. Enter the following list into the table:
Activity Type Description
Take over the World!
Chaos and mayhem
Steal your favorite socks
I guess you can see that all these activities have a common theme, but all bad guys are alike, aren't they?
We have populated our lookup tables with data, but I haven't shown you how referential integrity works yet. To prove to you that the referential integrity constraints that we defined do work, we will attempt to enter an address into the Address table without first defining a person to relate the address to.
We are also going to prove the NOT NULL constraint that we defined on the PersonID column in the Address table.
If you remember back to when we developed the tables, we specified that the PersonID column in the Address table could not contain NULL values. We will test this constraint first, and then prove that the referential integrity constraint is doing what we expect as well.
Open the Address table in data view. Currently we do not have a person in the Person table, so we cannot put a value in the PersonID column; we will leave it empty. When you leave a column empty in SQL Server 2000 it defaults to being NULL.
Click in the PersonID column. Because we do not have a person, press Tab to go to the AddressTypeID. When we try to save this row to the database (by going to the next row in the table), SQL Server 2000 will attempt to put a NULL value in this column.
We entered the address types in the AddressType table previously. The AddressTypeID column will accept values that exist in the AddressTypeID column in the AddressType table.
We are going to supply this address with an address type of Home Physical Address. In my AddressType table this row has an AddressTypeID of 4. Your column might differ from this, depending on the order in which you entered your address types.
Tip - You can find out what ID your column has by opening the AddressType table in data view.
Next enter a value for the Country column. Because we are defining this address for New Zealand, we need the primary key for New Zealand from the Country table. In my Country table this has a value of 2; yours can differ depending on the order in which you entered your countries.
Press Tab to move to the Address1 column. Enter the value of 123 Hickory Street. Press Tab and enter a City of Wellington. Press Tab. Because in New Zealand we do not have the concept of states, we shall leave the State column empty.
Press Tab and move to the ZipCode column. The ZIP code for Wellington, New Zealand is 6001; enter this value in the ZipCode column.
When you leave this row, either by pressing Tab or by clicking the next row, SQL Server 2000 will try to save this row to the database. The integrity rules for the database will be checked, and if the data is valid it will be saved. Try to leave the row now.
If you have followed all the instructions thus far (and I am sure you have, you clever cookie you), and your table definitions are correct, SQL Server 2000 will return an error similar to that shown in Figure 3.30.
This is the first error we were expecting. This shows that our NOT NULL constraint that we defined on the PersonID column is firing correctly. Aren't we clever then?
If you click OK, we will now check the referential integrity constraint that we defined between the Address and Person tables.
Click in the PersonID column and enter the value 100. Because we have not entered any people yet, it doesn't really matter what value we enter here, unlike the Country column. Now try to move to the next row in the table.
Once again, SQL Server 2000 will try to save this row to the database, and because this value for the PersonID column violates referential integrity (because there is no parent record in the Person table with the value of 100) the insert will fail. You will see an error message similar to that shown in Figure 3.31.
The error notifies you that the value you are trying to insert conflicts with the foreign key constraint (in this case FK_Address_Person) that is defined on the tables. The statement has been terminated message reflects that this change will not be saved to the database.
What do we do to fix this? We define names of some people to enter in the database so we can create addresses for them.
Open the Person table in data view. Once again our PersonID column is an identity field, so you cannot enter a value in this field. Enter the following list of people into the Person table.
15 Jun 1963
03 May 1942
09 Sep 1941
Ace of What?
10 Aug 1954
Bon Von Hausen
12 Jun 1972
04 Mar 1969
31 Apr 1955
Note - Notice how I have set the dates for the DOB column to medium date format? In New Zealand we enter our dates day/month/year (a throwback from our English heritage). Many times I have come across a server using U.S. date format month/day/year (in New Zealand), which has caused nothing but grief. To help prevent any misunderstandings I prefer to use medium date format. When you leave the field, SQL Server will convert the date to short format, but we at least we know what we intended!
Once again this is not a complete list; we will add to it as we progress through our application development. Don't let this stop you from thinking of other people that you would like to add to this list.
Okay, we actually have some basic data in our base tables. We can now populate our Spy, BadGuy, Address, and Activities tables, but we are not going to because in the next chapter we are going to insert data into the remaining tables using Transact-SQL.
And once again you will see the flexibility of SQL Server 2000.
As you can see, it is quite tedious to enter data directly into the tables as we have. It is also quite difficult to ensure that the data we insert is correct, that is, it links to the correct values for the primary keys.
With Transact-SQL we can write some funky Transact-SQL statements that allow us to perform updates with relative ease. I am sure that you are up for the challenge! So let us finish here with a summary of what we have done so far.
Oops, I almost forgot. What is the transitive dependency in the Address table? This would be the City column, and perhaps State, ZipCode, and so on. These are all independent items that could exist as separate lookup tables. We could have a City table that was dependent on the Country, and just link the CityID into the Address table.
Why haven't we done this? Sometimes we need to understand the impact of having a highly normalized structure. To retrieve someone's address we must combine three or four tables, and this can have a large impact on performance. So we denormalize the structure a little to prevent the overhead of joining the tables every time we need an address. Well this, and when I originally performed the analysis for the book, I forgot to implement this design. Tut, tut.