Tables: The Primary Storage for Data
The table is the primary storage object for data in a relational database. In its simplest form, a table consists of row(s) and column(s), both of which hold the data. A table takes up physical space in a database and can be permanent or temporary.
A field, also called a column in a relational database, is part of a table that is assigned a specific data type. The data type determines what kind of data the column is allowed to hold. This enables the designer of the table to help maintain the integrity of the data.
Every database table must consist of at least one column. Columns are those elements within a table that hold specific types of data, such as a person's name or address. For example, a valid column in a customer table might be the customer's name. Figure 3.2 illustrates a column in a table.
Figure 3.2 An example of a column.
Generally, a column name must be one continuous string and can be limited to the number of characters used according to each implementation of SQL. It is typical to use underscores with names to provide separation between characters. For example, a column for the customer's name can be named CUSTOMER_NAME instead of CUSTOMERNAME. This is normally done to increase the readability of database objects. There are other naming conventions that you can utilize, such as Camel Case, to fit your specific preferences. As such, it is important for a database development team to agree upon a standard naming convention and stick to it so that order is maintained within the development process.
The most common form of data stored within a column is string data. This data can be stored as either uppercase or lowercase for character-defined fields. The case that you use for data is simply a matter of preference, which should be based on how the data will be used. In many cases, data is stored in uppercase for simplicity and consistency. However, if data is stored in different case types throughout the database (uppercase, lowercase, and mixed case), functions can be applied to convert the data to either uppercase or lowercase if needed. These functions are covered in Hour 11, "Restructuring the Appearance of Data."
Columns also can be specified as NULL or NOT NULL, meaning that if a column is NOT NULL, something must be entered. If a column is specified as NULL, nothing has to be entered. NULL is different from an empty set, such as an empty string, and holds a special place in database design. As such, you can relate a NULL value to a lack of any data in the field.
A row is a record of data in a database table. For example, a row of data in a customer table might consist of a particular customer's identification number, name, address, phone number, and fax number. A row is composed of fields that contain data from one record in a table. A table can contain as little as one row of data and up to as many as millions of rows of data or records. Figure 3.3 illustrates a row within a table.
Figure 3.3 Example of a table row.
The CREATE TABLE Statement
The CREATE TABLE statement in SQL is used to create a table. Although the very act of creating a table is quite simple, much time and effort should be put into planning table structures before the actual execution of the CREATE TABLE statement. Carefully planning your table structure before implementation saves you from having to reconfigure things after they are in production.
Some elementary questions need to be answered when creating a table:
- What type of data will be entered into the table?
- What will be the table's name?
- What column(s) will compose the primary key?
- What names shall be given to the columns (fields)?
- What data type will be assigned to each column?
- What will be the allocated length for each column?
- Which columns in a table can be left as a null value?
After these questions are answered, the actual CREATE TABLE statement is simple.
The basic syntax to create a table is as follows:
CREATE TABLE table_name ( field1 data_type [ not null ], field2 data_type [ not null ], field3 data_type [ not null ], field4 data_type [ not null ], field5 data_type [ not null ] );
Note that a semicolon is the last character in the previous statement. Also, brackets indicate portions that are optional. Most SQL implementations have some character that terminates a statement or submits a statement to the database server. Oracle, Microsoft SQL Server, and MySQL use the semicolon. Although Transact-SQL, Microsoft SQL Server's ANSI SQL version, has no such requirement, it is considered best practice to use it. This book uses the semicolon.
Create a table called EMPLOYEE_TBL in the following example using the syntax for MySQL:
CREATE TABLE EMPLOYEE_TBL (EMP_ID CHAR(9) NOT NULL, EMP_NAME VARCHAR (40) NOT NULL, EMP_ST_ADDR VARCHAR (20) NOT NULL, EMP_CITY VARCHAR (15) NOT NULL, EMP_ST CHAR(2) NOT NULL, EMP_ZIP INTEGER(5) NOT NULL, EMP_PHONE INTEGER(10) NULL, EMP_PAGER INTEGER(10) NULL);
The following code would be the compatible code for both Microsoft SQL Server and Oracle:
CREATE TABLE EMPLOYEE_TBL (EMP_ID CHAR(9) NOT NULL, EMP_NAME VARCHAR (40) NOT NULL, EMP_ST_ADDR VARCHAR (20) NOT NULL, EMP_CITY VARCHAR (15) NOT NULL, EMP_ST CHAR(2) NOT NULL, EMP_ZIP INTEGER NOT NULL, EMP_PHONE INTEGER NULL, EMP_PAGER INTEGER NULL);
Eight different columns make up this table. Notice the use of the underscore character to break the column names up into what appears to be separate words (EMPLOYEE ID is stored as EMP_ID). This is a technique that is used to make table or column name more readable. Each column has been assigned a specific data type and length, and by using the NULL/NOT NULL constraint, you have specified which columns require values for every row of data in the table. The EMP_PHONE is defined as NULL, meaning that NULL values are allowed in this column because there might be individuals without a telephone number. The information concerning each column is separated by a comma, with parentheses surrounding all columns (a left parenthesis before the first column and a right parenthesis following the information on the last column).
Each record, or row of data, in this table consists of the following:
EMP_ID, EMP_NAME, EMP_ST_ADDR, EMP_CITY, EMP_ST, EMP_ZIP, EMP_PHONE, EMP_PAGER
In this table, each field is a column. The column EMP_ID could consist of one employee's identification number or many employees' identification numbers, depending on the requirements of a database query or transaction.
When selecting names for objects, specifically tables and columns, make sure the name reflects the data that is to be stored. For example, the name for a table pertaining to employee information could be named EMPLOYEE_TBL. Names for columns should follow the same logic. When storing an employee's phone number, an obvious name for that column would be PHONE_NUMBER.
The ALTER TABLE Command
You can modify a table after the table has been created by using the ALTER TABLE command. You can add column(s), drop column(s), change column definitions, add and drop constraints, and, in some implementations, modify table STORAGE values. The standard syntax for the ALTER TABLE command follows:
alter table table_name [modify] [column column_name][datatype | null not null] [restrict|cascade] [drop] [constraint constraint_name] [add] [column] column definition
Modifying Elements of a Table
The attributes of a column refer to the rules and behavior of data in a column. You can modify the attributes of a column with the ALTER TABLE command. The word attributes here refers to the following:
- The data type of a column
- The length, precision, or scale of a column
- Whether the column can contain NULL values
The following example uses the ALTER TABLE command on EMPLOYEE_TBL to modify the attributes of the column EMP_ID:
ALTER TABLE EMPLOYEE_TBL MODIFY EMP_ID VARCHAR(10); Table altered.
The column was already defined as data type VARCHAR (a varying-length character), but you increased the maximum length from 9 to 10.
Adding Mandatory Columns to a Table
One of the basic rules for adding columns to an existing table is that the column you are adding cannot be defined as NOT NULL if data currently exists in the table. NOT NULL means that a column must contain some value for every row of data in the table. So, if you are adding a column defined as NOT NULL, you are contradicting the NOT NULL constraint right off the bat if the preexisting rows of data in the table do not have values for the new column.
There is, however, a way to add a mandatory column to a table:
- Add the column and define it as NULL. (The column does not have to contain a value.)
- Insert a value into the new column for every row of data in the table.
- Alter the table to change the column's attribute to NOT NULL.
Adding Auto-Incrementing Columns to a Table
Sometimes it is necessary to create a column that auto-increments itself to give a unique sequence number for a particular row. You could do this for many reasons, such as not having a natural key for the data, or wanting to use a unique sequence number to sort the data. Creating an auto-incrementing column is generally quite easy. In MySQL, the implementation provides the SERIAL method to produce a truly unique value for the table. Following is an example:
CREATE TABLE TEST_INCREMENT( ID SERIAL, TEST_NAME VARCHAR(20));
In Microsoft SQL Server, we are provided with an IDENTITY column type. The following is an example for the SQL Server implementation:
CREATE TABLE TEST_INCREMENT( ID INT IDENTITY(1,1) NOT NULL, TEST_NAME VARCHAR(20));
Oracle does not provide a direct method for an auto-incrementing column. However, there is one method using an object called a SEQUENCE and a TRIGGER that simulates the effect in Oracle. This technique is discussed when we talk about TRIGGERs in Hour 22, "Advanced SQL Topics."
Now we can insert values into the newly created table without specifying a value for our auto-incrementing column:
INSERT INTO TEST_INCREMENT(TEST_NAME) VALUES ('FRED'),('JOE'),('MIKE'),('TED'); SELECT * FROM TEST_INCREMENT; | ID | TEST_NAME | | 1 | FRED | | 2 | JOE | | 3 | MIKE | | 4 | TED |
You need to consider many things when modifying existing columns of a table. Following are some common rules for modifying columns:
- The length of a column can be increased to the maximum length of the given data type.
- The length of a column can be decreased only if the largest value for that column in the table is less than or equal to the new length of the column.
- The number of digits for a number data type can always be increased.
- The number of digits for a number data type can be decreased only if the value with the most number of digits for that column is less than or equal to the new number of digits specified for the column.
- The number of decimal places for a number data type can either be increased or decreased.
- The data type of a column can normally be changed.
Some implementations might actually restrict you from using certain ALTER TABLE options. For example, you might not be allowed to drop columns from a table. To do this, you have to drop the table itself and then rebuild the table with the desired columns. You could run into problems by dropping a column in one table that is dependent on a column in another table or dropping a column that is referenced by a column in another table. Be sure to refer to your specific implementation documentation.
Creating a Table from an Existing Table
You can create a copy of an existing table using a combination of the CREATE TABLE statement and the SELECT statement. The new table has the same column definitions. You can select any or all columns. New columns that you create via functions or a combination of columns automatically assume the size necessary to hold the data. The basic syntax for creating a table from another table is as follows:
create table new_table_name as select [ *|column1, column2 ] from table_name [ where ]
Notice some new keywords in the syntax, particularly the SELECT keyword. SELECT is a database query and is discussed in more detail in Chapter 7, "Introduction to the Database Query." However, it is important to know that you can create a table based on the results from a query.
Both MySQL and Oracle support the CREATE TABLE AS SELECT method of creating a table based on another table. Microsoft SQL Server, however, uses a different statement. For that database implementation, you use a SELECT ... INTO statement. This statement is used like this:
select [ *|column1, columnn2] into new_table_name from table_name [ where ]
Here you'll examine some examples of using this method.
First, do a simple query to view the data in the PRODUCTS_TBL table:
select * from products_tbl; PROD_ID PROD_DESC COST ----------------------------------------------- 11235 WITCH COSTUME 29.99 222 PLASTIC PUMPKIN 18 INCH 7.75 13 FALSE PARAFFIN TEETH 1.1 90 LIGHTED LANTERNS 14.5 15 ASSORTED COSTUMES 10 9 CANDY CORN 1.35 6 PUMPKIN CANDY 1.45 87 PLASTIC SPIDERS 1.05 119 ASSORTED MASKS 4.95
Next, create a table called PRODUCTS_TMP based on the previous query:
create table products_tmp as select * from products_tbl; Table created.
In SQL Server, the same statement would be written as such:
select * into products_tmp from products_tbl; Table created.
Now if you run a query on the PRODUCTS_TMP table, your results appear the same as if you had selected data from the original table.
select * from products_tmp; PROD_ID PROD_DESC COST ---------------------------------------------- 11235 WITCH COSTUME 29.99 222 PLASTIC PUMPKIN 18 INCH 7.75 13 FALSE PARAFFIN TEETH 1.1 90 LIGHTED LANTERNS 14.5 15 ASSORTED COSTUMES 10 9 CANDY CORN 1.35 6 PUMPKIN CANDY 1.45 87 PLASTIC SPIDERS 1.05 119 ASSORTED MASKS 4.95
Dropping a table is actually one of the easiest things to do. When the RESTRICT option is used and the table is referenced by a view or constraint, the DROP statement returns an error. When the CASCADE option is used, the drop succeeds and all referencing views and constraints are dropped. The syntax to drop a table follows:
drop table table_name [ restrict | cascade ]
SQL Server does not allow for the use of the CASCADE option. So for that particular implementation, you must ensure that you drop all objects that reference the table you are removing to ensure that you are not leaving an invalid object in your system.
In the following example, you drop the table that you just created:
drop table products_tmp; Table dropped.