Home > Articles > Data > SQL

Summarizing Data Results from a SQL Query

📄 Contents

  1. What Are Aggregate Functions?
  2. Summary
  3. Q&A
  4. Workshop
  • Print
  • + Share This
One of the most useful capabilities of the SQL query is the summarization of data using aggregate functions, such as summarization and averaging. In this lesson, aggregate functions are explained by example, providing the reader with a solid foundation that can be applied to empower the database query.
This chapter is from the book

In this hour, you learn about SQL's aggregate functions. You can perform a variety of useful functions with aggregate functions.

The highlights of this hour include

  • What functions are

  • How functions are used

  • When to use functions

  • Using aggregate functions

  • Summarizing data with aggregate functions

  • Results from using functions

What Are Aggregate Functions?

Functions are keywords in SQL used to manipulate values within columns for output purposes. A function is a command always used in conjunction with a column name or expression. There are several types of functions in SQL. This hour covers aggregate functions. An aggregate function is used to provide summarization information for an SQL statement, such as counts, totals, and averages.

The aggregate functions discussed in this hour are

  • COUNT

  • SUM

  • MAX

  • MIN

  • AVG

The following queries show the data used for most of this hour's examples:

SELECT *
FROM PRODUCTS_TBL;
PROD_ID    PROD_DESC                      COST
---------- ------------------------------ ------
11235      WITCHES COSTUME                29.99
222        PLASTIC PUMPKIN 18 INCH         7.75
13         FALSE PARAFFIN TEETH            1.1
90         LIGHTED LANTERNS               14.5
15         ASSORTED COSTUMES              10
9          CANDY CORN                      1.35
6          PUMPKIN CANDY                   1.45
87         PLASTIC SPIDERS                 1.05
119        ASSORTED MASKS                  4.95
1234       KEY CHAIN                       5.95
2345       OAK BOOKSHELF                  59.99


11 rows selected.

Some employees do not have a pager number in the results of the following query:

SELECT EMP_ID, LAST_NAME, FIRST_NAME, PAGER
FROM EMPLOYEE_TBL;
EMP_ID    LAST_NAM FIRST_NA PAGER
--------- -------- -------- ----------
311549902 STEPHENS TINA
442346889 PLEW     LINDA
213764555 GLASS    BRANDON  3175709980
313782439 GLASS    JACOB    8887345678
220984332 WALLACE  MARIAH
443679012 SPURGEON TIFFANY


6 rows selected.

The COUNT Function

The COUNT function is used to count rows or values of a column that do not contain a NULL value. When used with a query, the COUNT function returns a numeric value. When the COUNT function is used with the DISTINCT command, only the distinct rows are counted. ALL (opposite of DISTINCT) is the default; it is not necessary to include ALL in the syntax. Duplicate rows are counted if DISTINCT is not specified. One other option with the COUNT function is to use COUNT with an asterisk. COUNT, when used with an asterisk (COUNT(*)), counts all the rows of a table including duplicates, whether a NULL value is contained in a column or not.

The syntax for the COUNT function is as follows:

COUNT [ (*) | (DISTINCT | ALL) ] (COLUMN NAME)

NOTE

The DISTINCT command cannot be used with COUNT(*), only with the COUNT(column_name) .

Example

Meaning

SELECT COUNT(EMPLOYEE_ID) FROM EMPLOYEE_PAY_ID

Counts all employee IDs

SELECT COUNT(DISTINCT SALARY)FROM EMPLOYEE_PAY_TBL

Counts only the distinct rows

SELECT COUNT(ALL SALARY)FROM EMPLOYEE_PAY_TBL

Counts all rows for SALARY

SELECT COUNT(*) FROM EMPLOYEE_TBL Counts all rows of the EMPLOYEE table

COUNT(*) is used in the following example to get a count of all records in the EMPLOYEE_TBL table. There are six employees.

SELECT COUNT(*)
FROM EMPLOYEE_TBL;
COUNT(*)
----------
     6

COUNT(EMP_ID) is used in the next example to get a count of all the employee identifications that exist in the table. The returned count is the same as the last query because all employees have an identification number.

SELECT COUNT(EMP_ID)
FROM EMPLOYEE_TBL;
COUNT(EMP_ID)
-------------
      6

COUNT(PAGER) is used in the following example to get a count of all of the employee records that have a pager number. Only two employees had pager numbers.

SELECT COUNT(PAGER)
FROM EMPLOYEE_TBL;
COUNT(PAGER)
------------
      2

The ORDERS_TBL table, shown next, is used in the following COUNT example:

SELECT *
FROM ORDERS_TBL;
ORD_NUM    CUST_ID    PROD_ID       QTY   ORD_DATE_
---------- ---------- ------------- ----  -------------
56A901     232        11235            1  22-OCT-99
56A917     12         907            100  30-SEP-99
32A132     43         222             25  10-OCT-99
16C17      090        222              2  17-OCT-99
18D778     287        90              10  17-OCT-99
23E934     432        13              20  15-OCT-99
90C461     560        1234             2


7 rows selected.

This last example obtains a count of all distinct product identifications in the ORDERS_TBL table.

SELECT COUNT(DISTINCT(PROD_ID))
FROM ORDERS_TBL;
COUNT(DISTINCT(PROD_ID))
------------------------
            6

The PROD_ID 222 has two entries in the table, thus reducing the distinct values from 7 to 6.

NOTE

Because the COUNT function counts the rows, data types do not play a part. The rows can contain columns with any data type.

The SUM Function

The SUM function is used to return a total on the values of a column for a group of rows. The SUM function can also be used in conjunction with DISTINCT. When SUM is used with DISTINCT, only the distinct rows are totaled, which may not have much purpose. Your total is not accurate in that case because rows of data are omitted.

The syntax for the SUM function is as follows:

SUM ([ DISTINCT ] COLUMN NAME)

NOTE

The value of an argument must be numeric to use the SUM function. The SUM function cannot be used on columns having a data type other than numeric, such as character or date.

Example

Meaning

SELECT SUM(SALARY) FROM EMPLOYEE_PAY_TBL

Totals the salaries

SELECT SUM(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL Totals the distinct salaries

In the following query, the sum, or total amount, of all cost values is being retrieved from the PRODUCTS_TBL table:

SELECT SUM(COST)
FROM PRODUCTS_TBL;
 SUM(COST)
----------
  163.07

The AVG Function

The AVG function is used to find averages for a group of rows. When used with the DISTINCT command, the AVG function returns the average of the distinct rows. The syntax for the AVG function is as follows:

AVG ([ DISTINCT ] COLUMN NAME)

NOTE

The value of the argument must be numeric for the AVG function to work.

Example

Meaning

SELECT AVG(SALARY) FROM EMPLOYEE_PAY_TBL

Returns the average salary

SELECT AVG(DISTINCT SALARY) EMPLOYEE_PAY_TBL Returns the distinct FROM average salary

The average value for all values in the PRODUCTS_TBL table's COST column is being retrieved in the following example:

SELECT AVG(COST)
FROM PRODUCTS_TBL;
 AVG(COST)
----------
13.5891667

NOTE

In some implementations, the results of your query may be truncated to the precision of the data type.

The next example uses two aggregate functions in the same query. Because some employees are paid hourly and others paid a salary, you want to retrieve the average value for both PAY_RATE and SALARY.

SELECT AVG(PAY_RATE), AVG(SALARY)
FROM EMPLOYEE_PAY_TBL;
AVG(PAY_RATE) AVG(SALARY)
------------- -----------
  13.5833333    30000

The MAX Function

The MAX function is used to return the maximum value for the values of a column in a group of rows. NULL values are ignored when using the MAX function. The DISTINCT command is an option. However, because the maximum value for all the rows is the same as the distinct maximum value, DISTINCT is useless.

MAX([ DISTINCT ] COLUMN NAME)

Example

Meaning

SELECT MAX(SALARY) FROM EMPLOYEE_PAY_TBL

Returns the highest salary

SELECT MAX(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL Returns the highest distinct salary

The following example returns the maximum value for the COST column in the PRODUCTS_TBL table:

SELECT MAX(COST)
FROM PRODUCTS_TBL;
 MAX(COST)
----------
   59.99

The MIN Function

The MIN function returns the minimum value of a column for a group of rows. NULL values are ignored when using the MIN function. The DISTINCT command is an option. However, because the minimum value for all rows is the same as the minimum value for distinct rows, DISTINCT is useless.

MIN([ DISTINCT ] COLUMN NAME)

Example

Meaning

SELECT MIN(SALARY) FROM EMPLOYEE_PAY_TBL

Returns the lowest salary

SELECT MIN(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL Returns the lowest distinct salary

The following example returns the minimum value for the COST column in the PRODUCTS_TBL table:

SELECT MIN(COST)
FROM PRODUCTS_TBL;
 MIN(COST)
----------
   1.05

CAUTION

One very important thing to keep in mind when using aggregate functions with the DISTINCT command is that your query may not return the desired results. The purpose of aggregate functions is to return summarized data based on all rows of data in a table.

The final example combines aggregate functions with the use of arithmetic operators:

SELECT COUNT(ORD_NUM), SUM(QTY),
    SUM(QTY) / COUNT(ORD_NUM) AVG_QTY
FROM ORDERS_TBL;
COUNT(ORD_NUM)  SUM(QTY)  AVG_QTY
-------------- ---------- ----------
             7        160 22.857143

You have performed a count on all order numbers, figured the sum of all quantities ordered, and, by dividing the two figures, have derived the average quantity of an item per order. You also created a column alias for the computation—AVG_QTY.

  • + Share This
  • 🔖 Save To Your Account