Home > Articles > Data > SQL

Summarizing Data Results from a Query in SQL

📄 Contents

  1. Aggregate Functions
  2. Summary
  3. Q&A
  4. Workshop
  • Print
  • + Share This
Aggregate functions can be useful and are quite simple to use. In this chapter from SQL in 24 Hours, Sams Teach Yourself, 6th Edition, you learn how to count values in columns, count rows of data in a table, get the maximum and minimum values for a column, figure the sum of the values in a column, and figure the average value for values in a column.
This chapter is from the book

In this hour, you learn about SQL’s aggregate functions. You can perform a variety of useful functions with aggregate functions, such as getting the highest total of a sale or counting the number of orders processed on a given day. The real power of aggregate functions will be discussed in the next hour when you tackle the GROUP BY clause.

Aggregate Functions

Functions are keywords in SQL used to manipulate values within columns for output purposes. A function is a command normally used with a column name or expression that processes the incoming data to produce a result. SQL contains several types of functions. This hour covers aggregate functions. An aggregate function provides summarization information for a SQL statement, such as counts, totals, and averages.

The basic set of aggregate functions discussed in this hour are

  • COUNT
  • SUM
  • MAX
  • MIN
  • AVG

The following query lists the employee information from the EMPLOYEES table. Note that some of the employees do not have data assigned in some of the columns. We use this data for most of this hour’s examples.

SELECT TOP 10 EMPLOYEEID,LASTNAME,
 CITY,STATE,PAYRATE,SALARY
 FROM EMPLOYEES;

EMPLOYEEID  LASTNAME          CITY               STATE      PAYRATE        SALARY
----------- ----------------- ------------------ ---------- -------------- --------
1           Iner              Red Dog            NULL                      54000.00
2           Denty             Errol              NH         22.24          NULL
3           Sabbah            Errol              NH         15.29          NULL
4           Loock             Errol              NH         12.88          NULL
5           Sacks             Errol              NH         23.61          NULL
6           Arcoraci          Alexandria         LA         24.79          NULL
7           Astin             Espanola           NM         18.03          NULL
8           Contreraz         Espanola           NM         NULL           60000.00
9           Capito            Espanola           NM         NULL           52000.00
10          Ellamar           Espanola           NM         15.64          NULL

(10 row(s) affected)

COUNT

You use the COUNT function to count rows or values of a column that do not contain a NULL value. When used within a query, the COUNT function returns a numeric value. You can also use the COUNT function with the DISTINCT command to only count the distinct rows of a dataset. ALL (opposite of DISTINCT) is the default; it is not necessary to include ALL in the syntax. Duplicate rows are counted if DISTINCT is not specified. One other option with the COUNT function is to use it with an asterisk. COUNT(*) counts all the rows of a table including duplicates, regardless of whether a NULL value is contained in a column.

The syntax for the COUNT function follows:

COUNT [ (*) | (DISTINCT | ALL) ] (COLUMN NAME)

This example counts all employee IDs:

SELECT COUNT(EMPLOYEEID) FROM EMPLOYEES

This example counts only the distinct rows:

SELECT COUNT(DISTINCT SALARY)FROM EMPLOYEES

This example counts all rows for SALARY:

SELECT COUNT(ALL SALARY)FROM EMPLOYEES

This final example counts all rows of the EMPLOYEES table:

SELECT COUNT(*) FROM EMPLOYEES

COUNT(*) is used in the following example to get a count of all records in the EMPLOYEES table. There are 5,611 employees.

SELECT COUNT(*)
FROM EMPLOYEES;
-----------
5611

(1 row(s) affected)

COUNT(EMPLOYEEID) is used in the next example to get a count of all the employee identification IDs that exist in the table. The returned count is the same as the last query because all employees have an identification number.

SELECT COUNT(EMPLOYEEID)
FROM EMPLOYEES;
-----------
5611

(1 row(s) affected)

COUNT([STATE]) is used in the following example to get a count of all the employee records that have a state assigned. Look at the difference between the two counts. The difference is the number of employees who have NULL in the STATE column.

SELECT COUNT([STATE])
FROM EMPLOYEES;
-----------
5147
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

The following examples obtain a count of all salary amounts and then all the distinct salary amounts in the EMPLOYEES table.

SELECT COUNT(SALARY )
FROM EMPLOYEES;
-----------
1359
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

SELECT COUNT(DISTINCT SALARY )
FROM EMPLOYEES;
-----------
45
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

The SALARY column had a lot of matching amounts, so the DISTINCT values make the counts drop dramatically.

SUM

The SUM function returns a total on the values of a column for a group of rows. You can also use the SUM function with DISTINCT. When you use SUM with DISTINCT, only the distinct rows are totaled, which might not have much purpose. Your total is not accurate in that case because rows of data are omitted.

The syntax for the SUM function follows:

SUM ([ DISTINCT ] COLUMN NAME)

This example totals the salaries:

SELECT SUM(SALARY) FROM EMPLOYEES

This example totals the distinct salaries:

SELECT SUM(DISTINCT SALARY) FROM EMPLOYEES

In the following query, the sum, or total amount, of all salary values is retrieved from the EMPLOYEES table:

SELECT SUM(SALARY)
FROM EMPLOYEES;
------------------------------
70791000.00
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

Observe the way the DISTINCT command in the following example skews the previous results by 68 million dollars. This is why it is rarely useful.

SELECT SUM(DISTINCT COST)
FROM EMPLOYEES;
------------------------------
2340000.00
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

The following query demonstrates that although some aggregate functions require numeric data, this is only limited to the type of data. Here the ZIP column of the EMPLOYEES table shows that the implicit conversion of the VARCHAR data to a numeric type is supported in Oracle:

SELECT SUM(ZIP)
FROM EMPLOYEES;
SUM(ZIP)
-----------
280891448

Some aggregate functions require numeric data; this is only limited to the type of data. If the data can be converted implicitly, for example, the string '12345' to an integer, then you can use the aggregate function. When you use a type of data that cannot be implicitly converted to a numeric type, such as the POSITION column, it results in an error, as in the following example:

SELECT SUM(POSITION)
FROM EMPLOYEES;
Msg 8117, Level 16, State 1, Line 1
Operand data type varchar is invalid for sum operator.

AVG

The AVG function finds the average value for a given group of rows. When used with the DISTINCT command, the AVG function returns the average of the distinct rows. The syntax for the AVG function follows:

AVG ([ DISTINCT ] COLUMN NAME)

The average value for all values in the EMPLOYEES table’s SALARY column is retrieved in the following example:

SELECT AVG(SALARY)
FROM EMPLOYEES;
------------------------------
52090.507726
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

This example returns the distinct average salary:

SELECT AVG(DISTINCT SALARY)
FROM EMPLOYEES;
------------------------------
52000.000000
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

The next example uses two aggregate functions in the same query. Because some employees are paid hourly and others are on salary, you want to retrieve the average value for both PAYRATE and SALARY.

SELECT AVG(PAYRATE) AS AVG_PAYRATE, AVG(SALARY) AS AVG_SALARY
FROM EMPLOYEES;
AVG_PAYRATE                    AVG_SALARY
------------------------------ ------------------------------
18.473012                      52090.507726
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

Notice how the use of aliases makes the output more readable with multiple aggregate values. Also remember that the aggregate function can work on any numeric data. So you can perform calculations within the parentheses of the function as well. So if you need to get the average hourly rate of salaried employees to compare to the average rate of hourly employees, you could write the following:

SELECT AVG(PAYRATE) AS AVG_PAYRATE, AVG(SALARY/2040) AS AVG_SALARY_RATE
FROM EMPLOYEES;
AVG_PAYRATE                    AVG_SALARY_RATE
------------------------------ ------------------------------
18.473012                      25.5345625
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

MAX

The MAX function returns the maximum value from the values of a column in a group of rows. NULL values are ignored when using the MAX function. Using MAX with the DISTINCT command is an option. However, because the maximum value for all the rows is the same as the distinct maximum value, DISTINCT is useless.

The syntax for the MAX function is

MAX([ DISTINCT ] COLUMN NAME)

The following example returns the highest SALARY in the EMPLOYEES table:

SELECT MAX(SALARY)
FROM EMPLOYEES;
------------------------------
74000.00
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

This example returns the highest distinct salary:

SELECT MAX(DISTINCT SALARY)
FROM EMPLOYEES;
------------------------------
74000.00
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

You can also use aggregate functions such as MAX and MIN (covered in the next section) on character data. In the case of these values, collation of your database comes into play again. Most commonly your database collation is set to a dictionary order, so the results are ranked according to that. For example, say you perform a MAX on the CITY column of the employees table:

SELECT MAX(CITY) AS MAX_CITY
FROM EMPLOYEES;
MAX_CITY
------------------------------
Zwara

(1 row(s) affected)

In this instance, the function returned the largest value according to a dictionary ordering of the data in the column.

MIN

The MIN function returns the minimum value of a column for a group of rows. NULL values are ignored when using the MIN function. Using MIN with the DISTINCT command is an option. However, because the minimum value for all rows is the same as the minimum value for distinct rows, DISTINCT is useless.

The syntax for the MIN function is

MIN([ DISTINCT ] COLUMN NAME)

The following example returns the lowest SALARY in the EMPLOYEES table:

SELECT MIN(SALARY)
FROM EMPLOYEES;
------------------------------
30000.00
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

This example returns the lowest distinct salary:

SELECT MIN(DISTINCT SALARY)
FROM EMPLOYEES;
------------------------------
30000.00
Warning: Null value is eliminated by an aggregate or other SET operation.

(1 row(s) affected)

As with the MAX function, the MIN function can work against character data and returns the minimum value according to the dictionary ordering of the data.

SELECT MIN(CITY) AS MIN_CITY
FROM EMPLOYEES;
MIN_CITY
------------------------------
 AFB MunicipalCharleston SC

(1 row(s) affected)
  • + Share This
  • 🔖 Save To Your Account