Home > Articles > Data > SQL

  • Print
  • + Share This
This chapter is from the book

10.10 Grouping Sets

The GROUP BY clauses that have been described so far use the short notation for the specification of groupings. SQL also has a more extensive notation. So-called grouping sets specifications indicate on which expressions groupings must be performed.

Example 10.21. For each town, get the most recent date of birth.

With the shortest notation form, this statement looks as follows:

SELECT   TOWN, MIN(BIRTH_DATE)
FROM     PLAYERS
GROUP BY TOWN

When the extensive notation form is used, this similar formulation occurs:

SELECT   TOWN, MIN(BIRTH_DATE)
FROM     PLAYERS
GROUP BY GROUPING SETS ((TOWN))

The result of both statements is:

TOWN       MIN(BIRTH_DATE)
---------  ---------------
Stratford  1948-09-01
Midhurst   1963-06-22
Inglewood  1962-07-08
Plymouth   1963-10-01
Douglas    1963-05-14
Eltham     1964-12-28

Explanation: Behind the term GROUPING SETS, the grouping sets specifications can be found. Within such a specification, several groupings can be specified. Each grouping is placed between brackets and the whole should also be placed between brackets—hence, the double brackets.

The advantage of the extensive notation form is that it offers more ways to group data. Several groupings can be specified, among other things, and combinations of ROLLUP and CUBE can be used.

Example 10.22. For each town, get the number of players, and for each sex, get the number of players as well.

Because a grouping is needed on two different columns, this question cannot be formulated with one GROUP BY clause (in which the short notation form is used). A way to combine these two groupings in one statement is to use an UNION operator.

SELECT   CAST(NULL AS CHAR), TOWN, COUNT(*)
FROM     PLAYERS
GROUP BY TOWN
UNION
SELECT   SEX, CAST(NULL AS CHAR), COUNT(*)
FROM     PLAYERS
GROUP BY SEX
ORDER BY 2, 1

The result is:

SEX  TOWN       COUNT(*)
---  ---------  --------
?    Stratford         7
?    Midhurst          1
?    Inglewood         2
?    Plymouth          1
?    Douglas           1
?    Eltham            2
M    ?                 9
F    ?                 5

Explanation: The rows in this intermediate result in which the TOWN column has been filled came from the first select block. The rows in which TOWN is equal to NULL form the intermediate result of the second select block. In fact, these two rows form subtotals for each sex.

To simplify this type of statement, the grouping sets specification has been added to SQL. With this, one GROUP BY clause can be used to specify several groupings. The previous statement becomes:

SELECT   SEX, TOWN, COUNT(*)
FROM     PLAYERS
GROUP BY GROUPING SETS ((TOWN), (SEX))
ORDER BY 2, 1

Explanation: Behind the words GROUPING SETS, two groupings are now specified: (TOWN) and (SEX).

If one grouping consists of one expression, the brackets can be removed. So, GROUP BY GROUPING SETS ((TOWN), (SEX)) is equivalent to GROUP BY GROUPING SETS (TOWN, SEX).

Again, the GROUP BY clause, as we discussed in the previous chapters, is, in fact, a shortened notation for the one with grouping sets. Table 10.1 contains several examples of original formulations without grouping sets and their equivalents with grouping sets.

Table 10.1. Original GROUP BY Clauses and Their Equivalent Grouping Sets Specifications

Original Specification

Specification with Grouping Sets

GROUP BY A

GROUP BY GROUPING SETS ((A))

or

GROUP BY GROUPING SETS (A)

GROUP BY A, B

GROUP BY GROUPING SETS ((A, B))

GROUP BY YEAR(A), SUBSTR(B)

GROUP BY GROUPING SETS ((YEAR(A), SUBSTR(B)))

A special grouping is (). There is no expression between the brackets. In this case, all rows are placed in one group. We can calculate a grand total with that, for example.

Example 10.23. Find for each combination of sex-town the number of players, get for each sex the number of players, and get the total number of players in the entire table.

SELECT   SEX, TOWN, COUNT(*)
FROM     PLAYERS
GROUP BY GROUPING SETS ((SEX, TOWN), (SEX), ())
ORDER BY 1, 2

The result is:

SEX  TOWN       COUNT(*)
---  ---------  --------
M    Stratford         7
M    Inglewood         1
M    Douglas           1
M    ?                 9
F    Midhurst          1
F    Inglewood         1
F    Plymouth          1
F    Eltham            2
F    ?                 5
?    ?                14

Explanation: The last row contains the total number of players and is added because of the grouping ().

Example 10.24. Get for each team and for each player individually the number of matches played.

SELECT   TEAMNO, PLAYERNO, COUNT(*)
FROM     MATCHES
GROUP BY GROUPING SETS (TEAMNO, PLAYERNO)
ORDER BY 1, 2

The result is:

TEAMNO  PLAYERNO  COUNT(*)
------  --------  --------
     1         ?         8
     2         ?         5
     ?         2         1
     ?         6         3
     ?         8         2
     ?        27         1
     ?        44         1
     ?        57         1
     ?        83         1
     ?       104         1
     ?       112         2

Explanation: The first two rows in the result have been included because of the grouping on the TEAMNO column and the other rows because of the grouping on the PLAYERNO column.

This example clearly shows that brackets are important. The specification GROUPING SETS (TEAMNO, PLAYERNO) returns a different result than GROUPING SETS ((TEAMNO, PLAYERNO)). The second grouping sets specification results in one grouping on the combination of the two columns specified and the second grouping sets specification leads to two groupings.

Finally, here are a few abstract examples of certain GROUP BY clauses, including the groupings that are executed. E1, E2, and E3 stand for random expressions, and the symbol ∪ represents the union operator.

Table 10.2. The Relationship Between Grouping Sets Specifications and Groupings

GROUP BY Clause

Groupings

GROUP BY E1, E2, E3

[E1, E2, E3]

GROUP BY GROUPING SETS (())

[]

GROUP BY GROUPING SETS ((E1, E2, E3))

[E1, E2, E3]

GROUP BY GROUPING SETS (E1, E2, E3)

[E1] ∪ [E2] ∪ [E3]

GROUP BY GROUPING SETS ((E1), (E2), (E3))

[E1] ∪ [E2] ∪ [E3]

GROUP BY GROUPING SETS ((E1, E2), (E3))

[E1, E2] ∪ [E3]

GROUP BY GROUPING SETS ((E1, E2), E3)

[E1, E2] ∪ [E3]

GROUP BY GROUPING SETS ((E1, E2), (E3, E4))

[E1, E2] ∪ [E3, E4]

GROUP BY GROUPING SETS ((E1, (E2, E3)))

Not allowed

Exercise 10.24: Get the total number of penalties by using a grouping sets specification.

Exercise 10.25: Get for each combination of team number and player number the number of matches, give the number of matches for each team number, and find the total number of matches as well.

Exercise 10.26: Indicate which groupings must be specified for the following GROUP BY clauses:

  1. GROUP BY GROUPING SETS ((), (), (E1), (E2))
  2. GROUP BY GROUPING SETS (E1, (E2, E3),(E3, E4, E5))
  3. GROUP BY GROUPING SETS ((E1, E2), (), E3, (E2, E1))
  • + Share This
  • 🔖 Save To Your Account