- 10.1 Introduction
- 10.2 Grouping on One Column
- 10.3 Grouping on Two or More Columns
- 10.4 Grouping on Expressions
- 10.5 Grouping of NULL Values
- 10.6 General Rules for the GROUP BY Clause
- 10.7 Complex Examples with GROUP BY
- 10.8 Grouping with WITH ROLLUP
- 10.9 Grouping with WITH CUBE
- 10.10 Grouping Sets
- 10.11 Grouping with ROLLUP and CUBE
- 10.12 Combining Grouping Sets
- 10.13 Answers
10.11 Grouping with ROLLUP and CUBE
Section 10.8 describes the WITH ROLLUP specification. This specification cannot be used if the GROUP BY clause contains grouping sets specifications. In that case, an alternative specification must be used.
It often happens that data has to be aggregated on different levels. Example 10.23 is a clear example. For such a situation, a short notation form has been added, the ROLLUP. Imagine that E1 and E2 are two expressions. In that case, the specification GROUP BY ROLLUP (E1, E2) is equal to the specification GROUP BY GROUPING SETS ((E1, E2), ((E1), ()). So, ROLLUP does not offer extra functionality; it makes only the formulation of some GROUP BY clauses easier. This means that the SELECT statement in Example 10.23 can be simplified by using ROLLUP.
Example 10.25. Get for each combination of sex-town the number of players, get for each sex the number of players, and get the total number of players in the entire table.
SELECT SEX, TOWN, COUNT(*) FROM PLAYERS GROUP BY ROLLUP (SEX, TOWN) ORDER BY 1, 2
The result is (of course, equal to that of Example 10.23):
SEX TOWN COUNT(*) --- --------- -------- M Stratford 7 M Inglewood 1 M Douglas 1 M ? 9 F Midhurst 1 F Inglewood 1 F Plymouth 1 F Eltham 2 F ? 5 ? ? 14
Explanation: The term ROLLUP comes from the OLAP world. It is an operator that is supported by many OLAP tools. It indicates that data must be aggregated on different levels, beginning at the lowest level. That lowest level is, of course, specified at ROLLUP. In this example, it is formed by the combination of the SEX and TOWN columns. After that, the data is aggregated by sex and then the total.
Example 10.26. For each combination of sex-town-year of birth, get the number of players; for each combination of sex-town, get the number of players; for each sex, get the number of players; and, finally, get the total number of players.
SELECT ROW_NUMBER() OVER () AS SEQNO, SEX, TOWN, YEAR(BIRTH_DATE), COUNT(*) FROM PLAYERS GROUP BY ROLLUP (SEX, TOWN, YEAR(BIRTH_DATE)) ORDER BY 2, 3, 4
The result is:
SEQNO SEX TOWN YEAR(BIRTH_DATE) COUNT(*) ----- --- --------- ---------------- -------- 1 M Stratford 1948 1 2 M Stratford 1956 2 3 M Stratford 1963 2 4 M Stratford 1964 1 5 M Stratford 1971 1 6 M Stratford 7 7 M Inglewood 1963 1 8 M Inglewood 1 9 M Douglas 1963 1 10 M Douglas 1 11 M 9 12 F Midhurst 1963 1 13 F Midhurst 1 14 F Inglewood 1962 1 15 F Inglewood 1 16 F Plymouth 1963 1 17 F Plymouth 1 18 F Eltham 1964 1 19 F Eltham 1970 1 20 F Eltham 2 21 F 5 22 14
Explanation: The grouping [SEX, TOWN, YEAR(BIRTH_DATE)] returns the rows 1, 2, 3, 4, 5, 7, 9, 12, 14, 16, 18, and 19. The grouping [SEX, TOWN] results in the rows 6, 8, 10, 13, 15, 17, and 20. The grouping [SEX] leads up to the rows 11 and 21, and, finally, the grouping [] returns the last row.
By adding more brackets, certain aggregation levels can be skipped.
Example 10.27. For each combination of sex-town-year of birth, get the number of players; for each sex, get the number of players; and, finally, get the total number of players.
SELECT ROW_NUMBER() OVER () AS SEQNO, SEX, TOWN, YEAR(BIRTH_DATE), COUNT(*) FROM PLAYERS GROUP BY ROLLUP (SEX, (TOWN, YEAR(BIRTH_DATE))) ORDER BY 2, 3, 4
The result is:
SEQNO SEX TOWN YEAR(BIRTH_DATE) COUNT(*) ----- --- --------- ---------------- -------- 1 M Stratford 1948 1 2 M Stratford 1956 2 3 M Stratford 1963 2 4 M Stratford 1964 1 5 M Stratford 1971 1 6 M Inglewood 1963 1 7 M Douglas 1963 1 8 M 9 9 F Midhurst 1963 1 10 F Inglewood 1962 1 11 F Plymouth 1963 1 12 F Eltham 1964 1 13 F Eltham 1970 1 14 F 5 15 14
Explanation: Because the TOWN column is placed between brackets together with the expression YEAR(BIRTH_DATE), it is considered to be a group. The groupings that are performed because of this are, successively, [SEX, TOWN, YEAR(BIRTH_DATE)], [SEX], and []. The grouping [SEX, TOWN] is skipped.
By way of illustration, the specification ROLLUP ((SEX, TOWN), YEAR(BIRTH_DATE)) would lead to the groupings [SEX, TOWN, YEAR(BIRTH_DATE)], [SEX, TOWN], and []. Only the grouping on the SEX column is absent here. Another example: The specification ROLLUP ((SEX, TOWN), (YEAR(BIRTH_DATE), MONTH(BIRTH_DATE))) results in the following groupings: [SEX, TOWN, YEAR(BIRTH_DATE), MONTH(BIRTH_DATE)], [SEX, TOWN], and [].
Besides ROLLUP, SQL has a second specification to simplify long GROUP BY clauses: the CUBE. If E _{1} and E _{2} are two expressions, the specification GROUP BY CUBE (E _{1}, E _{2}, E _{3}) is equal to the specification GROUP BY GROUPING SETS ((E _{1}, E _{2}, E _{3}), (E _{1}, E _{2}), (E _{1}, E _{3}), (E _{2}, E _{3}), (E _{1}), (E _{2}), (E _{3}), ()).
Example 10.28. Get the number of players for each combination of sex-town, for each sex and for each town, and also get the total number of players in the entire table.
SELECT ROW_NUMBER() OVER () AS SEQNO, SEX, TOWN, COUNT(*) FROM PLAYERS GROUP BY CUBE (SEX, TOWN) ORDER BY 2, 3
The result is:
SEQNO SEX TOWN COUNT(*) ----- --- --------- -------- 1 M Stratford 7 2 M Inglewood 1 3 M Douglas 1 4 M 9 5 F Midhurst 1 6 F Inglewood 1 7 F Plymouth 1 8 F Eltham 2 9 F 5 10 Stratford 7 11 Midhurst 1 12 Inglewood 2 13 Plymouth 1 14 Douglas 1 15 Eltham 2 16 14
Explanation: Rows 1, 2, 3, 5, 6, 7, and 8 have been included because of the grouping [SEX, TOWN]. Rows 4 and 9 have been included because of the grouping [SEX]. Rows 10 up to and including 15 form the result of the grouping [TOWN]. Finally, row 16 forms the result of a total grouping.
The GROUPING function can also be used in combination with ROLLUP and CUBE.
As in Section 10.10, we show several other abstract examples of certain GROUP BY clauses in which ROLLUP and CUBE appear, including the groupings that are executed. Again, E _{1}, E _{2}, E _{3} and E _{4} represent random expressions, and the symbol ∪ represents the union operator.
Table 10.3. The Relationship Between Grouping Sets Specifications and Groupings
GROUP BY Clause |
Groupings |
GROUP BY ROLLUP (()) |
[] |
GROUP BY ROLLUP (E1) |
[E1] ∪ [] |
GROUP BY ROLLUP (E1, E2) |
[E1, E2] ∪ [E1] ∪ [] |
GROUP BY ROLLUP (E1, (E2, E3)) |
[E1, E2, E3] ∪ [E1] ∪ [] |
GROUP BY ROLLUP ((E1, E2), E3) |
[E1, E2, E3] ∪ [E1, E2] ∪ [] |
GROUP BY ROLLUP ((E1, E2), (E3, E4)) |
[E1, E2, E3, E4] ∪ [E1, E2] ∪ [] |
GROUP BY CUBE (()) |
[] |
GROUP BY CUBE (E1) |
[E1] ∪ [] |
GROUP BY CUBE (E1, E2) |
[E1, E2] ∪ [E1] ∪ [E2] ∪ [] |
GROUP BY CUBE (E1, E2, E3) |
[E1, E2, E3] ∪ [E1, E2] ∪ [E1, E3] ∪ [E2, E3] ∪ [E1] ∪ [E2] ∪ [E3] ∪ [] |
GROUP BY CUBE (E1, E2, E3, E4) |
[E1, E2, E3, E4] ∪ [E1, E2, E3] [E1, E2, E4] ∪ [E1, E3, E4] ∪ [E2, E3, E4] ∪ [E1, E2] ∪ [E1, E3] ∪ [E1, E4] ∪ [E2, E3] ∪ [E2, E4] ∪ [E3, E4] ∪ [E1] ∪ [E2] ∪ [E3] ∪ [E4] ∪ [] |
GROUP BY CUBE (E1, (E2, E3)) |
[E1, E2, E3] ∪ [E1] ∪ [E2, E3] ∪ [] |
GROUP BY CUBE ((E1, E2), (E3, E4)) |
[E1, E2, E3, E4] ∪ [E1, E2] ∪ [E3, E4] ∪ [] |
GROUP BY CUBE (E1, ()) |
[E1] ∪ [] |
Exercise 10.27: For each combination of team number-player number, get the number of matches, and also get the number of matches for each team and the total number of matches. In this result, include only those matches that have been won in this result.
Exercise 10.28: Execute a CUBE on the column town, sex, and team number after the two tables PLAYERS and TEAMS have been joined.