Home > Articles > Data > SQL

  • Print
  • + Share This
This chapter is from the book

10.11 Grouping with ROLLUP and CUBE

Section 10.8 describes the WITH ROLLUP specification. This specification cannot be used if the GROUP BY clause contains grouping sets specifications. In that case, an alternative specification must be used.

It often happens that data has to be aggregated on different levels. Example 10.23 is a clear example. For such a situation, a short notation form has been added, the ROLLUP. Imagine that E1 and E2 are two expressions. In that case, the specification GROUP BY ROLLUP (E1, E2) is equal to the specification GROUP BY GROUPING SETS ((E1, E2), ((E1), ()). So, ROLLUP does not offer extra functionality; it makes only the formulation of some GROUP BY clauses easier. This means that the SELECT statement in Example 10.23 can be simplified by using ROLLUP.

Example 10.25. Get for each combination of sex-town the number of players, get for each sex the number of players, and get the total number of players in the entire table.

SELECT   SEX, TOWN, COUNT(*)
FROM     PLAYERS
GROUP BY ROLLUP (SEX, TOWN)
ORDER BY 1, 2

The result is (of course, equal to that of Example 10.23):

SEX  TOWN       COUNT(*)
---  ---------  --------
M    Stratford         7
M    Inglewood         1
M    Douglas           1
M    ?                 9
F    Midhurst          1
F    Inglewood         1
F    Plymouth          1
F    Eltham            2
F    ?                 5
?    ?                14

Explanation: The term ROLLUP comes from the OLAP world. It is an operator that is supported by many OLAP tools. It indicates that data must be aggregated on different levels, beginning at the lowest level. That lowest level is, of course, specified at ROLLUP. In this example, it is formed by the combination of the SEX and TOWN columns. After that, the data is aggregated by sex and then the total.

Example 10.26. For each combination of sex-town-year of birth, get the number of players; for each combination of sex-town, get the number of players; for each sex, get the number of players; and, finally, get the total number of players.

SELECT   ROW_NUMBER() OVER () AS SEQNO,
         SEX, TOWN, YEAR(BIRTH_DATE), COUNT(*)
FROM     PLAYERS
GROUP BY ROLLUP (SEX, TOWN, YEAR(BIRTH_DATE))
ORDER BY 2, 3, 4

The result is:

SEQNO  SEX  TOWN       YEAR(BIRTH_DATE)  COUNT(*)
-----  ---  ---------  ----------------  --------
    1  M    Stratford              1948         1
    2  M    Stratford              1956         2
    3  M    Stratford              1963         2
    4  M    Stratford              1964         1
    5  M    Stratford              1971         1
    6  M    Stratford                           7
    7  M    Inglewood              1963         1
    8  M    Inglewood                           1
    9  M    Douglas                1963         1
   10  M    Douglas                             1
   11  M                                        9
   12  F    Midhurst               1963         1
   13  F    Midhurst                            1
   14  F    Inglewood              1962         1
   15  F    Inglewood                           1
   16  F    Plymouth               1963         1
   17  F    Plymouth                            1
   18  F    Eltham                 1964         1
   19  F    Eltham                 1970         1
   20  F    Eltham                              2
   21  F                                        5
   22                                          14

Explanation: The grouping [SEX, TOWN, YEAR(BIRTH_DATE)] returns the rows 1, 2, 3, 4, 5, 7, 9, 12, 14, 16, 18, and 19. The grouping [SEX, TOWN] results in the rows 6, 8, 10, 13, 15, 17, and 20. The grouping [SEX] leads up to the rows 11 and 21, and, finally, the grouping [] returns the last row.

By adding more brackets, certain aggregation levels can be skipped.

Example 10.27. For each combination of sex-town-year of birth, get the number of players; for each sex, get the number of players; and, finally, get the total number of players.

SELECT   ROW_NUMBER() OVER () AS SEQNO,
         SEX, TOWN, YEAR(BIRTH_DATE), COUNT(*)
FROM     PLAYERS
GROUP BY ROLLUP (SEX, (TOWN, YEAR(BIRTH_DATE)))
ORDER BY 2, 3, 4

The result is:

SEQNO  SEX  TOWN       YEAR(BIRTH_DATE)  COUNT(*)
-----  ---  ---------  ----------------  --------
    1  M    Stratford              1948         1
    2  M    Stratford              1956         2
    3  M    Stratford              1963         2
    4  M    Stratford              1964         1
    5  M    Stratford              1971         1
    6  M    Inglewood              1963         1
    7  M    Douglas                1963         1
    8  M                                        9
    9  F    Midhurst               1963         1
   10  F    Inglewood              1962         1
   11  F    Plymouth               1963         1
   12  F    Eltham                 1964         1
   13  F    Eltham                 1970         1
   14  F                                        5
   15                                          14

Explanation: Because the TOWN column is placed between brackets together with the expression YEAR(BIRTH_DATE), it is considered to be a group. The groupings that are performed because of this are, successively, [SEX, TOWN, YEAR(BIRTH_DATE)], [SEX], and []. The grouping [SEX, TOWN] is skipped.

By way of illustration, the specification ROLLUP ((SEX, TOWN), YEAR(BIRTH_DATE)) would lead to the groupings [SEX, TOWN, YEAR(BIRTH_DATE)], [SEX, TOWN], and []. Only the grouping on the SEX column is absent here. Another example: The specification ROLLUP ((SEX, TOWN), (YEAR(BIRTH_DATE), MONTH(BIRTH_DATE))) results in the following groupings: [SEX, TOWN, YEAR(BIRTH_DATE), MONTH(BIRTH_DATE)], [SEX, TOWN], and [].

Besides ROLLUP, SQL has a second specification to simplify long GROUP BY clauses: the CUBE. If E 1 and E 2 are two expressions, the specification GROUP BY CUBE (E 1, E 2, E 3) is equal to the specification GROUP BY GROUPING SETS ((E 1, E 2, E 3), (E 1, E 2), (E 1, E 3), (E 2, E 3), (E 1), (E 2), (E 3), ()).

Example 10.28. Get the number of players for each combination of sex-town, for each sex and for each town, and also get the total number of players in the entire table.

SELECT   ROW_NUMBER() OVER () AS SEQNO,
         SEX, TOWN, COUNT(*)
FROM     PLAYERS
GROUP BY CUBE (SEX, TOWN)
ORDER BY 2, 3

The result is:

SEQNO  SEX  TOWN       COUNT(*)
-----  ---  ---------  --------
    1  M    Stratford         7
    2  M    Inglewood         1
    3  M    Douglas           1
    4  M                      9
    5  F    Midhurst          1
    6  F    Inglewood         1
    7  F    Plymouth          1
    8  F    Eltham            2
    9  F                      5
   10       Stratford         7
   11       Midhurst          1
   12       Inglewood         2
   13       Plymouth          1
   14       Douglas           1
   15       Eltham            2
   16                        14

Explanation: Rows 1, 2, 3, 5, 6, 7, and 8 have been included because of the grouping [SEX, TOWN]. Rows 4 and 9 have been included because of the grouping [SEX]. Rows 10 up to and including 15 form the result of the grouping [TOWN]. Finally, row 16 forms the result of a total grouping.

The GROUPING function can also be used in combination with ROLLUP and CUBE.

As in Section 10.10, we show several other abstract examples of certain GROUP BY clauses in which ROLLUP and CUBE appear, including the groupings that are executed. Again, E 1, E 2, E 3 and E 4 represent random expressions, and the symbol ∪ represents the union operator.

Table 10.3. The Relationship Between Grouping Sets Specifications and Groupings

GROUP BY Clause

Groupings

GROUP BY ROLLUP (())

[]

GROUP BY ROLLUP (E1)

[E1][]

GROUP BY ROLLUP (E1, E2)

[E1, E2][E1][]

GROUP BY ROLLUP (E1, (E2, E3))

[E1, E2, E3][E1][]

GROUP BY ROLLUP ((E1, E2), E3)

[E1, E2, E3][E1, E2][]

GROUP BY ROLLUP ((E1, E2), (E3, E4))

[E1, E2, E3, E4][E1, E2][]

GROUP BY CUBE (())

[]

GROUP BY CUBE (E1)

[E1][]

GROUP BY CUBE (E1, E2)

[E1, E2][E1][E2][]

GROUP BY CUBE (E1, E2, E3)

[E1, E2, E3][E1, E2][E1, E3][E2, E3][E1][E2][E3][]

GROUP BY CUBE (E1, E2, E3, E4)

[E1, E2, E3, E4][E1, E2, E3] [E1, E2, E4][E1, E3, E4][E2, E3, E4][E1, E2][E1, E3][E1, E4][E2, E3][E2, E4][E3, E4][E1][E2][E3][E4][]

GROUP BY CUBE (E1, (E2, E3))

[E1, E2, E3][E1][E2, E3][]

GROUP BY CUBE ((E1, E2), (E3, E4))

[E1, E2, E3, E4][E1, E2][E3, E4][]

GROUP BY CUBE (E1, ())

[E1][]

Exercise 10.27: For each combination of team number-player number, get the number of matches, and also get the number of matches for each team and the total number of matches. In this result, include only those matches that have been won in this result.

Exercise 10.28: Execute a CUBE on the column town, sex, and team number after the two tables PLAYERS and TEAMS have been joined.

  • + Share This
  • 🔖 Save To Your Account