Home > Articles > Data > SQL Server

Transact-SQL Improvements with SQL Server 2008, Part 2

📄 Contents

  1. Working with Multiple Groupings of Data in a Single Command
  2. Passing Multiple Data Rows to a Stored Procedure with Table-Valued Parameters
  • Print
  • + Share This
Database administration expert Baya Dewald discusses improvements to Transact-SQL language in SQL Server 2008. He concludes this two-part series by discussing functions for grouping and table-valued parameters for passing multiple data rows to a stored procedure.
Like this article? We recommend

Like this article? We recommend

In this article, we'll continue our brief look at new and changed Transact-SQL functionality in SQL Server 2008. In part 1 of this series, we considered initializing and incrementing variables, merging multiple data sets into one, separating date and time values with new data types, and some new functions for dealing with date and time. Now we'll take a look at functions for grouping, as well as how to use table-valued parameters to pass multiple data rows to a stored procedure.

Working with Multiple Groupings of Data in a Single Command

The SELECT statement's GROUP BY clause has been extended to allow for a new GROUPING SETS option, which enables you to have multiple groupings of data in a single command. For example, a quick look at the Person.Address table in the AdventureWorks sample database reveals that persons reside in different cities within multiple states. The following query shows the number of persons in each combination of city and state/province:

SELECT b.Name AS StateProvince, a.City, COUNT(*) AS NumberOfAddresses
FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
GROUP BY b.Name, a.City
ORDER BY 3 DESC

The following table shows partial results of this query:

StateProvince

City

NumberOfAddresses

England

London

434

Seine (Paris)

Paris

398

Washington

Burien

215

California

Concord

214

Washington

Bellingham

213

Oregon

Beaverton

213

California

Chula Vista

207

However, this result doesn't produce subtotals by each city and each province. I can easily modify this query to include subtotals for each city name, province name, and a combination of city and state, simply by adding the GROUPING SETS clause:

SELECT b.Name AS StateProvince, a.City, COUNT(*) AS NumberOfAddresses
FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
GROUP BY GROUPING SETS (b.Name, a.City, (a.City, b.Name))
ORDER BY 3 DESC

Partial results:

StateProvince

City

NumberOfAddresses

California

NULL

4564

Washington

NULL

2636

England

NULL

1954

New South Wales

NULL

1588

British Columbia

NULL

1579

Oregon

NULL

1105

NULL

London

434

England

London

434

NULL

Paris

398

Seine (Paris)

Paris

398

To be fair, previous versions of SQL Server also allowed constructing a similar output, but doing so would require considerably more typing. The following query would return results identical to the one above:

SELECT b.Name AS StateProvince, NULL AS City, COUNT(*) AS NumberOfAddresses
 FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
GROUP BY b.Name
UNION ALL
SELECT NULL AS StateProvince, City, COUNT(*) AS NumberOfAddresses
 FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
GROUP BY City
UNION ALL
SELECT b.Name AS StateProvince, City, COUNT(*) AS NumberOfAddresses
 FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
GROUP BY b.Name, City
ORDER BY 3 DESC

In addition to saving the effort of typing a much longer statement, the GROUPING SETS clause provides considerably better performance than the equivalent statement using a UNION ALL construct. As the statistics report below shows, the SELECT command using the GROUPING SETS clause scans the Address table twice, whereas the statement using UNION ALL scans the same table three times:

Statistics report for GROUPING SETS:

Table 'Address'. Scan count 2, logical reads 245, physical reads 6,
 read-ahead reads 240, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.
Table 'StateProvince'. Scan count 2, logical reads 7, physical reads 2,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.


Statistics report for UNION ALL:

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.
Table 'Address'. Scan count 3, logical reads 461, physical reads 9,
 read-ahead reads 240, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.
Table 'StateProvince'. Scan count 2, logical reads 7, physical reads 2,
 read-ahead reads 0, lob logical reads 0, lob physical reads 0,
 lob read-ahead reads 0.

Transact-SQL now also includes ANSI-standard ROLLUP and CUBE operators, which are shorthand alternatives to GROUPING SETS. For example, the following query will produce subtotals for each state/province and a grand total for all addresses:

SELECT b.Name AS StateProvince, a.City, COUNT(*) AS NumberOfAddresses
FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
GROUP BY ROLLUP (b.name, a.City)
ORDER BY 1

Partial results:

StateProvince

City

NumberofAddresses

NULL

NULL

19614

Alabama

Birmingham

2

Alabama

Florence

1

Alabama

Huntsville

2

Alabama

Mobile

1

Alabama

Montgomery

1

Alabama

NULL

7

Alberta

Calgary

23

Alberta

Edmonton

2

Alberta

NULL

25

Arizona

Chandler

4

Arizona

Gilbert

1

Arizona

Lemon Grove

2

Arizona

Mesa

1

Arizona

Phoenix

5

Arizona

Scottsdale

2

Arizona

Surprise

1

Arizona

Tucson

2

Arizona

NULL

18

The next query will produce subtotals for each city, subtotals for each state, and the grand total for all addresses:

SELECT b.Name AS StateProvince, a.City, COUNT(*) AS NumberOfAddresses
FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
GROUP BY CUBE (b.name, a.City)
ORDER BY 1

The GROUPING_ID function allows us to identify the column(s) used to aggregate the results. To demonstrate, I'll limit the output to cities found in Alabama:

SELECT
      GROUPING_ID(b.Name, a.City) AS grouping_bits,
      b.Name AS StateProvince,
      a.City,
      COUNT(*) AS NumberOfAddresses
FROM Person.Address a
INNER JOIN Person.StateProvince b
ON a.StateProvinceID = b.StateProvinceID
WHERE b.Name = 'Alabama'
GROUP BY CUBE (b.name, a.City)
ORDER BY 1, 3

Results:

grouping_bits

StateProvince

City

NumberofAddresses

0

Alabama

Birmingham

2

0

Alabama

Florence

1

0

Alabama

Huntsville

2

0

Alabama

Mobile

1

0

Alabama

Montgomery

1

1

Alabama

NULL

7

2

NULL

Birmingham

2

2

NULL

Florence

1

2

NULL

Huntsville

2

2

NULL

Mobile

1

2

NULL

Montgomery

1

3

NULL

NULL

7

Notice that I passed the state column first to the GROUPING_ID function, followed by city. So the grouping bits column will have a value of 1 if results are aggregated by state, 2 if aggregated by city, and 3 (1+2) if aggregated by both. If addresses are displayed for a combination of city and state, we aren't aggregating by either column, so grouping bits will have a value of zero. Note also that the grouping bits column allowed for a very nice option for sorting the results (using the ORDER BY clause). Similarly, if the result set was large, we could materialize it—persist it as an indexed view or table—and build an index on the grouping bits column for faster data retrieval.

  • + Share This
  • 🔖 Save To Your Account