Sometimes it is necessary to rotate results so that columns are presented horizontally and rows are presented vertically. This is known as creating a PivotTable, creating a cross-tab report, or rotating data.
Assume there is a table Pivot that has one row per quarter. A SELECT of Pivot reports the quarters vertically:
These are the statements used to create the Pivot table and populate it with the data from the first table:
CREATE TABLE Pivot ( YEAR INT, Quarter INT, Amount FLOAT ); INSERT INTO Pivot VALUES (1990, 1, 1.1); INSERT INTO Pivot VALUES (1990, 2, 1.2); INSERT INTO Pivot VALUES (1990, 3, 1.3); INSERT INTO Pivot VALUES (1990, 4, 1.4); INSERT INTO Pivot VALUES (1991, 1, 2.1); INSERT INTO Pivot VALUES (1991, 2, 2.2); INSERT INTO Pivot VALUES (1991, 3, 2.3); INSERT INTO Pivot VALUES (1991, 4, 2.4);
This is the SELECT statement used to create the rotated results:
SELECT YEAR, SUM(CASE Quarter WHEN 1 THEN Amount ELSE 0 END) AS Q1, SUM(CASE Quarter WHEN 2 THEN Amount ELSE 0 END) AS Q2, SUM(CASE Quarter WHEN 3 THEN Amount ELSE 0 END) AS Q3, SUM(CASE Quarter WHEN 4 THEN Amount ELSE 0 END) AS Q4 FROM Pivot GROUP BY YEAR
This SELECT statement also handles a table in which there are multiple rows for each quarter. The GROUP BY combines all rows in Pivot for a given year into a single row in the output. When the grouping operation is being performed, the CASE functions in the SUM aggregates are applied in such a way that the Amount values for each quarter are added into the proper column in the result set and 0 is added to the result set columns for the other quarters.
If the results of this SELECT statement are used as input to a spreadsheet, it is easy for the spreadsheet to calculate a total for each year. When the SELECT is used from an application it may be easier to enhance the SELECT statement to calculate the yearly total. For example:
SELECT P1.*, (P1.Q1 + P1.Q2 + P1.Q3 + P1.Q4) AS YearTotal FROM (SELECT YEAR, SUM(CASE P.Quarter WHEN 1 THEN P.Amount ELSE 0 END) AS Q1, SUM(CASE P.Quarter WHEN 2 THEN P.Amount ELSE 0 END) AS Q2, SUM(CASE P.Quarter WHEN 3 THEN P.Amount ELSE 0 END) AS Q3, SUM(CASE P.Quarter WHEN 4 THEN P.Amount ELSE 0 END) AS Q4 FROM Pivot AS P GROUP BY P.Year) AS P1
The Pivot transformation makes a normalized data set into a less normalized but more compact version by pivoting the input data on a column value. For example, a normalized Orders data set that lists customer name, product, and quantity purchased typically has multiple rows for any customer who purchased multiple products, with each row for that customer showing order details for a different product. By pivoting the data set on the product column, the Pivot transformation can output a data set with a single row per customer. That single row lists all the purchases by the customer, with the product names shown as column names, and the quantity shown as a value in the product column. Because not every customer purchases every product, many columns may contain null values.
When a dataset is pivoted, input columns perform different roles in the pivoting process. A column can participate in the following ways:
- The column is passed through unchanged to the output. Because many input rows can result only in one output row, the transformation copies only the first input value for the column.
- The column acts as the key or part of the key that identifies a set of records.
- The column defines the pivot. The values in this column are associated with columns in the pivoted data set.
- The column contains values that are placed in the columns that the pivot creates.
The following diagram shows a data set before the data is pivoted on the Product column.
The following diagram shows a data set after the data has been pivoted on the Product column.
|Pivot Key Value||Ham||Soda||Milk||Beer||Chips|
To pivot data efficiently, which means creating as few records in the output dataset as possible, the input data must be sorted on the pivot column. If the data is not sorted, the Pivot transformation might generate multiple records for each value in the set key, which is the column that defines set membership. For example, if the dataset is pivoted on a Name column but the names are not sorted, the output dataset could have more than one row for each customer, because a pivot occurs every time that the value in Name changes.
The Pivot transformation uses the properties on its input and output columns to define the pivot operation.
The Pivot transformation has one input, one regular output, and one error output.
FROM table_reference [, table_reference ]* table_reference : non_join_table | joined_table | pivoted_table pivoted_table : table_reference PIVOT pivot_clause AS table_alias pivot_clause : ( aggregate_function ( value_column ) FOR column_name IN ( column_list ) )