Pivot table basics: rows to columns

from the Artful Common Queries page


From table tbl( class, member ), you want to cross-tabulate all classes with their members. In SQL terms, you aggregate members over classes. In MySQL:

SELECT class,GROUP_CONCAT(member)
FROM tbl
GROUP BY class;

With that simple query, you're halfway toward cross-tabulation, halfway to implementing a simple CUBE, and halfway to basic entity-attribute-value (EAV) logic. This is easier to see if we have two columns, rather than just one, to tabulate against the grouping column:

DROP TABLE IF EXISTS tbl;
CREATE TABLE tbl( id INT, colID INT, value CHAR(20) );
INSERT INTO tbl VALUES
  (1,1,'Sampo'),(1,2,'Kallinen'),(1,3,'Office Manager'),
  (2,1,'Jakko'),(2,2,'Salovaara'),(2,3,'Vice President');

To tabulate all colID and value values against all id values—that is, to write a reporting CUBE for the table—write a GROUP_CONCAT() instruction for each colID found in the table, then GROUP BY id:

SELECT 
  id, 
  GROUP_CONCAT(if(colID = 1, value, NULL)) AS 'First Name',
  GROUP_CONCAT(if(colID = 2, value, NULL)) AS 'Last Name',
  GROUP_CONCAT(if(colID = 3, value, NULL)) AS 'Job Title'
FROM tbl
GROUP BY id;
+------+------------+-----------+----------------+
| id   | First Name | Last Name | Title          |
+------+------------+-----------+----------------+
|    1 | Sampo      | Kallinen  | Office Manager |
|    2 | Jakko      | Salovaara | Vice President |
+------+------------+-----------+----------------+

Since Paul Spinks first taught a spreadsheet how to do that cross-tabulation trick in 1979, this has also been known as a pivot table: we pivot colID and value against ID.

(Of course for a proper EAV representation, we'd add an attributes table:

DROP TABLE IF EXISTS attrs;
CREATE TABLE attrs(colID INT,attr CHAR(12));
INSERT INTO attrs VALUES (1,'First Name'),(2,'Last Name'),(3,'Title');

and write a stored procedure to PREPARE the above query from table-based attribute names.)

More often, crosstab queries calculate. Here is a simple sales table:

DROP TABLE IF EXISTS sales;
CREATE TABLE Sales (empID INT, yr SMALLINT, sales DECIMAL(10,2));
INSERT sales VALUES
(1, 2005, 12000),(1, 2006, 18000),(1, 2007, 25000),
(2, 2005, 15000),(2, 2006, 6000),(3, 2006, 20000),(3, 2007, 24000);

In Microsoft SQL Server, CUBE/PIVOT syntax for horizontal and vertical totals is:

SELECT EmpId, [2005], [2006], [2007], [ALL]
FROM (
  SELECT
    CASE WHEN GROUPING(EmpId)=0 THEN CAST(EmpId AS CHAR(7)) ELSE 'ALL' END AS EmpId,
    CASE WHEN GROUPING(Yr)=0 THEN CAST(Yr AS CHAR(7)) ELSE 'ALL' END AS Yr,
    SUM(Sales) AS Sales
  FROM Sales
  GROUP BY EmpId, Yr WITH CUBE
) AS s
PIVOT( SUM(Sales) FOR Yr IN ([2005], [2006], [2007], [ALL]) ) AS p

In MySQL, it's a two-step:

1. In an inner query, write one aggregating expression per reporting column,
2. In an outer query, build the horizontal sums:

SELECT 
  IFNULL(empId,'Totals') AS EmpId,       -- outer query labels rollup row
  sums.2005, sums.2006, sums.2007,       -- and calculates horizontal sums
  sums.2005 + sums.2006 + sums.2007 AS Sums
FROM (                                   -- inner query groups by employee
  SELECT                                 -- with an expression for each column
    EmpID,
    SUM(IF(Yr=2005,sales,0)) As '2005',
    SUM(IF(Yr=2006,sales,0)) As '2006',
    SUM(IF(Yr=2007,sales,0)) As '2007'
  FROM Sales
  GROUP BY EmpID WITH ROLLUP
) AS sums;
+--------+----------+----------+----------+-----------+
| EmpId  | 2005     | 2006     | 2007     | Sums      |
+--------+----------+----------+----------+-----------+
| 1      | 12000.00 | 18000.00 | 25000.00 |  55000.00 |
| 2      | 15000.00 |  6000.00 |     0.00 |  21000.00 |
| 3      |     0.00 | 20000.00 | 24000.00 |  44000.00 |
| Totals | 27000.00 | 44000.00 | 49000.00 | 120000.00 |
+--------+----------+----------+----------+-----------+

That approach works beautifully for the common need to tabulate monthly amounts by year, say from an order history table orderhist(orderdate date,amount decimal(10,2)):

SELECT 
  IfNull(Year,'Totals') Year, 
  Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, `Dec`, 
  Qty AS Count, 
  Yrly as 'Yrly Total'
FROM (
  SELECT 
    year(orderdate) AS 'Year',
    Round(Sum(CASE WHEN Month(orderdate)= 1 THEN amount ELSE 0 END),2) AS Jan,
    Round(Sum(CASE WHEN Month(orderdate)= 2 THEN amount ELSE 0 END),2) AS Feb,
    Round(Sum(CASE WHEN Month(orderdate)= 3 THEN amount ELSE 0 END),2) AS Mar,
    Round(Sum(CASE WHEN Month(orderdate)= 4 THEN amount ELSE 0 END),2) AS Apr,
    Round(Sum(CASE WHEN Month(orderdate)= 5 THEN amount ELSE 0 END),2) AS May,
    Round(Sum(CASE WHEN Month(orderdate)= 6 THEN amount ELSE 0 END),2) AS Jun,
    Round(Sum(CASE WHEN Month(orderdate)= 7 THEN amount ELSE 0 END),2) AS Jul,
    Round(Sum(CASE WHEN Month(orderdate)= 8 THEN amount ELSE 0 END),2) AS Aug,
    Round(Sum(CASE WHEN Month(orderdate)= 9 THEN amount ELSE 0 END),2) AS Sep,
    Round(Sum(CASE WHEN Month(orderdate)=10 THEN amount ELSE 0 END),2) AS Oct,
    Round(Sum(CASE WHEN Month(orderdate)=11 THEN amount ELSE 0 END),2) AS Nov,
    Round(Sum(CASE WHEN Month(orderdate)=12 THEN amount ELSE 0 END),2) AS `Dec`,
    Count(*) AS Qty,
    Round(Sum(amount),2) AS Yrly
  FROM orderhist
  GROUP BY year 
  WITH ROLLUP
) AS sums ;

More likely the required sums need to be calculated across joins. Build them in using a good text editor. Here is the above query for the orders and orderdetails table in the Northwind database:

SELECT 
  IfNull(Year,'Totals') Year, 
  Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct, Nov, `Dec`, 
  quantity AS Count, 
  Yrly as 'Yrly Total'
FROM (
  SELECT 
    year(orderdate) AS 'Year',
    Round(Sum(If( Month(o.orderdate)= 1, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Jan,
    Round(Sum(If( Month(o.orderdate)= 2, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Feb,
    Round(Sum(If( Month(o.orderdate)= 3, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Mar,
    Round(Sum(If( Month(o.orderdate)= 4, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Apr,
    Round(Sum(If( Month(o.orderdate)= 5, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS May,
    Round(Sum(If( Month(o.orderdate)= 6, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Jun,
    Round(Sum(If( Month(o.orderdate)= 7, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Jul,
    Round(Sum(If( Month(o.orderdate)= 8, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Aug,
    Round(Sum(If( Month(o.orderdate)= 9, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Sep,
    Round(Sum(If( Month(o.orderdate)=10, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Oct,
    Round(Sum(If( Month(o.orderdate)=11, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS Nov,
    Round(Sum(If( Month(o.orderdate)=12, (d.unitprice*d.quantity)-d.discount, 0 )), 2 ) AS `Dec`,
    Count(*) AS quantity,
    Round(Sum((unitprice*quantity)-discount),2) AS Yrly
  FROM orders o
  JOIN orderdetails d USING(orderID)
  GROUP BY year 
  WITH ROLLUP
) AS sums ; 

With many columns or subtotals, pivot tables get complicated, time-consuming and error-prone. Automation is needed. Oracle and Microsoft SQL Server have CUBE syntax to simplify the job, especially for big pivot tables. MySQL doesn't. See "Automate pivot table queries" for how to roll your own cube with MySQL.

Return to the Artful Common Queries page