MySQL/Pivot table
“透视表”或“交叉报表”(crosstab report),是对源数据表的选定的列作为“列标签”(Column label),该列的值将作为透视表中的列名。从而实现“行转列”。源数据表的其他列可以在透视表中保留(这称为“列标签”Column labels,即透视表的每行在该列有不同取值),也可以用于汇总计算(这称为“汇总值”Summation values)。
MySQL不像SQL Server直接提供了pivot和unpivot关键字,所以需要自行创建。
使用case的实现
编辑可分为手工写SQL语句的静态实现或者写存储程序的动态实现。
CREATE TABLE product_sales (
product_name VARCHAR(100),
store_location VARCHAR(50),
num_sales INT
);
INSERT INTO product_sales (product_name, store_location, num_sales) VALUES
('Chair', 'North', 55),
('Desk', 'Central', 120),
('Couch', 'Central', 78),
('Chair', 'South', 23),
('Chair', 'South', 10),
('Chair', 'North', 98),
('Desk', 'West', 61),
('Couch', 'North', 180),
('Chair', 'South', 14),
('Desk', 'North', 45),
('Chair', 'North', 87),
('Chair', 'Central', 34),
('Desk', 'South', 42),
('Couch', 'West', 58),
('Couch', 'Central', 27),
('Chair', 'South', 91),
('Chair', 'West', 82),
('Chair', 'North', 37),
('Desk', 'North', 68),
('Couch', 'Central', 54),
('Chair', 'South', 81),
('Desk', 'North', 25),
('Chair', 'North', 46),
('Chair', 'Central', 121),
('Desk', 'South', 85),
('Couch', 'North', 43),
('Desk', 'West', 10),
('Chair', 'North', 5),
('Chair', 'Central', 16),
('Desk', 'South', 9),
('Couch', 'West', 22),
('Couch', 'Central', 59),
('Chair', 'South', 76),
('Chair', 'West', 48),
('Chair', 'North', 19),
('Desk', 'North', 3),
('Couch', 'West', 63),
('Chair', 'South', 81),
('Desk', 'North', 85),
('Chair', 'North', 90),
('Chair', 'Central', 47),
('Desk', 'West', 63),
('Couch', 'North', 28);
SELECT * FROM product_sales;
#以下为静态实现:
SELECT
product_name,
SUM(CASE
WHEN store_location = 'North' THEN num_sales ELSE 0 END
) AS north,
SUM(CASE
WHEN store_location = 'Central' THEN num_sales ELSE 0 END
) AS central,
SUM(CASE
WHEN store_location = 'South' THEN num_sales ELSE 0 END
) AS south,
SUM(CASE
WHEN store_location = 'West' THEN num_sales ELSE 0 END
) AS west
FROM product_sales
GROUP BY product_name;
#以下为动态实现:
SET @sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT CONCAT(
'SUM(
CASE WHEN store_location = "', store_location, '" THEN num_sales ELSE 0 END)
AS ', store_location)
)
INTO @sql
FROM product_sales;
select @sql;
SET @sql = CONCAT('SELECT product_name, ', @sql,
' FROM product_sales GROUP BY product_name');
SELECT @sql;
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
==不使用if或case的实现==不使用"if", "case", or "GROUP_CONCAT". Yes, there is use for this..."if" statements sometimes cause problems when used in combination.
The simple secret, and it's also why they work in almost all databases, is the following functions:
- sign (x) returns -1,0, +1 for values x < 0, x = 0, x > 0 respectively
- abs( sign( x) ) returns 0 if x = 0 else, 1 if x > 0 or x < 0
- 1-abs( sign( x) ) complement of the above, since this returns 1 only if x = 0
Quick example: sign(-1) = -1, abs( sign(-1) ) = 1, 1-abs( sign(-1) ) = 0
Data for full example:
CREATE TABLE exams ( pkey int(11) NOT NULL auto_increment, name varchar(15), exam int, score int, PRIMARY KEY (pkey) ); insert into exams (name,exam,score) values ('Bob',1,75); insert into exams (name,exam,score) values ('Bob',2,77); insert into exams (name,exam,score) values ('Bob',3,78); insert into exams (name,exam,score) values ('Bob',4,80); insert into exams (name,exam,score) values ('Sue',1,90); insert into exams (name,exam,score) values ('Sue',2,97); insert into exams (name,exam,score) values ('Sue',3,98); insert into exams (name,exam,score) values ('Sue',4,99); mysql> select * from exams; +------+------+------+-------+ | pkey | name | exam | score | +------+------+------+-------+ | 1 | Bob | 1 | 75 | | 2 | Bob | 2 | 77 | | 3 | Bob | 3 | 78 | | 4 | Bob | 4 | 80 | | 5 | Sue | 1 | 90 | | 6 | Sue | 2 | 97 | | 7 | Sue | 3 | 98 | | 8 | Sue | 4 | 99 | +------+------+------+-------+ 8 rows in set (0.00 sec) mysql> select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4 from exams group by name; +------+-------+-------+-------+-------+ | name | exam1 | exam2 | exam3 | exam4 | +------+-------+-------+-------+-------+ | Bob | 75 | 77 | 78 | 80 | | Sue | 90 | 97 | 98 | 99 | +------+-------+-------+-------+-------+ 2 rows in set (0.00 sec)
Note, the above pivot table was created with one select statement.
Let's decompose to make the trick clearer, for the second exam:
mysql> select name, score, exam, exam-2, sign(exam-2), abs(sign(exam-2)), 1-abs(sign(exam-2)), score*(1-abs(sign(exam-2))) as exam2 from exams; +------+-------+------+--------+--------------+-------------------+---------------------+-------+ | name | score | exam | exam-2 | sign(exam-2) | abs(sign(exam-2)) | 1-abs(sign(exam-2)) | exam2 | +------+-------+------+--------+--------------+-------------------+---------------------+-------+ | Bob | 75 | 1 | -1 | -1 | 1 | 0 | 0 | | Bob | 77 | 2 | 0 | 0 | 0 | 1 | 77 | | Bob | 78 | 3 | 1 | 1 | 1 | 0 | 0 | | Bob | 80 | 4 | 2 | 1 | 1 | 0 | 0 | | Sue | 90 | 1 | -1 | -1 | 1 | 0 | 0 | | Sue | 97 | 2 | 0 | 0 | 0 | 1 | 97 | | Sue | 98 | 3 | 1 | 1 | 1 | 0 | 0 | | Sue | 99 | 4 | 2 | 1 | 1 | 0 | 0 | +------+-------+------+--------+--------------+-------------------+---------------------+-------+ 8 rows in set (0.00 sec)
You may think IF's would be clean but WATCH OUT! Look what the following gives (INCORRECT !!):
mysql> select name, if(exam=1,score,null) as exam1, if(exam=2,score,null) as exam2, if(exam=3,score,null) as exam3, if(exam=4,score,null) as exam4 from exams group by name; +------+-------+-------+-------+-------+ | name | exam1 | exam2 | exam3 | exam4 | +------+-------+-------+-------+-------+ | Bob | 75 | NULL | NULL | NULL | | Sue | 90 | NULL | NULL | NULL | +------+-------+-------+-------+-------+ 2 rows in set (0.00 sec)
Note: the following does work - is all the maths necessary after all?
mysql> SELECT name, SUM(IF(exam=1,score,NULL)) AS exam1, SUM(IF(exam=2,score,NULL)) AS exam2, SUM(IF(exam=3,score,NULL)) AS exam3, SUM(IF(exam=4,score,0)) AS exam4 FROM exams GROUP BY name; +------+-------+-------+-------+-------+ | name | exam1 | exam2 | exam3 | exam4 | +------+-------+-------+-------+-------+ | Bob | 75 | 77 | 78 | 80 | | Sue | 90 | 97 | 98 | 99 | +------+-------+-------+-------+-------+ 2 rows in set (0.00 sec) mysql> select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) as delta_1_2, sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) as delta_2_3, sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as delta_3_4 from exams group by name; +------+-------+-------+-------+-------+-----------+-----------+-----------+ | name | exam1 | exam2 | exam3 | exam4 | delta_1_2 | delta_2_3 | delta_3_4 | +------+-------+-------+-------+-------+-----------+-----------+-----------+ | Bob | 75 | 77 | 78 | 80 | 2 | 1 | 2 | | Sue | 90 | 97 | 98 | 99 | 7 | 1 | 1 | +------+-------+-------+-------+-------+-----------+-----------+-----------+ 2 rows in set (0.00 sec)
Above delta_1_2 shows the difference between the first and second exams, with the numbers being positive because both Bob and Sue improved their score with each exam. Calculating the deltas here shows it's possible to compare two rows, not columns which is easily done with the standard SQL statements but rows in the original table.
mysql>select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) as delta_1_2, sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) as delta_2_3, sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as delta_3_4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) + sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) + sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as TotalIncPoints from exams group by name; +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+ | name | exam1 | exam2 | exam3 | exam4 | delta_1_2 | delta_2_3 | delta_3_4 | TotalIncPoints | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+ | Bob | 75 | 77 | 78 | 80 | 2 | 1 | 2 | 5 | | Sue | 90 | 97 | 98 | 99 | 7 | 1 | 1 | 9 | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+ 2 rows in set (0.00 sec)
TotalIncPoints shows the sum of the deltas.
select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) as delta_1_2, sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) as delta_2_3, sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as delta_3_4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) + sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) + sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as TotalIncPoints, (sum(score*(1-abs(sign(exam-1)))) + sum(score*(1-abs(sign(exam-2)))) + sum(score*(1-abs(sign(exam-3)))) + sum(score*(1-abs(sign(exam-4)))))/4 as AVG from exams group by name; +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+-------+ | name | exam1 | exam2 | exam3 | exam4 | delta_1_2 | delta_2_3 | delta_3_4 | TotalIncPoints | AVG | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+-------+ | Bob | 75 | 77 | 78 | 80 | 2 | 1 | 2 | 5 | 77.50 | | Sue | 90 | 97 | 98 | 99 | 7 | 1 | 1 | 9 | 96.00 | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+-------+ 2 rows in set (0.00 sec)
It's possible to combine Total Increasing Point TotalIncPoints with AVG. In fact, it's possible to combine all of the example cuts of the data into one SQL statement, which provides additional options for displaying data on your page
select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, (sum(score*(1-abs(sign(exam-1)))) + sum(score*(1-abs(sign(exam-2)))))/2 as AVG1_2, (sum(score*(1-abs(sign(exam-2)))) + sum(score*(1-abs(sign(exam-3)))))/2 as AVG2_3, (sum(score*(1-abs(sign(exam-3)))) + sum(score*(1-abs(sign(exam-4)))))/2 as AVG3_4, (sum(score*(1-abs(sign(exam-1)))) + sum(score*(1-abs(sign(exam-2)))) + sum(score*(1-abs(sign(exam-3)))) + sum(score*(1-abs(sign(exam-4)))))/4 as AVG from exams group by name; +------+-------+-------+-------+-------+--------+--------+--------+-------+ | name | exam1 | exam2 | exam3 | exam4 | AVG1_2 | AVG2_3 | AVG3_4 | AVG | +------+-------+-------+-------+-------+--------+--------+--------+-------+ | Bob | 75 | 77 | 78 | 80 | 76.00 | 77.50 | 79.00 | 77.50 | | Sue | 90 | 97 | 98 | 99 | 93.50 | 97.50 | 98.50 | 96.00 | +------+-------+-------+-------+-------+--------+--------+--------+-------+ 2 rows in set (0.00 sec)
Exam scores are listing along with moving averages...again it's all with one select statement.
Good article on "Cross tabulations" or de-normalizing data to show stats: http://dev.mysql.com/tech-resources/articles/wizard/print_version.html
ADOdb (PHP) can generate pivot tables using PivotTableSQL()
.
For Perl, check DBIx-SQLCrosstab.