MySQL/Pivot table
「透視表」或「交叉報表」(crosstab report),是對源數據表的選定的列作為「列標籤」(Column label),該列的值將作為透視表中的列名。從而實現「行轉列」。源數據表的其他列可以在透視表中保留(這稱為「列標籤」Column labels,即透視表的每行在該列有不同取值),也可以用於匯總計算(這稱為「匯總值」Summation values)。
MySQL不像SQL Server直接提供了pivot和unpivot關鍵字,所以需要自行創建。
使用case的實現
編輯可分為手工寫SQL語句的靜態實現或者寫存儲程序的動態實現。
CREATE TABLE product_sales (
product_name VARCHAR(100),
store_location VARCHAR(50),
num_sales INT
);
INSERT INTO product_sales (product_name, store_location, num_sales) VALUES
('Chair', 'North', 55),
('Desk', 'Central', 120),
('Couch', 'Central', 78),
('Chair', 'South', 23),
('Chair', 'South', 10),
('Chair', 'North', 98),
('Desk', 'West', 61),
('Couch', 'North', 180),
('Chair', 'South', 14),
('Desk', 'North', 45),
('Chair', 'North', 87),
('Chair', 'Central', 34),
('Desk', 'South', 42),
('Couch', 'West', 58),
('Couch', 'Central', 27),
('Chair', 'South', 91),
('Chair', 'West', 82),
('Chair', 'North', 37),
('Desk', 'North', 68),
('Couch', 'Central', 54),
('Chair', 'South', 81),
('Desk', 'North', 25),
('Chair', 'North', 46),
('Chair', 'Central', 121),
('Desk', 'South', 85),
('Couch', 'North', 43),
('Desk', 'West', 10),
('Chair', 'North', 5),
('Chair', 'Central', 16),
('Desk', 'South', 9),
('Couch', 'West', 22),
('Couch', 'Central', 59),
('Chair', 'South', 76),
('Chair', 'West', 48),
('Chair', 'North', 19),
('Desk', 'North', 3),
('Couch', 'West', 63),
('Chair', 'South', 81),
('Desk', 'North', 85),
('Chair', 'North', 90),
('Chair', 'Central', 47),
('Desk', 'West', 63),
('Couch', 'North', 28);
SELECT * FROM product_sales;
#以下为静态实现:
SELECT
product_name,
SUM(CASE
WHEN store_location = 'North' THEN num_sales ELSE 0 END
) AS north,
SUM(CASE
WHEN store_location = 'Central' THEN num_sales ELSE 0 END
) AS central,
SUM(CASE
WHEN store_location = 'South' THEN num_sales ELSE 0 END
) AS south,
SUM(CASE
WHEN store_location = 'West' THEN num_sales ELSE 0 END
) AS west
FROM product_sales
GROUP BY product_name;
#以下为动态实现:
SET @sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT CONCAT(
'SUM(
CASE WHEN store_location = "', store_location, '" THEN num_sales ELSE 0 END)
AS ', store_location)
)
INTO @sql
FROM product_sales;
select @sql;
SET @sql = CONCAT('SELECT product_name, ', @sql,
' FROM product_sales GROUP BY product_name');
SELECT @sql;
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
==不使用if或case的實現==不使用"if", "case", or "GROUP_CONCAT". Yes, there is use for this..."if" statements sometimes cause problems when used in combination.
The simple secret, and it's also why they work in almost all databases, is the following functions:
- sign (x) returns -1,0, +1 for values x < 0, x = 0, x > 0 respectively
- abs( sign( x) ) returns 0 if x = 0 else, 1 if x > 0 or x < 0
- 1-abs( sign( x) ) complement of the above, since this returns 1 only if x = 0
Quick example: sign(-1) = -1, abs( sign(-1) ) = 1, 1-abs( sign(-1) ) = 0
Data for full example:
CREATE TABLE exams ( pkey int(11) NOT NULL auto_increment, name varchar(15), exam int, score int, PRIMARY KEY (pkey) ); insert into exams (name,exam,score) values ('Bob',1,75); insert into exams (name,exam,score) values ('Bob',2,77); insert into exams (name,exam,score) values ('Bob',3,78); insert into exams (name,exam,score) values ('Bob',4,80); insert into exams (name,exam,score) values ('Sue',1,90); insert into exams (name,exam,score) values ('Sue',2,97); insert into exams (name,exam,score) values ('Sue',3,98); insert into exams (name,exam,score) values ('Sue',4,99); mysql> select * from exams; +------+------+------+-------+ | pkey | name | exam | score | +------+------+------+-------+ | 1 | Bob | 1 | 75 | | 2 | Bob | 2 | 77 | | 3 | Bob | 3 | 78 | | 4 | Bob | 4 | 80 | | 5 | Sue | 1 | 90 | | 6 | Sue | 2 | 97 | | 7 | Sue | 3 | 98 | | 8 | Sue | 4 | 99 | +------+------+------+-------+ 8 rows in set (0.00 sec) mysql> select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4 from exams group by name; +------+-------+-------+-------+-------+ | name | exam1 | exam2 | exam3 | exam4 | +------+-------+-------+-------+-------+ | Bob | 75 | 77 | 78 | 80 | | Sue | 90 | 97 | 98 | 99 | +------+-------+-------+-------+-------+ 2 rows in set (0.00 sec)
Note, the above pivot table was created with one select statement.
Let's decompose to make the trick clearer, for the second exam:
mysql> select name, score, exam, exam-2, sign(exam-2), abs(sign(exam-2)), 1-abs(sign(exam-2)), score*(1-abs(sign(exam-2))) as exam2 from exams; +------+-------+------+--------+--------------+-------------------+---------------------+-------+ | name | score | exam | exam-2 | sign(exam-2) | abs(sign(exam-2)) | 1-abs(sign(exam-2)) | exam2 | +------+-------+------+--------+--------------+-------------------+---------------------+-------+ | Bob | 75 | 1 | -1 | -1 | 1 | 0 | 0 | | Bob | 77 | 2 | 0 | 0 | 0 | 1 | 77 | | Bob | 78 | 3 | 1 | 1 | 1 | 0 | 0 | | Bob | 80 | 4 | 2 | 1 | 1 | 0 | 0 | | Sue | 90 | 1 | -1 | -1 | 1 | 0 | 0 | | Sue | 97 | 2 | 0 | 0 | 0 | 1 | 97 | | Sue | 98 | 3 | 1 | 1 | 1 | 0 | 0 | | Sue | 99 | 4 | 2 | 1 | 1 | 0 | 0 | +------+-------+------+--------+--------------+-------------------+---------------------+-------+ 8 rows in set (0.00 sec)
You may think IF's would be clean but WATCH OUT! Look what the following gives (INCORRECT !!):
mysql> select name, if(exam=1,score,null) as exam1, if(exam=2,score,null) as exam2, if(exam=3,score,null) as exam3, if(exam=4,score,null) as exam4 from exams group by name; +------+-------+-------+-------+-------+ | name | exam1 | exam2 | exam3 | exam4 | +------+-------+-------+-------+-------+ | Bob | 75 | NULL | NULL | NULL | | Sue | 90 | NULL | NULL | NULL | +------+-------+-------+-------+-------+ 2 rows in set (0.00 sec)
Note: the following does work - is all the maths necessary after all?
mysql> SELECT name, SUM(IF(exam=1,score,NULL)) AS exam1, SUM(IF(exam=2,score,NULL)) AS exam2, SUM(IF(exam=3,score,NULL)) AS exam3, SUM(IF(exam=4,score,0)) AS exam4 FROM exams GROUP BY name; +------+-------+-------+-------+-------+ | name | exam1 | exam2 | exam3 | exam4 | +------+-------+-------+-------+-------+ | Bob | 75 | 77 | 78 | 80 | | Sue | 90 | 97 | 98 | 99 | +------+-------+-------+-------+-------+ 2 rows in set (0.00 sec) mysql> select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) as delta_1_2, sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) as delta_2_3, sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as delta_3_4 from exams group by name; +------+-------+-------+-------+-------+-----------+-----------+-----------+ | name | exam1 | exam2 | exam3 | exam4 | delta_1_2 | delta_2_3 | delta_3_4 | +------+-------+-------+-------+-------+-----------+-----------+-----------+ | Bob | 75 | 77 | 78 | 80 | 2 | 1 | 2 | | Sue | 90 | 97 | 98 | 99 | 7 | 1 | 1 | +------+-------+-------+-------+-------+-----------+-----------+-----------+ 2 rows in set (0.00 sec)
Above delta_1_2 shows the difference between the first and second exams, with the numbers being positive because both Bob and Sue improved their score with each exam. Calculating the deltas here shows it's possible to compare two rows, not columns which is easily done with the standard SQL statements but rows in the original table.
mysql>select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) as delta_1_2, sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) as delta_2_3, sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as delta_3_4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) + sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) + sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as TotalIncPoints from exams group by name; +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+ | name | exam1 | exam2 | exam3 | exam4 | delta_1_2 | delta_2_3 | delta_3_4 | TotalIncPoints | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+ | Bob | 75 | 77 | 78 | 80 | 2 | 1 | 2 | 5 | | Sue | 90 | 97 | 98 | 99 | 7 | 1 | 1 | 9 | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+ 2 rows in set (0.00 sec)
TotalIncPoints shows the sum of the deltas.
select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) as delta_1_2, sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) as delta_2_3, sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as delta_3_4, sum(score*(1-abs(sign(exam- 2)))) - sum(score*(1-abs(sign(exam- 1)))) + sum(score*(1-abs(sign(exam- 3)))) - sum(score*(1-abs(sign(exam- 2)))) + sum(score*(1-abs(sign(exam- 4)))) - sum(score*(1-abs(sign(exam- 3)))) as TotalIncPoints, (sum(score*(1-abs(sign(exam-1)))) + sum(score*(1-abs(sign(exam-2)))) + sum(score*(1-abs(sign(exam-3)))) + sum(score*(1-abs(sign(exam-4)))))/4 as AVG from exams group by name; +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+-------+ | name | exam1 | exam2 | exam3 | exam4 | delta_1_2 | delta_2_3 | delta_3_4 | TotalIncPoints | AVG | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+-------+ | Bob | 75 | 77 | 78 | 80 | 2 | 1 | 2 | 5 | 77.50 | | Sue | 90 | 97 | 98 | 99 | 7 | 1 | 1 | 9 | 96.00 | +------+-------+-------+-------+-------+-----------+-----------+-----------+----------------+-------+ 2 rows in set (0.00 sec)
It's possible to combine Total Increasing Point TotalIncPoints with AVG. In fact, it's possible to combine all of the example cuts of the data into one SQL statement, which provides additional options for displaying data on your page
select name, sum(score*(1-abs(sign(exam-1)))) as exam1, sum(score*(1-abs(sign(exam-2)))) as exam2, sum(score*(1-abs(sign(exam-3)))) as exam3, sum(score*(1-abs(sign(exam-4)))) as exam4, (sum(score*(1-abs(sign(exam-1)))) + sum(score*(1-abs(sign(exam-2)))))/2 as AVG1_2, (sum(score*(1-abs(sign(exam-2)))) + sum(score*(1-abs(sign(exam-3)))))/2 as AVG2_3, (sum(score*(1-abs(sign(exam-3)))) + sum(score*(1-abs(sign(exam-4)))))/2 as AVG3_4, (sum(score*(1-abs(sign(exam-1)))) + sum(score*(1-abs(sign(exam-2)))) + sum(score*(1-abs(sign(exam-3)))) + sum(score*(1-abs(sign(exam-4)))))/4 as AVG from exams group by name; +------+-------+-------+-------+-------+--------+--------+--------+-------+ | name | exam1 | exam2 | exam3 | exam4 | AVG1_2 | AVG2_3 | AVG3_4 | AVG | +------+-------+-------+-------+-------+--------+--------+--------+-------+ | Bob | 75 | 77 | 78 | 80 | 76.00 | 77.50 | 79.00 | 77.50 | | Sue | 90 | 97 | 98 | 99 | 93.50 | 97.50 | 98.50 | 96.00 | +------+-------+-------+-------+-------+--------+--------+--------+-------+ 2 rows in set (0.00 sec)
Exam scores are listing along with moving averages...again it's all with one select statement.
Good article on "Cross tabulations" or de-normalizing data to show stats: http://dev.mysql.com/tech-resources/articles/wizard/print_version.html
ADOdb (PHP) can generate pivot tables using PivotTableSQL()
.
For Perl, check DBIx-SQLCrosstab.