有一個需求是按月份統計環比和同比值,每次都取一遍非常麻煩,同時也容易出錯,這里我把要取的數據報表化,這里提供一個模板,這個SQL寫了我好久,經過對比數據發現,這樣寫是沒什么問題的,這里提供一些注意事項
同比環比的定義
月份同比計算 (2021-01 - 2020-01) / 2020-01
月份環比計算 (2021-02 - 2021-01) / 2021-01在計算同比的時候要注意order by要對月份和年份都要排序,原因是lead 中的order by只指定了一個排序規則,這里都要指定,否則會亂序,這個檢查了我好久的時間
同比和環比的SQL思路
同比,lead按照業務線分組,之后lead上推度量值,注意要排序使得相同月份,不同年份的數據在一起
環比,這里也是使用lead來上推,這個沒什么注意事項
開始的寫法,我這里把同比和環比放在一行了,這種寫法的缺點就是不夠直觀
select
lag,j_month,basis,relative
FROM
(
SELECT lag, j_month, order_price
, (order_price - lead_price_basis) * 1.0 / lead_price_basis AS basis
, relative,
FROM (
SELECT lag, j_month, order_price, relative
, lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY substring(j_month, 6, 7) DESC,substring(j_month, 0, 4) desc) AS lead_price_basis
FROM (
SELECT lag, j_month, order_price
, (order_price - lead_price_relative) * 1.0 / lead_price_relative AS relative
FROM (
SELECT lag, j_month, order_price
, lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY j_month DESC) AS lead_price_relative
FROM (
SELECT concat(lag_country, bu) AS lag, j_month, order_price
FROM (
SELECT bu
, CASE
WHEN to_country_id = 1 THEN '國內'
WHEN to_country_id <> 1 THEN '國外'
END AS lag_country, substring(d, 0, 7) AS j_month
, round(SUM(order_price), 0) AS order_price
FROM table
WHERE d >= '2019-01-01'
AND bu IN ('機票', '酒店', '度假')
AND d <> '4000-01-01'
GROUP BY bu, substring(d, 0, 7), CASE
WHEN to_country_id = 1 THEN '國內'
WHEN to_country_id <> 1 THEN '國外'
END
) a
ORDER BY lag, j_month DESC
) b
WHERE lag IS NOT NULL
) c
) d
) e
) f
where substring(j_month,0,4) = '2021'
后來使用union all 把同比和環比分開顯示,然后使用行轉列,就得到了下面的效果圖,個人認為是一種非常好的報表展示
展示效果
select
lag,j_lag,
MAX(case when substring(j_month,6,7) = '01' then relative end ) as m1,
MAX(case when substring(j_month,6,7) = '02' then relative end ) as m2,
MAX(case when substring(j_month,6,7) = '03' then relative end ) as m3,
MAX(case when substring(j_month,6,7) = '04' then relative end ) as m4,
MAX(case when substring(j_month,6,7) = '05' then relative end ) as m5,
MAX(case when substring(j_month,6,7) = '06' then relative end ) as m6,
MAX(case when substring(j_month,6,7) = '07' then relative end ) as m7,
MAX(case when substring(j_month,6,7) = '08' then relative end ) as m8,
MAX(case when substring(j_month,6,7) = '09' then relative end ) as m9,
MAX(case when substring(j_month,6,7) = '10' then relative end ) as m10,
MAX(case when substring(j_month,6,7) = '11' then relative end ) as m11,
MAX(case when substring(j_month,6,7) = '12' then relative end ) as m12
from
(
SELECT lag, '環比' AS j_lag, j_month
, (order_price - lead_price_relative) * 1.0 / lead_price_relative AS relative
FROM (
SELECT lag, j_month, order_price
, lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY j_month DESC) AS lead_price_relative
FROM (
SELECT concat(lag_country, bu) AS lag, j_month, order_price
FROM (
SELECT bu
, CASE
WHEN to_country_id = 1 THEN '國內'
WHEN to_country_id <> 1 THEN '國外'
END AS lag_country, substring(d, 0, 7) AS j_month
, round(SUM(order_price), 0) AS order_price
FROM table
WHERE d >= '2019-01-01'
AND bu IN ('機票', '酒店', '度假')
AND d <> '4000-01-01'
GROUP BY bu, substring(d, 0, 7), CASE
WHEN to_country_id = 1 THEN '國內'
WHEN to_country_id <> 1 THEN '國外'
END
) a
) b
WHERE lag IS NOT NULL
) c
UNION ALL
(
SELECT lag, '同比' , j_month
, (order_price - lead_price_basis) * 1.0 / lead_price_basis
FROM (
SELECT lag, j_month, order_price
, lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY substring(j_month, 6, 7) DESC,substring(j_month, 0, 4) desc) AS lead_price_basis
FROM (
SELECT concat(lag_country, bu) AS lag, j_month, order_price
FROM (
SELECT bu
, CASE
WHEN to_country_id = 1 THEN '國內'
WHEN to_country_id <> 1 THEN '國外'
END AS lag_country, substring(d, 0, 7) AS j_month
, round(SUM(order_price), 0) AS order_price
FROM table
WHERE d >= '2019-01-01'
AND bu IN ('機票', '酒店', '度假')
AND d <> '4000-01-01'
GROUP BY bu, substring(d, 0, 7), CASE
WHEN to_country_id = 1 THEN '國內'
WHEN to_country_id <> 1 THEN '國外'
END
) a
) b
WHERE lag IS NOT NULL
) c
)
)
WHERE substring(j_month, 0, 4) = '2021'
GROUP by lag,j_lag
order by substring(lag,3,4),substring(lag,0,2),j_lag
同時我們要對多月份進行匯總計算同比值,如果在這些報表上直接添加匯總是非常麻煩的,需求也是易變的,這里要對每一年的sum(度量)進行報表化,分業務類型,這個報表是簡單的,直接分別統計不同年份,直接 union all ,然后使用行轉列轉變格式即可,還是非常簡單的,我們基于這張表來計算多月份的同比就比較簡單了,就不用每次麻煩取數了,這張報表展示如下
報表的展示
完成這個需求,我發現SQL的設計同SQL的優化一樣重要,如果這個SQL得出的結果不夠直觀還要二次處理非常低效