SQL-分組月度環比同比

有一個需求是按月份統計環比和同比值,每次都取一遍非常麻煩,同時也容易出錯,這里我把要取的數據報表化,這里提供一個模板,這個SQL寫了我好久,經過對比數據發現,這樣寫是沒什么問題的,這里提供一些注意事項

  1. 同比環比的定義
    月份同比計算 (2021-01 - 2020-01) / 2020-01
    月份環比計算 (2021-02 - 2021-01) / 2021-01

  2. 在計算同比的時候要注意order by要對月份和年份都要排序,原因是lead 中的order by只指定了一個排序規則,這里都要指定,否則會亂序,這個檢查了我好久的時間

  3. 同比和環比的SQL思路
    同比,lead按照業務線分組,之后lead上推度量值,注意要排序使得相同月份,不同年份的數據在一起
    環比,這里也是使用lead來上推,這個沒什么注意事項

開始的寫法,我這里把同比和環比放在一行了,這種寫法的缺點就是不夠直觀

select 
lag,j_month,basis,relative
FROM
(
SELECT lag, j_month, order_price
    , (order_price - lead_price_basis) * 1.0 / lead_price_basis AS basis
    , relative,
FROM (
    SELECT lag, j_month, order_price, relative
        , lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY substring(j_month, 6, 7) DESC,substring(j_month, 0, 4) desc) AS lead_price_basis
    FROM (
        SELECT lag, j_month, order_price
            , (order_price - lead_price_relative) * 1.0 / lead_price_relative AS relative
        FROM (
            SELECT lag, j_month, order_price
                , lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY j_month DESC) AS lead_price_relative
            FROM (
                SELECT concat(lag_country, bu) AS lag, j_month, order_price
                FROM (
                    SELECT bu
                        , CASE 
                            WHEN to_country_id = 1 THEN '國內'
                            WHEN to_country_id <> 1 THEN '國外'
                        END AS lag_country, substring(d, 0, 7) AS j_month
                        , round(SUM(order_price), 0) AS order_price
                    FROM table
                    WHERE d >= '2019-01-01'
                        AND bu IN ('機票', '酒店', '度假')
                        AND d <> '4000-01-01'
                    GROUP BY bu, substring(d, 0, 7), CASE 
                            WHEN to_country_id = 1 THEN '國內'
                            WHEN to_country_id <> 1 THEN '國外'
                        END
                ) a
                ORDER BY lag, j_month DESC
            ) b
            WHERE lag IS NOT NULL
        ) c
    ) d
) e
) f 
where  substring(j_month,0,4) = '2021'

后來使用union all 把同比和環比分開顯示,然后使用行轉列,就得到了下面的效果圖,個人認為是一種非常好的報表展示


展示效果
select
lag,j_lag,
MAX(case when substring(j_month,6,7) = '01' then relative  end ) as m1,
MAX(case when substring(j_month,6,7) = '02' then relative  end ) as m2,
MAX(case when substring(j_month,6,7) = '03' then relative  end ) as m3,
MAX(case when substring(j_month,6,7) = '04' then relative  end ) as m4,
MAX(case when substring(j_month,6,7) = '05' then relative  end ) as m5,
MAX(case when substring(j_month,6,7) = '06' then relative  end ) as m6,
MAX(case when substring(j_month,6,7) = '07' then relative  end ) as m7,
MAX(case when substring(j_month,6,7) = '08' then relative  end ) as m8,
MAX(case when substring(j_month,6,7) = '09' then relative  end ) as m9,
MAX(case when substring(j_month,6,7) = '10' then relative  end ) as m10,
MAX(case when substring(j_month,6,7) = '11' then relative  end ) as m11,
MAX(case when substring(j_month,6,7) = '12' then relative  end ) as m12
from 
(
SELECT lag, '環比' AS j_lag, j_month
    , (order_price - lead_price_relative) * 1.0 / lead_price_relative AS relative
FROM (
    SELECT lag, j_month, order_price
        , lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY j_month DESC) AS lead_price_relative
    FROM (
        SELECT concat(lag_country, bu) AS lag, j_month, order_price
        FROM (
            SELECT bu
                , CASE 
                    WHEN to_country_id = 1 THEN '國內'
                    WHEN to_country_id <> 1 THEN '國外'
                END AS lag_country, substring(d, 0, 7) AS j_month
                , round(SUM(order_price), 0) AS order_price
            FROM table
            WHERE d >= '2019-01-01'
                AND bu IN ('機票', '酒店', '度假')
                AND d <> '4000-01-01'
            GROUP BY bu, substring(d, 0, 7), CASE 
                    WHEN to_country_id = 1 THEN '國內'
                    WHEN to_country_id <> 1 THEN '國外'
                END
        ) a
    ) b
    WHERE lag IS NOT NULL
) c
UNION ALL
(
SELECT lag, '同比' , j_month
    , (order_price - lead_price_basis) * 1.0 / lead_price_basis 
FROM (
    SELECT lag, j_month, order_price
        , lead(order_price, 1, NULL) OVER (PARTITION BY lag ORDER BY substring(j_month, 6, 7) DESC,substring(j_month, 0, 4) desc) AS lead_price_basis
    FROM (
        SELECT concat(lag_country, bu) AS lag, j_month, order_price
        FROM (
            SELECT bu
                , CASE 
                    WHEN to_country_id = 1 THEN '國內'
                    WHEN to_country_id <> 1 THEN '國外'
                END AS lag_country, substring(d, 0, 7) AS j_month
                , round(SUM(order_price), 0) AS order_price
            FROM table
            WHERE d >= '2019-01-01'
                AND bu IN ('機票', '酒店', '度假')
                AND d <> '4000-01-01'
            GROUP BY bu, substring(d, 0, 7), CASE 
                    WHEN to_country_id = 1 THEN '國內'
                    WHEN to_country_id <> 1 THEN '國外'
                END
        ) a
    ) b
    WHERE lag IS NOT NULL
) c
)
) 
WHERE substring(j_month, 0, 4) = '2021'
GROUP by lag,j_lag
order by substring(lag,3,4),substring(lag,0,2),j_lag

同時我們要對多月份進行匯總計算同比值,如果在這些報表上直接添加匯總是非常麻煩的,需求也是易變的,這里要對每一年的sum(度量)進行報表化,分業務類型,這個報表是簡單的,直接分別統計不同年份,直接 union all ,然后使用行轉列轉變格式即可,還是非常簡單的,我們基于這張表來計算多月份的同比就比較簡單了,就不用每次麻煩取數了,這張報表展示如下


報表的展示

完成這個需求,我發現SQL的設計同SQL的優化一樣重要,如果這個SQL得出的結果不夠直觀還要二次處理非常低效

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容