本文案例以MySQL5.7作為數據庫環(huán)境。
重復數據產生的原因有多種,比如系統(tǒng)存在bug、重復提交、需求調整(原來允許重復的內容現在不允許重復了)... 原因就不一一列覺了,這里用實例來分析怎么解決重復數據的問題。
在另一篇《MySQL實戰(zhàn)》的用戶表中準備以下數據
mysql> select id,username,mobile from t_user;
+-------+----------+-------------+
| id | username | mobile |
+-------+----------+-------------+
| 10001 | user1 | 13900000001 |
| 10002 | user2 | NULL |
| 10003 | user3 | NULL |
| 10004 | user4 | NULL |
| 10005 | user5 | NULL |
| 10006 | user6 | 13900000001 |
+-------+----------+-------------+
現在需要檢查用戶表中手機號mobile重復的數據,可以利用聚合函數count()按mobile字段group by找到需要的結果。
# 查詢找到出現重復的手機號
mysql> select mobile,count(1) as c from t_user where mobile is not null group by mobile having c > 1;
+-------------+---+
| mobile | c |
+-------------+---+
| 13900000001 | 2 |
+-------------+---+
接下來根據需要對重復的手機號進行處理,比如將id較大的記錄中的手機號設為null。
我們按照要求一步一步來完善上面的sql,既然要對id較大的記錄處理,那么久需要找到id最小的記錄
# mim(id)
mysql> select mobile,count(1) as c,min(id) as min_id from t_user where mobile is not null group by mobile having c > 1;
+-------------+---+--------+
| mobile | c | min_id |
+-------------+---+--------+
| 13900000001 | 2 | 10001 |
+-------------+---+--------+
找到最小id后,將t_user與查詢結果join,執(zhí)行update動作。
# update ... where id > ...
mysql> update t_user as u
-> join (
-> select mobile,count(1) as c,min(id) as min_id from t_user where mobile is not null group by mobile having c > 1
-> ) as a on u.mobile = a.mobile
-> set mobile = null
-> where u.id > a.min_id;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
提示執(zhí)行成功,最后檢查下是否達到預期效果。
# 查詢是否存在mobile重復的記錄
mysql> select mobile,count(1) as c from t_user where mobile is not null group by mobile having c > 1;
Empty set (0.00 sec)
# 再通過直觀方式再次驗證
mysql> select id,username,mobile from t_user;
+-------+----------+-------------+
| id | username | mobile |
+-------+----------+-------------+
| 10001 | user1 | 13900000001 |
| 10002 | user2 | NULL |
| 10003 | user3 | NULL |
| 10004 | user4 | NULL |
| 10005 | user5 | NULL |
| 10006 | user6 | NULL |
+-------+----------+-------------+
6 rows in set (0.00 sec)