本文案例以MySQL5.7作為數(shù)據(jù)庫環(huán)境。
重復(fù)數(shù)據(jù)產(chǎn)生的原因有多種,比如系統(tǒng)存在bug、重復(fù)提交、需求調(diào)整(原來允許重復(fù)的內(nèi)容現(xiàn)在不允許重復(fù)了)... 原因就不一一列覺了,這里用實例來分析怎么解決重復(fù)數(shù)據(jù)的問題。
在另一篇《MySQL實戰(zhàn)》的用戶表中準(zhǔn)備以下數(shù)據(jù)
mysql> select id,username,mobile from t_user;
+-------+----------+-------------+
| id | username | mobile |
+-------+----------+-------------+
| 10001 | user1 | 13900000001 |
| 10002 | user2 | NULL |
| 10003 | user3 | NULL |
| 10004 | user4 | NULL |
| 10005 | user5 | NULL |
| 10006 | user6 | 13900000001 |
+-------+----------+-------------+
現(xiàn)在需要檢查用戶表中手機(jī)號mobile重復(fù)的數(shù)據(jù),可以利用聚合函數(shù)count()按mobile字段group by找到需要的結(jié)果。
# 查詢找到出現(xiàn)重復(fù)的手機(jī)號
mysql> select mobile,count(1) as c from t_user where mobile is not null group by mobile having c > 1;
+-------------+---+
| mobile | c |
+-------------+---+
| 13900000001 | 2 |
+-------------+---+
接下來根據(jù)需要對重復(fù)的手機(jī)號進(jìn)行處理,比如將id較大的記錄中的手機(jī)號設(shè)為null。
我們按照要求一步一步來完善上面的sql,既然要對id較大的記錄處理,那么久需要找到id最小的記錄
# mim(id)
mysql> select mobile,count(1) as c,min(id) as min_id from t_user where mobile is not null group by mobile having c > 1;
+-------------+---+--------+
| mobile | c | min_id |
+-------------+---+--------+
| 13900000001 | 2 | 10001 |
+-------------+---+--------+
找到最小id后,將t_user與查詢結(jié)果join,執(zhí)行update動作。
# update ... where id > ...
mysql> update t_user as u
-> join (
-> select mobile,count(1) as c,min(id) as min_id from t_user where mobile is not null group by mobile having c > 1
-> ) as a on u.mobile = a.mobile
-> set mobile = null
-> where u.id > a.min_id;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
提示執(zhí)行成功,最后檢查下是否達(dá)到預(yù)期效果。
# 查詢是否存在mobile重復(fù)的記錄
mysql> select mobile,count(1) as c from t_user where mobile is not null group by mobile having c > 1;
Empty set (0.00 sec)
# 再通過直觀方式再次驗證
mysql> select id,username,mobile from t_user;
+-------+----------+-------------+
| id | username | mobile |
+-------+----------+-------------+
| 10001 | user1 | 13900000001 |
| 10002 | user2 | NULL |
| 10003 | user3 | NULL |
| 10004 | user4 | NULL |
| 10005 | user5 | NULL |
| 10006 | user6 | NULL |
+-------+----------+-------------+
6 rows in set (0.00 sec)