Group Replication
[TOC]
關于 Group Greplication
- 異步復制
- 半同步復制
- Group Replication
Group Replication 是組復制,不是同步復制,但最終是同步的,更確切地說,事務以相同順序傳遞給所有組成員,但它們執行并不同步,接受事務被提交之后,每個成員以自己的速度提交。
異步復制
<img src="https://dev.mysql.com/doc/refman/5.7/en/images/async-replication-diagram.png" style="zoom:50%" />
半同步復制
<img src="https://dev.mysql.com/doc/refman/5.7/en/images/semisync-replication-diagram.png" style="zoom:50%" />
Group Replication
<img src="https://dev.mysql.com/doc/refman/5.7/en/images/gr-replication-diagram.png" style="zoom:50%" />
此圖沒有包含 Paxos 消息等信息
Group Replication 要求
- InnoDB
- Primary Key
- IPv4,不支持 IPv6
- Network Performance
- log_bin = binlog
- log_slave_update = ON
- binlog_format = ROW
- GTID
- master_info_repository = TABLE、relay_log_info_repository = TABLE
- transaction_write_set_extraction = XXHASH64
- Multi-threaded Appliers (slave_parallel_type、slave_parallel_workers、slave_preserve_commit_order)
- 當前環境要存在 root@localhost 用戶,INSTALL PLUGIN 時要使用 root@localhost 創建和檢查 _gr_user@localhost
Group Replication 限制
- binlog_checksum = NONE
- Gap Locks : 認證過程(The certification process)不考慮 Gap locks,因為Gap locks信息在InnoDB外不可用,詳細參考 Gap Locks
- 除非應用程序依賴 REPEATABLE READ,否則建議 READ COMMITTED,因為 InnoDB 在 READ COMMITTED 沒有 Gap locks,InnoDB本身沖突檢測和Group Replication 分布式沖突檢測協同工作
- Table Locks and Named Locks : 認證過程(The certification process)不考慮表鎖 [ Section 13.3.5, “LOCK TABLES and UNLOCK TABLES Syntax 和命名鎖 GET_LOCK()
-
Savepoints Not Supported(5.7.19 已經支持) - SERIALIZABLE Isolation Level 多主模式不支持
- Concurrent DDL versus DML Operations(并發DDL與DML操作) : 多主模式下,同一對象在不同成員并發執行DDL和DML語句,可能造成腦裂數據不一致
- 級聯約束外鍵 多主模式不支持,外鍵約束可能導致多主模式組成的級聯操作造成檢測不到的沖突,導致組內成員數據不一致。單主模式不受影響。
- Very Large Transactions : 產生足夠大的GTID內容的個別事務可能導致組通信中的故障,該GTID內容足夠大,無法通過網絡在組合成員之間復制5秒鐘以內。為避免此問題,盡可能多地嘗試限制交易的大小。例如,將使用LOAD DATA INFILE的文件拆分成更小的塊。
常見問題
-
一個 MySQL Group Replication 最大成員數是多少?
9個
-
MySQL Group Replication 組內成員怎么通信?
通過 P2P TCP 連接,參數 group_replication_local_address 設置,IP要求所有成員可訪問,僅用于組內成員內部通信和消息傳遞
-
group_replication_bootstrap_group 參數有什么用?
MySQL Gropu Replication 創建組并做為初始化種子,第二個加入的成員需要向引導成員請求動態更改配置,以便加入到組中。
二種情況下使用:
- 創建初始化 Group Replication時
- 關閉并重新啟動整個組時
-
如果配置恢復過程?
可以通過 CHANGE MASTER TO 語句預配置組恢復通道
-
可以通過組復制橫向擴展寫負載?
可以通過組內不同成員之間傳播<u>無沖突</u>的事務來橫向擴展寫。
-
相同工作負載,對比簡單復制,需要更多的CPU和寬帶?
組內各個成員要不斷交互,成員數更多對成員數少要求更多的寬帶(9>3),為了保證組內同步和消息傳遞,會占用更多CPU和內存
-
可以跨廣域網部署嗎?
可以,但是每個成員之間網絡必須可靠,為了高性能需要低延時、高寬帶,如網絡丟包導致重傳和更高端到端延時,吞吐量和延時都會受影響
在延時大、帶寬窄的網絡環境,提高 Paxos對網絡適應,做了壓縮和多個本地事務封裝一個數據包的優化 Section 17.9.7.2, “Message Compression”
當組內成員通信往返時間 (RTT) 為2秒或更長時,可能會遇到問題,因為內置故障檢測機制可能會錯誤的觸發
-
當出現臨時連接問題時,成員能自動重新加入組嗎?
取決連接問題:
- 連接問題暫時的,恢復足夠快,故障檢測還沒檢測到就進行了重新連接,則不會移除
- 如果長時間連接問題,故障檢測器最終檢測到問題,該成員被移除,當恢復后,需要手工加入(或腳本自動加入)
-
什么時候從組內移除成員?
當某個成員無響應時(崩潰或網絡問題),系統會檢測到故障,其他成員把它從組配置中移除,創建一個不包含該成員的新配置
當一個成員明顯的延時怎么處理?
當一個成員明顯延時,沒有定義何時從組中自動移除成員的策略,<u>要找到延時原因并修復它或從組中刪除該成員</u>,否則觸發"流量控制",那么整個組的也將變慢。(流量控制可配置)
-
在懷疑組中存在問題時,組中是否有某個特定成員負責觸發重新配置組?
沒有。
任何成員都可以懷疑組中存在問題,所有成員需要(自動)對"某個成員故障"達成一致的意見,有一個成員負責觸發重新配置,從組中將故障成員移除,具體哪個成員不可配置。
Can I use Group Replication for sharding?
How do I use Group Replication with SELinux?
How do I use Group Replication with iptables?
How do I recover the relay log for a replication channel used by a group member?
單獨的通信機制
GR 使用 Slave 的通道,只是使用通過執行線程(Applier Thread)來執行 Binlog Event,并沒有使用通道傳輸 Binlog Event。
沒有使用異步復制的 Binlog Event,也沒有使用 MySQL 服務端口通信,而是創建一個獨立 TCP 端口通信,各個 MySQL 服務器睥 Group Replication 插件通過這個端口連接在一起,兩兩通信
Binlog Event 多線程執行
GR 插件自動創建一個通道 group_replication_applier (channel) 來執行接收到的 Binlog Event,當加入組時,GR 插件自動啟動 group_replication_applier 通道的執行線程(Applier Thread)
-- 手工調整這個通道執行線程
START SLAVE SQL_THREAD FOR CHANNEL 'group_replication_applier';
STOP SLAVE SQL_THREAD FOR CHANNEL 'group_replication_applier';
基于主鍵的并行執行
SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK';
SET GLOBAL slave_parallel_workers = N;
SET GLOBAL slave_preserve_commit_order = ON;
GR 的 LOGCAL_CLOCK 與異步復制的算法不同,GR 并發策略的邏輯時間是基于主鍵計算出來的,比異步復制基于鎖計算出來的邏輯時間的并發性要好很多
基于主鍵并行復制特點
- 若兩個事務更新同一行,則要按順序執行,否則就可以并發
- DDL 不能和任務事務并發,必須等待它前面所有事務執行完才能開始執行,后面的事務也要必須等等 DDL 執行完才能執行
為什么配置 slave_preserve_commit_order
并發執行時,不管兩個事務 Binlog Event 是不是同一 session 產生,只要滿足上面的特點就會并發,因此同一 session 里的事務可能被安排并發執行,會導致后執行的事務先被提交的情況,為了保證同一個 session 的事務按照順序提交,必須配置此參數,保證 Applier 上執行事務的提交順序和 源 MySQL 一致
Paxos 協議優點
- 不會腦裂 [有疑問,原主從環境有腦裂 P363] ???
- 冗余好,保證 Binlog 至少被復制超過一半成員,只要同時宕機成員不超過一半不會導致數據丟失
- 保證只要 Binlog Event 沒有被傳輸到半數以上成員,本地成員不會將事務的 Binlog Event 寫入 Binlog 文件和提交事務,從而保證宕機的服務器不會有組內在線不存在的數據,宕機的服務器重啟后,不再需要特殊處理就可以加入組
服務模式
- 單主模式 (默認模式)
- 多主模式
-- 設置多主模式
SET GLOBAL group_replication_single_primary_mode = OFF;
如果使用多主模式,需要加入組之前將此變量置為 OFF,服務模式不能在線切換,必須組內所有成員退出組,然后重新初始化要使用的模式,再把其他成員加進來
單主模式
- Primary Member
- Secondary Member
自動選舉 && Failover
- 初始化的成員自動選舉為 Primary Member
- Failover:group_replication_member_weight (5.7.20 更新)單主模式權重,先判斷權重大的為新 Primary,若一樣大,對所有在線成員的 UUID 排序,選最小的為 Primary Member,復制正常進行,但要注意,客戶端獲取 Primary Memory 的 UUID,然后連接新的 Primary Memory
# 任何成員查詢 Primary Member 的 UUID
show global status like 'group_replication_primary_member';
or
SELECT * FROM performance_schema.global_status WHERE variable_name = 'group_replication_primary_member';
讀寫自動切換
成員加入默認為 "只讀" 模式,只有選取為 Primary Member 才會是 "寫" 模式
SET GLOBAL super_read_only = 1;
SET GLOBAL super_read_only = 0;
缺點
- Failover 后,客戶端根據 UUID 判斷是不是 Primary Member
多主模式
自增字段
-- 原 MySQL 自增變量
SET GLOBAL auto_increment_offset = N;
SET GLOBAL auto_increment_increment = N;
# Group Replicaion 組復制自增步長,默認為 7,最大節點為 9
SET GLOBAL group_replication_auto_increment_increment = N;
注意:
a. 如果 server-id 為 1、2、3 配置,就不需要額外配置,但若 server-id不為 1、2、3 則需要配置 auto_increment_increment、auto_increment_offset若沒有配置 auto_increment_increment、auto_increment_offset,則自動將 group_replication_auto_increment_increment 和 server-id 設置到 auto_increment_increment、auto_increment_offset 上
b. auto_increment_increment 盡量設置大于或等于成員數,最好大于,因為方便以后增加節點,擴展時再改變自增比較麻煩
優點
- 當一個成員故障,只有一部分連接失效,應用影響小
- 當關閉一個 MySQL 節點時,可以先將連接平滑轉移到其他機器上再關閉這個節點,不會瞬斷
- 性能好 [有待評估] ???
缺點
- 自增步長要大于成員數,防止以后擴展麻煩
- 不支持串行(SERIALIZABLE) 隔離等級,單節點通過鎖實現
- 不支持外鍵級聯操作
# 當為 True,當發現上面 2 個不支持就會報錯,單主模式下為必須為 OFF
group_replication_enforce_update_everywhere_checks = TRUE
- DDL 語句并發執行的問題
多主復制,通過沖突檢測來辨別沖突事務,再回滾,5.7 的 DDL 不是原子操作無法回滾,因此 GR 沒到 DDL 做沖突檢測,如果 DDL 和有沖突的語句發生在不同成員,可能導致數據不一致.
<u>所以必須執行 DDL 前必須將有沖突的事務遷移到一臺機器上執行</u>
- DBA 維護時要注意防止腦裂 ???
當維護節點 s3時,從DNS下線s3,但在執行 stop group_replication時,因為DNS緩存,要注意應用長鏈接或短鏈接是否繼續連接此實例。(普通用戶不受影響,注意用戶權限)
# super 用戶查看 read_only 并 stop group_replication,super_read_only 由 ON -> OFF,read_only 一直是 ON
localhost.(none)>show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | ON |
| super_read_only | ON |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.00 sec)
localhost.(none)>stop group_replication;
Query OK, 0 rows affected (8.15 sec)
localhost.(none)>show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | ON |
| super_read_only | OFF |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.00 sec)
# 普通用戶登陸測試
mysql -utest_user -ptest_user -hlocalhost -S /tmp/mysql6666.sock test
mysql> insert into t_n values(1,'test1');
ERROR 1290 (HY000): The MySQL server is running with the --read-only option so it cannot execute this statement
# Super 用戶測試
mysqlha_login.sh 6666
localhost.(none)>use test;
Database changed
localhost.test>insert into t_n values(1,'test1');
Query OK, 1 row affected (0.00 sec)
# 怎么加入 Group Replication,可參考(組成員事務不一致)
多主模式思考
寫請求可以分發多個成員上
控制 DDL,當有 DDL 執行時,所有寫請求轉移到同一臺 MySQL 機器 [這個實現有點復雜]
-
折中方案,多主模式當單主模式使用
- 與單主模式比較,去掉 Failover 主從切換
- 解決 DDL 沖突問題,防止腦裂
- 一套 GR 為多應用提供服務,多應用不同的數據,沒有沖突
-- Session 1: A Member
BEGIN
INSERT INTO t1 VALUES(1);
-- Session 2: B Member
TRUNCATE t1;
-- Session 1:
COMMIT;
----> 2 個 session 事務執行順序不同
-- SESSION 1:
INSERT INTO t1 VALUES(1);
TRUNCATE t1;
-- SESSION 2:
TRUNCATE t1;
INSERT INTO t1 VALUES(1);
配置 Group Replication
必要配置
復制用戶
SET SQL_LOG_BIN=0;
CREATE USER rpl_user@'%' IDENTIFIED BY 'rpl_pass';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;
每個節點都要單獨配置,如果 MySQL 初始化已經創建,則可以省略此步
連接到哪個成員上去復制,是由 Group Replication插件隨機選擇,因為為 group_replication_reocvery 配置的用戶每個成員上都要存在
如果沒有 SET SQL_LOG_BIN = 0,認為有非組內事務,START GROUP_REPLICATION; 失敗,需要 set global group_replication_allow_local_disjoint_gtids_join=ON;
my.cnf 配置
# my.cnf
server_id = 1
log_bin = binlog
relay_log = relay-log
gtid_mode = ON
enforce_gtid_consistency = ON
binlog_format = ROW
transaction-isolation = READ-COMMITTED
binlog_checksum = NONE
master_info_repository = TABLE
relay_log_info_repository = TABLE
log_slave_update = ON
slave_parallel_type = LOGIAL_CLOCK
slave_parallel_workers = 8
slave_preserve_commit_order = ON
# Group Replication
plugin-load = group_replication.so
transaction_write_set_extraction = XXHASH64
loose-group_replication_group_name = 93f19c6c-6447-11e7-9323-cb37b2d517f3
loose-group_replication_start_on_boot = OFF
loose-group_replication_local_address = 'db1:3306'
loose-group_replication_group_seeds = 'db2:3306,db3:3306'
group_replication_ip_whitelist = '10.0.0.0/8'
loose-group_replication_bootstrap_group = OFF
# loose-group_replication_single_primary_mode = OFF # Trun off Single primary
# loose-group_replication_enforce_update_everywhere_checks = ON # Multi-Primary Mode (靜態)
loose-group_replication_transaction_size_limit = 52428800 # 5.7.19 Configures the maximum transaction size in bytes which the group accepts
# loose-group_replication_unreachable_majority_timeout
report_host = '域名'
report_port = 3306
loose-group_replication_flow_control_applier_threshold = 250000
loose-group_replication_flow_control_certifier_threshold = 250000
在現有初始化環境配置 MGR
# For Group Replication
transaction-isolation = READ-COMMITTED
binlog_checksum = NONE
master_info_repository = TABLE
relay_log_info_repository = TABLE
plugin-load = group_replication.so
transaction_write_set_extraction = XXHASH64
loose-group_replication_group_name = 33984520-8709-11e7-b883-782bcb105915
loose-group_replication_start_on_boot = OFF
loose-group_replication_local_address = '10.13.2.29:9999'
loose-group_replication_group_seeds = '10.13.2.29:9999,10.77.16.197:9999,10.73.25.178:9999'
loose-group_replication_bootstrap_group = OFF
# loose-group_replication_enforce_update_everywhere_checks = ON # Multi-Primary Mode
loose-group_replication_transaction_size_limit = 52428800 # 5.7.19 Configures the maximum transaction size in bytes which the group accepts
group_replication_ip_whitelist = '10.0.0.0/8'
# loose-group_replication_single_primary_mode = OFF
report_host = '10.13.2.29'
report_port = 8888
配置說明:
開啟 Binlog、Relaylog
開啟 GTID 功能
設置 ROW 格式的 Binlog
禁用 binlog_checksum (MySQL 5.7.17 不支持帶 checksum Binlog Event)
要使用多源復制,必須使用將 Slave 通道(Channel) 的狀態信息存儲到系統表
開啟并行復制
-
開啟主鍵信息采集
GR 需要 Server 層采集更新數據的主鍵信息,被 HASH 存儲起來,支持兩種 HASH 算法:XXHASH64、MURMUR32,默認 transaction_write_set_extraction = OFF,所以要使用 Group Replication 每張表都要有主鍵,否則更新數據時會失敗
<u>一個組內的所有成員必須配置相同的 HASH 算法</u>
plugin-load = 'group_replication.so' 相當執行 INSTALL PLUGIN group_replication SONAME "group_replication.so";
group_replication_group_name = <uuid> 設置 Group Replication Name,可以通過 select uuid(); 獲得
group_replication_start_on_boot = OFF MySQL 啟動時,不自動啟動 Group Replication
group_replication_local_address = <ip:port> 配置 Group Replication 本地成員監聽端口,成員之間通過這個端口通信,如果所有成員不在一臺機器上,不要配置 127.0.0.1,要配置成員內網IP和端口
group_replication_group_seeds = <ip:port,ip:port...> 配置種子成員,新成員加入組時,需要與組內成員通信,請求配組重新配置允許它加入組,不寫所有成員也可以
- 當多個 Server 同時加入組時,確保使用已經存在組的成員,不要使用正在申請加入組的成員,不支持創建組的時候同時加入多個成員
- 從當前view隨機選取數據源成員通信,當多個成員進入組時,同一 Server幾乎不會重復選擇,如果訪問失敗則自動連接新的數據源成員,一直達到連接重試限制(group_replication_recovery_retry_count = 10 可動態修改),恢復線程將被中止并報錯
- 恢復程序不會在每次嘗試連接數據源(donor)后休眠,僅當對所有可能的數據源進行嘗試無果后才休眠(group_replication_recovery_reconnect_interval = 60 秒 可動態修改)
增強數據源節點切換 Enhanced Automatic Donor Switchover
- 已清除數據:如果所選擇的數據源成員,在恢復過程中所需某數據已經刪除,則會出現錯誤,恢復程序檢測到此錯誤并重新選擇新的數據源成員
- 重復數據:當新成員加入已經包含一些與所選擇數據源成員相沖突的數據,恢復過程中報錯,這可能新成員存在一些錯誤的事務。有人認識可能恢復應該失敗退出,而不是切換另一個數據源成員,但在異構集群中一些成員共享沖突事務,有些沒有,當錯誤發生時,恢復可以選擇另一個數據源成員(donor)
- 其他錯誤:如果任何線程恢復失敗(接收或應用線程失敗),則會出現錯誤,恢復程序在組內選擇一個新的數據源成員(donor)
在一些持續故障或短暫的故障時,恢復程序自動重試到相同或新的數據源成員(the same or a new donor)
- group_replication_ip_whitelist = <ip,net,...> <u>一定要設置白名單</u>,若不配置默認為AUTOMATIC,自動識別本機網口的私網地址和私網網段,127.0.0.1 連接請求始終被允許,配置白名單一定要關閉 Group Replication
- group_replication_bootstrap_group = OFF 如果為 ON 告訴 Group Replication 插件,它是組內第一個成員,要做初始化,初始化后改為 OFF
只在 Group Replication 初始化時或整個組崩潰后恢復的情況下使用,當多個 Server 配置此參數,可能人為造成腦裂
- group_replication 變量加上 "loose" ,則可寫入 my.cnf 配置文件中
加載 Group Replication 插件
安裝插件
# 加載插件
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
# 啟用插件
START GROUP_REPLICATION;
> 將 MySQL 加入一個存在的 Group Replication 組或將它初始化為組內第一個成員
# 停用插件
STOP GROUP_REPLICATION;
> 將 MySQL 從一個 Group Replication 組內移除
初始化組
INSTALL PLUGIN group_replication SONAME "group_replication.so";
SET GLOBAL group_replication_group_name = "93f19c6c-6447-11e7-9323-cb37b2d517f3"; # 可以 select uuid(); 生成
SET GLOBAL group_replication_local_address = "dbaone:7777";
SET GLOBAL group_replication_bootstrap_group = ON ;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group = OFF ;
新成員加入
INSTALL PLUGIN group_replication SONAME "group_replication.so";
SET GLOBAL group_replication_group_name = "93f19c6c-6447-11e7-9323-cb37b2d517f3"; # 可以 select uuid(); 生成
SET GLOBAL group_replication_local_address = "127.0.0.1:7777";
SET GLOBAL group_replication_group_seeds = "db2:7777";
change master to master_user = 'replica',master_password='eHnNCaQE3ND' for channel 'group_replication_recovery';
START GROUP_REPLICATION;
注:如果在 my.ncf 已經配置,這里初始化和新成員加入有些步驟就不需要做了
初始化:
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group = OFF
新成員加入
change master to master_user = 'replica',master_password='eHnNCaQE3ND' for channel 'group_replication_recovery';
START GROUP_REPLICATION;
注:
- 當新成員加入,首先從其他節點,把它加入之前的數據復制過來,這些數據不能通過 Group Replication 通信協議進行復制,而是使用 group_replication_recovery 異步復制通道(Channel)
- 要保證 Binlog 一直存在,否則需要數據初始化,將數據恢復到最近的時間點再加入組 ???(wait 測試)
- 連接到哪個成員上去復制,是由 Group Replication插件隨機選擇,因為為 group_replication_reocvery 配置的用戶每個成員上都要存在
- 在啟動 group_replication_recovery 之前,Group Replication 會自動為其配置 MASTER_HOST、MASTER_PORT
- 當一個成員加入時,會收到組內其他成員的配置信息,包括主機名和端口,主機名和端口是從 全局只讀變量 hostname、port 獲取,如果hostname 無法解析成IP或網絡中使用網絡地址映射,group_replication_recover 通道無法正常工作
- /etc/hosts 配置所有成員的主機名和IP地址對應關系
- 配置 MySQL report_host、report_port,Group Replication 會優先使用此參數
Group Replication 網絡分區
當有多個節點意外故障,法定人數可能會丟失,導致大多數成員從組內被移除。
例如5個成員的GR,若同時3個成員突然沒有消息,大多數成員仲裁的規則被破壞,因為不能實現仲裁,事實上剩下的2個成員不能辨別其他3個server是否崩潰或網絡分隔成獨立的2個server,因此該組不能重新配置。
如果有成員主動退出,那么它會告知組進行重新配置。實際上,一個將要離開的成員告知其他成員,它將要離開,其他成員可以重新配置組,保持組內成員關系一致,并重新計算仲裁數,如果5個成員,如果3個成員一個接一個離開組,總數從5->2,同樣的時間,確保法定人數。
檢測分區
正常情況,組內每個成員均可通過 performance_schema.replication_group_members 查詢所有成員狀態,每個成員的狀態,也是視圖中所有成員共同決議的。
如果存在網絡分隔,表內訪問不到的成員狀態是 “UNREACHABLE”,由 Group Replicaion 內置的本地故障檢測器完成。
As such, lets assume that there is a group with these 5 servers in it:
- Server s1 with member identifier 199b2df7-4aaf-11e6-bb16-28b2bd168d07
- Server s2 with member identifier 199bb88e-4aaf-11e6-babe-28b2bd168d07
- Server s3 with member identifier 1999b9fb-4aaf-11e6-bb54-28b2bd168d07
- Server s4 with member identifier 19ab72fc-4aaf-11e6-bb51-28b2bd168d07
- Server s5 with member identifier 19b33846-4aaf-11e6-ba81-28b2bd168d07
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | 127.0.0.1 | 13002 | ONLINE |
| group_replication_applier | 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | 127.0.0.1 | 13001 | ONLINE |
| group_replication_applier | 199bb88e-4aaf-11e6-babe-28b2bd168d07 | 127.0.0.1 | 13000 | ONLINE |
| group_replication_applier | 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | 127.0.0.1 | 13003 | ONLINE |
| group_replication_applier | 19b33846-4aaf-11e6-ba81-28b2bd168d07 | 127.0.0.1 | 13004 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
5 rows in set (0,00 sec)
# 當同時3個成員失去聯系 (s3、s4、s5),此時在s1上查看
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | 127.0.0.1 | 13002 | UNREACHABLE |
| group_replication_applier | 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | 127.0.0.1 | 13001 | ONLINE |
| group_replication_applier | 199bb88e-4aaf-11e6-babe-28b2bd168d07 | 127.0.0.1 | 13000 | ONLINE |
| group_replication_applier | 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | 127.0.0.1 | 13003 | UNREACHABLE |
| group_replication_applier | 19b33846-4aaf-11e6-ba81-28b2bd168d07 | 127.0.0.1 | 13004 | UNREACHABLE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
5 rows in set (0,00 sec)
解除分隔
解決
- 停止 s1、s2 的組復制或完成停了實例,找到其他三個成員停止的原因,然后重新啟動組復制(或服務)
- group_replication_force_members 強制這個參數列表成員組成 Group,其實成員移出組 (<u>謹慎操作,防止腦裂</u>)
強制組成員配置
<img src="https://dev.mysql.com/doc/refman/5.7/en/images/gr-majority-lost-to-stable-group.png" style="zoom:50%" />
假設只有 s1、s2 在線,s3、s4、s5 意外離開并不在線,可以強制讓 s1、s2組成一個組 (membership)
group_replication_force_members 被視為最后補救措施,一定小心使用,僅當多數成員導致不能仲裁時使用,如果被濫用,可能導致<u>腦裂和阻塞整個系統</u>
<u>確定 s3、s4、s5 一定不在線,不可訪問,如果這三個成員分區隔離(因為占大多數),強制 s1、s2組成新組,造成人為腦裂</u>
# s1 查看
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 1999b9fb-4aaf-11e6-bb54-28b2bd168d07 | 127.0.0.1 | 13002 | UNREACHABLE |
| group_replication_applier | 199b2df7-4aaf-11e6-bb16-28b2bd168d07 | 127.0.0.1 | 13001 | ONLINE |
| group_replication_applier | 199bb88e-4aaf-11e6-babe-28b2bd168d07 | 127.0.0.1 | 13000 | ONLINE |
| group_replication_applier | 19ab72fc-4aaf-11e6-bb51-28b2bd168d07 | 127.0.0.1 | 13003 | UNREACHABLE |
| group_replication_applier | 19b33846-4aaf-11e6-ba81-28b2bd168d07 | 127.0.0.1 | 13004 | UNREACHABLE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
5 rows in set (0,00 sec)
# 檢查s1、s2 Group Replication 通信地址 group_replication_local_address
mysql> SELECT @@group_replication_local_address;
+-----------------------------------+
| @@group_replication_local_address |
+-----------------------------------+
| 127.0.0.1:10000 |
+-----------------------------------+
1 row in set (0,00 sec)
mysql> SELECT @@group_replication_local_address;
+-----------------------------------+
| @@group_replication_local_address |
+-----------------------------------+
| 127.0.0.1:10001 |
+-----------------------------------+
1 row in set (0,00 sec)
# 強制重新配置組成員 s1 (127.0.0.1 只是列表,真實環境要看變量 group_replication_local_address)
SET GLOBAL group_replication_force_members="127.0.0.1:10000,127.0.0.1:10001";
# 檢查 s1
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | b5ffe505-4ab6-11e6-b04b-28b2bd168d07 | 127.0.0.1 | 13000 | ONLINE |
| group_replication_applier | b60907e7-4ab6-11e6-afb7-28b2bd168d07 | 127.0.0.1 | 13001 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0,00 sec)
# 檢查 s2
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | b5ffe505-4ab6-11e6-b04b-28b2bd168d07 | 127.0.0.1 | 13000 | ONLINE |
| group_replication_applier | b60907e7-4ab6-11e6-afb7-28b2bd168d07 | 127.0.0.1 | 13001 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2 rows in set (0,00 sec)
多主模式
多主模式不支持在線更改。
my.cnf 與單主模式配置不同
loose-group_replication_single_primary_mode = OFF
loose-group_replication_enforce_update_everywhere_checks = ON # start group_replication后不可修改
auto_increment_increment =
auto_increment_offset =
auto_increment_increment、auto_increment_offset 如果不配置,auto_increment_offset 使用 server_id,auto_increment_increment 使用group_replication_auto_increment_increment(默認為7)
多主模式,變量 group_replication_primary_member 為空
localhost.(none)>show global status like 'group_replication_primary_member';
+----------------------------------+-------+
| Variable_name | Value |
+----------------------------------+-------+
| group_replication_primary_member | |
+----------------------------------+-------+
1 row in set (0.00 sec)
read_only、super_only_only 均為 OFF
localhost.(none)>show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | OFF |
| super_read_only | OFF |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.00 sec)
監控
# 存儲組內成員的基本信息,任何一個成員,都能看到此信息
SELECT * FROM performance_schema.replication_group_members
狀態有五種
* OFFLINE:當MySQL Group Replication 沒有啟用時
* RECOVERING: 當 MySQL Group Replication 啟動后,首先設置為 RECOVERING,開始復制加入前的數據
* ONLINE:當 RECOVERING 完成后,狀態為 ONLINE,開始對外提供服務
* ERROR: 當本地成員發生錯誤后,Group Replication 無法工作時,當前成員狀態會變成 ERROR (組內正常成員可能看不到 ERROR 成員)
* UNRECAHABLE: 當網絡故障或其他成員宕機時,其他成員的狀態會被設置為 UNREACHABLE
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64 | 6666 | ONLINE |
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29 | 6666 | ONLINE |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 | 6666 | ONLINE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec)
# 存儲本地成員詳細信息,每個成員只能查詢到自己的詳細信息
SELECT * FROM performance_schema.replication_group_member_stats # 跟多主模式沖突檢測相關
localhost.(none)>SELECT CHANNEL_NAME,VIEW_ID,MEMBER_ID FROM performance_schema.replication_group_member_stats ;
+---------------------------+----------------------+--------------------------------------+
| CHANNEL_NAME | VIEW_ID | MEMBER_ID |
+---------------------------+----------------------+--------------------------------------+
| group_replication_applier | 14997660180761984:11 | e99a9946-6619-11e7-9b07-70e28406ebea |
+---------------------------+----------------------+--------------------------------------+
1 row in set (0.00 sec)
# 當新成員加入時,首先通過異步通道(Channel) group_replication_recovery 把加入組之前組內產生的數據同步過來,這個通道狀態信息和其他異步復制通道一樣,通過此表來監控。
# Group Replication 的通道不會在 SHOW SLAVE STATUS 顯示
SELECT * FROM performance_schema.replication_connection_status
localhost.(none)>SELECT CHANNEL_NAME,GROUP_NAME,THREAD_ID,SERVICE_STATE FROM performance_schema.replication_connection_status;
+----------------------------+--------------------------------------+-----------+---------------+
| CHANNEL_NAME | GROUP_NAME | THREAD_ID | SERVICE_STATE |
+----------------------------+--------------------------------------+-----------+---------------+
| group_replication_applier | 93f19c6c-6447-11e7-9323-cb37b2d517f3 | NULL | ON |
| group_replication_recovery | | NULL | OFF |
+----------------------------+--------------------------------------+-----------+---------------+
# Group Replication 通過 group_replication_applier 通道來執行 Binlog_Event,group_replication_applier 跟其他異步通道一樣可以通過此表來查詢
SELECT * FROM performance_schema.replication_applier_status
+----------------------------+---------------+-----------------+----------------------------+
| CHANNEL_NAME | SERVICE_STATE | REMAINING_DELAY | COUNT_TRANSACTIONS_RETRIES |
+----------------------------+---------------+-----------------+----------------------------+
| group_replication_applier | ON | NULL | 0 |
| group_replication_recovery | OFF | NULL | 0 |
+----------------------------+---------------+-----------------+----------------------------+
2 rows in set (0.00 sec)
# Group Replication 線程的狀態信息
SELECT * FROM performance_schema.threads
* thread/group_rpl/THD_applier_module_receiver
* thread/group_rpl/THD_certifier_broadcast
* thread/group_rpl/THD_recovery
SELECT * FROM mysql.`slave_master_info`
SELECT * FROM mysql.`slave_relay_log_info`
replication 相關視圖
SELECT * FROM performance_schema.replication_group_members;
SELECT * FROM performance_schema.replication_group_member_stats;
SELECT * FROM performance_schema.replication_applier_configuration;
SELECT * FROM performance_schema.replication_applier_status;
SELECT * FROM performance_schema.replication_applier_status_by_coordinator;
SELECT * FROM performance_schema.replication_applier_status_by_worker;
SELECT * FROM performance_schema.replication_connection_configuration;
SELECT * FROM performance_schema.replication_connection_status;
參考:
mysql-group-replication-monitoring
官方文檔 group-replication-monitoring
Group Replication 原理
MGR 事務執行過程
事務執行過程
- 本地事務控制模塊
- 成員間的通信模塊
- 全局事務認證模塊
- 異地事務執行模塊
Flow Control
MySQL Group Replication: understanding Flow Control
問題處理
1. 網絡中斷/分區處理
可以參考 Quest for Better Replication in MySQL: Galera vs. Group Replication
當一個成員因為為網絡問題被清除 Group Replication,需要手工處理,當網絡恢復不能自動加入,PXC是可以的
error log 日志顯示如下
2017-07-14T11:58:57.208677+08:00 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2017-07-14T11:58:57.273764+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 6c6c3761'
2017-07-14T11:58:57.274693+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 6c6c3761'
2017-07-14T11:59:00.531055+08:00 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-07-14T11:59:00.531314+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-07-14T11:59:00.531338+08:00 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-07-14T11:59:00.531365+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-07-14T11:59:00.531365+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-07-14T11:59:00.531401+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
當前節點狀態
SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64 | 6666 | ERROR |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
1 row in set (0.00 sec)
處理步驟:
localhost.(none)>start group_replication;
ERROR 3093 (HY000): The START GROUP_REPLICATION command failed since the group is already running.
localhost.(none)>stop group_replication;
Query OK, 0 rows affected (11.35 sec)
localhost.(none)>start group_replication;
Query OK, 0 rows affected (5.53 sec)
localhost.(none)>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64 | 6666 | ONLINE |
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29 | 6666 | ONLINE |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 | 6666 | ONLINE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec)
2. 組成員事務不一致
被隔離成員,有不是 Group Replication 組內的事務,默認是不能加入到 Group,<u>若人為可知是一致性,可強制加入,謹慎使用</u>
localhost.(none)>stop group_replication;
localhost.(none)>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29 | 6666 | OFFLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)
localhost.test>create table t2 (id int primary key ,name varchar(10));
Query OK, 0 rows affected (0.01 sec)
localhost.test>show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | OFF |
| super_read_only | OFF |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.00 sec)
localhost.test>start group_replication;
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.
error 日志
2017-07-14T15:00:14.966294+08:00 0 [Note] Plugin group_replication reported: 'A new primary was elected, enabled conflict detection until the new primary applies all relay logs'
2017-07-14T15:00:14.966366+08:00 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16,
c745233b-6614-11e7-a738-40f2e91dc960:1 > Group transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16'
2017-07-14T15:00:14.966383+08:00 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.'
2017-07-14T15:00:14.966366+08:00 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16,
c745233b-6614-11e7-a738-40f2e91dc960:1 > Group transactions: 5860d02e-4b55-11e7-8721-40f2e91dc960:1-788,
93f19c6c-6447-11e7-9323-cb37b2d517f3:1-16'
2017-07-14T15:00:14.966383+08:00 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.'
2017-07-14T15:00:14.966392+08:00 0 [Note] Plugin group_replication reported: 'To force this member into the group you can use the group_replication_allow_local_disjoint_gtids_join option'
2017-07-14T15:00:14.966473+08:00 88181 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2017-07-14T15:00:14.968087+08:00 0 [Note] Plugin group_replication reported: 'getstart group_id 6c6c3761'
2017-07-14T15:00:18.464227+08:00 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate'
2017-07-14T15:00:18.464840+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-07-14T15:00:18.464875+08:00 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit'
2017-07-14T15:00:18.465367+08:00 0 [Note] Plugin group_replication reported: 'Exiting xcom thread'
2017-07-14T15:00:18.465382+08:00 0 [Note] Plugin group_replication reported: 'new state x_start'
2017-07-14T15:00:23.486593+08:00 88181 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1'
2017-07-14T15:00:23.486642+08:00 88181 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1'
2017-07-14T15:00:23.486838+08:00 88211 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed
2017-07-14T15:00:23.487244+08:00 88208 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
強制加入,一定要確定不會導致數據不一致,謹慎操作
localhost.test>SET GLOBAL group_replication_allow_local_disjoint_gtids_join = ON ;
Query OK, 0 rows affected (0.00 sec)
localhost.test>start group_replication;
Query OK, 0 rows affected (2.26 sec)
localhost.test>SET GLOBAL group_replication_allow_local_disjoint_gtids_join = OFF ;
Query OK, 0 rows affected (0.00 sec)
localhost.test>show variables like '%read_only%';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| innodb_read_only | OFF |
| read_only | ON |
| super_read_only | ON |
| tx_read_only | OFF |
+------------------+-------+
4 rows in set (0.00 sec)
3. 大事務導致組成員被移除或切換
測試如下,在 5.7.17、5.7.18 都有此問題,fix 5.7.19 加了一個限制大事務的參數 group_replication_transaction_size_limit
#84785
# 單主模式 5.7.18
CREATE TABLE `kafkaoffset_api_log` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`developer` varchar(20) NOT NULL DEFAULT '' ,
ctime timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'request api time',
PRIMARY KEY (`id`),
KEY idx_time(ctime,developer)
);
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 1 row affected (0.01 sec)
Records: 1 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 8 rows affected (0.01 sec)
Records: 8 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 16 rows affected (0.00 sec)
Records: 16 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 32 rows affected (0.01 sec)
Records: 32 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 64 rows affected (0.00 sec)
Records: 64 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 128 rows affected (0.00 sec)
Records: 128 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 256 rows affected (0.02 sec)
Records: 256 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 512 rows affected (0.02 sec)
Records: 512 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 1024 rows affected (0.05 sec)
Records: 1024 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 2048 rows affected (0.08 sec)
Records: 2048 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 4096 rows affected (0.17 sec)
Records: 4096 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 8192 rows affected (0.33 sec)
Records: 8192 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 16384 rows affected (0.62 sec)
Records: 16384 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 32768 rows affected (1.09 sec)
Records: 32768 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 65536 rows affected (2.29 sec)
Records: 65536 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 131072 rows affected (5.00 sec)
Records: 131072 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 262144 rows affected (8.44 sec)
Records: 262144 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select devloper,now() from kafkaoffset_api_log;
ERROR 1054 (42S22): Unknown column 'devloper' in 'field list'
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 524288 rows affected (15.79 sec)
Records: 524288 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
Query OK, 1048576 rows affected (36.33 sec)
Records: 1048576 Duplicates: 0 Warnings: 0
localhost.test>insert into kafkaoffset_api_log(developer,ctime) select developer,now() from kafkaoffset_api_log;
ERROR 3101 (HY000): Plugin instructed the server to rollback the current transaction.
# 當前成員
localhost.test>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64 | 6666 | ERROR |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
1 row in set (0.00 sec)
# 其他成員
localhost.test>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29 | 6666 | ONLINE |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 | 6666 | ONLINE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
2 rows in set (0.00 sec)
因為執行大事務,在內存分配、網絡寬帶開銷上面,導致故障檢測觸發使成員處于不可訪問狀態。
解決:5.7.19 引進一個參數 group_replication_transaction_size_limit
,默認是0,是沒有限制,需要根據 MGR 工作負載設置比較合理的值。
# 恢復組成員
localhost.test>stop group_replication ;
Query OK, 0 rows affected (5.67 sec)
localhost.test>start group_replication;
Query OK, 0 rows affected (2.40 sec)
localhost.test>SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
| group_replication_applier | 817b0415-661a-11e7-842a-782bcb479deb | 10.55.28.64 | 6666 | ONLINE |
| group_replication_applier | c745233b-6614-11e7-a738-40f2e91dc960 | 10.13.2.29 | 6666 | ONLINE |
| group_replication_applier | e99a9946-6619-11e7-9b07-70e28406ebea | 10.77.16.197 | 6666 | ONLINE |
+---------------------------+--------------------------------------+--------------+-------------+--------------+
3 rows in set (0.00 sec)
- 沒有主鍵的表
localhost.test>create table t1(id int,name varchar(10));
Query OK, 0 rows affected (0.02 sec)
localhost.test>insert into t1 values(1,'test1');
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.
localhost.test>alter table t1 add primary key(id);
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
localhost.test>insert into t1 values(1,'test1');
Query OK, 1 row affected (0.00 sec)
error log
2017-08-22T19:19:38.693424+08:00 4521 [ERROR] Plugin group_replication reported: 'Table t1 does not have any PRIMARY KEY. This is not compatible with Group Replication'
MySQL Change log
1、5.7.20
group_replication_member_weight,默認50,范圍[0-100], <u>單主模式</u>增加權重參數,以前是 uuid 最小選舉為新的 Primary,現在先判斷權重大的為新的Primary,如果權重相同,再 uuid 最小為新的 Primary
STOP GROUP_REPLICATION 時,以前 stop group replication 還可以接受事務,現在為立刻 super_read_only 生效,對集群一致性更好的保障
STOP GROUP_REPLICATION 時,mgr 通訊停止了,但異步通道沒有停止,5.7.20以后異步通道也停止
參考 group_replication_force_members 加上限制,只有多數成員不可達時,才可以使用
-
server_uuid 不可能與 group_replication_group_name 相同
?
自動化兼容
監控
- 所有節點 group_replication_single_primary_mode 值不同
- 單主模式下,切庫上下線域名ip
mysql> show variables like 'group_replication_single_primary_mode';
+---------------------------------------+-------+
| Variable_name | Value |
+---------------------------------------+-------+
| group_replication_single_primary_mode | ON |
+---------------------------------------+-------+
1 row in set (0.00 sec)
- 檢查組成員不一致 performance_schema.replication_group_members
- ?
備份
mb 備份
innobackupex 備份
擴容
slave 擴容 (stop group_replication)
innobackupex 擴容
參考
MySQL High Availabilitywith Group Replication - 宋利兵 IT大咖說