前言
raft 論文斷斷續續看了好久了, etcd的raft實現也是斷斷續續看了好久, 最近又想起了,再回顧的時候發現好多細節又是忘了, 還是做些簡單記錄吧,記性不好還不肯動筆,那就完蛋了..
progress
progress 是leader維護的各個follower的狀態信息, 總共有三種狀態: probe
, replicate
, snapshot
, 其內部的狀態機如下轉換
+--------------------------------------------------------+
| send snapshot |
| |
+---------+----------+ +----------v---------+
+---> probe | | snapshot |
| | max inflight = 1 <----------------------------------+ max inflight = 0 |
| +---------+----------+ +--------------------+
| | 1. snapshot success
| | (next=snapshot.index + 1)
| | 2. snapshot failure
| | (no change)
| | 3. receives msgAppResp(rej=false&&index>lastsnap.index)
| | (match=m.index,next=match+1)
receives msgAppResp(rej=true)
(next=match+1)| |
| |
| |
| | receives msgAppResp(rej=false&&index>match)
| | (match=m.index,next=match+1)
| |
| |
| |
| +---------v----------+
| | replicate |
+---+ max inflight = n |
+--------------------+
raft 的membership change 一點小區別.
etcd的實現
The key invariant that membership changes happen one node at a time is preserved, but in our implementation the membership change takes effect when its entry is applied, not when it is added to the log (so the entry is committed under the old membership instead of the new)
raft 論文的論述
once a given server adds the lastest configuation entry to its log. it uses the the latest configuration in its log. regardless of whether the entry is commited.... This means that the leader will use the rules of Cnew.new to determine when the log entry for Cnew.old is committed