HA相關概念:
可用性=平均無故障時間/(平均無故障時間+修復時間) //一般要到達5個9至6個9的可用性;
可通過縮短修復時間提供可用性;
縮短修復時間的方式:提供備用主機,實現Failover(故障轉移)
1.轉移ip地址,即loating ip(流動的ip地址)
2.轉移服務
特定情況下需要轉移追蹤信息和數據
A.基于rsync+inotify同步
B.共享存儲
備用主機如何知道主節點不可用?
主節點周期性向備用節點發送heartbeat(心跳信息)
補充:
SAN:塊級別接口(一塊硬盤),后端有兩臺主機都掛載了該SAN,當兩臺主機同一時刻操作的是同一個文件時,文件系統可能會崩潰;
NAS:文件系統級別
當后端主機使用的SAN的共享存儲時,一臺主機發生問題,為了防止其自動修復后會爭用存儲,通常需要用補刀設備,確保出問題的主機無法正常工作;
STONITH:shooting the other node in the head 切斷電源
主機都是通過交換機接入到共享存儲的,所以在交換機上切斷其連接存儲的網線也可實現;
Failback:故障奪回;備用主機的性能可能很差,當主節點修復后,應該立即奪回資源;
HA Cluster實現方案:
vrrp協議的實現:keepalived
ais完備HA集群:RHCS(cman),heartbeat,corosync
Vrrp協議:virtual redundant routing protocol 虛擬冗余路由協議
什么是vrrp協議?
將多個物理路由器虛擬成一個或多個虛擬路由器來使用,每個虛擬路由有自己的標識(VRID),虛擬路由內部的物理設備有主節點和備用節點之分,主節點不停的向備用節點發送心跳和優先級信息,備用節點通過心跳信息判定主節點是否有問題,根據優先級決定是否強主節點的資源,搶還是不搶取決于備用節點工作在搶占模式還是非搶占模式下;
搶占式:自己的優先級比別人高,就搶;
非搶占式:自己的優先級比別人高,只要對方能正常工作,就不搶
Vrrp協議僅轉移ip地址;
術語:
虛擬路由器:virtual router
虛擬路由器標識:VRID(0-255)
物理路由器:
Master:主設備
Backup:備用設備
Priority:優先級
VIP:virtual ip
VMC:virtual MAC
Keepalived:
vrrp協議的軟件實現,只能工作在linux上,實現ip地址漂移;
根據配置生成ipvs規則;
后端主機的健康狀態監測;
原生目的高可用lvs;
組件:
Vrrp stack,ipvs wrapper,checkers,配置文件分析器,io復用器,內存管理組件;
配置HA Cluster配置前提:
1.各節點時間同步
2.確保iptables和selinux不會成為阻礙
3.各節點可通過主機名互相通信(對KA并非必須),本地hosts文件
4.各節點基于秘鑰認證的ssh服務完成通信(并非必須)
Keepalived安裝配置:
CentOS6.4+ 隨base倉庫提供;
程序環境:
主配置文件:/etc/keepalived/keepalived.conf ``
GLOBAL CONFIGURATION
全局定義
靜態路由和地址的相關配置
VRRPD CONFIGURATION
Vrrp同步組;
vrrp instance 對應一個虛擬路由實例
LVS CONFIGURATION
Ipvs集群的vs和rs
配置文件示例:
全局配置段:
global_defs {
notification_email {
root@localhost
//定義收件人地址
}
notification_email_from Alexandre.Cassen@firewall.loc //定義發件人地址
smtp_server 192.168.200.1 //郵件服務器地址
smtp_connect_timeout 30 //連接郵件服務器的超時時常
router_id LVS_DEVEL //當前物理路由器的標識
vrrp_mcast_group4 224-239.x.x.x //組播地址
}
虛擬路由配置段:
vrrp_instance VI_1 { //定義一個虛擬路由實例
state MASTER
interface eth0 //發送heartbeat的網卡
virtual_router_id 51 //虛擬路由器的id
priority 100 //優先級
advert_int 1 //通告時間間隔
authentication { //認證
auth_type PASS
auth_pass 1111
}
virtual_ipaddress { //vip地址
192.168.200.16/24 dev eno16777736 //vip地址配置在哪塊網卡的別名
}
}
其他可用的配置指令:
nopreempt 非搶占模式
preempt_delay 300 搶占模式下,節點上線后觸發新選舉操作的延遲時常;
定義通知腳本:
#notify_master 當前節點成為主節點觸發的腳本
notify_backup 當前節點轉為備用節點時觸發的腳本
notify_fault 當前節點轉為失敗狀態時觸發的腳本
實踐:Vrrp主備模型實現ip地址漂移:
環境:
各節點時間同步,確保iptables及selinux不會成為阻礙
Node1(CentOS 7)172.18.20.7 MASTER
Node2(CentOS 7)172.18.20.8 BACKUP
VIP 172.18.20.100/16
vrrp_mcast_group4 224.0.0.100
配置node 1:
[root@localhost keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.0.100
}
vrrp_instance myroute {
state MASTER
interface ens33
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass ck2384
}
virtual_ipaddress {
172.18.20.100/16 dev ens33
}
}
配置node 2
[root@localhost keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node2
vrrp_mcast_group4 224.0.0.100
}
vrrp_instance myroute {
state BACKUP
interface eno16777736
virtual_router_id 51
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass ck2384
}
virtual_ipaddress {
172.18.20.100/16 dev eno16777736
}
}
測試:
1.首先啟動node2
[root@node2 ~]# systemctl status keepalived
● keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2017-05-30 15:00:36 CST; 6s ago
Process: 5004 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 5005 (keepalived)
CGroup: /system.slice/keepalived.service
├─5005 /usr/sbin/keepalived -D
├─5006 /usr/sbin/keepalived -D
└─5007 /usr/sbin/keepalived -D
May 30 15:00:36 node2 Keepalived_vrrp[5007]: Opening file '/etc/keepalived/keepalived.conf'.
May 30 15:00:36 node2 Keepalived_vrrp[5007]: Configuration is using : 62909 Bytes
May 30 15:00:36 node2 Keepalived_vrrp[5007]: Using LinkWatch kernel netlink reflector...
May 30 15:00:36 node2 Keepalived_vrrp[5007]: VRRP_Instance(myroute) Entering BACKUP STATE
May 30 15:00:36 node2 Keepalived_vrrp[5007]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
May 30 15:00:40 node2 Keepalived_vrrp[5007]: VRRP_Instance(myroute) Transition to MASTER STATE
May 30 15:00:41 node2 Keepalived_vrrp[5007]: VRRP_Instance(myroute) Entering MASTER STATE
May 30 15:00:41 node2 Keepalived_vrrp[5007]: VRRP_Instance(myroute) setting protocol VIPs.
May 30 15:00:41 node2 Keepalived_vrrp[5007]: VRRP_Instance(myroute) Sending gratuitous ARPs on eno16777736 for 172.18.20.100
May 30 15:00:41 node2 Keepalived_healthcheckers[5006]: Netlink reflector reports IP 172.18.20.100 added
[root@node2 ~]# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:cf:ba:90 brd ff:ff:ff:ff:ff:ff
inet 172.18.20.8/16 brd 172.18.255.255 scope global eno16777736
valid_lft forever preferred_lft forever
inet 172.18.20.100/16 scope global secondary eno16777736
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fecf:ba90/64 scope link
valid_lft forever preferred_lft forever
2.啟動node1,vip配置在了master node1上
[root@node1 ~]# systemctl start keepalived
[root@node1 ~]# systemctl status keepalived
● keepalived.service - LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2017-05-30 03:02:18 EDT; 4s ago
Process: 49646 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 49647 (keepalived)
CGroup: /system.slice/keepalived.service
├─49647 /usr/sbin/keepalived -D
├─49648 /usr/sbin/keepalived -D
└─49649 /usr/sbin/keepalived -D
May 30 03:02:18 node1 Keepalived_vrrp[49649]: Opening file '/etc/keepalived/keepalived.conf'.
May 30 03:02:18 node1 Keepalived_vrrp[49649]: Configuration is using : 62887 Bytes
May 30 03:02:18 node1 Keepalived_vrrp[49649]: Using LinkWatch kernel netlink reflector...
May 30 03:02:18 node1 Keepalived_vrrp[49649]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
May 30 03:02:19 node1 Keepalived_vrrp[49649]: VRRP_Instance(myroute) Transition to MASTER STATE
May 30 03:02:19 node1 Keepalived_vrrp[49649]: VRRP_Instance(myroute) Received lower prio advert, forcing new election
May 30 03:02:20 node1 Keepalived_vrrp[49649]: VRRP_Instance(myroute) Entering MASTER STATE
May 30 03:02:20 node1 Keepalived_vrrp[49649]: VRRP_Instance(myroute) setting protocol VIPs.
May 30 03:02:20 node1 Keepalived_vrrp[49649]: VRRP_Instance(myroute) Sending gratuitous ARPs on ens33 for 172.18.20.100
May 30 03:02:20 node1 Keepalived_healthcheckers[49648]: Netlink reflector reports IP 172.18.20.100 added
[root@node1 ~]# ip a l
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:e4:53:15 brd ff:ff:ff:ff:ff:ff
inet 172.18.20.7/16 brd 172.18.255.255 scope global ens33
valid_lft forever preferred_lft forever
inet 172.18.252.52/16 brd 172.18.255.255 scope global secondary dynamic ens33
valid_lft 71521sec preferred_lft 71521sec
inet 172.18.20.100/16 scope global secondary ens33
valid_lft forever preferred_lft forever
inet6 fe80::87aa:10b2:67e3:70d0/64 scope link
valid_lft forever preferred_lft forever
3.在node2上抓包,可以收到主節點的通告信息
[root@node2 ~]# tcpdump -i eno16777736 -nn host 224.0.0.100
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eno16777736, link-type EN10MB (Ethernet), capture size 65535 bytes
15:04:21.217999 IP 172.18.20.7 > 224.0.0.100: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:04:22.221621 IP 172.18.20.7 > 224.0.0.100: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20
15:04:23.227296 IP 172.18.20.7 > 224.0.0.100: VRRPv2, Advertisement, vrid 51, prio 100, authtype simple, intvl 1s, length 20