MHA(master high availability)是目前MySQL高可用架构方案中比较成熟与常见的方案。在mysql的故障切换过程中,MHA能够快速的自动的切换,最大程度的保持主从数据库的一致性。
MHA简介
MHA由两部分组成:MHA Manager(管理节点)与MHA Node(数据节点)
MHA Manager工具命令
| 工具命令 | 说明 |
|---|
| masterha_check_ssh | 检查SSH免密登陆 |
| masterha_check_repl | 检查MySQL主从同步状态 |
| masterha_check_status | 检查MHA运行状态 |
| masterha_manager | 启动MHA |
| masterha_master_monitor | 检查master是否有故障 |
| masterha_master_switch | 手工转移故障 |
| masterha_conf_host | 手工添加server信息 |
| mastarha_secondary_check | 从远程服务器建立TCP连接 |
| masterha_stop | 停止MHA |
部署
环境
master 192.168.1.100 主
slave 192.168.1.107 备master
slave 192.168.1.108
manager 192.168.1.101
1.数据库配置好主从
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| *************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.100
Master_User: slave
Master_Port: 3306
Connect_Retry: 30
Master_Log_File: mysql-bin.000011
Read_Master_Log_Pos: 154
Relay_Log_File: mysql-107-relay-bin.000003
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000011
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.100
Master_User: slave
Master_Port: 3306
Connect_Retry: 30
Master_Log_File: mysql-bin.000011
Read_Master_Log_Pos: 154
Relay_Log_File: mysql-108-relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000011
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
|
备master同时也要开启binlog
2.配置host文件
1
2
3
4
| 192.168.1.100 mysql-100
192.168.1.107 mysql-107
192.168.1.108 mysql-108
192.168.1.101 manager
|
3.配置ssh免密
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| [root@101 ~]# ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/root/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
SHA256:yaE5BRw4flvDpcRmyVbtBpBjLW8xlOvxZ41VwCQNLzk root@101
The key's randomart image is:
+---[DSA 1024]----+
| oo+.Booo=o. |
| o ..& B .+o .|
| . . Bo* *E . .|
| . .==o= oo .|
| .+oS+ + + |
| .. . . + .|
| o |
| |
| |
+----[SHA256]-----+
[root@101 ~]# ssh-copy-id root@192.168.1.100
[root@101 ~]# ssh-copy-id root@192.168.1.107
[root@101 ~]# ssh-copy-id root@192.168.1.108
|
4.所有节点安装依赖包
1
2
| [root@101 ~]# wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo #配置阿里epel包
[root@101 ~]# yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes #安装依赖
|
5.安装node,管理节点(也可以下载tar.gz编译安装)
1
2
3
4
5
| [root@101 ~]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@101 ~]# wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
[root@101 ~]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
[root@101 ~]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
管理只需要manager安装即可
|
6.配置MHA管理账户并授权
1
| grant all privileges on *.* to 'mha'@'192.168.1.%' identified by 'Xk321';
|
7.修改配置MHA配置文件
1
2
3
4
5
6
7
8
| [root@101 mha4mysql-manager-0.58]# cp -ra /root/mha4mysql-manager-0.58/samples/scripts/ /usr/local/bin
[root@101 mha4mysql-manager-0.58]# cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin
[root@101 scripts]# cd /usr/local/bin/scripts/
[root@101 scripts]# cp master_ip_online_change /usr/local/bin/
[root@101 scripts]# cp send_report /usr/local/
[root@101 scripts]# mkdir /etc/masterha
[root@101 scripts]# cp /root/mha4mysql-manager-0.58/samples/conf/app1.cnf /etc/masterha/
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| [root@101 mha]# cat conf/mha.cnf
[server default]
```shell
[root@101 ~]# cat /etc/masterha/app1.cnf
[server default]
manager_log=/var/log/masterha/app1/manager.log #MHA日志目录
manager_workdir=/var/log/masterha/app1 #MHA工作日志目录
master_binlog_dir=/data/mysql/ #MHA保存主库binlog日志的目录
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=Xk321 #MHA管理用户密码
ping_interval=1 #用于检测master是否正常状态
remote_workdir=/tmp
repl_password=Xk321 #主从复制用户密码
repl_user=slave #主从复制的用户
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.1.107 -s 192.168.1.108
ssh_user=root #远程登陆的用户
user=mha #MHA管理用户
[server1]
hostname=192.168.1.100
port=3306
[server2] #主机标签
candidate_master=1 #当主库故障,优先成为主库
check_repl_delay=0
hostname=192.168.1.107 #用户名
port=3306 #端口号
[server3]
hostname=192.168.1.108
port=3306
|
8.master_ip_failover配置
```shell
```shell
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
######################################################################
my $vip = '192.168.1.111';
my $brdc = '192.168.1.255';
my $ifdev = 'ens192';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens192:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens192:$key down";
my $exit_code = 0;
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
############################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
9.检测
```shell
[root@101 ~]# masterha_check_ssh -conf=/etc/masterha/app1.cnf #检测ssh免密连接情况
Fri May 7 21:45:37 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri May 7 21:45:37 2021 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Fri May 7 21:45:37 2021 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Fri May 7 21:45:37 2021 - [info] Starting SSH connection tests..
Fri May 7 21:45:38 2021 - [debug]
Fri May 7 21:45:37 2021 - [debug] Connecting via SSH from root@192.168.1.107(192.168.1.107:22) to root@192.168.1.108(192.168.1.108:22)..
Fri May 7 21:45:37 2021 - [debug] ok.
Fri May 7 21:45:38 2021 - [debug]
Fri May 7 21:45:38 2021 - [debug] Connecting via SSH from root@192.168.1.108(192.168.1.108:22) to root@192.168.1.107(192.168.1.107:22)..
Fri May 7 21:45:38 2021 - [debug] ok.
Fri May 7 21:45:38 2021 - [info] All SSH connection tests passed successfully.
......
Thu May 6 23:00:06 2021 - [info] All SSH connection tests passed successfully.
[root@101 mha]# masterha_check_repl -conf=/etc/masterha/app1.cnf #检测主从同步状态
......
mysql-100(192.168.1.100:3306) (current master)
+--mysql-107(192.168.1.107:3306)
+--mysql-108(192.168.1.108:3306)
......
MySQL Replication Health is OK.
10.在master手动创建VIP
1
| [root@101 ~]# ifconfig ens192:1 192.168.1.111
|
11.开启
1
| [root@101 mha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
|
12.测试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
| [root@101 ~]# masterha_check_status -conf=/etc/masterha/app1.cnf
app1 (pid:1297) is running(0:PING_OK), master:192.168.1.100
#关闭master sql
[root@100 ~]# /etc/init.d/mysqld stop
Shutting down MySQL............ SUCCESS!
#查看msyql-108主master
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.107
Master_User: slave
Master_Port: 3306
Connect_Retry: 30
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 154
Relay_Log_File: mysql-108-relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
查看mysql-107
[root@mysql-107 ~]# ifconfig ens192:1
ens192:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.111 netmask 255.255.255.0 broadcast 192.168.1.255
ether 00:0c:29:8f:ef:9a txqueuelen 1000 (Ethernet)
VIP成功漂移过来
#查看日志
[root@101 ~]# tail -f /var/log/masterha/app1/manager.log
IN SCRIPT TEST====/sbin/ifconfig ens192:1 down==/sbin/ifconfig ens192:1 192.168.1.111===
Checking the Status of the script.. OK
Fri May 7 21:30:02 2021 - [info] OK.
Fri May 7 21:30:02 2021 - [warning] shutdown_script is not defined.
Fri May 7 21:30:02 2021 - [info] Set master ping interval 1 seconds.
Fri May 7 21:30:02 2021 - [info] Set secondary check script: /usr/local/bin/masterha_secondary_check -s 192.168.1.107 -s 192.168.1.108
Fri May 7 21:30:02 2021 - [info] Starting ping health check on 192.168.1.100(192.168.1.100:3306)..
Fri May 7 21:30:02 2021 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
Fri May 7 21:30:19 2021 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Fri May 7 21:30:19 2021 - [info] Executing secondary network check script: /usr/local/bin/masterha_secondary_check -s 192.168.1.107 -s 192.168.1.108 --user=root --master_host=192.168.1.100 --master_ip=192.168.1.100 --master_port=3306 --master_user=mha --master_password=Xk321 --ping_type=SELECT
Fri May 7 21:30:19 2021 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql/ --output_file=/tmp/save_binary_logs_test --manager_version=0.58 --binlog_prefix=mysql-bin
Fri May 7 21:30:19 2021 - [info] HealthCheck: SSH to 192.168.1.100 is reachable.
Monitoring server 192.168.1.107 is reachable, Master is not reachable from 192.168.1.107. OK.
Monitoring server 192.168.1.108 is reachable, Master is not reachable from 192.168.1.108. OK.
Fri May 7 21:30:19 2021 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Fri May 7 21:30:20 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.1.100' (111))
Fri May 7 21:30:20 2021 - [warning] Connection failed 2 time(s)..
Fri May 7 21:30:21 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.1.100' (111))
Fri May 7 21:30:21 2021 - [warning] Connection failed 3 time(s)..
Fri May 7 21:30:22 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.1.100' (111))
Fri May 7 21:30:22 2021 - [warning] Connection failed 4 time(s)..
Fri May 7 21:30:22 2021 - [warning] Master is not reachable from health checker!
Fri May 7 21:30:22 2021 - [warning] Master 192.168.1.100(192.168.1.100:3306) is not reachable!
Fri May 7 21:30:22 2021 - [warning] SSH is reachable.
Fri May 7 21:30:22 2021 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Fri May 7 21:30:22 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri May 7 21:30:22 2021 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Fri May 7 21:30:22 2021 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Fri May 7 21:30:23 2021 - [info] GTID failover mode = 0
Fri May 7 21:30:23 2021 - [info] Dead Servers:
Fri May 7 21:30:23 2021 - [info] 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:23 2021 - [info] Alive Servers:
Fri May 7 21:30:23 2021 - [info] 192.168.1.107(192.168.1.107:3306)
Fri May 7 21:30:23 2021 - [info] 192.168.1.108(192.168.1.108:3306)
Fri May 7 21:30:23 2021 - [info] Alive Slaves:
Fri May 7 21:30:23 2021 - [info] 192.168.1.107(192.168.1.107:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:23 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:23 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri May 7 21:30:23 2021 - [info] 192.168.1.108(192.168.1.108:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:23 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:23 2021 - [info] Checking slave configurations..
Fri May 7 21:30:23 2021 - [info] read_only=1 is not set on slave 192.168.1.107(192.168.1.107:3306).
Fri May 7 21:30:23 2021 - [warning] relay_log_purge=0 is not set on slave 192.168.1.107(192.168.1.107:3306).
Fri May 7 21:30:23 2021 - [info] read_only=1 is not set on slave 192.168.1.108(192.168.1.108:3306).
Fri May 7 21:30:23 2021 - [warning] relay_log_purge=0 is not set on slave 192.168.1.108(192.168.1.108:3306).
Fri May 7 21:30:23 2021 - [info] Checking replication filtering settings..
Fri May 7 21:30:23 2021 - [info] Replication filtering check ok.
Fri May 7 21:30:23 2021 - [info] Master is down!
Fri May 7 21:30:23 2021 - [info] Terminating monitoring script.
Fri May 7 21:30:23 2021 - [info] Got exit code 20 (Master dead).
Fri May 7 21:30:23 2021 - [info] MHA::MasterFailover version 0.58.
Fri May 7 21:30:23 2021 - [info] Starting master failover.
Fri May 7 21:30:23 2021 - [info]
Fri May 7 21:30:23 2021 - [info] * Phase 1: Configuration Check Phase..
Fri May 7 21:30:23 2021 - [info]
Fri May 7 21:30:24 2021 - [info] GTID failover mode = 0
Fri May 7 21:30:24 2021 - [info] Dead Servers:
Fri May 7 21:30:24 2021 - [info] 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:24 2021 - [info] Checking master reachability via MySQL(double check)...
Fri May 7 21:30:24 2021 - [info] ok.
Fri May 7 21:30:24 2021 - [info] Alive Servers:
Fri May 7 21:30:24 2021 - [info] 192.168.1.107(192.168.1.107:3306)
Fri May 7 21:30:24 2021 - [info] 192.168.1.108(192.168.1.108:3306)
Fri May 7 21:30:24 2021 - [info] Alive Slaves:
Fri May 7 21:30:24 2021 - [info] 192.168.1.107(192.168.1.107:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:24 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:24 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri May 7 21:30:24 2021 - [info] 192.168.1.108(192.168.1.108:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:24 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:24 2021 - [info] Starting Non-GTID based failover.
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] ** Phase 1: Configuration Check Phase completed.
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] * Phase 2: Dead Master Shutdown Phase..
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] Forcing shutdown so that applications never connect to the current master..
Fri May 7 21:30:24 2021 - [info] Executing master IP deactivation script:
Fri May 7 21:30:24 2021 - [info] /usr/local/bin/master_ip_failover --orig_master_host=192.168.1.100 --orig_master_ip=192.168.1.100 --orig_master_port=3306 --command=stopssh --ssh_user=root
IN SCRIPT TEST====/sbin/ifconfig ens192:1 down==/sbin/ifconfig ens192:1 192.168.1.111===
Disabling the VIP on old master: 192.168.1.100
Fri May 7 21:30:24 2021 - [info] done.
Fri May 7 21:30:24 2021 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Fri May 7 21:30:24 2021 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] * Phase 3: Master Recovery Phase..
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] The latest binary log file/position on all slaves is mysql-bin.000035:154
Fri May 7 21:30:24 2021 - [info] Latest slaves (Slaves that received relay log files to the latest):
Fri May 7 21:30:24 2021 - [info] 192.168.1.107(192.168.1.107:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:24 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:24 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri May 7 21:30:24 2021 - [info] 192.168.1.108(192.168.1.108:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:24 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:24 2021 - [info] The oldest binary log file/position on all slaves is mysql-bin.000035:154
Fri May 7 21:30:24 2021 - [info] Oldest slaves:
Fri May 7 21:30:24 2021 - [info] 192.168.1.107(192.168.1.107:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:24 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:24 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri May 7 21:30:24 2021 - [info] 192.168.1.108(192.168.1.108:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:24 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
Fri May 7 21:30:24 2021 - [info]
Fri May 7 21:30:24 2021 - [info] Fetching dead master's binary logs..
Fri May 7 21:30:24 2021 - [info] Executing command on the dead master 192.168.1.100(192.168.1.100:3306): save_binary_logs --command=save --start_file=mysql-bin.000035 --start_pos=154 --binlog_dir=/data/mysql/ --output_file=/tmp/saved_master_binlog_from_192.168.1.100_3306_20210507213023.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58
Creating /tmp if not exists.. ok.
Concat binary/relay logs from mysql-bin.000035 pos 154 to mysql-bin.000035 EOF into /tmp/saved_master_binlog_from_192.168.1.100_3306_20210507213023.binlog ..
Binlog Checksum enabled
Dumping binlog format description event, from position 0 to 154.. ok.
No need to dump effective binlog data from /data/mysql//mysql-bin.000035 (pos starts 154, filesize 154). Skipping.
Binlog Checksum enabled
/tmp/saved_master_binlog_from_192.168.1.100_3306_20210507213023.binlog has no effective data events.
Event not exists.
Fri May 7 21:30:25 2021 - [info] Additional events were not found from the orig master. No need to save.
Fri May 7 21:30:25 2021 - [info]
Fri May 7 21:30:25 2021 - [info] * Phase 3.3: Determining New Master Phase..
Fri May 7 21:30:25 2021 - [info]
Fri May 7 21:30:25 2021 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Fri May 7 21:30:25 2021 - [info] All slaves received relay logs to the same position. No need to resync each other.
Fri May 7 21:30:25 2021 - [info] Searching new master from slaves..
Fri May 7 21:30:25 2021 - [info] Candidate masters from the configuration file:
Fri May 7 21:30:25 2021 - [info] 192.168.1.107(192.168.1.107:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled
Fri May 7 21:30:25 2021 - [info] Replicating from 192.168.1.100(192.168.1.100:3306)
Fri May 7 21:30:25 2021 - [info] Primary candidate for the new Master (candidate_master is set)
Fri May 7 21:30:25 2021 - [info] Non-candidate masters:
Fri May 7 21:30:25 2021 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Fri May 7 21:30:25 2021 - [info] New master is 192.168.1.107(192.168.1.107:3306)
Fri May 7 21:30:25 2021 - [info] Starting master failover..
Fri May 7 21:30:25 2021 - [info]
From:
192.168.1.100(192.168.1.100:3306) (current master)
+--192.168.1.107(192.168.1.107:3306)
+--192.168.1.108(192.168.1.108:3306)
To:
192.168.1.107(192.168.1.107:3306) (new master)
+--192.168.1.108(192.168.1.108:3306)
Fri May 7 21:30:25 2021 - [info]
Fri May 7 21:30:25 2021 - [info] * Phase 3.4: New Master Diff Log Generation Phase..
Fri May 7 21:30:25 2021 - [info]
Fri May 7 21:30:25 2021 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Fri May 7 21:30:25 2021 - [info]
Fri May 7 21:30:25 2021 - [info] * Phase 3.5: Master Log Apply Phase..
Fri May 7 21:30:25 2021 - [info]
Fri May 7 21:30:25 2021 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Fri May 7 21:30:25 2021 - [info] Starting recovery on 192.168.1.107(192.168.1.107:3306)..
Fri May 7 21:30:25 2021 - [info] This server has all relay logs. Waiting all logs to be applied..
Fri May 7 21:30:25 2021 - [info] done.
Fri May 7 21:30:25 2021 - [info] All relay logs were successfully applied.
Fri May 7 21:30:25 2021 - [info] Getting new master's binlog name and position..
Fri May 7 21:30:25 2021 - [info] mysql-bin.000007:154
Fri May 7 21:30:25 2021 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.1.107', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000007', MASTER_LOG_POS=154, MASTER_USER='slave', MASTER_PASSWORD='xxx';
Fri May 7 21:30:25 2021 - [info] Executing master IP activate script:
Fri May 7 21:30:25 2021 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=192.168.1.100 --orig_master_ip=192.168.1.100 --orig_master_port=3306 --new_master_host=192.168.1.107 --new_master_ip=192.168.1.107 --new_master_port=3306 --new_master_user='mha' --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password
IN SCRIPT TEST====/sbin/ifconfig ens192:1 down==/sbin/ifconfig ens192:1 192.168.1.111===
----- Failover Report -----
app1: MySQL Master failover 192.168.1.100(192.168.1.100:3306) to 192.168.1.107(192.168.1.107:3306) succeeded
Master 192.168.1.100(192.168.1.100:3306) is down!
Check MHA Manager logs at 101:/var/log/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.1.100(192.168.1.100:3306)
The latest slave 192.168.1.107(192.168.1.107:3306) has all relay logs for recovery.
Selected 192.168.1.107(192.168.1.107:3306) as a new master.
192.168.1.107(192.168.1.107:3306): OK: Applying all logs succeeded.
192.168.1.107(192.168.1.107:3306): OK: Activated master IP address.
192.168.1.108(192.168.1.108:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.1.108(192.168.1.108:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.1.107(192.168.1.107:3306)
192.168.1.107(192.168.1.107:3306): Resetting slave info succeeded.
Master failover to 192.168.1.107(192.168.1.107:3306) completed successfully.
|
在MHA的高可用环境的,主库宕机了,MHA服务将停止,如何恢复MHA服务了,需要把宕机的主库加入到高可用环境(也就是把宕机的主库变成从库)在重新启动MHA