MHA高可用
1、MHA软件介绍
原理:当Master出现故障时,它可以自动将最新数据的Slave提升为新的Master,然后将所有其他的Slave重新指向新的Master。
MHA能够在较短的时间内实现自动故障检测和故障转移,通常在10-30秒以内;在复制框架中,MHA能够很好地解决复制过程中的数据一致性问题,由于不需要在现有的replication中添加额外的服务器,仅需要一个manager节点,而一个Manager能管理多套复制,所以能大大地节约服务器的数量;另外,安装简单,无性能损耗,以及不需要修改现有的复制部署也是它的优势之处。
1 2 3 4 5 6 7 从库不更新binlog,只更新relay-log,这个可以解决的,可以配置让他更新binlog,加参数 Rvs就是阿里云上的数据库,ecs是云主机 Heartbeat+DRBD 也是做高可用的 心跳检测 MHA的manager最好不要安装在主库上,就怕断电了,mha不能工作,安装在3台从库上可以,或者重新找一台新机器安装MHA(MHA就是一个监控软件。监控主库心跳的)
工作流程
1)把宕机的master二进制日志保存下来。
2)找到binlog位置点最新的slave。
3)在binlog位置点最新的slave上用relay log(差异日志)修复其它slave。
4)将宕机的master上保存下来的二进制日志恢复到含有最新位置点的slave上。
5)将含有最新位置点binlog所在的slave提升为master。
6)将其它slave重新指向新提升的master,并开启主从复制。
1 2 3 4 5 6 7 8 keepalived有局限性,他怎么知道哪台机器的数据最新? MHA做完切换,运维无感知,需要有一个通知,MHA自带一个通知工具 问题: 1、binlog的保存位置在哪 (谁被提升为主库就发缺失数据送给谁) 2、down机的主库修复好之后,如何加入集群? 3、down机的主库修复好之后,加入集群的角色是什么?
2、MHA架构图
MHA的工具
MHA node的工具
1 2 3 4 5 [root@dbe1 bin]# 11 -rwxrwxr-x1 root root 17639 Mar 23 2018 apply_diff relay_logs -rwxrwxr-x 1 root root 4807 Mar 23 2018 filter mysqlbinlog -rwxrwxr-x 1 root root 8337 Mar 23 2018 purge_relay_logs -rwxrwxr-x 1 root root 7525 Mar 23 2018 save binary logs
MHA manager
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 [root@dbe1 bin]# 11 -rwxrwxr-x 1 root root 1995 Mar 232018 masterha_check_repl -rwxrwxr-x 1 root root 1779 Mar 232018 masterha_check_ssh -rwxrwxr-x 1 root root 1865 Mar 232018 masterha_check_status -rwxrwxr-x 1 root root 3201 Mar 232018 masterha_conf_host -rwxrwxr-x 1 root root 2517 Mar 232018 masterha_manager -rwxrwxr-x 1 root root 2165 Mar 232018 masterha_master_monitor -rwxrwxr-x 1 root root 2373 Mar 232018 masterha_master_switch -rwxrwxr-x 1 root root 5172 Mar 232018 masterha_secondary_check -rwxrwxr-x 1 root root 1739 Mar 232018 masterha_stop -rwxrwxr-x 1 root root 3648 Mar 23 2018 master_ip_failover -rwxrwxr-x 1 root root 9870 Mar 23 2018 master_ip_online_change -rwxrwxr-x 1 root root 11867 Mar 23 2018 power_manager -rwxrwxr-x 1 root root 1360 Mar 23 2818 send_report
MHA优点总结
1)Masterfailover and slave promotion can be done very quickly
自动故障转移快
2)Mastercrash does not result in data inconsistency
主库崩溃不存在数据一致性问题
3)Noneed to modify current MySQL settings (MHA works with regular MySQL)
不需要对当前mysql环境做重大修改
4)Noneed to increase lots of servers
不需要添加额外的服务器(仅一台manager就可管理上百个replication)
5)Noperformance penalty
性能优秀,可工作在半同步复制和异步复制,当监控mysql状态时,仅需要每隔N秒向master发送ping包(默认3秒),所以对性能无影响。你可以理解为MHA的性能和简单的主从复制框架性能一样。
6)Works with any storage engine
只要replication支持的存储引擎,MHA都支持,不会局限于innodb
3、部署MHA
环境准备
主机名
ip
角色
应用
db01
10.0.0.51 / 172.16.1.51
暂时的主库、MHA客户端
mysql 5.6、MHA node
db02
10.0.0.52 / 172.16.1.52
暂时的从库、MHA客户端
mysql 5.6、MHA node
db03
10.0.0.53 / 172.16.1.53
暂时的从库、MHA客户端
mysql 5.6、MHA node
db04
10.0.0.54 / 172.16.1.54
暂时的从库、MHA客户端
mysql 5.6、MHA node
lb01
10.0.0.5 / 172.16.1.5
MHA的管理端
MHA manager、mariadb
1、MHA先决条件,需要先部署主从
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 1、主库开启binlog,主库不开 2、主库和从库server_id不同,从库之间可以相同 3、主库要创建主从复制用户,从库可以不创建 1、主库、从库也要开启binlog 2、主库、从库server_id不同,从库之间也不能相同 3、主库、从库必须创建主从复制用户 4、关闭只读 1、主库配置文件 [root@db01 ~]# vim /etc/my.cnf [root@db01 ~]# vim /etc/my.cnf [mysqld] basedir=/app/mysql-5.6.50 datadir=/app/mysql-5.6.50/data log_bin=mysql-bin server_id=1 lower_case_table=1 innodb_data_file_path=ibdata1:76M;ibdata2:50M:autoextend binlog_format=row slow_query_log=1 slow_query_log_file=/app/mysql-5.6.50/data/db01-slow.log long_query_time=0.05 log_queries_not_using_indexes min_examined_row_limit=100 skip_name_resolve relay_log_purge = 0 [client] [mysql] prompt="\\u@\\h:\\d > " [root@db01 ~]# /etc/init.d/mysqld restart 创建用户 grant replication slave on *.* to rep@'172.16.1.5%' identified by '123' ; 2、从库配置文件 [root@db02 data]# vim /etc/my.cnf [mysqld] basedir=/app/mysql-5.6.50 datadir=/app/mysql-5.6.50/data log_bin=mysql-bin server_id=2 lower_case_table=1 binlog_format=row socket=/tmp/mysql.sock skip-name-resolve relay_log_purge = 0 [client] socket=/tmp/mysql.sock [mysql] prompt="\\u@\\h:\\d > " [root@db02 data]# /etc/init.d/mysqld restart 创建用户 grant replication slave on *.* to rep@'172.16.1.5%' identified by '123' ; [root@db03 data]# vim /etc/my.cnf [mysqld] basedir=/app/mysql-5.6.50 datadir=/app/mysql-5.6.50/data log_bin=mysql-bin server_id=3 lower_case_table=1 binlog_format=row socket=/tmp/mysql.sock skip-name-resolve relay_log_purge = 0 [client] socket=/tmp/mysql.sock [mysql] prompt="\\u@\\h:\\d > " [root@db03 data]# /etc/init.d/mysqld restart 创建用户 grant replication slave on *.* to rep@'172.16.1.5%' identified by '123' ; [root@db04 data]# vim /etc/my.cnf [mysqld] basedir=/app/mysql-5.6.50 datadir=/app/mysql-5.6.50/data log_bin=mysql-bin server_id=4 lower_case_table=1 binlog_format=row socket=/tmp/mysql.sock skip-name-resolve relay_log_purge = 0 [client] socket=/tmp/mysql.sock [mysql] prompt="\\u@\\h:\\d > " [root@db04 data]# /etc/init.d/mysqld restart 创建用户 grant replication slave on *.* to rep@'172.16.1.5%' identified by '123' ; 3、#3台从库设置只读,(在命令行临时设置),不能在配置文件设置,(mha切换的时候会重启mysql,让只读重启失效,临时设置重启会失效) root@localhost:(none) > set global read_only=1; 4、#4台数据库关闭relay-log自动删除功能(临时生效) root@localhost:(none) > set global relay_log_purge = 0; [root@db01 ~]# vim /etc/my.cnf [mysqld] relay_log_purge=0 [root@db02 ~]# vim /etc/my.cnf [mysqld] relay_log_purge=0 [root@db03 ~]# vim /etc/my.cnf [mysqld] relay_log_purge=0 [root@db04 ~]# vim /etc/my.cnf [mysqld] relay_log_purge=0
2、安装node、manager、ssh免密
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 wget http://test.driverzeng.com/MySQL_plugins/mha4mysql-manager-0.56-0.el6.noarch.rpm wget http://test.driverzeng.com/MySQL_plugins/mha4mysql-node-0.56-0.el6.noarch.rpm 1、推送node包在各台机器 [root@db01 ~]# for i in 5 52 53 54;do scp mha4mysql-node-0.56-0.el7.centos.noarch.rpm 172.16.1.$i :/root;done 2、5台机器安装node [root@db03 ~]# yum -y localinstall mha4mysql-node-0.56-0.el6.noarch.rpm 3、(lb01) mha管理机安装manager [root@lb01 ~]# yum -y localinstall mha4mysql-manager-0.56-0.el6.noarch.rpm [root@lb01 ~]# rpm -qa |grep mha mha4mysql-manager-0.56-0.el6.noarch mha4mysql-node-0.56-0.el6.noarch 4、4台机器做软连接 [root@db01 ~]# ln -s /app/mysql-5.6.50/bin/mysqlbinlog /usr/bin/mysqlbin;ln -s /app/mysql-5.6.50/bin/mysql /usr/bin/mysql 5、5台机器做免密,自己也要给自己做免密 ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa >/dev/null 2>&1 [root@db01 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@172.16.1.5 [root@db01 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@172.16.1.51 [root@db01 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@172.16.1.52 [root@db01 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@172.16.1.53 [root@db01 ~]# ssh-copy-id -i ~/.ssh/id_dsa.pub root@172.16.1.54 [root@db01 ~]# for i in 5 51 52 53 54;do ssh root@172.16.1.$i "ifconfig|awk 'NR==2{print $2 }'" ;done 输出信息就免密成功 inet 10.0.0.5 netmask 255.255.255.0 broadcast 10.0.0.255 inet 10.0.0.51 netmask 255.255.255.0 broadcast 10.0.0.255 inet 10.0.0.52 netmask 255.255.255.0 broadcast 10.0.0.255 inet 10.0.0.53 netmask 255.255.255.0 broadcast 10.0.0.255 inet 10.0.0.54 netmask 255.255.255.0 broadcast 10.0.0.255
3、配置MHA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 1、创建MHA配置文件存放目录 [root@lb01 ~]# mkdir /etc/mha 2、编写配置文件 [root@lb01 ~]# vim /etc/mha/app1.cnf [server default] manager_log=/etc/mha/logs/manager.log manager_workdir=/etc/mha/app1 master_binlog_dir=/app/mysql-5.6.50/data user=mha password=mha ping_interval=2 repl_password=123 repl_user=rep ssh_user=root ssh_port=22 [server1] hostname=172.16.1.51 port=3306 [server2] hostname=172.16.1.52 port=3306 [server3] hostname=172.16.1.53 port=3306 [server4] hostname=172.16.1.54 port=3306 3、db01主库创建mha的管理用户,其他从库做了主从复制,会把这个用户复制过去 grant all on *.* to mha@'172.16.1.%' identified by '123' ; 4、创建工作目录、日志目录 [root@lb01 ~]# mkdir /etc/mha/{logs,app1} [root@lb01 ~]# ll /etc/mha/ drwxr-xr-x 2 root root 6 Aug 27 20:20 app1 -rw-r--r-- 1 root root 449 Aug 27 20:17 app1.cnf drwxr-xr-x 2 root root 6 Aug 27 20:20 logs 4、检测MHA的SSH免密 [root@lb01 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf 5、每次启动前先检测mha的主从复制 [root@lb01 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
检测免密结果显示成功
检测mha的主从复制OK
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 7、启动MHA [root@lb01 ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /etc/mha/logs/manager.log 2>&1 & masterha_manager: 启动命令 --conf=/etc/mha/app1.cnf: 指定配置文件 --remove_dead_master_conf:做完切换后,从配置文件中摘除宕机的主库 --ignore_last_failover: 忽略上一次切换(切换完了,不生产锁文件) 1).MHA在做一次切换后,会生成一个锁文件(app1.failover.complete)在工作目录,8个小时之内,无法做第二次切换 生成锁文件的目的(下一次切换的时候会看一下这个锁文件是否存在,锁文件存在不切换,这个锁文件会存在8小时) 2).MHA切换完成后,会自动结束MHA的进程 [root@lb01 ~]# ll /etc/mha/app1 -rw-r--r-- 1 root root 0 Aug 28 11:25 app1.failover.complete 1).在所有从库相同时,MHA会选择配置文件中server标签id 最小的作为主库 8、启动后检查 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf 显示结果就OK: app1 (pid:18582) is running(0:PING_OK), master:172.16.1.51
4、使用systemd管理MHA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 1、停止mha [root@lb01 ~]# masterha_stop --conf=/etc/mha/app1.cnf 2、使用systemctl管理 [root@lb01 ~]# vim /usr/lib/systemd/system/mha.service [Unit] Description=MHA After=network.target sshd-keygen.service Wants=sshd-keygen.service [Service] Type=sample ExecStart=/usr/bin/masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover > /etc/mha/logs/manager.log ExecStop=/usr/bin/masterha_stop --conf=/etc/mha/app1.cnf [Install] WantedBy=multi-user.target [root@lb01 ~]# systemctl daemon-reload [root@lb01 ~]# systemctl start mha [root@lb01 ~]# systemctl enable mha 3、启动后检查 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:19243) is running(0:PING_OK), master:172.16.1.51
4、MHA日志分析
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [root@lb01 ~]# tailf -100 /etc/mha/logs/manager.log GTID failover mode = 0 Starting Non-GTID based failover. master_ip_failover_script is not set . Skipping invalidating dead master IP address. Candidate masters from the configuration file: Non-candidate masters: candidate_master=1 // 设立太子,但是如果太子落后其他机器数据超过100M,就废储 check_repl_delay=0 // 关闭对太子落后的检测 All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.1.54' , MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000003' , MASTER_LOG_POS=878597, MASTER_USER='rep' , MASTER_PASSWORD='xxx' ;
MHA日志文件里面的相关工具命令
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 save_binary_logs --command =save --start_file=mysql-bin.000001 --start_pos=878597 --binlog_dir=/app/mysql/data --output_file=/var/tmp/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 pply_diff_relay_logs --command =apply --slave_user='mha' --slave_host=172.16.1.54 --slave_ip=172.16.1.54 --slave_port=3306 -- apply_files=/var/tmp/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog --workdir=/var/tmp --target_version=5.7.42-log --timestamp=20240827154042 --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58 --slave_pass=xxx
MHA日志文件里面的binlog的路径
1 2 3 4 5 6 7 8 9 10 11 12 13 [root@db02 ~]# ll /var/tmp/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog -rw-r--r-- 1 root root 177 Aug 27 15:40 /var/tmp/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog [root@mha-manager ~]# ll /etc/mha/app1/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog -rw-r--r-- 1 root root 177 Aug 27 15:40 /etc/mha/app1/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog [root@db04 ~]# ll /var/tmp/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog -rw-r--r-- 1 root root 177 Aug 27 15:40 /var/tmp/saved_master_binlog_from_172.16.1.52_3306_20240827154042.binlog
5、MHA集群的恢复
1、宕掉db01主库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 1、停止mysql [root@db01 ~]# /etc/init.d/mysqld stop 2、db02 root@localhost:(none) > show slave status\G Empty set (0.00 sec) 3、db03 root@localhost:(none) > show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.52 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000008 Read_Master_Log_Pos: 120 Relay_Log_File: db03-relay-bin.000002 Relay_Log_Pos: 283 Relay_Master_Log_File: mysql-bin.000008 Slave_IO_Running: Yes Slave_SQL_Running: Yes 4、db04 root@localhost:(none) > show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.52 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000008 Read_Master_Log_Pos: 120 Relay_Log_File: db04-relay-bin.000002 Relay_Log_Pos: 283 Relay_Master_Log_File: mysql-bin.000008 Slave_IO_Running: Yes Slave_SQL_Running: Yes 5、查看lb01MHA管理机器的配置文件,已摘除db01的server标签
2、修复宕机的主库,并加入集群
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 1、#修复宕机的主库(db01) [root@db01 ~]# /etc/init.d/mysqld start 2、#找到最后一个change master语句,到宕机的主库执行 [root@lb01 ~]# grep -i 'change master to' /etc/mha/logs/manager.log|tail -1 CHANGE MASTER TO MASTER_HOST='172.16.1.52' , MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000008' , MASTER_LOG_POS=120, MASTER_USER='rep' , MASTER_PASSWORD='123' ; 3、#db01执行change msater 语句,启动主从复制 root@localhost:(none) > CHANGE MASTER TO MASTER_HOST='172.16.1.52' , MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000008' , MASTER_LOG_POS=120, MASTER_USER='rep' , MASTER_PASSWORD='123' ; root@localhost:(none) > start slave; root@localhost:(none) > show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.52 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000008 Read_Master_Log_Pos: 120 Relay_Log_File: db01-relay-bin.000002 Relay_Log_Pos: 283 Relay_Master_Log_File: mysql-bin.000008 Slave_IO_Running: Yes Slave_SQL_Running: Yes 4、#MHA修改配置文件,将db01的信息加入mha配置文件 [root@lb01 ~]# vim /etc/mha/app1.cnf ..... [server1] hostname=172.16.1.51 port=3306 ...... 5、#启动mha [root@lb01 ~]# systemctl start mha 6、检查谁是主库,会报错,需要安装mysql命令 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf 7、安装mysql命令 安装mysql源 [root@lb01 ~]# rpm -ivh https://dev.mysql.com/get/mysql84-community-release-el7-1.noarch.rpm [root@lb01 ~]# vim /etc/yum.repos.d/mysql-community.repo [mysql56-community] name=MySQL 5.6 Community Server baseurl=http://repo.mysql.com/yum/mysql-5.6-community/el/7/$basearch enabled=1 gpgcheck=0 [root@lb01 ~]# yum -y install mysql 8、再次执行 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:27835) is running(0:PING_OK), master:172.16.1.51
6、写脚本恢复集群,放到mha机器
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:27835) is running(0:PING_OK), master:172.16.1.51 2、写脚本恢复,放到mha机器 [root@lb01 ~]# vim recover.sh . /etc/init.d/functions log_file='/etc/mha/logs/manager.log' mha_conf='/etc/mha/app1.cnf' repl_pass='123' mha_user=$(awk -F= '/^user/{print $2}' ${mha_conf} ) mha_password=$(awk -F= '/^password/{print $2}' ${mha_conf} ) dowm_master=$(sed -nr 's#^Master (.*)\(.*\).*!$#\1#gp' ${log_file} |tail -1) change_master=$(grep -i 'change master to' ${log_file} |tail -1|awk -F: '{print $4}' |sed "s#xxx#${repl_pass} #g" ) echo "正在修复主库....." ssh ${dowm_master} '/etc/init.d/mysqld start' while true ;do mysqladmin -u${mha_user} -p${mha_password} -h${dowm_master} ping &>/dev/null if [ $? -eq 0 ];then mysql -u${mha_user} -p${mha_password} -h${dowm_master} -e "${change_master} ;start slave" echo "主库修复完成....." break fi done echo "正在补全MHA配置文件..." cat > ${mha_conf} << EOF [server default] manager_log=/etc/mha/logs/manager.log manager_workdir=/etc/mha/app1 master_binlog_dir=/app/mysql-5.6.50/data master_ip_failover_script=/etc/mha/app1/master_ip_failover password=${mha_password} ping_interval=2 repl_password=123 repl_user=rep ssh_port=22 ssh_user=root user=${mha_user} [server1] hostname=172.16.1.51 port=3306 [server2] hostname=172.16.1.52 port=3306 [server3] hostname=172.16.1.53 port=3306 [server4] hostname=172.16.1.54 port=3306 #[binlog1] #(这个配置了,就永远不会提升为主库) #no_master=1 #hostname=172.16.1.5 #master_binlog_dir=/data/mysql/binlog EOF echo "正在启动MHA....." systemctl start mha while true ;do masterha_check_status --conf=${mha_conf} &>/dev/null if [ $? -eq 0 ];then action "MHA manager启动" /bin/true break fi done 3、#执行脚本 sh -x recover.sh 1、vip的漂移 2、配合zabbix的自愈模式,要等切换完才可以执行脚本 上一次宕机的主库,和这一次的新主库,都要记录一下,再判断这个日志里面,这一次宕机的主库,是不是上一次宕机的主库 要在脚本前面判断上一次切换是否完成,就是看报告是否出来
测试:根据数据最新的提升为主库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 主库:db01 从库:db02、db03、db04 1、#当不知道谁是主库时,执行这条命令 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:27835) is running(0:PING_OK), master:172.16.1.51 2、#db01执行脚本 [root@db01 ~]# sh insert.sh 3、#想让db03数据最新,得让其他2个库的relaylog落后 (db02) root@localhost:(none) > stop slave io_thread; (db04) root@localhost:(none) > stop slave io_thread; 检查数据是否落后 select * from prod.prod4、#停止主库db01,停止脚本 [root@db01 ~]# /etc/init.d/mysqld stop [root@db01 ~]# sh insert.sh ^C 5、#db02、db04检查主从状态,已经变成主库 root@localhost:(none) > show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.53 6、#执行脚本,将宕掉的db01加入集群,查看集群状态,集群的主库变成db03 [root@lb01 ~]# sh recover.sh [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:31397) is running(0:PING_OK), master:172.16.1.53 脚本里面的:start slave 会把sql、io线程起来
测试:设置储君参数(将某台数据库的数据一直落后),选主时,该数据库被选为主库
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 candidate_master=1 check_repl_delay=0 1、修改mha配置文件 [root@lb01 ~]# vim /etc/mha/app1.cnf ..... [server2] candidate_master=1 check_repl_delay=0 hostname=172.16.1.52 port=3306 ..... 2、重启mha [root@lb01 ~]# systemctl restart mha [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:31957) is running(0:PING_OK), master:172.16.1.53 3、主库db03执行写入数据脚本 [root@db03 ~]# sh insert.sh 4、设置db02数据落后 root@localhost:(none) > stop slave io_thread; 此时,db02的数据肯定落后 5、停止db03数据库,停止脚本 [root@db03 ~]# /etc/init.d/mysqld stop [root@db03 ~]# sh insert.sh ^C 6、查看db01、db04主从状态,主库强制变成了db02 root@localhost:(none) > show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.52
7、 MHA的vip漂移
VIP漂移的两种方式
1)通过keepalived的方式,管理虚拟IP的漂移
2)通过MHA自带脚本方式,管理虚拟IP的漂移
1 2 3 4 5 6 7 8 9 10 11 12 13 1、keepalived如何可以找到最新的从库 2、要写4个脚本,每台机器放一个,切换的时候容易出现脑裂 3、需要等待MHA切换完成再漂移VIP 4、多个脚本同时执行,需要去检测不同机器上slave的复制情况 5、网上的文档就2台之间漂移,不会做4台,如果2台切换,如何保证切到的数据最新呢(做半同步可以解决) 6、半同步阻塞数据写入,影响主库性能(这个机器不提供服务就可以解决) 7、互相切换的2台主库升配置,(51 51升配,53 54不升配置) 6、7 这2点就是在砸钱 mha可以找到最新的从库替换为主库
配置MHA + master_ip_failover (MHA自带脚本方式的VIP漂移)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 1、更改MHA的配置文件 [root@lb01 ~]# vim /etc/mha/app1/master_ip_failover master_ip_failover_script=/etc/mha/app1/master_ip_failover 2、下载写好的将脚本放到工作目录 [root@lb01 ~]# wget http://test.driverzeng.com/MySQL_File/master_ip_failover my $vip = '172.16.1.55/24' ; my $key = '1' ; my $ssh_start_vip = "/sbin/ifconfig eth1:$key $vip " ; my $ssh_stop_vip = "/sbin/ifconfig eth1:$key down" ; [root@lb01 ~]# mv master_ip_failover /etc/mha/app1 3、检测MHA状态 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:32555) is running(0:PING_OK), master:172.16.1.52 3、手动在主库上绑定VIP [root@db02 ~]# ifconfig eth1:1 172.16.1.55/24 4、给脚本执行权限 [root@lb01 ~]# chmod +x /etc/mha/app1/master_ip_failover 5、重启mha [root@lb01 ~]# systemctl stop mha [root@lb01 ~]# systemctl start mha 1.权限问题 2、语法问题 3、格式问题 dos2nuix /etc/mha/app1/master_ip_failover [root@lb01 ~]# dos2unix /etc/mha/app1/master_ip_failover dos2unix: converting file /etc/mha/app1/master_ip_failover to Unix format ... 5、重启mha [root@lb01 ~]# systemctl stop mha [root@lb01 ~]# systemctl start mha 6、把添加的配置写到脚本里面,以后脚本有什么改动,就要写在脚本里 MHA能够找到最新数据的从库
VIP漂移切换测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 既然有了VIP,那么以后程序就往VIP里面写数据,现在要更改写数据的脚本 1、检测目前谁是主库 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:51306) is running(0:PING_OK), master:172.16.1.51 2、先实时追踪日志 [root@lb01 ~]# tailf /etc/mha/logs/manager.log 3、每个机器上都修改写入数据的脚本 [root@db01 ~]# vim insert.sh master_ip=172.16.1.55 4、执行脚本 [root@db02 ~]# sh insert.sh 5、停止当前的主库 [root@db01 ~]# /etc/init.d/mysqld stop 6、可看到写入数据的程序开始报错 ERROR 2003 (HY000): Can't connect to MySQL server on ' 172.16.1.55' (111) Warning: Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can' t connect to MySQL server on '172.16.1.55' (111)Warning: Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can't connect to MySQL server on ' 172.16.1.55' (111) Warning: Using a password on the command line interface can be insecure. ERROR 2003 (HY000): Can' t connect to MySQL server on '172.16.1.55' (111)7、看到日志开始有数据输出,开始将VIP漂移到新主库 8、查看db01的写入数据脚本开始恢复数据写入,说明数据写入到新主库db02 [root@lb01 ~]# systemctl start mha [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:55464) is running(0:PING_OK), master:172.16.1.52 [root@db02 ~]# ifconfig ...... eth1:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.16.1.55 netmask 255.255.255.0 broadcast
8、MYSQL binlog实时同步
1 2 3 4 5 6 7 1、同步到rsync服务端的哪个目录 2、MHA凭什么去rsync服务端的backup目录下找 (在mha的配置文件里面配置) 3、主库变成从库了,怎么办(每台从库都安装sersync,但不能全部起) 这个方案可以实施,但是非常麻烦,就怕mysql很吃内存,就怕给rsync杀了,但是rsync很轻量级
使用mysqlbinlog同步binlog
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 1、MHA 配置binlog server,需要修改配置文件,同步是往MHA管理机同步 [root@lb01 ~]# vim /etc/mha/app1.cnf .... [binlog1] hostname=172.16.1.5 master_binlog_dir=/data/mysql/binlog 2、在脚本里面更新上这个配置 2、创建binlog目录 [root@lb01 ~]# mkdir -p /data/mysql/binlog 3、查看主库是谁 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:36583) is running(0:PING_OK), master:172.16.1.51 4、一定要先进入目录再,实时同步binlog [root@lb01 ~]# cd /data/mysql/binlog/ [root@lb01 binlog]# mysqlbinlog -R --host=172.16.1.51 --user=mha --password=123 --raw --stop-never mysql-bin.000001 & 输入回车 [1] 60923 5、查看172.16.1.51的binlog是否同步过来 [root@lb01 binlog]# pwd /data/mysql/binlog [root@lb01 binlog]# ll total 12016 -rw-rw---- 1 root root 440 Aug 29 18:20 mysql-bin.000001 -rw-rw---- 1 root root 143 Aug 29 18:20 mysql-bin.000002 6、验证是否实时同步 172.16.1.51 db01执行脚本 [root@db01 ~]# sh insert.sh [root@lb01 binlog]# ll /data/mysql/binlog/ total 12016 -rw-rw---- 1 root root 440 Aug 29 18:20 mysql-bin.000001 -rw-rw---- 1 root root 143 Aug 29 18:20 mysql-bin.000002 [root@db01 ~]# mysqladmin -uroot -p123 flush-log 7、当db02变成从库时,就不能用上条命令去拉取binlog,所以host要改成vip,就比较方便 [root@lb01 binlog]# kill %1 或者killall mysqlbinlog [root@lb01 ~]# cd /data/mysql/binlog/ [root@lb01 binlog]# mysqlbinlog -R --host=172.16.1.55 --user=mha --password=123 --raw --stop-never mysql-bin.000001 & 输入回车 [1] 60923
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 [root@lb01 ~]# vim /root/failover_status new_master=$(sed -nr 's#^app1.*failover (.*)\(.*\) to (.*)\(.*\) succeeded$#\2#gp' /etc/mha/logs/manager.log|tail -1) old_master=$(sed -nr 's#^app1.*failover (.*)\(.*\) to (.*)\(.*\) succeeded$#\1#gp' /etc/mha/logs/manager.log|tail -1) [root@lb01 ~]# vim vip-recover.sh . /etc/init.d/functions . /root/failover_status log_file='/etc/mha/logs/manager.log' mha_conf='/etc/mha/app1.cnf' repl_pass='123' mha_user=$(awk -F= '/^user/{print $2}' ${mha_conf} ) mha_password=$(awk -F= '/^password/{print $2}' ${mha_conf} ) dowm_master=$(sed -nr 's#^Master (.*)\(.*\).*!$#\1#gp' ${log_file} |tail -1) while true ;do if [ ${dowm_master} == ${old_master} ];then change_master=$(grep -i 'change master to' ${log_file} |tail -1|awk -F: '{print $4}' |sed "s#xxx#${repl_pass} #g" ) echo "正在修复主库....." ssh ${dowm_master} '/etc/init.d/mysqld start' while true ;do mysqladmin -u${mha_user} -p${mha_password} -h${dowm_master} ping &>/dev/null if [ $? -eq 0 ];then mysql -u${mha_user} -p${mha_password} -h${dowm_master} -e "${change_master} ;start slave" echo "主库修复完成....." break fi done echo "正在补全MHA配置文件..." cat > ${mha_conf} << EOF [server default] manager_log=/etc/mha/logs/manager.log manager_workdir=/etc/mha/app1 master_binlog_dir=/app/mysql-5.6.50/data master_ip_failover_script=/etc/mha/app1/master_ip_failover password=${mha_password} ping_interval=2 repl_password=123 repl_user=rep ssh_port=22 ssh_user=root user=${mha_user} [server1] hostname=172.16.1.51 port=3306 [server2] hostname=172.16.1.52 port=3306 [server3] hostname=172.16.1.53 port=3306 [server4] hostname=172.16.1.54 port=3306 [binlog1] #(这个配置了,就永远不会提升为主库) #no_master=1 hostname=172.16.1.5 master_binlog_dir=/data/mysql/binlog EOF echo "正在启动MHA....." systemctl start mha while true ;do masterha_check_status --conf=${mha_conf} &>/dev/null if [ $? -eq 0 ];then action "MHA manager启动" /bin/true exit 0 fi done else continue fi done
9、基于GTID的主从复制
1 2 3 4 5 [root@lb01 ~]# systemctl stop mha 因为停主库的时候防止MHA切换 因为MHA啥样的主从复制都可以结合,不需要对当前的主从复制做任何改动,现在搭建GTID的主从复制,看看MHA是否结合兼容所有的主从复制
什么是GTID
1 2 3 4 GTID:事务提交做的主从复制(全局事务标识符) 标识符是由UUID + TID组成 UUID:主库的身份证号 TID:事务提交号
GTID特性
(1).支持多线程复制:事实上是针对每个database开启相应的独立线程,即每个库有一个单独的(sqlthread).
(2).支持启用GTID,在配置主从复制,传统的方式里,你需要找到binlog和POS点,然后change master to指向.
在mysql5.6里,无须再知道binlog和POS点,只需要知道master的IP/端口/账号密码即可,因为同步复制是自动的,MySQL通过内部机制GTID自动找点同步.
(3).基于Row复制只保存改变的列,大大节省Disk Space/Network resources和Memory usage.
(4).支持把Master 和Slave的相关信息记录在Table中原来是记录在文件里,记录在表里,增强可用性
(5).支持延迟复制
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 1、连接进去关闭之前的主从复制 stop slave;reset slave all;show slave status\G vim /etc/my.cnf [mysqld] server_id=1 log-bin=mysql-bin vim /etc/my.cnf [mysqld] server_id=2 log-bin=mysql-bin vim /etc/my.cnf [mysqld] server_id=3 log-bin=mysql-bin vim /etc/my.cnf [mysqld] server_id=4 log-bin=mysql-bin grant replication slave on *.* to rep@'172.16.1.5%' identified by '123' ; mysql> show master status; +------------------+----------+ | File | Position | +------------------+----------+ | mysql-bin.000012 | 154 | +------------------+----------+ change master to master_host='172.16.1.51' , master_user='rep' , master_password='123' , master_log_file='mysql-bin.000012' , master_log_pos=154; start slave;
基于GTID的主从复制
先决条件
1)主库和从库都要开启binlog
2)主库和从库server-id不同
3)要有主从复制用户
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 1、连接进去关闭之前的主从复制 stop slave;reset slave all;show slave status\G 2、修改主库配置文件 vim /etc/my.cnf [mysqld] server_id=1 log-bin=mysql-bin 3、修改从库配置文件 vim /etc/my.cnf [mysqld] server_id=2 log-bin=mysql-bin vim /etc/my.cnf [mysqld] server_id=3 log-bin=mysql-bin vim /etc/my.cnf [mysqld] server_id=4 log-bin=mysql-bin 4、在主库上创建主从复制用户 grant replication slave on *.* to rep@'172.16.1.%' identified by '123' ; 基于GTID的主从复制不用去记位置点,他会根据事务去找位置点 5、从库执行change master语句 change master to master_host='172.16.1.51' , master_user='rep' , master_password='123' , master_auto_position=1; mysql> change master to master_host='172.16.1.51' , master_user='rep' , master_password='123' , master_auto_position=1; ERROR 1777 (HY000): CHANGE MASTER TO MASTER_AUTO_POSITION = 1 cannot be executed because @@GLOBAL.GTID_MODE = OFF. (#GLOBAL.GTID_MODE = OFF GTID的模式没有开) root@localhost:(none) > show variables like '%gtid%' ; +---------------------------------+-----------+ | Variable_name | Value | +---------------------------------+-----------+ | binlog_gtid_simple_recovery | ON | | enforce_gtid_consistency | OFF | | gtid_executed | 1000 | | gtid_mode | OFF |#GTID的模式没有开 | gtid_next | AUTOMATIC | | gtid_owned | | | gtid_purged | | | simplified_binlog_gtid_recovery | OFF | +---------------------------------+-----------+ 6、4台MYSQL机器 [mysqld]标签 配置文件加入这几个配置 [root@db01 ~]# vim /etc/my.cnf [root@db02 ~]# vim /etc/my.cnf [root@db03 ~]# vim /etc/my.cnf [root@db04 ~]# vim /etc/my.cnf ..... gtid_mode=ON enforce_gtid_consistency log-slave-updates log_bin=mysql-bin [root@db01 ~]# vim /etc/my.cnf gtid_mode=ON enforce_gtid_consistency
[root@db02 ~]# tailf -100 /app/mysql-5.6.50/data/db02.err
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 7、重启 [root@db02 ~]# /etc/init.d/mysqld restart 8、连接进去从库,从库执行change master语句 change master to master_host='172.16.1.51' , master_user='rep' , master_password='123' , master_auto_position=1; 9、开启主从复制,并查看状态 start slave;show slave status\G Auto_Position: 1 10、MHA切换主从的时候脚本里面会执行change master语句,那个脚本里面的也是执行master_auto_position=1; 11、主库创建一个库 root@localhost:(none) > create database gtid; root@localhost:(none) > show master status; +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------+ | mysql-bin.000048 | 221998 | | | a45796f8-61ca-11ef-808a-000c29688028:1-1058, df4b2d8a-54d3-11ef-ac00-000c29d088a4:1-564 | +------------------+----------+--------------+------------------+-----------------------------------------------------------------------------------------+ 12、从库查看状态 *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.51 Master_User: rep Master_Port: 3306 ....... Retrieved_Gtid_Set: df4b2d8a-54d3-11ef-ac00-000c29d088a4:563-564 Executed_Gtid_Set: a45796f8-61ca-11ef-808a-000c29688028:1-1058
给基于GTID的主从复制绑定VIP,实现VIP漂移
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1、查看之前绑定的VIP在哪,发现在db02上面,需要解绑vip [root@db02 ~]# ifconfig eth1:1 down 2、我想让db01作为主库,需要在主库上面绑定VIP [root@db01 ~]# ifconfig eth1:1 172.16.1.55/24 3、先检查拉取binlog的命令是否存在,如果没有启动,就启动 [root@lb01 binlog]# jobs [2]+ Running mysqlbinlog -R --host=172.16.1.55 --user=mha --password=123 --raw --stop-never mysql-bin.000001 & 4、MHA管理机检测repl [root@lb01 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf OK 5、启动mha,检测当前的主库 [root@lb01 ~]# systemctl start mha [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:76089) is running(0:PING_OK), master:172.16.1.51 6、测试,db01执行脚本 [root@db01 ~]# sh insert.sh 7、实时追踪mha日志文件 [root@lb01 ~]# tailf /etc/mha/logs/manager.log 8、等待一会,停止db01数据库 [root@db01 ~]# /etc/init.d/mysqld stop 9、写入数据的脚本开始执行错误,等待一会,脚本恢复正常写入,说明vip漂移到db02机器
报错解决
报错1、down掉主库,vip切换了,想要将宕掉的主库加入集群,检测repl是否健康,出现报错
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1、先检查mysqlbinlog是否运行,如果没有运行就先运行 [root@db02 ~]# jobs [1]+ Running mysqlbinlog -R --host=172.16.1.55 --user=mha --password=123 --raw --stop-never mysql-bin.000001 & 2、检查vip是否存在在某个数据库上面,如果在,那么有vip的将作为主库,如果不存在,则给主库绑定vip 3、检查主从复制是否正常,如果每个机器的主从复制都不正常,从库需要重新执行change master 语句 stop slave; change master to master_host='172.16.1.51' , master_user='rep' , master_password='123' , master_auto_position=1; stop slave; start slave;show slave status\G 4、查看主从复制恢复正常之后,检查健康状态,如果检查状态没问题,就启动mha
报错2:基于传统复制的基础上做了GTID主从复制之后,当主从出现数据不一致的现象,需要对主库进行全备,但是全备的binlog里面包含传统的主从复制binlog,导入数据时报错,SQL线程报错
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 1、从库关闭主从复制,关闭自动查找位置点 root@localhost:(none) > STOP SLAVE; root@localhost:(none) > change master to master_auto_position=0; 2、主库打点全备,在恢复时关闭GTID参数 [root@db02 ~]# mysqldump -uroot -p123 -A -R --triggers --master-data=1 --single-transaction --set-gtid-purged=off|gzip > /tmp/full.sql.gz 3、主库将全备的数据发给其他从库 [root@db02 ~]# for i in 51 53 54 ;do scp /tmp/full.sql.gz 172.16.1.$i :/tmp;done 4、从库导入全备数据 [root@db01 ~]# zcat /tmp/full.sql.gz |mysql -uroot -p123 [root@db03 ~]# zcat /tmp/full.sql.gz |mysql -uroot -p123 [root@db04 ~]# zcat /tmp/full.sql.gz |mysql -uroot -p123 5、查看打点的位置点 [root@db01 ~]# zcat /tmp/full.sql.gz |head -25 CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000001' , MASTER_LOG_POS=151; 6、所有从库执行change master语句,重启主从复制 change master to master_user='rep' , master_host='172.16.1.52' , master_password='123' , master_log_file='mysql-bin.000001' , master_log_pos=151, master_port=3306; start slave;show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.16.1.52 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 151 Relay_Log_File: db01-relay-bin.000002 Relay_Log_Pos: 314 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes change master to master_host='172.16.1.52' , master_user='rep' , master_password='123' , master_auto_position=1; stop slave; start slave;show slave status\G
10、Mysql的读写分离
1 2 3 4 读写分离软件 1.mysql-proxy 2.Atlas 360写的 3.MyCat (重量级,比较占资源)
Atlas简介
Atlas是由 Qihoo 360公司Web平台部基础架构团队开发维护的一个基于MySQL协议的数据中间层项目。它在MySQL官方推出的MySQL-Proxy 0.8.2版本的基础上,修改了大量bug,添加了很多功能特性。它在MySQL官方推出的MySQL-Proxy 0.8.2版本的基础上,修改了大量bug,添加了很多功能特性。
Atlas主要功能
1.读写分离
2.从库负载均衡
3.IP过滤
4.自动分表(公司数据量比较大,表是会被逻辑拆分的,水平拆分还在垂直拆分)
5.DBA可平滑上下线DB
6.自动摘除宕机的DB
Atlas相对于官方MySQL-Proxy的优势
1.将主流程中所有Lua代码用C重写,Lua仅用于管理接口
2.重写网络模型、线程模型
3.实现了真正意义上的连接池
4.优化了锁机制,性能提高数十倍
1 2 3 4 5 6 7 8 阿里云不能绑定VIP,但是有一个弹性公网IP vim 1.per system ('echo 123' ); system ('ifconfig' ); chmode +x 1.prel
11、安装配置Atlas(独立的机器)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 1、查看当前主从信息 [root@lb01 binlog]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:125895) is running(0:PING_OK), master:172.16.1.51 2、下载atlas [root@lb01 ~]# wget http://test.driverzeng.com/MySQL_plugins/Atlas-2.2.1.el6.x86_64.rpm 3、安装atlas [root@lb01 ~]# rpm -ivh Atlas-2.2.1.el6.x86_64.rpm 4、查看安装完的程序相关文件 [root@lb01 ~]# ll /usr/local/mysql-proxy/ drwxr-xr-x 2 root root 75 Aug 28 15:57 bin drwxr-xr-x 2 root root 22 Aug 28 15:57 conf drwxr-xr-x 3 root root 331 Aug 28 15:57 lib drwxr-xr-x 2 root root 6 Dec 17 2014 log 5、加密密码 [root@lb01 ~]# cd /usr/local/mysql-proxy/bin [root@lb01 bin]# ./encrypt 123 3yb5jEku5h4= 6、修改配置文件 [root@lb01 ~]# vim /usr/local/mysql-proxy/conf/test.cnf [mysql-proxy] admin-username = user admin-password = pwd proxy-backend-addresses = 172.16.1.55:3306 proxy-read-only-backend-addresses = 172.16.1.52:3306,172.16.1.53:3306,172.16.1.54:3306 pwds = mha:3yb5jEku5h4=,root:3yb5jEku5h4= daemon = true keepalive = true event-threads = 8 log-level = error log-path = /usr/local/mysql-proxy/log sql-log = ON sql-log-slow = 10 instance = test proxy-address = 0.0.0.0:3306 admin-address = 0.0.0.0:2345 charset = utf8 client-ips = 127.0.0.1, 172.16.1 7、启动 [root@lb01 ~]# /usr/local/mysql-proxy/bin/mysql-proxyd test start OK: MySQL-Proxy of test is started 8、停止 [root@lb01 ~]# /usr/local/mysql-proxy/bin/mysql-proxyd test stop OK: MySQL-Proxy of test is stopped 9、如果想要让实例名一样,就修改配置文件 [root@lb01 ~]# vim /usr/local/mysql-proxy/conf/test.cnf 42行: instance = app1 [root@lb01 binlog]# mv /usr/local/mysql-proxy/conf/{test ,app1}.cnf 10、启动 [root@lb01 binlog]# /usr/local/mysql-proxy/bin/mysql-proxyd app1 start OK: MySQL-Proxy of app1 is started [root@lb01 binlog]# netstat -lntup Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:2345 0.0.0.0:* LISTEN 127355/mysql-proxy tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 127355/mysql-proxy =========================记录一下 admin-username = user admin-password = pwd admin-address = 0.0.0.0:2345 ================================
Atlas管理接口使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 1、连接管理接口登录 [root@lb01 ~]# mysql -uuser -ppwd -h127.0.0.1 -P2345 MySQL [(none)]> show databases; ERROR 1105 (07000): use 'SELECT * FROM help' to see the supported commands MySQL [(none)]> select * from help ; +----------------------------+---------------------------------------------------------+ | command | description | +----------------------------+---------------------------------------------------------+ | SELECT * FROM help | shows this help |#查看帮助 | SELECT * FROM backends | lists the backends and their state |#查看后端数据库 | SET OFFLINE $backend_id | set offline 2; | | SET ONLINE $backend_id | set online 2; |#平滑上线数据库 | ADD MASTER $backend | example: "add master 172.16.1.88:3306;" , ... |#添加一个主库 | ADD SLAVE $backend | example: "add slave 172.16.1.77:3306;" , ... |#添加一个从库 | REMOVE BACKEND $backend_id | example: "remove backend 2;" , ... |#删除后端数据库 | SELECT * FROM clients | select * from clients |#查看可连接的客户端 | ADD CLIENT $client | example: "add client 10.0.0;" , ... |#添加一个允许连接的客户端 | REMOVE CLIENT $client | example: "remove client 10.0.0" , ... |#删除可连接的客户端 | SELECT * FROM pwds | select * from pwds; |#查看在配置文件里面设置的用户和加密的密码 | ADD PWD $pwd | example: "add pwd user:raw_password" , ... |#添加一个明文的密码用户 | ADD ENPWD $pwd | example: "add enpwd user:encrypted_password" , ... |#添加一个加密后的用户密码 | REMOVE PWD $pwd | example: "remove pwd user" , ... |#删除用户 | SAVE CONFIG | save the backends to config file |#直接保持配置文件 | SELECT VERSION | display the version of Atlas |#查看atlas版本号 +----------------------------+---------------------------------------------------------+ MySQL [(none)]> SELECT * FROM backends; +-------------+------------------+-------+------+ | backend_ndx | address | state | type | +-------------+------------------+-------+------+ | 1 | 172.16.1.51:3306 | up | rw | | 2 | 172.16.1.52:3306 | up | ro | | 3 | 172.16.1.53:3306 | up | ro | | 4 | 172.16.1.54:3306 | up | ro | +-------------+------------------+-------+------+
通过altas代理写入数据
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 atlas作为代理,给(VIP)172.16.1.55代理,所有atlas的配置文件里面需要改成代理的vip 1、修改配置文件,将vip添加 [root@lb01 ~]# vim /usr/local/mysql-proxy/conf/app1.cnf 12行 proxy-backend-addresses = 172.16.1.55:3306 proxy-read-only-backend-addresses = 172.16.1.52:3306,172.16.1.53:3306,172.16.1.54:3306 2、重启 [root@lb01 ~]# /usr/local/mysql-proxy/bin/mysql-proxyd app1 restart 3、将写入数据的脚本里面的ip修改成代理ip [root@db01 ~]# vim insert.sh master_ip=172.16.1.5 4、检查主从状态,现在主库是51 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:125895) is running(0:PING_OK), master:172.16.1.51 5、执行脚本,查看是否能够写入数据,并检查数据 root@localhost:(none) > select count(*) from prod.prod; +----------+ | count(*) | +----------+ | 28 | +----------+ 1 row in set (0.00 sec)
12、多台机器部署Atlas
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 1、停止lb01机器上面的atlas服务 [root@lb01 ~]# /usr/local/mysql-proxy/bin/mysql-proxyd app1 stop 2、将atlas包、配置文件发送给其他4台数据库 [root@lb01 ~]# for i in 51 52 53 54;do scp Atlas-2.2.1.el6.x86_64.rpm 172.16.1.${i} :/root;done [root@lb01 ~]# for i in 51 52 53 54;do scp /usr/local/mysql-proxy/conf/app1.cnf 172.16.1.${i} :/usr/local/mysql-proxy/conf/;done 3、4个库安装atlas [root@lb01 ~]# for i in 51 52 53 54;do ssh root@172.16.1.${i} "rpm -ivh Atlas-2.2.1.el6.x86_64.rpm" ;done 4、查看配置文件是否收到 [root@db01 ~]# ll /usr/local/mysql-proxy/conf/ total 8 -rw-r--r-- 1 root root 2832 Aug 31 14:30 app1.cnf -rw-r--r-- 1 root root 2810 Dec 17 2014 test.cnf [root@db01 ~]# rpm -aq |grep Atlas Atlas-2.2.1-1.x86_64 5、目前3306端口数据库在运行,所以改mysql端口不太行,所以先更改atlas端口,改成3307 ,如果有生产环境,这个端口会提前配置好,他就是3306,改数据库的端口就好 [root@db01 ~]# vim /usr/local/mysql-proxy/conf/app1.cnf proxy-address = 0.0.0.0:3307 sed -i 's#proxy-address = 0.0.0.0:3306#proxy-address = 0.0.0.0:3307#g' /usr/local/mysql-proxy/conf/app1.cnf 5、4台数据库启动atlas,并且检查端口 2345 3307端口 /usr/local/mysql-proxy/bin/mysql-proxyd app1 start [root@db04 ~]# netstat -lntup 6、atlas需要使用systemd管理,不然没法加入开机自启动 7、更改程序写数据的脚本,还得加参数-P3307,往atlas里面写数据 [root@db01 ~]# vim insert.sh master_ip=172.16.1.55 mysql_cli="mysql -uroot -p123 -h${master_ip} -P3307" 8、执行脚本 [root@db01 ~]# sh insert.sh Warning: Using a password on the command line interface can be insecure. Warning: Using a password on the command line interface can be insecure.
atlas结合MHA脚本使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 1、检查MHA是否启动 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:125895) is running(0:PING_OK), master:172.16.1.51 2、在主从服务器上编写脚本 [root@db01 ~]# vim atlas_mha.sh mha_log="/etc/mha/logs/manager.log" down_master=$(grep '(current master)' ${mha_log} |awk -F '(' '{print $1}' |tail -1) new_master=$(grep '(new master)' ${mha_log} |awk -F '(' '{print $1}' |tail -1) atlas_user='user' atlas_password='pwd' atlas_port='2345' atlas_conn='mysql -u${atlas_user} -p${atlas_password} -P${atlas_port}' for atlas_ip in 51 52 53 54;do down_master_backend_id=$(${atlas_conn} -h172.16.1.${atlas_ip} -e 'select * from backends' | grep "${new_master} " |awk '{print $1}' ) ${atlas_conn} -h172.16.1.${atlas_ip} -e 'remove backend ${down_master_backend_id}' ${atlas_conn} -h172.16.1.${atlas_ip} -e 'add slave ${down_master}:3306;save config' done 3、脚本结合MHA一起使用,修改MHAvip漂移的脚本 在mha机器上修改 [root@lb01 ~]# vim /etc/mha/app1/master_ip_failover sub start_vip () { `ssh $ssh_user \@$new_master_host \" $ssh_start_vip \"`; `/bin/sh -x \"/root/atlas_mha.sh\"`; }
测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 1、检查主从,并且检查vip在主库上面 [root@lb01 ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:125895) is running(0:PING_OK), master:172.16.1.51 [root@db01 ~]# ip a 2、执行写入数据的脚本 [root@db01 ~]# sh insert.sh 3、先实时追踪日志 [root@lb01 ~]# tailf /etc/mha/logs/manager.log 4、准备切换,停止主库数据库 [root@db01 ~]# /etc/init.d/mysqld stop change master to master_host='172.16.1.52' , master_user='rep' , master_password='123' , master_auto_position=1; start slave;show slave status\G STOP SLAVE;RESET SLAVE ALL;change master to master_host='172.16.1.52' , master_user='rep' , master_password='123' , master_auto_position=1;start slave;show slave status\G