Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO topology_recovery: Running 1 PostFailoverProcesses hooks Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO topology_recovery: Running PostFailoverProcesses hook 1 of 1: /root/postfailover.sh Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO CommandRun(/root/postfailover.sh,[]) Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 INFO CommandRun/running: bash /tmp/orchestrator-process-cmd-542186076 Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 DEBUG outdated keys: [mysql3:3306] Apr 04 11:30:50 mysql1 orchestrator[4453]: 2018-04-04 11:30:50 DEBUG Discovered host: mysql3:3306, master: mysql2:3306, version: 5.6.39-log in 0.009s (Backend: 0.006s, Instance: 0.004s) ... Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO topology_recovery: done running PostFailoverProcesses hooks Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO topology_recovery: Waiting for 0 postponed functions Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG PostponedFunctionsContainer: waiting on 0 postponed functions Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG PostponedFunctionsContainer: done waiting Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO topology_recovery: Executed 0 postponed functions Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG ChangeMasterTo: will attempt changing master on mysql1:3306 to mysql2:3306, mysql-bin.000013:976 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO ChangeMasterTo: Changed master on mysql1:3306 to: mysql2:3306, mysql-bin.000013:976. GTID: false Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 DEBUG ChangeMasterTo: will attempt changing master credentials on mysql1:3306 Apr 04 11:30:55 mysql1 orchestrator[4453]: 2018-04-04 11:30:55 INFO ChangeMasterTo: Changed master credentials on mysql1:3306 Apr 04 11:30:55 mysql1 orchestrator[4453]: [martini] Completed 200 OK in 5.515153338sI've opened an issue to suggest a new hook is added, and the fix is already in one of the dev branches of Orchestrator. It also makes sense to add the old master to the readers hostgroup after the promotion. The problem is there is currently no way to have Orchestrator start the slave threads on the old master. I opened another issue to explore options to deal with this. Finally, we need to take care of setting the old master back ONLINE in ProxySQL hosts table (remember we have to set it to OFFLINE_SOFT for the purpose of master switch).
(admin@127.0.0.1) [(none)]>select * from runtime_mysql_servers; +--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | 0 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | | | 1 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | | | 1 | mysql2 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | | | 1 | mysql3 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | | +--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+I have the master in both writer and reader hostgroups, so if you wanted to use read/write splitting the master is still able to service read traffic. We need to develop two scripts that will be run as Orchestrator hooks for pre and post failover tasks, and configure Orchestrator to invoke them at the appropriate time.
"ApplyMySQLPromotionAfterMasterFailover": true, "PreGracefulTakeoverProcesses": [ "/tmp/prefailover.sh" ], "PostMasterFailoverProcesses": [ "/tmp/postfailover.sh"Make sure to have Orchestrator refresh the configuration by either running `service orchestrator reload` or call /api/reload-configuration after you make the changes.
#!/bin/bash
# Variable exposed by Orchestrator
OldMaster=$ORC_FAILED_HOST
PROXYSQL_HOST="mysql3"
# stop accepting connections to old master
(
echo 'UPDATE mysql_servers SET STATUS="OFFLINE_SOFT" WHERE hostname="'"$OldMaster"'";'
echo "LOAD MYSQL SERVERS TO RUNTIME;"
) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
# wait while connections are still active and we are in the grace period
CONNUSED=`mysql -uivan -p**** -h ${PROXYSQL_HOST} -P6032 -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND srv_host="'"$OldMaster"'"' -B -N 2> /dev/null`
TRIES=0
while [ $CONNUSED -ne 0 -a $TRIES -ne 20 ]
do
CONNUSED=`mysql -uivan -p**** -h ${PROXYSQL_HOST} -P6032 -e 'SELECT IFNULL(SUM(ConnUsed),0) FROM stats_mysql_connection_pool WHERE status="OFFLINE_SOFT" AND srv_host="'"$OldMaster"'"' -B -N 2> /dev/null`
TRIES=$(($TRIES+1))
if [ $CONNUSED -ne "0" ]; then
sleep 0.05
fi
done
#!/bin/bash
OldMaster=$ORC_FAILED_HOST
NewMaster=$ORC_SUCCESSOR_HOST
PROXYSQL_HOST="mysql3"
# remove old master from writers hostgroup
(
echo 'DELETE FROM mysql_servers WHERE hostgroup_id=0 AND hostname="'"$OldMaster"'";'
echo 'LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;'
) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
# promote the new master by adding to the writers hostgroup
(
echo 'INSERT INTO mysql_servers(hostgroup_id,hostname,port,status) values (0, "'"$NewMaster"'", 3306, "ONLINE");'
echo 'LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;'
) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
# if graceful then set old master ONLINE in read hostgroup and start replication
if [ "$ORC_COMMAND" == "graceful-master-takeover" ]
then
(
echo 'UPDATE mysql_servers SET status="ONLINE" WHERE hostgroup_id=1 AND hostname="'"$OldMaster"'";'
echo 'LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;'
) | mysql -vvv -uivan -p**** -h ${PROXYSQL_HOST} -P6032
# start replication threads on the old master
# not working for now as hook is fired too soon https://github.com/github/orchestrator/issues/453
#mysql -vvv -h$OldMaster -uorchestrator -p**** -e'start slave;'
fi
[root@mysql1 ~]# sysbench --report-interval=1 --db-driver=mysql --num-threads=4 --max-requests=0 --max-time=2000 --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-user=app --mysql-password=**** --mysql-host=mysql3 --mysql-port=6033 --mysql-db=test --oltp-read-only=off run WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options. WARNING: --num-threads is deprecated, use --threads instead WARNING: --max-time is deprecated, use --time instead sysbench 1.0.9 (using system LuaJIT 2.0.4) Running the test with following options: Number of threads: 4 Report intermediate results every 1 second(s) Initializing random number generator from current time Initializing worker threads... Threads started! [ 1s ] thds: 4 tps: 116.62 qps: 2367.35 (r/w/o: 1661.63/467.49/238.23) lat (ms,95%): 57.87 err/s: 0.00 reconn/s: 0.00 [ 2s ] thds: 4 tps: 78.99 qps: 1580.79 (r/w/o: 1105.85/313.96/160.98) lat (ms,95%): 80.03 err/s: 0.00 reconn/s: 0.00 [ 3s ] thds: 4 tps: 78.01 qps: 1569.30 (r/w/o: 1097.21/312.06/160.03) lat (ms,95%): 81.48 err/s: 0.00 reconn/s: 0.00 [ 4s ] thds: 4 tps: 118.03 qps: 2360.57 (r/w/o: 1656.40/465.11/239.06) lat (ms,95%): 42.61 err/s: 0.00 reconn/s: 0.00 [ 5s ] thds: 4 tps: 111.96 qps: 2242.21 (r/w/o: 1573.45/439.85/228.92) lat (ms,95%): 43.39 err/s: 0.00 reconn/s: 0.00 [ 6s ] thds: 4 tps: 83.92 qps: 1635.45 (r/w/o: 1133.92/334.68/166.84) lat (ms,95%): 77.19 err/s: 0.00 reconn/s: 0.00 [ 7s ] thds: 4 tps: 92.03 qps: 1878.63 (r/w/o: 1325.44/367.12/186.06) lat (ms,95%): 66.84 err/s: 0.00 reconn/s: 0.00 [ 8s ] thds: 4 tps: 81.08 qps: 1604.53 (r/w/o: 1117.07/322.31/165.16) lat (ms,95%): 70.55 err/s: 0.00 reconn/s: 0.00 [ 9s ] thds: 4 tps: 90.99 qps: 1822.79 (r/w/o: 1276.85/363.96/181.98) lat (ms,95%): 69.29 err/s: 0.00 reconn/s: 0.00 [ 10s ] thds: 4 tps: 100.95 qps: 2018.09 (r/w/o: 1408.37/405.82/203.91) lat (ms,95%): 47.47 err/s: 0.00 reconn/s: 0.00 [ 11s ] thds: 4 tps: 102.47 qps: 2059.27 (r/w/o: 1448.45/402.90/207.92) lat (ms,95%): 66.84 err/s: 0.00 reconn/s: 0.00 [ 12s ] thds: 4 tps: 79.44 qps: 1579.83 (r/w/o: 1103.16/316.77/159.89) lat (ms,95%): 89.16 err/s: 0.00 reconn/s: 0.00 [ 13s ] thds: 4 tps: 69.80 qps: 1419.90 (r/w/o: 993.13/285.18/141.59) lat (ms,95%): 108.68 err/s: 0.00 reconn/s: 0.00 [ 14s ] thds: 4 tps: 53.05 qps: 1067.93 (r/w/o: 756.66/204.18/107.09) lat (ms,95%): 127.81 err/s: 0.00 reconn/s: 0.00 [ 15s ] thds: 4 tps: 77.99 qps: 1556.87 (r/w/o: 1081.91/317.97/156.99) lat (ms,95%): 108.68 err/s: 0.00 reconn/s: 0.00 [ 16s ] thds: 4 tps: 87.18 qps: 1704.50 (r/w/o: 1189.44/339.70/175.36) lat (ms,95%): 75.82 err/s: 0.00 reconn/s: 0.00 [ 17s ] thds: 4 tps: 75.99 qps: 1559.79 (r/w/o: 1099.85/303.96/155.98) lat (ms,95%): 71.83 err/s: 0.00 reconn/s: 0.00 [ 18s ] thds: 4 tps: 62.97 qps: 1255.37 (r/w/o: 878.56/247.88/128.93) lat (ms,95%): 63.32 err/s: 0.00 reconn/s: 0.00 [ 19s ] thds: 4 tps: 77.03 qps: 1491.55 (r/w/o: 1034.38/305.11/152.06) lat (ms,95%): 73.13 err/s: 0.00 reconn/s: 0.00 [ 20s ] thds: 4 tps: 82.90 qps: 1703.04 (r/w/o: 1195.62/335.61/171.80) lat (ms,95%): 68.05 err/s: 0.00 reconn/s: 0.00Can you spot the switchover? I cannot! It was done around second 17 via drag and dropping one of the slaves to the master position in Orchestrator GUI.
We've seen how ProxySQL and Orchestrator can be integrated by leveraging the hooks functionality of Orchestrator to make changes in ProxySQL configuration. This is a better approach than having ProxySQL decide who the master is based on the value of read_only variable alone (by using mysql_replication_hostgroups table). The hooks will handle topology changes regardless of whether it's a response to a failure OR a graceful takeover. We also performed a graceful master switchover with ProxySQL and Orchestrator via the GUI that completed in under 1 second, without returning any errors to the application. The only missing piece to fully automate a graceful master switchover is starting the slave threads automatically after the old master is set as a slave, but I expect this will be solved very soon either via a new hook or a new Orchestrator parameter.
Looking to optimize your MySQL use?