When SHOW SLAVE STATUS lies
Over-the-Top Tales from the Trenches.
Motto: Bringing order to the chaos of every day DBA life.
So you have got your nice MySQL Master-Slave replication pair setup. Everything is sweet, then the master dies/restarts or you have a slightly extended network outage.
Your monitoring software (in our case Avail) fires off a page and you are rudely interrupted from reading the Pythian blog.
These real world interruptions, what can I say… it pays the bills.
Anyway being the rounded DBA or enlightened DBA as Babette would say, you are capable of handling any type of database. You log into the machine and check out why the slave threw an error or if your monitoring is slow, why the slave is lagging by 2 hours.
You run SHOW SLAVE STATUS\G
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: x.x.x.x
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000480
Read_Master_Log_Pos: 690470773
Relay_Log_File: db2-relay-bin.000028
Relay_Log_Pos: 683977007
Relay_Master_Log_File: mysql-bin.000480
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB: avail,avail
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error: Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted
(you can check this by running ‘mysqlbinlog’ on the binary log), the slave’s relay log is corrupted
(you can check this by running ‘mysqlbinlog’ on the relay log), a network problem,
or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log,
you will be able to know their names by issuing ‘SHOW SLAVE STATUS’ on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 126
Relay_Log_Space: 690471192
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
1 row in set (0.00 sec)
The bits which are important are in red.
Now the normal procedure for this kind of error is to get the Exec_Master_Log_Pos and tell the slave to restart from there. Using a command like: (more…)
