Overview of Transaction Logging in MySQL
Nov 6, 2008 / By Keith Murphy
I was doing some research over the weekend on how transactions work “under the hood” in MySQL. I thought it might be enlightening if I wrote about what I found.
The database system must resolve the situation where, for whatever reason, the database server stops working in the middle of a transaction. Perhaps the power goes out, the hard drive crashes, or the CPU disappears in a cloud of smoke. Maybe the MySQL Server itself crashes for some reason. What does the MySQL Server do when operation resumes?
Potentially, there are transactions in an inconsistent state. How are inconsistent transactions resolved? They are resolved through a recovery process involving log files that are called transactional logs. There are two kinds of transactional logs: redo logs and undo logs.
Redo logs are used to apply changes that were made in memory but not flushed to the permanent table records (for example, InnoDB’s permanent table records are the ibdata files.) This situation would typically happen after a MySQL Server crash.Â Logging is done this way because it allows faster database operation.
This might be seem counter-intuitive at first look. Instead of writing to one file (the permanent table records file) you are writing to two different files (the redo log file and the permanent table records file). The difference is that the writes to the database table file(s) are random in nature while the writes to the redo log file will be sequential, which is usually going to be much faster. You perform the much-faster writes to the redo log as needed and then perform the slower writes to the table files periodically when there is time. Thus, the system actually operates faster writing to both files rather than only one file. This applying of the redo logs occurs automatically on startup of the MySQL server after a crash. Very similar processes are used by other database platforms.
In addition to the redo log there must also be undo logs. When a database user starts a transaction and executes some commands, the database does not know if the user will end the transaction with a
COMMIT or a
ROLLBACK command. Ending withÂ a
COMMIT means all the changes made in the course of the transaction have to be preserved (fulfilling the Durable aspect of ACID). If the transaction gets interrupted for some reason, such as the MySQL daemon crashing, the client disconnecting before sending a
COMMIT, or the user issuing a
ROLLBACK command, then all changes made by the transaction need to be undone.
If the server crashed, the redo log files are applied first, on start up. This puts the database in a consistent state. Now the database server will need to roll back the transactions which were not committed but had already made changes to the database. Undo logs are used for this. As an example, if you are running a transaction that adds a million rows and the server crashes after eight hundred thousand inserts are performed, the server will first use the redo log to get the database server into a consistent stand than then will perform a rollback of the eight hundred thousand inserts using the undo logs. For InnoDB, this undo information is stored in the ibdata file(s). For Falcon the undo log information is stored in the serial log file.
Transactional logs have some common characteristics. The log records each contain Log Sequence Numbers (LSN) which are unique IDs for each record. There is a also a Previous Log Sequence Number (Prev LSN), which is a link to the last log record. A Transaction ID number is used to reference the database transaction generating the log record.Â Finally, checkpoints are entries in the transactional logs that are used to speed up the processing of those logs when they are needed. Checkpoints contain a list of transactions that are open when the checkpoint was created. This is used by the server to determine how far back into the log it must go to begin processing.
(A small side note. Some people get confused thinking the binary log files are used in this process. That is not true. The binary logs are used only for replication purposes and for manual data recover situations, such as when a backup is restored and you need to “catch the server up” using the binary logs. The server does not use these files in the transactional recovery process.)
That completes this general overview of the transactional logs in MySQL (or any modern RDBMS really). I hope you find it useful.