"If you insert and delete rows in smallish batches at about the same rate in the table, the purge thread can start to lag behind and the table can grow bigger and bigger because of all the “dead” rows, making everything disk-bound and very slow In such a case, throttle new row operations, and allocate more resources to the purge thread by tuning the innodb_max_purge_lag system variable." --InnoDB Multi-Versioning
A classic scenario for such issue is an application taking track of the activity on the network that requires to write a huge number of small inserts into the database. From my tests, I have seen an incredible number of entries in the history list in 5.5 that were not present in the previous InnoDB version and that are not present again in 5.6. The point is that it could happen to have so many transactions, doing INSERT, UPDATE, or DELETE, that the History and un-flushed undo log grow too much. To prevent issues, we should tune the value of the innodb_max_purge_lag in order to allow InnoDB to complete the PURGE operations. Innodb_max_purge_lag is the highest number we want to have in history list, above which Innodb will start to apply an indiscriminate delay in pushing the operations. The formula is quite simple: [code] ((purge_lag/innodb_max_purge_lag)×10)–5 milliseconds. [/code] Or following the code we have: [code] float ratio = (float) trx_sys->rseg_history_len/ srv_max_purge_lag; ((ratio - .5) * 10000); [/code] If we have a History list of 1200 and have set innodb_max_purge_lag to 1000, the result should be: [code] ((1200/1000)X10)-5= 7 ms delay for operation. following the manual ((1200/100) -5) * 10000 = 7000 <-- Microsecond following the code [/code] All match and delay will be 7 ms. Also, the max limit in previous versions for the delay was of 4295 seconds!! In 5.5/5.6, we see a different way of managing the max number of seconds: [code] if (delay > srv_max_purge_lag_delay) { delay = srv_max_purge_lag_delay; } [/code] Srv_max_purge_lag_delay max value is 10 seconds. So in the worse case scenario, the max delay will be 10 seconds. In the past, we were used to see the History list going up and down (relatively) fast, so the force delay was playing its role efficiently. At the same time, we knew that all operations in the Main threads where slowed down, so the forced delay was something we had to leave with or worse things could happen, like the need to perform an emergency page flush from the buffer pool to allow REDO to recover space. But something has changed... For the better, obviously, but we must be careful. It's better because now the purge thread works independently and can scale, and pending undo flushes does not slow down the other operations. Also in 5.6, MySQL could be more or less aggressive in relation to the History list to purge. But these operations remain something we should monitor and tune, for two main reasons: - Space taken by undo log segments is still an issue, and now that the number is increased, it could be even worse. - Setting a wrong value for innodb_max_purge_lag could kill our performance.Running Very Long Transaction If you’re running very long transaction, be it even SELECT, Innodb will be unable to purge records for changes which are done after this transaction has started, in default REPEATABLE-READ isolation mode. This means very long transactions are very bad causing a lot of garbage to be accommodated in the database. It is not limited to undo slots. When we’re speaking about Long Transactions the time is a bad measure. Having transaction in read only database open for weeks does no harm, however if database has very high update rate, say 10K+ rows are modified every second even 5 minute transaction may be considered long as it will be enough to accumulate about 3 million of row changes. (Peter Z.)
But what can really harm your the system is the delay defined by the purge lag to improve the flushing. Assume we define it to innodb_max_purge_lag=200,000, and we do have the number of pending flush as for the above, 359585. Doing calculation as for the previous formula: [code] ((359585/200000)X10)-5= 12.97925 ms delay for operation. [/code] Hey that's not too bad. I will delay only 12.97925 ms to operation/Insert to help the purge. But what is unclear is what an operation is for the delay, or rather, where does the delay really apply? Ready? Are you sitting comfortably? 2) row0mysql.c [code] /*******************************************************************//** Delays an INSERT, DELETE or UPDATE operation if the purge is lagging. */ static void row_mysql_delay_if_needed(void) /*===========================*/ { if (srv_dml_needed_delay) { os_thread_sleep(srv_dml_needed_delay); } } [/code] 3)os0thread.c [code] /*****************************************************************//** The thread sleeps at least the time given in microseconds. */ UNIV_INTERN void os_thread_sleep( /*============*/ ulint tm) /*!< in: time in microseconds */ { #ifdef __WIN__ Sleep((DWORD) tm / 1000); #else struct timeval t; t.tv_sec = tm / 1000000; t.tv_usec = tm % 1000000; select(0, NULL, NULL, NULL, &t); #endif } [/code] Do you get it? The delay is per ROW. So assume that you have a system checking connections status and traffic, collecting statistics every minute for your 100000 connected users, each user generating at least 8 inserts, plus a consolidation operation to get on average 10 minutes for each insert. Each insert per user takes 0.002 second, and all traffic is managed by 300 threads. 100000 x 8 = 800000/300 = 2666 insert for transactions x 0.002ms each = 5.3sec to run all the operations. Now, what if we apply the delay given we have History list quite high? I have to add the 12ms to the 0.002, which will give us 0.122 sec. This gives us 325 seconds (5.4 minutes) for each thread!!!! Do not forget the consolidation process, which needs to run every 10 minutes. So it has to process (1 reads for user per minute x 100000 users) x 10 minutes and split in 10 threads doing it by user id set, assuming each read per user will take 0.001 ms (given already in memory) and 0.002 ms for write. Without delay, we will have = 1,000,000 / 10 = 100,000 x 0.003 = 300sec (5 minutes) for thread. With delay, it will be 0.122 = 12200 seconds (203 minutes) for each operation. Last but not least, the consolidation will collide with the inserts, causing possible increase of the delay because of the REPEATBLE_READ. Another possible issue is that the consolidation will cause pages to remain in a dirty state for too long, possibly causing serious issues in the REDO log in case we need to free space. I did push some numbers a little bit just to make the scenario more clear. Now, just to remove some possible doubt:The new mechanism for the purge is much more flexible and is able to scale. The fact that it's now separate from the main thread reduces a lot of the negative effects. Nevertheless, the capacity that the undo log purge now has also presents possible risks. Huge pending flushes means possible huge space on disk and/or huge delay. The delay behavior is not new and was already present MySQL 5.0, but the new capacity brings it to a higher level of risk, especially in 5.5. In 5.6 the purging thread, is much more efficient, and getting a huge history list was difficult, but when I did get it I saw the same behavior. Whenever we need to tune the purge lag settings, the value needs to be set not in relation to the history list, but in relation to the maximum acceptable delay in the inserts as a cumulative value.
Ready to optimize your MySQL Database for the future?