In this blog entry, I will discuss strategies and techniques to resolve ‘log file sync’ waits. This entry is intended to show an approach based upon scientific principles, not necessarily a step-by-step guide. Let’s understand how LGWR is inherent in implementing the commit mechanism first.
Commit Mechanism and LGWR Internals
At commit time, a process creates a redo record (containing commit opcodes) and copies that redo record into the log buffer. Then, that process signals LGWR to write the contents of log buffer. LGWR writes from the log buffer to the log file and signals user process back completing a commit. A commit is considered successful after the LGWR write is successful.
Of course, there are minor deviations from this general concept, such as latching, commits from a plsql block, or IMU-based commit generation, etc. But the general philosophy remains the same.
Signals, Semaphores and LGWR
The following section introduces the internal workings of commit and LGWR interation in Unix platforms. There are minor implementation differences between a few Unix flavours or platforms like NT/XP, for example the use of post-wait drivers instead of semaphores etc. This section will introduce, but not necessarily dive deep into, internals. I used truss to trace LGWR and user process. The command is:
truss -rall -wall -fall -vall -d -o /tmp/truss.log -p 22459
(A word of caution: don’t truss LGWR or any background process unless it is absolutely necessary. Doing so can cause performance issues, or worse, shutdown the database.)
(more…)