More

    All about “log file sync” wait event in Oracle Database

    What is a ‘log file sync’ wait?

    When a user session commits, all redo records generated by that session’s transaction need to be flushed out from memory to the redo logfile to ensure changes to the database made by that transaction become permanent.

    At the time of commit, the user session will post LGWR to write the log buffer (containing the current unwritten redo, including this session’s redo records) to the redo log file. Once LGWR knows that it’s write requests have completed, it will post the user session to notify it that this has completed. The user session waits on ‘log file sync’ while waiting for LGWR to post it back to confirm all redo it generated have made it safely onto disk.

    The time between the user session posting the LGWR and the LGWR posting the user after the write has completed is the wait time for ‘log file sync’ that the user session will show.
    Note that in 11.2 and higher LGWR may dynamically switch from the default post/wait mode to a polling mode where it will maintain it’s writing progress in an in-memory structure and sessions waiting on ‘log file sync’ can periodically check that structure (i.e. poll) to see if LGWR has progressed far enough such that the redo records representing their transactions have made it to disk. In that case the wait time will span from posting LGWR until the session sees sufficient progress has been made.
    NOTE: if a sync is ongoing, other sessions that want to commit (and thus flush log information) will also wait for the LGWR to sync and will also wait on ‘log file sync’?

    What should be collected for initial diagnosis of ‘log file sync’ waits ?

    To initially analyze ‘log file sync’ waits the following information is helpful:
    • AWR report from a similar time frame and period where time waited for ‘log file sync’ is “acceptable” in order to use as a baseline for reasonable performance for comparison purposes
    • AWR report when “excessive” ‘log file sync’ waits are occurring
    Note: The 2 reports should be for between 10-30 minutes each.
    • LGWR trace file (including LGnn traces in 12.2 and higher)
    The lgwr trace file will show warning messages for periods when ‘redo writing’ times may be high

    What causes high waits for ‘log file sync’?

    Waits for the ‘log file sync’ event can occur at any stage between a user process posting the LGWR to write redo information and the LGWR posting back the user process after the redo has been written from the log buffer to disk (local redo logs and optionally propagated remote standby databases in SYNC mode) and the user process waking up to receive the post or poll that LGWR has written the info as requested.
    For more information see:
    Document:34592.1 WAITEVENT: “log file sync”

    In terms of the most common causes, these are:
    Issues affecting LGWR’s I/O Performance
    Excessive Application Commits
    Details of these causes and how to troubleshoot them are outlined below:

    Issues affecting LGWR’s IO Performance

    The primary question we are looking to answer here is “Is LGWR slow in writing to disk?”. The following steps can assist determine whether this the case or not.
    Compare the average wait time for ‘log file sync’ to the average wait time for ‘log file parallel write’.


    Wait event ‘log file parallel’ write is waited for by LGWR while the actual write operation to the redo is occurring. The duration of the event shows the time waited for the IO portion of the operation to occur. For more information on “log file parallel write” see: :
    Document:34583.1 WAITEVENT: “log file parallel write” Reference Note

    Looking at this event in conjunction with “log file sync” shows how much of the sync operation is spent on IO and also, by inference, how much processing time is spent on the CPU.

    The example above shows high wait times for both ‘log file sync’ and ‘log file parallel write’

    If the proportion of the ‘log file sync’ time spent on ‘log file parallel write’ times is high, then most of the wait time is due to IO (waiting for the redo to be written). The performance of LGWR in terms of IO should be examined. As a rule of thumb, an average time for ‘log file parallel write’ over 20 milliseconds suggests a problem with IO subsystem (the typical time may be much smaller for more modern storage systems with lots of disk caching and/or non-moving parts e.g. SSD, NVRAM, etc.).

    Recommendations

    • Work with the system administrator to examine the file systems / logical volumes where the redologs are located with a view to improving the performance of IO.


    • Avoid placing redo logfiles on older generations or less sophisticated RAID technologies that require the calculation of parity, such as RAID-5 or RAID-6 and writing to multiple disks with very little front end caching or buffering plus dedicated CPU resources to mask that overhead.


    Avoid placing redo logs on older generations of Solid State Disk (SSD) technologies.
    Although generally, Solid State Disks write performance is good on average, they may endure write peaks which will highly increase some waits on ‘log file sync’ which may result in choppy performance or even transient database hangs. (This should be tested, as there are cases where performance is still acceptable on SSD despite the uneven IO response times)
    Oracle Engineered Systems (Exadata, SuperCluster and Oracle Database Appliance) have been optimized to leverage SSDs and newer related technologies more effectively.


    • Look for other processes that may be writing to the same disk location or general IO paths and ensure that the storage systems have sufficient bandwidth to cope with the required IO traffic activity. If they do not then consider adding / modernizing the storage to increase the load it can handle or rebalance the existing IO activity as much as possible across what is currently available.

    References:

    • Troubleshooting: ‘Log file sync’ Waits (Doc ID 1376916.1)
    • Alternative and Specialised Options as to How to Avoid Waiting for Redo Log Synchronization (Doc ID 857576.1)
    • Using 4k Redo Logs on Flash, 4k-Disk and SSD-based Storage (Doc ID 1681266.1)

    Recent Articles

    spot_img

    Related Stories

    Leave A Reply

    Please enter your comment!
    Please enter your name here

    Stay on op - Ge the daily news in your inbox