SQL server folks may find the "non-yielding scheduler" warning familiar and it is usually (if not always) associated with a stack dump. The "non-yielding scheduler" message comes with warnings like the following:
Process %ld:%ld:%ld (0x%lx) Worker 0x%p appears to be non-yielding on Scheduler %ld. Thread creation time: %time. Approx Thread CPU Used: kernel 0x ms, user 0x ms. Process Utilization 0x%. System Idle 0x%. Interval: 0x ms.
This means that there is a background thread that keeps running and making other worker threads yielding back, in time. There are many causes of the "non-yielding scheduler" error. Memory dump analysis is necessary to determine the cause. Look up the error in MS Support Advanced Search and you will find plenty of KBs dedicated to it. You may also find (not all cases) errors 17883, 17884 logged in the event viewer, containing information like this:
Event ID: 17884 Description: New queries assigned to process on Node 0x have not been picked up by a worker thread in the last 0x seconds. Blocking or long-running queries can contribute to this condition, and may degrade client response time.
Event ID: 17883 Description: Process Worker appears to be non-yielding on Scheduler . Thread creation time: . Approx Thread CPU Used: kernel ms, user ms. Process Utilization %. System Idle %. Interval: ms.
There is a must-read whitepaper describing SQL Server Scheduling and how to interpret and diagnose Errors 17883, 17884, 17887, and 17888; please look here: How To Diagnose and Correct Errors 17883, 17884, 17887, and 17888.
Recently, we had a client having same issue. The client runs a busy online business with thousands of connected users; databases sometimes include more than half a million tables. We had a healthy counters and server didn't seem to suffer from memory or I/O bottlenecks. We did suspect, however, that it is something to do with SQL Server internals, as we also received fatal exceptions pertaining to SPIDs <50 (system processes).
We opened a case with MS support after the dumps continued to happen and their reply was: After analyzing the dumps, we see that the thread has non-yielded while updating index usage stats. These stats are returned through DMV sys.dm_db_index_usage_stats. According to MS, the issue is a known bug that has been corrected in SQL Server 2008 but isn't expected to be resolved in SQL Server 2005.
Enable trace flag 2330 to prevent the update of data in DMV in sys.dm_db_index_usage_stats. This DMV retrieves information about index usage; it does not, however, generate this information itself, but rather just returns this info from a cache inside SQL Server. The impact of disabling this job was nothing to the server operations.
It is worthwhile to note that sys.dm_db_index_operational_stats can report similar statistics as sys.dm_db_index_physical_stats. Craig Freedman wrote a good blog on the differences between these views.
[Image comparing sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats data points]
Microsoft released a "FAST PUBLISH" KB about this issue which mentions the following scenario:
sys.dm_db_index_usage_stats DMV.This aligns with what we had with the affected server (more than half a million tables). The KB mentions these workarounds:
Note: make sure that the trace is enabled by running DBCC TRACESTATUS(-1). You can also look in the SQL Server error log and note whether the trace flags are mentioned:
Time Server -T1222Time Server -T845Time Server -T2330If you face similar issues, you should install the latest service pack and cumulative update. You can try trace T2330 as well, and if you still get errors, open a case with Microsoft Support to analyze the dumps. Hope this helps!
Ready to future-proof your SQL Server investment?