Preventing runaway queries using Resource Governor

Nov 22, 2013 / By Utsab Chattopadhyay

Tags: , , , , ,

Section A (Theory)

One of the typical reasons for performance issues is runaway queries executed by end users on a busy production database. This problem is fairly common in OLAP or shared systems (OLTP/OLAP) where business users begin executing complex (and often poorly written) queries, leaving their office for hours, and expecting their data to be ready when they return. Although these problems are apparently very easy to resolve by killing the offending query, in reality it is not that straightforward. We cannot just kill those queries without some explicit business approval, as we do not understand the business impact of killing the offending queries halfway through.

The traditional way to handle this problem is to educate users about when they can execute big queries, and also to review and approve each query before they could be executed on production. However, this approach is very expensive and prone to failure from human error and/or business needs. I found a huge benefit from using Resource Governor to handle such situations. Please refer to http://technet.microsoft.com/en-us/library/bb934084(v=sql.105).aspx to learn more about Resource Governor.

At this point, let us focus on how we can use Resource Governor to handle this situation efficiently. All of these runaway queries are typically generated from SQL Server Management Studio, and ideally no one should connect to Production OLTP Systems directly using Management Studio. Therefore, we can establish a business rule stating that any queries coming from SQL Server Management Studio will be deprioritized if SQL Server is busy serving anything else (i.e.  OLTP Applications, etc.) Once this rule is established and approved, we can use Resource Governor to design and implement a solution. These steps are described in Section B (Demo) of this post.

Ensure that you have a thorough plan while implementing Resource Governor on Production using the steps from Section B (Demo). First, you need to be absolutely sure that no business critical query is connecting to production using SQL Server Management Studio. You also need to identify what the optimum resource constraint is for your environment. For example, in the demo I will maximize CPU and memory at 25%, but this may not be the best number for your environment. Lastly, you should explain the possible impact after implementation to all stakeholders (i.e. their ad hoc queries will run fine if the system is not busy serving OLTP Applications. However, it will get deprioritized and performance will be degraded once the system is busy.)

Section B (Demo)

In this section, we will configure Resource Governor on a SQL Server 2008 R2 Enterprise Edition. Please use the following steps to complete the configuration:

  1. Make sure you are connected to correct SQL Server Instance – A very basic, but vital step while working on production.RG_1
  2.  Define Resource Pool with the parameters identified beforehand.RG_2
  3.  Reconfigure Resource GovernorRG_3
  4. Create a workload group with the parameters identified beforehand. Basically, all connections coming from SQL Server Management Studio will be handled through this group.

    RG_4
  5.  Reconfigure Resource Governor

    RG_3
  6. Create a Classifier Function

    RG_6
  7.  Configure Resource Governor to use Classifier Function created in Step#6RG_7
  8. Reconfigure Resource GovernorRG_3
  9. Now verify all the changes are successful:RG_9After this point, any queries coming through SQL Server Management Studio (ad hoc query) will be screened by Resource Governor and will be deprioritized if needed, ensuring vital application does not experience performance issues from runaway queries.You can check which SPID is using which Resource Group by using the query below to ensure that the configuration is working as expected. Please note that in this case, Resource Governor classified SPID 54 and 55 under GROUP_ADHOC_QUERY_SS which means these two SP will not be able to take more than 25% of CPU and memory of the system (as per the configuration we defined here) irrespective of how much demanding query they are running.RG_LastSection C (Appendix)

    Please find the codes used in Section B (Demo) for your reference:

    select @@SERVERNAME

    Create RESOURCE POOL POOL_ADHOC_QUERY
    WITH
    (MIN_CPU_PERCENT = 1, MAX_CPU_PERCENT=25,MIN_MEMORY_PERCENT = 1, MAX_MEMORY_PERCENT = 25);

    GO

    ALTER RESOURCE GOVERNOR RECONFIGURE;
    GO

    CREATE WORKLOAD GROUP [GROUP_ADHOC_QUERY_SSMS] WITH(group_max_requests=0,importance=Low,request_max_cpu_time_sec=600,
    request_max_memory_grant_percent=25,request_memory_grant_timeout_sec=0,max_dop=0) USING [POOL_ADHOC_QUERY]
    GO

    ALTER RESOURCE GOVERNOR RECONFIGURE;
    GO
    USE master;
    GO
    CREATE FUNCTION RG_Classifier() RETURNS SYSNAME WITH SCHEMABINDING
    AS
    BEGIN
    DECLARE @workload_group sysname;
    IF (APP_NAME() LIKE ‘%Microsoft SQL Server Management Studio – Query%’)
    SET @workload_group = ‘GROUP_ADHOC_QUERY_SSMS’;
    RETURN @workload_group;
    END;

    ALTER RESOURCE GOVERNOR
    WITH (CLASSIFIER_FUNCTION=dbo.RG_Classifier);
    ALTER RESOURCE GOVERNOR RECONFIGURE

    ALTER RESOURCE GOVERNOR RECONFIGURE;
    GO

    USE master
    SELECT * FROM sys.resource_governor_resource_pools
    SELECT * FROM sys.resource_governor_workload_groups
    GO

    SELECT s.group_id, CAST(g.name as nvarchar(20)), s.session_id, s.login_time, CAST(s.host_name as nvarchar(20)), CAST(s.program_name AS nvarchar(20))
    FROM sys.dm_exec_sessions s
    INNER JOIN sys.dm_resource_governor_workload_groups g
    ON g.group_id = s.group_id
    ORDER BY g.name
    GO

     

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>