Distribution Cleanup job and xp_cmdshell

Mar 24, 2008 / By Jon Stairs

Tags: ,

Recently I encountered a SQL Server 2005 Replication setup where the following error message was appearing about every 5 minutes in the SQL Server error logs.

Replication-@rowcount_only parameter must be the value 0,1, or 2. 0=7.0 compatible checksum. 1=only check rowcou: agent distribution@rowcount_only parameter must be the value 0,1, or 2. 0=7.0 compatible checksum. 1=only  scheduled for retry. Could not clean up the distribution transaction tables.

The error message was coming from the failing distribution cleanup job. Examining the job history I was able to determine the real error message.

Executed as user: PROD\SQLAgent. Could not remove directory '\\dummy\Replication\unc\DUMMY_TRANSLOG_TRANSLOG\20080205230108\'. Check the security context of xp_cmdshell and close other processes that may be accessing the directory. [SQLSTATE 42000] (Error 20015). The step failed.

To provide some background information on the Distribution Cleanup Job. When a server is configured to be a Distributer a job is created titled ‘Distribution clean up: distribution’. The job has one step and it calls a stored proc called dbo.sp_MSdistribution_cleanup where it will clear out records from the MSrepl_commands and MSrepl_transactions tables. The cleanup procedure also deletes any expired snapshots stored in the Replication folder.

The first step in debugging was to confirm if xp_cmdshell was enabled. I opened SQL Server Surface Area Configuration and selected “Surface Area Configuration for Features”, selected the xp_cmdshell and checked the enable box. Xp_cmdshell is disabled by default in SQL Server 2005 for security reasons.

The second part of debugging was to make sure that the PROD\SQLAgent account had proper permissions to run xp_cmdshell statements. Only members of the sysadmin role has permission to create jobs with steps that run xp_cmdshell, if PROD\SQLAgent is not a member of sysadmin role, the job could be given permission to run xp_cmdshell by using a proxy account.

The final part of debugging was to ensure that proper permissions were set to delete the specified folder. The stored procedure dbo.sp_MSdistribution uses the unc path, not the local path to the folder. I configured the SQL Server service account to have full control on the replication share. There is a distinction between the permissions set on the folder and the network share.

The reason I set the SQL Server service account instead of PROD\SQLAgent to have full control on the share is because when an xp_cmdshell command it runs on the os as the SQL Server service account not the user issuing the command. To prove this I have run the following commands

C:\>echo %USERNAME%
pythian

C:\>osql -E
1> xp_cmdshell 'echo %USERNAME% > c:\whoami.txt'
2> go
 output

 -------------------------------------------------------------------------------

        ------------------------------------------------------------------------

        ------------------------------------------------------------------------

        --------------------------------
 NULL

(1 row affected)
1> exit

C:\>more whoami.txt
NETWORK SERVICE

As we can see in the previous test that the xp_cmdshell is run under the NETWORK SERVICE account instead of the pythian user which I was logged in as.

After making these changes the distribution clean up job was able to run successfully and the errors stopped occurring in the Error logs.

7 Responses to “Distribution Cleanup job and xp_cmdshell”

  • Peter Cwik says:

    Holy &*(&(^! I have been killing myself with this one! Make sure the share has permissions!?! That was the answer!! Thank You!!!!!!!!!!!!!!!!!

  • Eugene Chow says:

    This solves my problem with the permission on the UNC share. Everything else was Ok. Thanks again.

  • jd says:

    Permissions on the share worked for me too! Thanks.

  • Chris says:

    This is what I was looking for. I didn’t know that xp_cmdshell runs as Network Service.

    Thank you for your help.

  • Adrian says:

    My whoami.txt reads %USERNAME% :-/

    I can successfully create a delete folders using the EXEC xp_cmdshell so I’m at a loss to understand why the cleanup job fails

  • Nathan says:

    Assigning full control permissions over the UNC share to the SQL Server / SQL Server Agent service account solved the problem for me as well.

  • Bruce James says:

    I had this same problem on a cluster. I verified the accounts and permissions were as they should be then restarted SQL Agent and that resolved it.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>