Troubleshooting Transactional Replication Latency Using Agent Statistics
Troubleshooting latency issues is a multi-step approach. The first step is to identify which agent is slow.
- Log Reader Agent (Publisher to Distributor)
- Distribution Agent (Distributor to Subscriber)
Note: Each thread will have their own buffer with 40k in size. Here, we need to look at the wait time to understand where the bottleneck exists.For example, if you notice that the wait time for the Reader thread is high, then it essentially means your Writer thread is slow since the Reader thread is waiting for Writer to release the buffer. Similarly, if you notice high wait times for your Writer thread then your Reader thread is slow. The easiest way to decode this is, HIGH wait time on Reader thread = Writer thread is slow ( thread which writes the commands to distribution database) HIGH Wait time on Writer thread = Reader thread is slow ( thread which scans the transaction log)– State = 1 means stats after batch commit
–Work = cumulative time spent by the agent since restart - idle time ( seconds)
–Idle = Time spent waiting to call sp_replcmds
–Reader fetch = Time to do execute sp_replcmds
Wait = Time spent waiting on writer to release buffer
–Writer write = Time spent writing commands into distribution database
Wait = Time spent waiting on reader to populate buffer
Distribution Agent Statistics
<stats state="1" work="154" idle="351464"> <reader fetch="144" wait="11"/> <writer write="12" wait="338"/> <sincelaststats elapsedtime="305" work="10" cmds="81262" cmdspersec="8041.000000"><reader fetch="0" wait="9"/><writer write="10" wait="0"/></sincelaststats></stats> – State =1 means stats after a batch commit – Work = cumulative time spend by the agent since restart - idle time – Idle = Time spend waiting to call sp_msget_repl_commands – Reader fetch = Time to do execute sp_msget_repl_commands Wait = Time spent waiting on writer to release buffer. – Writer write = Time spend writing commands into distribution database Wait = Time spent waiting on reader to populate buffer. Similar to log reader agent, the decoding of wait time is same way we did for log reader agent. HIGH wait time on Reader thread = Writer thread is slow ( thread which writes the subscriber database using batched RPC Calls) HIGH wait time on Writer thread = Reader thread is slow ( thread which takes the pending commands from Distribution database) Distributor Writer Thread Slow Scenario We would be able to understand this concept better by looking at sample statistics. In the example below, I explicitly started the transaction on Subscriber table to simulate blocking on the subscriber side, making the writer thread of the distribution agent wait and build up latency. This is how the stats looked.- <stats state="1" work="755" idle="354505">
- <reader fetch="153" wait="604"/>
- <writer write="613" wait="346"/>
- <sincelaststats elapsedtime="636" work="515" cmds="45033" cmdspersec="87.000000"><reader fetch="0" wait="515"/><writer write="515" wait="0"/></sincelaststats></stats>
On this page
Share this
Share this
More resources
Learn more about Pythian by reading the following blogs and articles.
Three Cs to secure against leaks like Capital One

Three Cs to secure against leaks like Capital One
Jul 30, 2019 12:00:00 AM
3
min read
Disable Lock Escalation in SQL Server
Disable Lock Escalation in SQL Server
Apr 10, 2015 12:00:00 AM
1
min read
An Open Letter to Larry Ellison on AWR and ASH Licensing
An Open Letter to Larry Ellison on AWR and ASH Licensing
Jun 29, 2007 12:00:00 AM
4
min read
Ready to unlock value from your data?
With Pythian, you can accomplish your data transformation goals and more.