On April 4th 2012 Xtrabackup 2.0 was released in to GA by Percona along with a new streaming feature called xbstream. This new tool allowed for compression and parallelism of streaming backups when running xtrabackup or innobackupex without having to stream using tar, then pipe to gzip or pigz, then pipe to netcat or socat to stream your backup to the recipient server. This resulted in simplifying the command structure a great deal and fast became the preferred way of streaming backups from a origin server to its destination. In recent months we’ve had discussions internally as to whether xbstream would be a better way of streaming large amounts of data between servers for use cases outside of xtrabackup. And which is better, socat or netcat? So I decided to put this to the test. In order to test this I created two m5.xlarge EC2 instances as this provided an “up to 10 gigabit” level of network performance. I also put both instances in the same availability zone in order to reduce the chance of poor networking skewing my results. Once this was done I installed Percona XtraDB Server 5.6, Xtrabackup 2.4.9, and created a simple database with a data set size of 90Gb. For my first test I started by using a streaming backup of the entire data set using both the xbstream and tar streaming methods. Compression was not used so to evaluate the streaming methods equally. Both socat and netcat were evaluated.
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=1 ./ | nc 172.31.55.250 10001 171228 15:11:13 innobackupex: Starting the backup operation ..... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 15:25:22 completed OK! real 14m9.385s user 3m27.392s sys 3m34.420s [root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=2 ./ | nc 172.31.55.250 10001 171228 15:38:50 innobackupex: Starting the backup operation ..... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 15:50:42 completed OK! real 11m51.915s user 3m31.808s sys 3m34.740s [root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=4 ./ | nc 172.31.55.250 10001 171228 15:38:50 innobackupex: Starting the backup operation ..... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 16:07:28 completed OK! real 11m51.923s user 3m27.836s sys 3m30.088s
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=1 ./ | socat -u stdio TCP:172.31.55.250:10001 171228 16:13:51 innobackupex: Starting the backup operation ....... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 16:26:55 completed OK! real 13m3.911s user 3m8.208s sys 2m35.160s [root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=2 ./ | socat -u stdio TCP:172.31.55.250:10001 171228 16:28:16 innobackupex: Starting the backup operation ..... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 16:40:08 completed OK! real 11m51.984s user 3m8.148s sys 2m28.860s [root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=4 ./ | socat -u stdio TCP:172.31.55.250:10001 171228 16:44:54 innobackupex: Starting the backup operation ....... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 16:56:46 completed OK! real 11m51.916s user 3m7.460s sys 2m24.968s
[root@ip-172-31-54-219 ~]# time innobackupex --stream=tar --parallel=1 ./ | nc 172.31.55.250 10001 171228 17:02:26 innobackupex: Starting the backup operation ....... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 17:16:09 completed OK! real 13m42.910s user 3m19.696s sys 3m47.672s
[root@ip-172-31-54-219 ~]# time innobackupex --stream=tar --parallel=1 ./ | socat -u stdio TCP:172.31.55.250:10001 171228 17:19:59 innobackupex: Starting the backup operation ...... xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied. 171228 17:33:03 completed OK! real 13m3.940s user 2m59.468s sys 2m29.388s
Here is a summary of the output noted above, in seconds.
You’ll notice that the xbstream method outperformed the tar method once we started introducing parallel threads. You may also note that performance gains ended after 2 threads were in use and this is likely due to the fact we may have hit a networking bottleneck. Another interesting thing to note is that with a single thread, socat outperformed netcat, but when it came to using multiple threads, they were about equal. So what does this mean for moving data outside of xtrabackup / innobackupex? For my next test I decided to focus on just the large data files that I created in the test schema directory, the main reason being that xbstream can handle files, but not directories and cannot act recursively. First I used xbstream and then tried again using tar. Again, compression was not used so we could look at just the streaming method. Both netcat and socat were evaluated
[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 1 ./t* | nc 172.31.55.250 10001 real 12m25.439s user 0m20.928s sys 3m43.492s [root@ip-172-31-54-219 streamtest]# time xbstream -c -p 2 ./t* | nc 172.31.55.250 10001 real 12m28.086s user 0m22.996s sys 3m50.972s [root@ip-172-31-54-219 streamtest]# time xbstream -c -p 4 ./t* | nc 172.31.55.250 10001 real 13m15.775s user 0m21.460s sys 3m50.336s
[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 1 ./t* | socat -u stdio TCP:172.31.55.250:10001 real 11m47.781s user 0m17.132s sys 2m38.168s [root@ip-172-31-54-219 streamtest]# time xbstream -c -p 2 ./t* | socat -u stdio TCP:172.31.55.250:10001 real 11m47.707s user 0m15.816s sys 2m22.884s [root@ip-172-31-54-219 streamtest]# time xbstream -c -p 4 ./t* | socat -u stdio TCP:172.31.55.250:10001 real 11m47.805s user 0m16.796s sys 2m36.588s
[root@ip-172-31-54-219 streamtest]# time tar -cf - ./t* | nc 172.31.55.250 10001 real 11m47.942s user 0m5.260s sys 2m32.048s
[root@ip-172-31-54-219 streamtest]# time tar -cf - ./t* | socat -u stdio TCP:172.31.55.250:10001 real 11m47.914s user 0m4.860s sys 1m37.632s
Here is a summary of the output noted above, in seconds.
When working with xtrabackup / innobackupex, it looks like xbstream and socat is the way to go. If you’re steaming backups and are not taking advantage of multiple threads, you should consider it. For large data copies from one server to another. It looks like you’re safe using xbstream or tar, so long as the combination of xbsteam and netcat is avoided. Considering that xbstream will not work with directories or act recursively natively, it may just be easier to stick with tar.
Looking to optimize your MySQL use?