Backup and data streaming with xbstream, tar, socat, and netcat

On April 4th 2012
Xtrabackup 2.0 was released in to GA by Percona along with a new streaming feature called
xbstream. This new tool allowed for
compression and
parallelism of streaming backups when running xtrabackup or innobackupex without having to stream using tar, then pipe to gzip or pigz, then pipe to netcat or socat to stream your backup to the recipient server. This resulted in simplifying the command structure a great deal and fast became the preferred way of streaming backups from a origin server to its destination. In recent months we’ve had discussions internally as to whether xbstream would be a better way of streaming large amounts of data between servers for use cases outside of xtrabackup. And which is better, socat or netcat? So I decided to put this to the test. In order to test this I created two m5.xlarge EC2 instances as this provided an “up to 10 gigabit” level of network performance. I also put both instances in the same availability zone in order to reduce the chance of poor networking skewing my results. Once this was done I installed Percona XtraDB Server 5.6, Xtrabackup 2.4.9, and created a simple database with a data set size of 90Gb. For my first test I started by using a streaming backup of the entire data set using both the xbstream and tar streaming methods. Compression was not used so to evaluate the streaming methods equally. Both socat and netcat were evaluated.
XBSTREAM / NETCAT TESTS
You’ll notice that the xbstream method outperformed the tar method once we started introducing parallel threads. You may also note that performance gains ended after 2 threads were in use and this is likely due to the fact we may have hit a networking bottleneck. Another interesting thing to note is that with a single thread, socat outperformed netcat, but when it came to using multiple threads, they were about equal. So what does this mean for moving data outside of xtrabackup / innobackupex? For my next test I decided to focus on just the large data files that I created in the test schema directory, the main reason being that xbstream can handle files, but not directories and cannot act recursively. First I used xbstream and then tried again using tar. Again, compression was not used so we could look at just the streaming method. Both netcat and socat were evaluated
XBSTREAM / NETCAT TESTS
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=1 ./ | nc 172.31.55.250 10001
171228 15:11:13 innobackupex: Starting the backup operation
.....
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 15:25:22 completed OK!
real 14m9.385s
user 3m27.392s
sys 3m34.420s
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=2 ./ | nc 172.31.55.250 10001
171228 15:38:50 innobackupex: Starting the backup operation
.....
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 15:50:42 completed OK!
real 11m51.915s
user 3m31.808s
sys 3m34.740s
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=4 ./ | nc 172.31.55.250 10001
171228 15:38:50 innobackupex: Starting the backup operation
.....
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 16:07:28 completed OK!
real 11m51.923s
user 3m27.836s
sys 3m30.088s
XBSTREAM / SOCAT TESTS
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=1 ./ | socat -u stdio TCP:172.31.55.250:10001
171228 16:13:51 innobackupex: Starting the backup operation
.......
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 16:26:55 completed OK!
real 13m3.911s
user 3m8.208s
sys 2m35.160s
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=2 ./ | socat -u stdio TCP:172.31.55.250:10001
171228 16:28:16 innobackupex: Starting the backup operation
.....
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 16:40:08 completed OK!
real 11m51.984s
user 3m8.148s
sys 2m28.860s
[root@ip-172-31-54-219 ~]# time innobackupex --stream=xbstream --parallel=4 ./ | socat -u stdio TCP:172.31.55.250:10001
171228 16:44:54 innobackupex: Starting the backup operation
.......
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 16:56:46 completed OK!
real 11m51.916s
user 3m7.460s
sys 2m24.968s
TAR / NETCAT TEST
[root@ip-172-31-54-219 ~]# time innobackupex --stream=tar --parallel=1 ./ | nc 172.31.55.250 10001
171228 17:02:26 innobackupex: Starting the backup operation
.......
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 17:16:09 completed OK!
real 13m42.910s
user 3m19.696s
sys 3m47.672s
TAR / SOCAT TEST
[root@ip-172-31-54-219 ~]# time innobackupex --stream=tar --parallel=1 ./ | socat -u stdio TCP:172.31.55.250:10001
171228 17:19:59 innobackupex: Starting the backup operation
......
xtrabackup: Transaction log of lsn (119373249297) to (119373249297) was copied.
171228 17:33:03 completed OK!
real 13m3.940s
user 2m59.468s
sys 2m29.388s
Here is a summary of the output noted above, in seconds.

[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 1 ./t* | nc 172.31.55.250 10001
real 12m25.439s
user 0m20.928s
sys 3m43.492s
[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 2 ./t* | nc 172.31.55.250 10001
real 12m28.086s
user 0m22.996s
sys 3m50.972s
[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 4 ./t* | nc 172.31.55.250 10001
real 13m15.775s
user 0m21.460s
sys 3m50.336s
XBSTREAM / SOCAT TESTS
[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 1 ./t* | socat -u stdio TCP:172.31.55.250:10001
real 11m47.781s
user 0m17.132s
sys 2m38.168s
[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 2 ./t* | socat -u stdio TCP:172.31.55.250:10001
real 11m47.707s
user 0m15.816s
sys 2m22.884s
[root@ip-172-31-54-219 streamtest]# time xbstream -c -p 4 ./t* | socat -u stdio TCP:172.31.55.250:10001
real 11m47.805s
user 0m16.796s
sys 2m36.588s
TAR / NETCAT TEST
[root@ip-172-31-54-219 streamtest]# time tar -cf - ./t* | nc 172.31.55.250 10001
real 11m47.942s
user 0m5.260s
sys 2m32.048s
TAR / SOCAT TEST
[root@ip-172-31-54-219 streamtest]# time tar -cf - ./t* | socat -u stdio TCP:172.31.55.250:10001
real 11m47.914s
user 0m4.860s
sys 1m37.632s
Here is a summary of the output noted above, in seconds.