Minimal Networking Knowledge Recommended for DBAs
In a recent post, I summarized what I consider to be the minimal topics one should be familiar with in order to do technical work with databases. One thing I left out, except for it being implicitly included in "Hardware Basics" and "Operating Systems Basics", is networking. In this post, I'll go through what I consider the minimal networking knowledge one needs to know when working with databases.
Protocol conceptsWhile there are some niche network protocols in use (in my experience, when working with mainframes, for example), database practitioners may well spend their whole lives dealing mostly or only with TCP/IP, so that's where I suggest focusing your learning efforts. A great guide to the basics is "TCP/IP Illustrated, Vol 1: The Protocols" . While a bit old, the fundamentals are still relevant, and the book is very clearly written, and packed with tcpdump captures supporting the explanations. I think a good way to learn with it, while then checking out what has changed since it was written, is going to the RFC index and checking out RFCs mentioned in the book for a protocol you're studying. If more recent RFCs are available, they'll be mentioned in the index with an "obsoleted-by" note. This is a vast field, and not all of it is of immediate relevance to a database practitioner, so I'm providing the following list of what I consider the core networking knowledge you'll need to work with (and primarily to troubleshoot) databases:
- TCP's three-way handshake
- TCP connection termination
- The MSL and its relationship with connection termination and establishment
- Slow start
- Flow Control
- How ARP works
- How DNS works
- The MTU and Path MTU
- Routing, with emphasis on routing errors and their meaning (for example, what's the difference between no route to host and a timeout?)
- Basics of UDP and what it does not provide when compared with TCP (if you're troubleshooting something that uses it, like Galera)
Networking toolsI'll mention tools available on GNU/Linux here because that's the most popular OS by far for the databases I currently work with, but once you know the protocol basics, you're just one quick Internet-search away from discovering which tool you must use on another OS (for example, entering "How to find out my IP in Windows" in Google gets me instructions on how to use ipconfig on the very first result returned). If you're not on GNU/Linux but you are working on another OS that's part of the Unix family tree, there are two useful resources that can help you: You'll need to become familiar with the following commands to find out which IP address the host you're working has, how to change it, or add/remove an alias, and how to modify the routing table or the firewall rules. For this, I recommend reading the main pages for the following commands:
- ip (You may find a lot of older resources online mentioning ifconfig, which is what I learned to use, but ip is what should be used now),
- netstat or ss.
- You'll read how the three-way handshake works.
- You'll want to go and establish a TCP connection while capturing traffic, then analyze the traffic and see if it matches what the protocol says should happen.