Technical insights into CockroachDB
I recently had the pleasure of discussing CockroachDB on the Datascape Podcast with Chris Presley and I wanted to supplement that episode with a bit more technical information about this database.
A deeper look at consensusCockroachDB uses the Raft consensus algorithm to guarantee data consistency (as long as system clocks are synchronized with NTP and clock offset is bounded), but it does not handle the whole data set in as a single Raft group. Instead, it uses Multi-Raft, and has a group for each range (horizontal scaling is achieved by splitting data into ranges). Each group designates a node as Leaseholder, which is the node that accepts writes and can serve reads. To optimize reads, Raft is bypassed for them (simplifying that is because the Leaseholder is the only node in a group accepting writes, so it has the most up-to-date data). There are excellent online resources to get more information about consensus and consistency guarantees in CockroachDB and here are some starting points:
- Jepsen's test results, both in Kyle Kingsbury's post describing them and in Cockroach Labs' blog.
- How CockroachDB does distributed, atomic transactions.
- Living without atomic clocks.
- Consensus, made thrive.
- The Secret Life of Data's interactive explanation of Raft.