Cassandra backups using nodetool
Cassandra nodetool provides several types of commands to manage your Cassandra cluster. See my previous posts for an orientation to Cassandra nodetool and using nodetool to get Cassandra information. My colleague has provided an in-depth analysis of backup strategies in Cassandra that you can review to learn more about ways to minimize storage cost and time-to-recovery, and to maximize performance. Below I will cover the nodetool commands used in scripting these best practices for managing Cassandra full and incremental backups.
SnapshotsThe basic way to backup Cassandra is to take a snapshot. Since sstables are immutable, and since the snapshot command flushes data from memory before taking the snapshot, this will provide a complete backup. Use nodetool snapshot to take a snapshot of sstables. You can specify a particular keyspace as an optional argument to the command, like nodetool snapshot keyspace1. This will produce a snapshot for each table in the keyspace, as shown in this sample output from nodetool listsnapshots:
Snapshot Details: Snapshot name Keyspace name Column family name True size Size on disk 1528233451291 keyspace1 standard1 1.81 MiB 1.81 MiB 1528233451291 keyspace1 counter1 0 bytes 864 bytesThe first column is the snapshot name, to refer to the snapshot in other nodetool backup commands. You can also specify tables in the snapshot command. The output at the end of the list of snapshots -- for example, Total TrueDiskSpaceUsed: 5.42 MiB -- shows, as the name suggests, the actual size of the snapshot files, as calculated using the walkFileTree Java method. Verify this by adding up the files within each snapshots directory under your data directory keyspace/tablename (e.g., du -sh /var/lib/cassandra/data/keyspace1/standard1*/snapshots). To make the snapshots more human readable, you can tag them. Running nodetool snapshot -t 2018June05b_DC1C1_keyspace1 keyspace1 results in a more obvious snapshot name as shown in this output from nodetool listsnapshots:
2018June05b_DC1C1_keyspace1 keyspace1 standard1 1.81 MiB 1.81 MiB 2018June05b_DC1C1_keyspace1 keyspace1 counter1 0 bytes 864 bytesHowever, if you try to use a snapshot name that exists, you'll get an ugly error:
error: Snapshot 2018June05b_DC1C1_keyspace1 already exists. -- StackTrace -- java.io.IOException: Snapshot 2018June05b_DC1C1_keyspace1 already exists....The default snapshot name is already a timestamp (number of milliseconds since the Unix epoch), but it's a little hard to read. You could get the best of both worlds by doing something like (depending on your operating system): nodetool snapshot -t keyspace1_date +"%s" keyspace1. I like how the results of listsnapshots sorts that way, too. In any case, with inevitable snapshot automation, the human-readable factor becomes largely irrelevant. You may also see snapshots in this listing that you didn't take explicitly. By default, auto_snapshot is turned on in the cassandra.yaml configuration file, causing a snapshot to be taken anytime a table is truncated or dropped. This is an important safety feature, and it's recommended that you leave it enabled. Here's an example of a snapshot created when a table is truncated:
cqlsh> truncate keyspace1.standard1; root@DC1C1:/# nodetool listsnapshots Snapshot Details: Snapshot name Keyspace name Column family name True size Size on disk truncated-1528291995840-standard1 keyspace1 standard1 3.57 MiB 3.57 MiBTo preserve disk space (or cost), you will want to eventually delete snapshots. Use nodetool clearsnapshot with the -t flag and the snapshot name (recommended, to avoid deleting all snapshots). Specifying -- and the keyspace name will additionally filter the deletion to the keyspace specified. For example, nodetool clearsnapshot -t 1528233451291 -- keyspace1 will remove just the two snapshot files listed above, as reported in this sample output:
Requested clearing snapshot(s) for [keyspace1] with snapshot name Note that if you forget the -t flag or the -- you will get undesired results. Without the -t flag, the command will not read the snapshot name, and without the -- delimiter, you will end up deleting all snapshots for the keyspace. Check syntax carefully. The sstables are not tied to any particular instance of Cassandra or server, so you can pass them around as needed. (For example, you may need to populate a test server.) If you put an sstable in your data directory and run nodetool refresh, it will load into Cassandra. Here's a simple demonstration:
cqlsh> truncate keyspace1.standard1 cp /var/lib/cassandra/data/keyspace1/standard1-60a1a450690111e8823fa55ed562cd82/snapshots/keyspace1_1528236376/* /var/lib/cassandra/data/keyspace1/standard1-60a1a450690111e8823fa55ed562cd82/ cqlsh> select * from keyspace1.standard1 limit 1; key | C0 | C1 | C2 | C3 | C4 -----+----+----+----+----+---- (0 rows) nodetool refresh keyspace1 standard1 cqlsh> select count(*) from keyspace1.standard1; count 7425This simple command has obvious implications for your backup and restore automation.