How to restore MySQL data folder using persistent disk snapshots (GCP)

Tags:
Oracle,
Technical Track
There are already many tools for MySQL to perform backups and consistent restores, but this time I’d like to elaborate on how to take advantage of the kindness the cloud, which gives us Persistent Disk Snapshots (PSD) to restore our data set in a consistent way.
In this entry, I'll describe in detail how to restore a backup of our MySQL dataset in a Cloud Compute VM instance making use of persistent disk snapshots and using code examples that can be used as part of an automation strategy. This method will allow us to switch between datasets in less than a minute, as opposed to the several hours that would be required to perform a restore via xtrabackup, or even longer via mysqldump.
Considerations
I’ve been working in Google Cloud Platform (GCP) and chose a simple master/slave replication scenario consisting of two VM instances running Percona Server 5.7 with xtrabackup installed on the slave.
Prior to restoring from a snapshot, we need a backup PDS. I won’t explain it here and will assume a backup PDS has already been created
Regarding permissions, I created a service account for this and included it as part of the Compute Instance Admin (v1) role and gave it Service Account User permissions. You can find how to create this type of account
here
.
With the service account created, it’s time to add it to our gcloud tool in the VM instance helped by the
gcloud auth activate-service-account
command.
The idea
We want to restore a backup PDS from our collection and mount it in a folder in our VM. We’ll stop and point MySQL into that new data folder and start it again. Using this approach, the new data folder will be available in a few seconds (less than 1 minute) and MySQL will be up and running again with a stable and consistent set of data.
Steps
- Create folders if they don't exist
mkdir -p /data-a #This will do nothing since /data-a already exists mkdir -p /data-b
- Select the target folder to mount the restored snapshot
FILE_COUNT=(`ls /data-a | wc -l`) if [ $FILE_COUNT -eq 0 ]; then NEW_PART_NAME='data-a' NEW_PART_PATH='/dev/sdc1' OLD_PART_NAME='data-b' OLD_PART_PATH='/dev/sdb1' else NEW_PART_NAME='data-b' NEW_PART_PATH='/dev/sdb1' OLD_PART_NAME='data-a' OLD_PART_PATH='/dev/sdc1' Fi
- Gather info for snapshot restoration to disk
VM_NAME=`hostname` SNAP_NAME=(`gcloud compute snapshots list --sort-by ~NAME | grep bkp | head -2 | tail -1 | awk '{print $1}'`) SNAP_ZONE=(`gcloud compute snapshots list --sort-by ~NAME | grep bkp | head -2 | tail -1 | awk '{print $3}' | cut -d'/' -f 1`) NEW_DISK_NAME=${VM_NAME}-${NEW_PART_NAME} OLD_DISK_NAME=${VM_NAME}-${OLD_PART_NAME}
- Restore snapshot to disk
gcloud compute disks create ${NEW_DISK_NAME} \ --source-snapshot ${SNAP_NAME} \ --zone ${SNAP_ZONE}
- Attach disk to the current VM
gcloud compute instances attach-disk ${VM_NAME} \ --disk ${NEW_DISK_NAME} \ --zone ${SNAP_ZONE}
- Mount the created disk
mount ${NEW_PART_PATH} /${NEW_PART_NAME}
- Ensure proper permissions in the new data folder
chown -R mysql:mysql /${NEW_PART_NAME}
- Stop MySQL
systemctl stop mysql
- Find the datadir variable in my.cnf
REPLACEMENT="datadir=/"${NEW_PART_NAME}"/mysql" sed -i "s|datadir=.*|${REPLACEMENT}|g" ${MY_CNF_FILE}
- Start MySQL service back again
systemctl start mysql
- Edit fstab to make the partition changes permanent
sed -i "s|${OLD_PART_PATH}|${NEW_PART_PATH}|g" /etc/fstab sed -i "s|${OLD_PART_NAME}|${NEW_PART_NAME}|g" /etc/fstab
- Umount unused partition
sleep 5 umount -l ${OLD_PART_PATH}
- Detach disk
gcloud compute instances detach-disk ${VM_NAME} \ --disk ${OLD_DISK_NAME} \ --zone ${SNAP_ZONE}
- Remove disk
gcloud compute disks delete ${OLD_DISK_NAME} \ --zone ${SNAP_ZONE} \ --quiet
- Remove the data dir
rm -rf /${OLD_PART_NAME}
Conclusions
Working in the cloud brings many advantages and flexibility. It’s a good thing to extend that flexibility to our restoration process which gives us a huge cut in restoration time when it comes to serious data sets.
In addition, creating and keeping a decent amount of PDS in our GCP collection instead of compressed backup seems more friendly for management (GCP has a nice GUI, in my personal opinion) and easier to maintain (you can remove them, sort them by date or size, you name it).