Blog | Pythian

Using gh-ost for online schema changes

Written by Valerie Parham-Thompson | Oct 21, 2016 4:00:00 AM

The pt-online-schema-change tool has been a workhorse for years, allowing MySQL DBAs to alter tables with minimal impact to application workload, and before MySQL had native online alter capabilities. Although consistently grateful for the tool’s existence, I never liked the messiness of having to use and clean up triggers, and most DBAs have a horror story or two to tell about using any of the online alter methods.

When Github's online schema transmogrifer (gh-ost; https://github.com/github/gh-ost) was released, then, I jumped on the chance to test it out. Following are my testing notes.

Testing Methodology and Key Benefits

I tried out the flags available to run gh-ost from different locations in a replication setup, both functionally and under load. The load applied was via mysqlslap, using a combination of reads and writes, with multiple threads for concurrency.

This is to be considered minimal load testing, but it allowed me to observe gh-ost throttling itself, to see the wait time for cutover, and to use the operational controls. The main two benefits I saw in gh-ost were the avoidance of triggers and having operational controls to pause or cleanly stop the change process.

Getting Started: Requirements and Installation

The nicely verbose output is the best way to understand what the tool is doing; I've pasted a sample output below. I also cover the operational controls in a section below. Don't miss reading the helpful docs but here are some highlights:

  • Execution Location: You can run gh-ost from the master or a replica. The preferred way is from a replica, to avoid extra load on the master.
  • Replication Mode: Row-based replication (RBR) is required on the node you are running gh-ost from.
  • Primary Key: Any table being altered requires a primary key.
  • Concurrency: It is functionally possible to run multiple gh-ost changes at once.

Installation Steps

The installation process is simple:

  1. Visit the download page at https://github.com/github/gh-ost/releases/latest
  2. to get the link for the latest version.
  3. wget https://github.com/github/gh-ost/releases/download/v[VERSION]/gh-ost-binary-linux-[VERSION].tar.gz
  4. tar -xvf gh-ost-binary-linux-[VERSION].tar.gz

I tested on bare-bones (Centos7) servers and did not find any other software was required to be installed.

Database Configuration and Permissions

Just a few changes to make on the database:

1. Required Permissions

Master: grant ALTER, CREATE, DELETE, DROP, INDEX, INSERT, LOCK TABLES, SELECT, TRIGGER, UPDATE on *.* to ghost@'192.168.56%' identified by password 'xxx'; grant super, replication slave on *.* to ghost@'192.168.56%';

2. Binary Log Configuration

Add this setting on the node you will run gh-ost on. It's not explicitly needed if running from a replica because gh-ost will set it for you, but note this is required.

Slave: set global binlog_format=ROW;

You do need to set this if running on a master, otherwise you will receive an error message:

2016-09-01 16:07:29 FATAL 192.168.56.68:3306 has STATEMENT binlog_format, but I'm too scared to change it to ROW because it has replicas. Bailing out.

Understanding Internal Mechanics and File Structure

Notes about the files created

You'll see some tables are created during the process:

  • _t1_ghc: The record of changes (not data change capture).
  • _t1_gho: The "ghost" table that is an initial copy and then receives changes from binlogs.
  • _t1_del: The cutover table.

Unless you specify otherwise, _t1_del will remain after the process is complete as a backup.

Practical Setup Scenarios

Sample statements for various setups are below. A few brief notes on usage:

  • Drop the --execute flag in the below statements to do a dry run.
  • Using --exact-rowcount causes a select count(*) and doesn't help with accuracy. In a federated setup, this can crash the server. I'd avoid this flag unless tested well.

1. Running on a Replica (Primary Method)

This is the desired method to avoid load on the master. The tool will detect the master from a show slave status statement.

Bash
./gh-ost \  --max-load=Threads_running=25 \  --critical-load=Threads_running=1000 \  --chunk-size=1000 \  --throttle-control-replicas="192.168.56.144" \  --max-lag-millis=1500 \  --user="ghost" \  --password="ghost" \  --host=192.168.56.144 \  --database="mysqlslap" \  --table="t1" \  --verbose \  --alter="add column whatever3 varchar(50)" \  --switch-to-rbr \  --allow-master-master \  --cut-over=default \  --default-retries=120 \  --panic-flag-file=/tmp/ghost.panic.flag \  --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \  --initially-drop-ghost-table \  --execute  

2. Testing on a Replica

Verify the process without affecting the master or other replicas.

Bash
./gh-ost \  --test-on-replica \  --max-load=Threads_running=25 \  --critical-load=Threads_running=1000 \  --chunk-size=1000 \  --throttle-control-replicas="192.168.56.20" \  --max-lag-millis=1500 \  --user="ghost" \  --password="ghost" \  --host=192.168.56.20 \  --database="mysqlslap" \  --table="t1" \  --verbose \  --alter="add column charcol6 varchar(25)" \  --switch-to-rbr \  --cut-over=default \  --default-retries=120 \  --panic-flag-file=/tmp/ghost.panic.flag \  --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \  --execute 

3. Migrate on Replica Only

Used for specific migration strategies like the "shell game" method.

Bash
./gh-ost \  --migrate-on-replica \  --max-load=Threads_running=25 \  --critical-load=Threads_running=1000 \  --chunk-size=1000 \  --throttle-control-replicas="192.168.56.20" \  --max-lag-millis=1500 \  --user="ghost" \  --password="ghost" \  --host=192.168.56.20 \  --database="mysqlslap" \  --table="t1" \  --verbose \  --alter="add column charcol6 varchar(25)" \  --switch-to-rbr \  --cut-over=default \  --default-retries=120 \  --panic-flag-file=/tmp/ghost.panic.flag \  --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \  --execute 

4. Running from the Master

Bash
./gh-ost \  --allow-on-master \  --max-load=Threads_running=25 \  --critical-load=Threads_running=1000 \  --chunk-size=1000 \  --throttle-control-replicas="192.168.56.144" \  --max-lag-millis=1500 \  --user="ghost" \  --password="ghost" \  --host=192.168.56.145 \  --database="mysqlslap" \  --table="t1" \  --verbose \  --alter="add column whatever6 varchar(50)" \  --cut-over=default \  --default-retries=120 \  --switch-to-rbr \  --panic-flag-file=/tmp/ghost.panic.flag \  --postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \  --execute  

MySQL Database Consulting Services

Ready to optimize your MySQL Database for the future?