Select Page

Life made easier: Facebook’s Online Schema Change for MySQL

Author: Patrick Galbraith | 10 min read | July 27, 2011

So often, you have a task where you need to perform an alteration of a huge table while not wanting to affect the operation of your website. With an alter table, MySQL will internally lock the table in question, create a temporary table with the new table definition, copy the date, then drop the current table, then rename the temporary table to the current table, then unlock tables– not acceptable for a running website!

One way of being able to alter a table without affecting the database (read website application) is using a replicated architecture, particularly having a dual master setup, making the change on the inactive master, then switching, lather, rinse repeat. What if you don’t have a replicated architecture but need some way of making this alteration? Facebook, a company who I’ve heard has a fair amount of data, would be a likely place to have problems and challenges that we all face as developers and database administrators — by a thousand-fold! They have a huge number of tables spread out over multitudes of servers and needed the ability to alter these tables online. Hence, the Online Schema Change for MySQL came about.

Description

What is the Online Schema Change for MySQL? It’s a tool Facebook wrote, a PHP library, that has methods and a simple API for being able to perform DDL changes to a database table without having any down-time or having to use replication to perform these alterations. This is done in several phases:

  • Copy Phase — copy data to the table with the new table definition
  • Capture changes — through the use of triggers and a “delta” table to record those those change
  • Replay changes — replay changes from the delta table to the new table
  • Cutover — lock the original table, delta table, new table, replay changes one more time, rename original table as an old table, new table to original table, unlock tables

These steps have to happen as atomically as possible and this tool has done a great job of ensuring this atomicity. For more intricate details about this tool, Mark Callaghan has a good post at: http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932 What I wanted to achieve for this post is to share my experience using this tool. I decided a good test would be to run a perl script that inserts a lot of data into a table and alter the table using Online Schema Change tool while the insert tool runs.

Writing a PHP script to use OSC

First, of course, you need to get the PHP library: http://bazaar.launchpad.net/~mysqlatfacebook/mysqlatfacebook/tools/annotate/head:/osc/OnlineSchemaChange.php Download this PHP library. What I did was look at the source to see how to use it. The summary of how to use it can best be described in my alteration script I wrote, called interestingly alter.php. I will comment on each part of the script inline:

#!/usr/bin/env php

<?php

First of all, you want to import the library

require_once('./OnlineSchemaChange.php');

These flags set first of all the ability to run this tool on versions other than 5.1.47 (they’ve only tested with 5.1.47 and 5.0.84), which I was using Maria 5.3.0-MariaDB-alpha. I tried to also OR OSC_FLAGS_FORCE_CLEANUP, but either I’m doing something wrong or it’s not working properly, so I have deal with cleaning up the delta table afterward:

$flags = OSC_FLAGS_ACCEPT_VERSION; # | OSC_FLAGS_FORCE_CLEANUP;

/* you have to set this otherwise get warnings */
date_default_timezone_set('UTC');

Next, an OnlineSchemaChange object is instantiated. This is the object handle you will run your alterations with. Upon instantiation, you provide:

  • Database user
  • Database password
  • Schema name
  • Table you will be altering
  • Alteration statement (DDL statement)
  • Output folder. This is where data is selected into and from. Very importantly (!) you must make this readable and writeable by both MySQL and the user you are running the Online Schema Change tool for. What I did was created a directory, /tmp/outdir, and made the permissions 777.
  • Batchsize load — how many records to select at a time to the outfile.
  • amount of time a transaction runs before the tool quits
  • Log file directory — this is where the logs for the tool will be written to. I created this directory in /home/ec2-user/logs
      /* instantiate */
      $osc = new OnlineSchemaChange('/tmp/mysql.sock',
                              'username',
                              's3kr1t',
                              'test',
                              't1',
                              'alter table t1 add column col5 varchar(128)',
                              /* output folder - MySQL user must be able to
                              write to this directory! */
                              '/tmp/output/',
                              $flags,
                              500000, # batchsize load
                              500, # batchsize replay
                              30, # long xact time
                              '/home/ec2-user/logs/'# long logfile folder
                        );
       /* run execute */
       $osc->execute();
?>

Then you run execute(). This method runs some initial cleanup of the DDL statement as well as all the other methods, some including but not limited to (this code is not in the alter.php script but in OnlineSchemaChange.php)

      $this->init();
      $this->createCopyTable();
      $this->alterCopyTable();
      $this->createDeltasTable();
      $this->createTriggers();
      $this->startSnapshotXact();
      $this->selectTableIntoOutfile();
      $this->dropNCIndexes();
      $this->loadCopyTable();
      $this->replayChanges(false);
      $this->recreateNCIndexes();
      $this->replayChanges(false);
      $this->swapTables();
      $this->cleanup();

I created a table with one integer column and 4 varchar(128) columns:

CREATE TABLE `t1` (
  `id` int(11) NOT NULL DEFAULT '0',
  `col1` varchar(128) NOT NULL DEFAULT '',
  `col2` varchar(128) NOT NULL DEFAULT '',
  `col3` varchar(128) NOT NULL DEFAULT '',
  `col4` varchar(128) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

Testing OSC

The insert script is very simple, it simply inserts random characters into every column in a big loop, albeit I utilize micro-sleep to space it out a bit so the insert occurs every millisecond and ensures I have the tool take its time somewhat while testing the alterations:

for my $table ('t1') {
    my $insert = "insert into $table (id, col1, col2, col3, col4) values (?, ?, ?, ?, ?)";
    my $sth = $dbh->prepare($insert);

    for my $i ($row->[0] .. ($row->[0] + 1000000)) {
        my @chars = grep !/[0O1Iil]/, 0..9, 'A'..'Z', 'a'..'z';
        my $random_chars= join '', map { $chars[rand @chars] } 0 .. 512;

        my $col1 = substr($random_chars, 0, 127);
        my $col2 = substr($random_chars, 128, 255);
        my $col3 = substr($random_chars, 256, 384);
        my $col4 = substr($random_chars, 385, 512);

        $sth->execute($i, $col1, $col2, $col3, $col4);
        usleep(1000);
    }
}

Next, using screen, I started the insert script, letting it run for some time to accumulate data so the table is somewhat full of data, then I ran the tool:

[ec2-user@ip-10-200-xx-xx ~]$ ./alter.php

When completed after about 10 minutes, I checked to see if it did its job:

MariaDB [test]> show create table t1G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int(11) NOT NULL DEFAULT '0',
  `col1` varchar(128) NOT NULL DEFAULT '',
  `col2` varchar(128) NOT NULL DEFAULT '',
  `col3` varchar(128) NOT NULL DEFAULT '',
  `col4` varchar(128) NOT NULL DEFAULT '',
   `col5` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Great! This is wonderful! However, I checked the other screen window:

DBD::mysql::st execute failed: Table 'test.t1' doesn't exist at ./insert.pl line 43.

Hmm, one statement failed. What does this mean exactly? I suspect the last phase is what causes this. I would have assumed the lock on all three tables during the renaming of the current to old, new to current would prevent this. My script doesn’t try to attempt to retry an SQL statement if it fails, but one caveat of using this tool is that whatever application is running against the table, that it might want to wait and retry the statement. I added debug code to my insert tool and verified what insert statement fails, and verified that the failed statement did not get executed:

DBD::mysql::st execute failed: Table 'test.t1' doesn't exist at ./insert.pl line 43.
ERROR: insert of 1050374, tubohRKzZ3JHAFASRCYogfTaB4MjbDTLXM9yHbbNPVtQarghJvcR2LFy4oTwcoAEb3qSvsLFJDnKtxJuDk9v6yKJzFX92LotuRKY3VLexL3NF56GacznFuLAp5xachu, pRNh6zzeQR8pzZaNUX2sBovAEnMf4gkZbH3sUXVmJx37Y7boJbBUpw4YS9C4dnVMULME7DkEhn5ss7wcmcmuhg2Eej6LDNWdqEbcskJdj4RZHm9T946uycw5YUJ2VotvPy9N3AwMVG7fwBHUmDCqnnBdPD2xGeP8SMfnEtPPBe4pW2hpr6HmYnrFN9YFAtKW2DJzbobvBnMBXEo7WG4DayaHbYSNJKymHBqcUkFVQNeszNYK8UYgRLH499gwRu3... failled

MariaDB [test]> select * from t1 where id = 1050374;
Empty set (0.00 sec)

A simple conceptual code snippet shows the essence of how an API would deal with this:

            my $try_again = 1;
            while ($try_again) {
            if ($sth->execute($i, $col1, $col2, $col3, $col4)) {
                $try_again = 0;
            }
            else {
                print "ERROR: " . $DBI::errstr . " tried to insert record id $in";
                usleep(2000);
            }
        }

This is very simplistic, but upon introducing this, the script was able to insert the data. Of course, there is much more complexity in a real-world application, and other errors could exist that cause a loop that attempts to “try again” to run endlessly. The next thing I wanted to try is another alter, this time to drop the column just created:

...
$osc = new OnlineSchemaChange('/tmp/mysql.sock',
                             ...
                              'alter table t1 drop column col5',
                             ...
                        );

, however, I had an error. The PHP update.php script would finish quickly and upon reviewing the log, I would see:

[ec2-user@ip-10-200-xx-xx ~]$ view logs/__osclog_t1.err

--OSC info: time=2011-07-27T14:59:34+00:00, db=test, table=t1, flags=400
--ALTER=ALTER TABLE __osc_new_t1 drop column col5
Exception: errno:-1054, error:-Loading copy table failed. SQL:-LOAD DATA INFILE '/tmp/output/__osc_tbl_t1.1' INTO TABLE __osc_new_t1(`id`, `col1`,`col2`,`col3`,`col4`,`col5`).Unknown column 'col5' in 'field list'

I’m not sure why this is the case. It seems to me that the insert statement is still assuming that the dropped column col5, exists. I’d be glad to know if this is operator error, if I’m missing something. Now to try adding an index with

'alter table t1 add index col1 (col1)'

as the alter statement, and sure enough:

MariaDB [test]> show create table t1G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int(11) NOT NULL DEFAULT '0',
  `col1` varchar(128) NOT NULL DEFAULT '',
  `col2` varchar(128) NOT NULL DEFAULT '',
  `col3` varchar(128) NOT NULL DEFAULT '',
  `col4` varchar(128) NOT NULL DEFAULT '',
  `col5` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `col1` (`col1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

Worked in that case.

Summary

The Facebook Online Schema Change tool is excellent, although there are few things you need to know about it, which I’ve pointed out:

  • Make sure to create the directories it needs, particularly ‘output’ directory which needs to be read/writeable by MySQL and the user you run the alteration script you write as.
  • You do have to be aware that your application will need to be able to handle the cut-over phase for when the renaming of tables results in the table you expect to be there isn’t.
  • You will need to remove the old table
    drop table __osc_old_t1;
  • Alter table drop column has an issue and I could not get it to work.
  • When using this tool, you need to be cognizant that it utilizes triggers, which you would want to run these alterations outside of replication.
  • Make sure to add
    date_default_timezone_set('UTC');

    to your code or you will get warnings that annoy you to add it

Again, I might have made a mistake, but can’t find anything to point to a particular mistake. My assessment with this tool is that is it excellent if you need to make alterations and need to not have major down-time do do so. It would have made my life much easier if I had used it in a number of projects! Thank you Facebook Team!

Subscribe to Our Blog

Never miss a post! Stay up to date with the latest database, application and analytics tips and news. Delivered in a handy bi-weekly update straight to your inbox. You can unsubscribe at any time.