Is it possible to run OPTIMIZE TABLE without having replication lag/downtime?

I had a table with 100,000,000 records and 500GB of data. I have been backing up a lot of the older records into a backup DB and deleting them from main DB. However the disk space hasn’t reduced, and I noticed the data_free has grown a lot for that table.

My understanding is I need to run OPTIMIZE TABLE to reduce the disk size, however I have read this causes replication lag. I am using mysql 5.7 InnoDB.

So my question is, can I run OPTIMIZE TABLE without causing replication lag? For example running OPTIMIZE TABLE on master such as:

OPTIMIZE NO_WRITE_TO_BINLOG TABLE tblname;

Then run the same command on the slaves one by one. Would that work? Are there some risks in doing that? Or is there any other way?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

At my company we use Percona’s free tool pt-online-schema-change.

It doesn’t literally do an OPTIMIZE TABLE, but for InnoDB tables, any table-copy operation will accomplish the same result. That is, it makes a new InnoDB tablespace, copies all the rows to that tablespace, and rebuilds all the indexes for that table. The new tablespace will be a defragmented version of the original tablespace.

Any alter will work, you don’t have to change anything in the table. I use the no-op ALTER TABLE <name> FORCE.

The advantage of pt-online-schema-change is that while it’s working, you can continue to read and write the table. It only needs a brief metadata lock to create triggers as it starts, and another brief metadata lock at the end to swap the new table for the old.

If you use OPTIMIZE TABLE, this causes long replication lag, because it won’t start running on the replica until after it’s finished on the source.

Whereas with pt-online-schema-change, it starts running the table-copy immediately, and this continues along with other concurrent transactions, and when it’s done on the source, it’s only a moment until it’s also done on the replica.

It actually takes longer than OPTIMIZE TABLE, but since it doesn’t prevent you from using the table, that doesn’t matter as much.

Method 2

I ended up making the tests on my local by setting up a replication environment.

It seems possible to run OPTIMIZE TABLE tblname; without causing any downtime or replication lag.

You need to run OPTIMIZE NO_WRITE_TO_BINLOG TABLE tblname; on master, to avoid writing to the bin logs and replicating the query to the slaves.

Then you have to run OPTIMIZE TABLE tblname; individually in every slave.

Here is more detailed explanation of what happens: https://dev.mysql.com/doc/refman/5.7/en/optimize-table.html#optimize-table-innodb-details

It says:

an exclusive table lock is only taken briefly during the prepare phase
and the commit phase of the operation.

So there is almost no lock time.

There are edge cases to worry about that could cause downtime (due to table lock caused by copy method over online DDL), some of those are listed in the link above.

Another thing to consider is disk space. With InnoDB I observed it recreates the table. So if the contents of your table add up to 100GB, you would need at least an extra 100GB of free space to run the command successfully.

As Bill suggested it may be a safer alternative to use the pt-online-schema-change, however if you cant use it, with careful operation seems no replication lag and no downtime is possible.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x