Can't kill in progress DDL statement?

1,209 views
Skip to first unread message

Doug Barth

unread,
Aug 12, 2013, 4:42:01 PM8/12/13
to codersh...@googlegroups.com
Hi,

I'm running Percona XtraDB Cluster, and in my testing, I'm finding that I'm unable to kill a connection if a DDL statement (in my case, adding a column to a large table) is in progress. Here's the versions of the server I'm running.

ii  percona-xtradb-cluster-client-5.5 5.5.31-23.7.5-438.lucid                         Percona Server database client binaries
ii  percona-xtradb-cluster-common-5.5 5.5.31-23.7.5-438.lucid                         Percona Server database common files (e.g. /etc/mysql/my.cnf)
ii  percona-xtradb-cluster-galera-2.x 152.lucid                                       Galera components of Percona XtraDB Cluster
ii  percona-xtradb-cluster-server-5.5 5.5.31-23.7.5-438.lucid                         Percona Server database server binaries

If I try to kill the connection using "KILL <thread id>" from another MySQL connection, I get the following error: ERROR 1095 (HY000): You are not owner of thread 757080. If I try to cancel the query from within the same connection (ie. pressing ^C in the MySQL client console), I get the prompt back but the query continues in the background.

Is this expected behavior? Since I'm also seeing long running DDL statements locking the cluster to writes, what is the recommended way to recover if someone accidentally starts a DDL statement and we can't wait for it to finish?

Alex Yurchenko

unread,
Aug 20, 2013, 7:42:38 AM8/20/13
to codersh...@googlegroups.com
I would assume it is. Just think of all the other nodes which are
executing the same DDL: you kill it on one node, but the rest go on - a
great way to get database inconsistency - and it is not trivial to
recover from that. (no, just killing that statement individually on each
node won't work, it gotta be a synchronous operation)

> Since I'm also seeing long running DDL
> statements locking the cluster to writes, what is the recommended way
> to
> recover if someone accidentally starts a DDL statement and we can't
> wait
> for it to finish?

Good question. At the moment I can't think of anything. In my opinion it
would be easier (and more productive) to support concurrent DDLs rather
than killing them.

Michael Taggart

unread,
Aug 26, 2013, 6:19:52 PM8/26/13
to codersh...@googlegroups.com
Hi Doug,

DDL statements are replicated quite nicely in Galera/XtraDB Cluster (I currently run XtraDB Cluster). However, the one you have to be the most careful with is an ALTER. That is because many times ALTERs take quite a bit of time and in Galera they become a blocking process for the whole cluster as every node has to execute them at the exact same time.

The correct way to handle any table ALTER is to use pt-online-schema-change which is available from the percona toolkit. (Ubuntu: apt-get install percona-toolkit). Here's how you would run adding a column with it:

pt-online-schema-change --alter "ADD COLUMN foo INT(11)" D=foo_db,t=foo_table

What the tool does is create a perfect copy of the structure of foo_db.foo_table and then run the ALTER on the empty table with no data. It then adds triggers for any new rows created from this point on and begins dumping old data from foo_db.foo_table to it and then renames the old table, renames the new table to foo_db.foo_table and then subsequently drops the old table.

There are a ton of options. You can read all about them here: http://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-change.html

Also, note that this process takes much longer than ALTER statements on a vanilla copy of MySQL as all the old data needs to be imported to the new one before the process is complete. For small tables it's not a big deal, but big tables can take some time.

Our workflow for working with our cluster is to do CREATE and DROP statements direct to the cluster and all ALTERs run through pt-online-schema-change.

Hope this helps!

Mike
Reply all
Reply to author
Forward
0 new messages