Fwd: Re: [codership-team] How to recover a crashed and damage node?

31 views
Skip to first unread message

j

unread,
Jan 22, 2015, 12:07:40 PM1/22/15
to stefan.mich...@googlemail.com, codersh...@googlegroups.com
When you restarted the node you probably made it a new cluster by setting wsrep_cluster_address to gcomm://

Shut it off and re-add it to the cluster; but since it has a new cluster UUID you probably need to do a full SST.

If all else fails, you do not need to remove Galera; just blow out the data directory and run mysql_install_db and redo the SST. Of course, that is if you do not care about the data....

On 2015-01-22 09:14, S. Guenther wrote:

Hello,
 
we are running a 3 node cluster with Galera 23.2.7.
Today one of the nodes crashed leaving the following messages in /var/log/mysql/error.log:
 
150122 13:08:03 [ERROR] Slave SQL: Column 0 of table 'test.t1' cannot be converted from type 'varchar(50)' to type 'varchar(50)', Error_code: 1677
150122 13:08:03 [Warning] WSREP: RBR event 2 Write_rows apply warning: 3, 37929
150122 13:08:03 [ERROR] WSREP: Failed to apply trx: source: 3c231ee7-4d3b-11e3-a851-b218289b8b52 version: 2 local: 0 state: APPLYING flags: 1 conn_id: 250163 trx_id: 1
3748700 seqnos (l: 35416, g: 37929, s: 37928, d: 37907, ts: 1421928483920236322)
150122 13:08:03 [ERROR] WSREP: Failed to apply app buffer: seqno: 37929, status: WSREP_FATAL
         at galera/src/replicator_smm.cpp:apply_wscoll():52
         at galera/src/replicator_smm.cpp:apply_trx_ws():118
150122 13:08:03 [ERROR] WSREP: Node consistency compromized, aborting...
150122 13:08:03 [Note] WSREP: Closing send monitor...
150122 13:08:03 [Note] WSREP: Closed send monitor.
150122 13:08:03 [Note] WSREP: gcomm: terminating thread
150122 13:08:03 [Note] WSREP: gcomm: joining thread
150122 13:08:03 [Note] WSREP: gcomm: closing backend
....
150122 13:08:03 [Note] WSREP: /usr/sbin/mysqld: Terminated.
Aborted (core dumped)
 
After a restart of mysqld this node has a different cluster_conf_id and state_uudi compared to the other tow nodes.
 
Additionally, three more tables are damaged:
 
150122 13:57:05 [ERROR] /usr/sbin/mysqld: Table './mysql/proc' is marked as crashed and should be repaired
150122 13:57:05 [Warning] Checking table:   './mysql/proc'
150122 13:57:10 [ERROR] /usr/sbin/mysqld: Table './ccs/FTPGROUP' is marked as crashed and should be repaired
150122 13:57:10 [Warning] Checking table:   './ccs/FTPGROUP'
150122 13:57:10 [ERROR] /usr/sbin/mysqld: Table './ccs/FTPUSER' is marked as crashed and should be repaired
150122 13:57:10 [Warning] Checking table:   './ccs/FTPUSER'
 
Does it make sense to try to repair this node or should I deinstall galera, remove /var/lib/mysql and start from the beginning again?
 
Thanks for any hints and suggestions,
 
Stefan

 

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages