mysqld got signal 11 ;

gmv...@gmail.com

unread,

May 3, 2016, 10:51:53 AM5/3/16

to Percona Discussion

Hi there,

I'm getting issues with one of the mysql node, mysql died with sig11.

It has happen twice already in last 4 days, I just activated this cluster.

I took binaries from Percona, it runs on centos 7

percona-release-0.1-3.noarch

Percona-XtraDB-Cluster-shared-56-5.6.28-25.14.1.el7.x86_64

Percona-XtraDB-Cluster-server-56-5.6.28-25.14.1.el7.x86_64

Percona-XtraDB-Cluster-galera-3-3.14-1.rhel7.x86_64

Percona-XtraDB-Cluster-client-56-5.6.28-25.14.1.el7.x86_64

percona-xtrabackup-2.3.4-1.el7.x86_64

Percona-XtraDB-Cluster-56-5.6.28-25.14.1.el7.x86_64

I appreciate any help from community to resolve that issue asap.

See below mysql error log.

2016-05-03 17:06:15 63421 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 12253348)

2016-05-03 17:06:15 63421 [Warning] WSREP: RBR event 4 Update_rows apply warning: 1615, 12253348

2016-05-03 17:06:15 63421 [Warning] WSREP: Failed to apply app buffer: seqno: 12253348, status: 1

at galera/src/trx_handle.cpp:apply():351

Retrying 2th time

2016-05-03 17:06:15 63421 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 12253348)

2016-05-03 17:06:15 63421 [Warning] WSREP: RBR event 4 Update_rows apply warning: 1615, 12253348

2016-05-03 17:06:15 63421 [Warning] WSREP: Failed to apply app buffer: seqno: 12253348, status: 1

at galera/src/trx_handle.cpp:apply():351

Retrying 3th time

2016-05-03 17:06:15 63421 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 12253348)

2016-05-03 17:06:15 63421 [Warning] WSREP: RBR event 4 Update_rows apply warning: 1615, 12253348

2016-05-03 17:06:15 63421 [Warning] WSREP: Failed to apply app buffer: seqno: 12253348, status: 1

at galera/src/trx_handle.cpp:apply():351

Retrying 4th time

2016-05-03 17:06:15 63421 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 12253348)

2016-05-03 17:06:15 63421 [Warning] WSREP: RBR event 4 Update_rows apply warning: 1615, 12253348

2016-05-03 17:06:15 63421 [Warning] WSREP: failed to replay trx: source: baeb3a39-10e7-11e6-af04-0779ee3ed09e version: 3 local: 1 state: REPLAYING flags: 1 conn_id: 741428 trx_id: 193526419 seqnos (l: 3516738, g: 12253348, s: 12253327, d: 12253286, ts: 1571301573680706)

2016-05-03 17:06:15 63421 [Warning] WSREP: Failed to apply trx 12253348 4 times

2016-05-03 17:06:15 63421 [ERROR] WSREP: trx_replay failed for: 6, schema: bmby, query: (null)

2016-05-03 17:06:15 63421 [ERROR] Aborting

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 736010

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741449

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 736126

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741456

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 739694

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741448

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741457

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741451

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741453

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741439

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741450

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 737398

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 735237

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 738846

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741167

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 740577

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741459

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741454

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 736206

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741441

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 740962

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 730915

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741455

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 738549

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 740834

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741445

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 739844

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741460

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741447

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741461

2016-05-03 17:06:17 63421 [Note] WSREP: killing local connection: 741452

2016-05-03 17:06:17 63421 [Note] WSREP: Closing send monitor...

2016-05-03 17:06:17 63421 [Note] WSREP: Closed send monitor.

2016-05-03 17:06:17 63421 [Note] WSREP: gcomm: terminating thread

2016-05-03 17:06:17 63421 [Note] WSREP: gcomm: joining thread

2016-05-03 17:06:17 63421 [Note] WSREP: gcomm: closing backend

2016-05-03 17:06:17 63421 [Note] WSREP: view(view_id(NON_PRIM,0538159b,21) memb {

baeb3a39,0

} joined {

} left {

} partitioned {

0538159b,0

b2b6b94d,0

})

2016-05-03 17:06:17 63421 [Note] WSREP: view((empty))

2016-05-03 17:06:17 63421 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

2016-05-03 17:06:17 63421 [Note] WSREP: gcomm: closed

2016-05-03 17:06:17 63421 [Note] WSREP: Flow-control interval: [16, 16]

2016-05-03 17:06:17 63421 [Note] WSREP: Received NON-PRIMARY.

2016-05-03 17:06:17 63421 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 12253393)

2016-05-03 17:06:17 63421 [Note] WSREP: Received self-leave message.

2016-05-03 17:06:17 63421 [Note] WSREP: Flow-control interval: [0, 0]

2016-05-03 17:06:17 63421 [Note] WSREP: Received SELF-LEAVE. Closing connection.

2016-05-03 17:06:17 63421 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 12253393)

2016-05-03 17:06:17 63421 [Note] WSREP: RECV thread exiting 0: Success

2016-05-03 17:06:17 63421 [Note] WSREP: recv_thread() joined.

2016-05-03 17:06:17 63421 [Note] WSREP: Closing replication queue.

2016-05-03 17:06:17 63421 [Note] WSREP: Closing slave action queue.

2016-05-03 17:06:17 63421 [Note] WSREP: Service disconnected.

2016-05-03 17:06:17 63421 [Note] WSREP: rollbacker thread exiting

2016-05-03 17:06:18 63421 [Note] WSREP: Some threads may fail to exit.

2016-05-03 17:06:18 63421 [Note] Binlog end

14:06:18 UTC - mysqld got signal 11 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Please help us make Percona XtraDB Cluster better by reporting any

bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=8388608

read_buffer_size=131072

max_used_connections=98

max_threads=602

thread_count=21

connection_count=13

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 248004 K bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x5d542e50

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 7f2947923d30 thread_stack 0x40000

/usr/sbin/mysqld(my_print_stacktrace+0x3b)[0x9087ab]

/usr/sbin/mysqld(handle_fatal_signal+0x471)[0x67a431]

/usr/lib64/libpthread.so.0(+0xf100)[0x7f37c0538100]

/usr/sbin/mysqld[0x69f81a]

/usr/sbin/mysqld[0x6a0631]

/usr/sbin/mysqld[0x6a08ae]

/usr/sbin/mysqld[0x68ce29]

/usr/sbin/mysqld[0x68c31c]

/usr/sbin/mysqld(_Z16acl_authenticateP3THDj+0x1e8)[0x6a0aa8]

/usr/sbin/mysqld[0x6d0c83]

/usr/sbin/mysqld(_Z16login_connectionP3THD+0x51)[0x6d2831]

/usr/sbin/mysqld(_Z22thd_prepare_connectionP3THD+0x24)[0x6d3094]

/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x153)[0x6d33e3]

/usr/sbin/mysqld(handle_one_connection+0x40)[0x6d3610]

/usr/sbin/mysqld(pfs_spawn_thread+0x143)[0x946823]

/usr/lib64/libpthread.so.0(+0x7dc5)[0x7f37c0530dc5]

/usr/lib64/libc.so.6(clone+0x6d)[0x7f37be78228d]

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort.

Query (0): is an invalid pointer

Connection ID (thread ID): 741496

Status: NOT_KILLED

You may download the Percona XtraDB Cluster operations manual by visiting

http://www.percona.com/software/percona-xtradb-cluster/. You may find information

in the manual which will help you identify the cause of the crash.

160503 17:06:19 mysqld_safe Number of processes running now: 0

160503 17:06:19 mysqld_safe WSREP: not restarting wsrep node automatically

160503 17:06:19 mysqld_safe mysqld from pid file /run/mysqld/mysql.pid ended

Thanks, Serge.

krunal....@percona.com

unread,

May 3, 2016, 11:44:36 PM5/3/16

to Percona Discussion

Hi,

I see you are hitting a bug mentioned here

https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1327763

It suggest that you are using prepare-statement that has a bug in old revision.

We have fixed the issue and it will be part of next PXC-5.6 release.

In meantime, it would be good if you can avoid prepare or avoid conflicts of prepare and applier.

You can read more about the issue in the bug. I have pasted commit-message that include the details.

Regards,

Krunal

gmv...@gmail.com

unread,

May 4, 2016, 10:11:14 AM5/4/16

to Percona Discussion

Hi

It looks like the fix you are talking about is not relevant here, I have validate with the customer, they don't use PREPARE in their php application.

What else it could be?

My cluster is running on a single node for now.

How I understood you don't have ETA for the next release and current codebase is still under development.

Thanks, Serge.

gmv...@gmail.com

unread,

May 4, 2016, 10:11:14 AM5/4/16

to Percona Discussion

Hi Kunal,

Than you for reply.

I'm not sure I can avoid prepare or conflicts, customer is running his application which is very hard to change.

Do you have ETA when fixed version will be available?

Is it possible to obtain the fix and recompile the current cluster?

I need a fix or workaround asap :(

Thanks, Serge.

On Tuesday, May 3, 2016 at 8:44:36 PM UTC-7, krunal....@percona.com wrote:

krunal....@percona.com

unread,

May 4, 2016, 11:12:11 PM5/4/16

to Percona Discussion

Hi,

May be you should re-check with customer.

grep "1615" ./include/mysqld_error.h

#define ER_NEED_REPREPARE 1615

Error code as per the log clearly indicate PREPARE problem and there are some 3 bugs that were reported by independent members in past.

All of them boiled to the same reason.

So I think there are good chances customer is using prepare. If not directly some internal application is causing it.