Galera crash, [ERROR] mysqld got signal 11

1,412 views
Skip to first unread message

Pierre Schwartz

unread,
Apr 24, 2017, 12:13:54 PM4/24/17
to codership
Hi everyone,

I am facing a crash on my 3 nodes galera cluster (working on debian 8)  : 

Suddenly one node stops working (once a week or one in 2 weeks and not always the same node). The 2 other nodes continue working well, there is no data loss nor global unavailability

Here is the syslog message during the crash. 
The memory equation is ok, the server has much more that 1 GB free memory even at crashtime.
Could it be a misconfiguration (on mysql ? on debian ?)

Anyone has already faced this kind of problem ?
Thanks for your help

Apr 24 17:02:45 mysqld[7085]: 170424 17:02:45 [ERROR] mysqld got signal 11 ;
Apr 24 17:02:45 mysqld[7085]: This could be because you hit a bug. It is also possible that this binary
Apr 24 17:02:45 mysqld[7085]: or one of the libraries it was linked against is corrupt, improperly built,
Apr 24 17:02:45 mysqld[7085]: or misconfigured. This error can also be caused by malfunctioning hardware.
Apr 24 17:02:45 mysqld[7085]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Apr 24 17:02:45 mysqld[7085]: We will try our best to scrape up some info that will hopefully help
Apr 24 17:02:45 mysqld[7085]: diagnose the problem, but since we have already crashed,
Apr 24 17:02:45 mysqld[7085]: something is definitely wrong and this may fail.
Apr 24 17:02:45 mysqld[7085]: Server version: 10.1.22-MariaDB-1~jessie
Apr 24 17:02:45 mysqld[7085]: key_buffer_size=16777216
Apr 24 17:02:45 mysqld[7085]: read_buffer_size=2097152
Apr 24 17:02:45 mysqld[7085]: max_used_connections=111
Apr 24 17:02:45 mysqld[7085]: max_threads=153
Apr 24 17:02:45 mysqld[7085]: thread_count=5
Apr 24 17:02:45 mysqld[7085]: It is possible that mysqld could use up to
Apr 24 17:02:45 mysqld[7085]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 959537 K  bytes of memory
Apr 24 17:02:45 mysqld[7085]: Hope that's ok; if not, decrease some variables in the equation.
Apr 24 17:02:45 mysqld[7085]: Thread pointer: 0x7ff210049008
Apr 24 17:02:45 mysqld[7085]: Attempting backtrace. You can use the following information to find out
Apr 24 17:02:45 mysqld[7085]: where mysqld died. If you see no messages after this, something went
Apr 24 17:02:45 mysqld[7085]: terribly wrong...
Apr 24 17:02:45 mysqld[7085]: stack_bottom = 0x7ff23372c218 thread_stack 0x48400
Apr 24 17:02:45 mysqld[7085]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7ff2331583ee]
Apr 24 17:02:45 mysqld[7085]: /usr/sbin/mysqld(handle_fatal_signal+0x2fd)[0x7ff232c9241d]
Apr 24 17:02:45 mysqld[7085]: /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7ff2322ba890]
Apr 24 17:02:45 mysqld[7085]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8f)[0x7ff232d4820f]
Apr 24 17:02:45 mysqld[7085]: /usr/sbin/mysqld(binlog_background_thread+0x292)[0x7ff232d48822]
Apr 24 17:02:45 mysqld[7085]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8064)[0x7ff2322b3064]
Apr 24 17:02:45 mysqld[7085]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7ff23045d62d]
Apr 24 17:02:45 mysqld[7085]: Trying to get some variables.
Apr 24 17:02:45 mysqld[7085]: Some pointers may be invalid and cause the dump to abort.
Apr 24 17:02:45 mysqld[7085]: Query (0x0):
Apr 24 17:02:45 mysqld[7085]: Connection ID (thread ID): 3
Apr 24 17:02:45 mysqld[7085]: Status: NOT_KILLED
Apr 24 17:02:45 mysqld[7085]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
Apr 24 17:02:45 mysqld[7085]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Apr 24 17:02:45 mysqld[7085]: information that should help you find out what is causing the crash.
Apr 24 17:02:45 systemd[1]: mariadb.service: main process exited, code=killed, status=11/SEGV
Apr 24 17:02:45 systemd[1]: Unit mariadb.service entered failed state.
Apr 24 17:02:51 systemd[1]: mariadb.service holdoff time over, scheduling restart.
Apr 24 17:02:51 systemd[1]: Stopping MariaDB database server...
Apr 24 17:02:51 systemd[1]: Starting MariaDB database server...
Apr 24 17:02:51 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] Using unique option prefix 'key_buffer' is error-prone and can break in the future. Please use the full name 'key_buffer_size' instead.
Apr 24 17:02:51 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] /usr/sbin/mysqld (mysqld 10.1.22-MariaDB-1~jessie) starting as process 6767 ...
Apr 24 17:02:56 mysqld[6538]: Segmentation fault
Apr 24 17:02:56 mysqld[6538]: WSREP: Failed to start mysqld for wsrep recovery: '2017-04-24 17:02:51 139714435332032 [Note] InnoDB: Using mutexes to ref count buffer pool pages
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: The InnoDB memory heap is disabled
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: Compressed tables use zlib 1.2.8
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: Using Linux native AIO
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: Using SSE crc32 instructions
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: Initializing buffer pool, size = 512.0M
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:51 139714435332032 [Note] InnoDB: Completed initialization of buffer pool
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:52 139714435332032 [Note] InnoDB: Highest supported file format is Barracuda.
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:52 139714435332032 [Note] InnoDB: Starting crash recovery from checkpoint LSN=67523500291
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:52 139714435332032 [Note] InnoDB: Restoring possible half-written data pages from the doublewrite buffer...
Apr 24 17:02:56 mysqld[6538]: InnoDB: Transaction 67688453 was in the XA prepared state.
Apr 24 17:02:56 mysqld[6538]: InnoDB: 3 transaction(s) which must be rolled back or cleaned up
Apr 24 17:02:56 mysqld[6538]: InnoDB: in total 2 row operations to undo
Apr 24 17:02:56 mysqld[6538]: InnoDB: Trx id counter is 67688960
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:54 139714435332032 [Note] InnoDB: Starting final batch to recover 2664 pages from redo log
Apr 24 17:02:56 mysqld[6538]: InnoDB: Last MySQL binlog file position 0 104857662, file name /var/log/mysql/mariadb-bin.000576
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 139714435332032 [Note] InnoDB: 128 rollback segment(s) are active.
Apr 24 17:02:56 mysqld[6538]: InnoDB: Starting in background the rollback of uncommitted transactions
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 7f118b3fe700  InnoDB: Rolling back trx with id 67688459, 1 rows to undo
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 139714435332032 [Note] InnoDB: Waiting for purge to start
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 139713327392512 [Note] InnoDB: Rollback of trx with id 67688459 completed
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 7f118b3fe700  InnoDB: Rolling back trx with id 67688456, 1 rows to undo
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 139713327392512 [Note] InnoDB: Rollback of trx with id 67688456 completed
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 7f118b3fe700  InnoDB: Rollback of non-prepared transactions completed
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:55 139714435332032 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.35-80.0 started; log sequence number 67532540295
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 139714435332032 [Warning] InnoDB: Skipping buffer pool dump/restore during wsrep recovery.
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 139714435332032 [Note] Plugin 'FEEDBACK' is disabled.
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 139714435332032 [Note] Recovering after a crash using /var/log/mysql/mariadb-bin
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 139714435332032 [Note] Starting crash recovery...
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0  InnoDB: Starting recovery for XA transactions...
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0  InnoDB: Transaction 67688453 in prepared state after recovery
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0  InnoDB: Transaction contains changes to 1 rows
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0  InnoDB: 1 transactions in prepared state after recovery
Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 139714435332032 [Note] Found 1 prepared transaction(s) in InnoDB
Apr 24 17:02:56 mysqld[6538]: 170424 17:02:56 [ERROR] mysqld got signal 11 ;
Apr 24 17:02:56 mysqld[6538]: This could be because you hit a bug. It is also possible that this binary
Apr 24 17:02:56 mysqld[6538]: or one of the libraries it was linked against is corrupt, improperly built,
Apr 24 17:02:56 mysqld[6538]: or misconfigured. This error can also be caused by malfunctioning hardware.
Apr 24 17:02:56 mysqld[6538]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Apr 24 17:02:56 mysqld[6538]: We will try our best to scrape up some info that will hopefully help
Apr 24 17:02:56 mysqld[6538]: diagnose the problem, but since we have already crashed,
Apr 24 17:02:56 mysqld[6538]: something is definitely wrong and this may fail.
Apr 24 17:02:56 mysqld[6538]: Server version: 10.1.22-MariaDB-1~jessie
Apr 24 17:02:56 mysqld[6538]: key_buffer_size=16777216
Apr 24 17:02:56 mysqld[6538]: read_buffer_size=2097152
Apr 24 17:02:56 mysqld[6538]: max_used_connections=0
Apr 24 17:02:56 mysqld[6538]: max_threads=153
Apr 24 17:02:56 mysqld[6538]: thread_count=0
Apr 24 17:02:56 mysqld[6538]: It is possible that mysqld could use up to
Apr 24 17:02:56 mysqld[6538]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 959537 K  bytes of memory
Apr 24 17:02:56 mysqld[6538]: Hope that's ok; if not, decrease some variables in the equation.
Apr 24 17:02:56 mysqld[6538]: Thread pointer: 0x0
Apr 24 17:02:56 mysqld[6538]: Attempting backtrace. You can use the following information to find out
Apr 24 17:02:56 mysqld[6538]: where mysqld died. If you see no messages after this, something went
Apr 24 17:02:56 mysqld[6538]: terribly wrong...
Apr 24 17:02:56 mysqld[6538]: stack_bottom = 0x0 thread_stack 0x48400
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x7f11ccdfd3ee]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(handle_fatal_signal+0x2fd)[0x7f11cc93741d]
Apr 24 17:02:56 mysqld[6538]: /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890)[0x7f11cbf5f890]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(wsrep_on+0x18)[0x7f11cc8d2bd8]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(+0x6fcdb1)[0x7f11cca8cdb1]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(+0x5a8cff)[0x7f11cc938cff]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(_Z24plugin_foreach_with_maskP3THDPFcS0_P13st_plugin_intPvEijS3_+0x2bb)[0x7f11cc7ba8fb]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(_Z10ha_recoverP7st_hash+0xc6)[0x7f11cc93ab36]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG7recoverEP11st_log_infoPKcP11st_io_cacheP28Format_description_log_eventb+0x4df)[0x7f11cc9edeef]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG18do_binlog_recoveryEPKcb+0x340)[0x7f11cc9ee4a0]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG4openEPKc+0x30)[0x7f11cc9f1b50]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(+0x38b535)[0x7f11cc71b535]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(_Z11mysqld_mainiPPc+0xa3b)[0x7f11cc71f28b]
Apr 24 17:02:56 mysqld[6538]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f11ca03bb45]
Apr 24 17:02:56 mysqld[6538]: /usr/sbin/mysqld(+0x382fdd)[0x7f11cc712fdd]
Apr 24 17:02:56 mysqld[6538]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Apr 24 17:02:56 mysqld[6538]: information that should help you find out what is causing the crash.'
Apr 24 17:02:56 systemd[1]: mariadb.service: control process exited, code=exited status=1
Apr 24 17:02:56 systemd[1]: Failed to start MariaDB database server.
Apr 24 17:02:56 systemd[1]: Unit mariadb.service entered failed state.

Jörg Brühe

unread,
Apr 24, 2017, 2:09:10 PM4/24/17
to codership, Pierre Schwartz
Hi Pierre, all!


On 24.04.2017 17:55, Pierre Schwartz wrote:
> Hi everyone,
>
> I am facing a crash on my 3 nodes galera cluster (working on debian 8) :
>
> Suddenly one node stops working (once a week or one in 2 weeks and not
> always the same node). The 2 other nodes continue working well, there is
> no data loss nor global unavailability
>
> Here is the syslog message during the crash.
> The memory equation is ok, the server has much more that 1 GB free
> memory even at crashtime.
> Could it be a misconfiguration (on mysql ? on debian ?)

From the backtrace, it doesn't look like a configuration issue.
Rather, it seems your application(s) is/are trying an XA transaction,
and that is incompatible with Galera:

>
> [[...]]
> Apr 24 17:02:45 mysqld[7085]:
> /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG13mark_xid_doneEmb+0x8f)[0x7ff232d4820f]

The function "mark_xid_done()" is dealing with binlog segments that
contain information about an "external" or "XA" transaction; AIUI, it
should not be called in a Galera cluster because there should be no
segment with an "XA" transaction.

> [[...]]
> Apr 24 17:02:56 mysqld[6538]: InnoDB: Transaction 67688453 was in the XA
> prepared state.

To me, this looks like an error condition.

> [[...]]
> Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0 InnoDB:
> Starting recovery for XA transactions...
> Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0 InnoDB:
> Transaction 67688453 in prepared state after recovery
> Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0 InnoDB:
> Transaction contains changes to 1 rows
> Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 7f11cd49b7c0 InnoDB:
> 1 transactions in prepared state after recovery
> Apr 24 17:02:56 mysqld[6538]: 2017-04-24 17:02:56 139714435332032 [Note]
> Found 1 prepared transaction(s) in InnoDB
> Apr 24 17:02:56 mysqld[6538]: 170424 17:02:56 [ERROR] mysqld got signal 11 ;

And this looks like a crash when handling that XA transaction.

You should look into your applications and find the misbehaving one.


HTH,
Jörg

--
Joerg Bruehe, Senior MySQL Support Engineer, joerg....@fromdual.com
FromDual GmbH, Rebenweg 6, CH - 8610 Uster; phone +41 44 500 58 26
Geschäftsführer: Oliver Sennhauser
Handelsregister-Eintrag: CH-020.4.044.539-3

Reply all
Reply to author
Forward
0 new messages