SELECT queries hitting unexpected deadlock error

1,258 views
Skip to first unread message

Munazir

unread,
Jun 29, 2015, 1:08:30 PM6/29/15
to codersh...@googlegroups.com

Hi,

Recently we started using MySQL Galera Cluster with 3 nodes as full multi-master (read and writes to all 3 nodes). Below is version for MySQL wsrep and Galera.

On Centos 6.6 with below version

wsrep_provider_version       | 3.10(r8182fa6)

MySQmysql-wsrep-server-5.6-5.6.23-25.10.el6.x86_64
galera-3-25.3.10-2.el6.x86_64

We noticed Deadlock error below is some queries with Deadlock

SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction (SQL: update `sessions` set `payload` = YTo0OntzOjY6Il90b2tlbiI7czo0MDoiVTVhYzN6d1VjNmxlYXNmSU9VRDBqMHp2cEtIZDNnRFBaR1FMR0YyWSI7czo5OiJfcHJldmlvdXMiO2E6MTp7czozOiJ1cmwiO3M6NDA6Imh0dHA6Ly92aXNhLm11c2FuZWQuZ292LnNhL2F1dGgvcmVnaXN0ZXIiO31zOjU6ImZsYXNoIjthOjI6e3M6Mzoib2xkIjthOjA6e31zOjM6Im5ldyI7YTowOnt9fXM6OToiX3NmMl9tZXRhIjthOjM6e3M6MToidSI7aToxNDM1NTg3MDUxO3M6MToiYyI7aToxNDM1NTg2ODgyO3M6MToibCI7czoxOiIwIjt9fQ==, `last_activity` = 1435587051 where `id` = 151bebdaf8ea15fa18b44843618a1dfd0131dba2

'SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction (SQL: select * from `users` where `users`.`deleted_at` is null and `id_number` = 1031223868 limit 1)'

'SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction (SQL: delete from `sessions` where `id` = 068470f0a9c481c8b5b6d13cf6e8bc1f24c94fb5)'

'SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction (SQL: update `sessions` set `payload` = YTo0OntzOjY6Il90b2tlbiI7czo0MDoiaUR3aFVLMlZxYzBVZ1ZXUzhLWW5wTW9rS2xjeGJhVnVnazU5enB1RiI7czo5OiJfcHJldmlvdXMiO2E6MTp7czozOiJ1cmwiO3M6NDA6Imh0dHA6Ly92aXNhLm11c2FuZWQuZ292LnNhL2F1dGgvcmVnaXN0ZXIiO31zOjU6ImZsYXNoIjthOjI6e3M6Mzoib2xkIjthOjA6e31zOjM6Im5ldyI7YTowOnt9fXM6OToiX3NmMl9tZXRhIjthOjM6e3M6MToidSI7aToxNDM1NTczMzgzO3M6MToiYyI7aToxNDM1NTczMzY0O3M6MToibCI7czoxOiIwIjt9fQ==, `last_activity` = 1435573383 where `id` = eed98d6f798915365c867923bc7cdfebd475f49f)'


Please update me what steps I can take to get rid of this Deadlock.

We are using HaProxy for Loadbalancing with mysqlcluster check.

Regards
Munazir

alexey.y...@galeracluster.com

unread,
Jun 29, 2015, 2:06:38 PM6/29/15
to Munazir, codersh...@googlegroups.com
Hi,

Deadlocks like these should be expected in a multi-master operation,
especially using TCP-level connection balancer (like haproxy). Setting
wsrep_retry_autocommit to a higher value may help in case of autocommit
queries, but otherwise transactions must simply be retried.

Regards,
Alex

Munazir

unread,
Jun 29, 2015, 4:04:50 PM6/29/15
to codersh...@googlegroups.com, mdmu...@gmail.com
Hi Alex,

Thanks for your quick reply.

I already put wsrep_retry_autocommit=4, now i change it to 6.

What is best value you suggest to be safe from Deadlock.

Regards
Munazir

alexey.y...@galeracluster.com

unread,
Jun 29, 2015, 4:17:37 PM6/29/15
to Munazir, codersh...@googlegroups.com
The higher the value the higher is the probability that the *autocommit*
query will pass. But in no case it guarantees 100%.

The real fix must be done at the application level changing the query
patterns.

Munazir

unread,
Jun 29, 2015, 8:03:14 PM6/29/15
to codersh...@googlegroups.com, f.alo...@tamkeen.it, mdmu...@gmail.com
Dear Alex,

After increase value to 6 on all 3 nodes. One of the node i.e. MySQLND01 get stop with below logs in .err


*** Priority TRANSACTION:

TRANSACTION 510056, ACTIVE 0 sec inserting

mysql tables in use 1, locked 1

1 lock struct(s), heap size 360, 0 row lock(s)

MySQL thread id 2, OS thread handle 0x7f2255c7e700, query id 34691 Write_rows_log_event::write_row(220947)

*** Victim TRANSACTION:

TRANSACTION 510055, ACTIVE 0 sec, thread declared inside InnoDB 4548

mysql tables in use 1, locked 1

24 lock struct(s), heap size 2936, 474 row lock(s), undo log entries 1

MySQL thread id 2710, OS thread handle 0x7f2254802700, query id 34689 10.60.32.81 dlv_admin query end

delete from `sessions` where `last_activity` <= '1435609593'

*** WAITING FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 100 page no 99 n bits 96 index `PRIMARY` of table `mol_dlvdb`.`sessions` trx id 510055 lock_mode X

Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0

0: len 8; hex 73757072656d756d; asc supremum;;

Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0

0: len 30; hex 333730323530343032343761613230343561303436616665343761323166; asc 37025040247aa2045a046afe47a21f; (total 40 bytes);

1: len 6; hex 00000007c7f8; asc ;;

2: len 7; hex f2000001900110; asc ;;

3: len 30; hex 59546f7a4f6e747a4f6a5936496c39306232746c62694937637a6f304d44; asc YTozOntzOjY6Il90b2tlbiI7czo0MD; (total 264 bytes);

4: len 4; hex d591ad70; asc p;;

Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0

0: len 30; hex 326436613336383266616261306563653462373133623432333434663363; asc 2d6a3682faba0ece4b713b42344f3c; (total 40 bytes);

1: len 6; hex 00000007c5c6; asc ;;

2: len 7; hex 4d00000163095b; asc M c [;;

3: len 30; hex 59546f314f6e747a4f6a6336496d316c63334e685a3255694f3245364d44; asc YTo1OntzOjc6Im1lc3NhZ2UiO2E6MD; (total 380 bytes);

4: len 4; hex d591ad23; asc #;;



2015-06-29 23:41:33 31983 [Note] WSREP: cluster conflict due to high priority abort for threads:

2015-06-29 23:41:33 31983 [Note] WSREP: Winning thread:

THD: 2, mode: applier, state: executing, conflict: no conflict, seqno: 220947

SQL: (null)

2015-06-29 23:41:33 31983 [Note] WSREP: Victim thread:

THD: 2710, mode: local, state: committing, conflict: no conflict, seqno: 220948

SQL: delete from `sessions` where `last_activity` <= '1435609593'

2015-06-29 23:41:33 31983 [Note] WSREP: BF kill (1, seqno: 220947), victim: (2710) trx: 510055

2015-06-29 23:41:33 31983 [Note] WSREP: Aborting query: delete from `sessions` where `last_activity` <= '1435609593'

2015-06-29 23:41:33 31983 [Note] WSREP: kill trx QUERY_COMMITTING for 510055

2015-06-29 23:41:33 31983 [Note] WSREP: thd 2710 seqno 220948 BF aborted by provider, will replay

2015-06-29 23:41:33 31983 [Note] WSREP: replaying increased: 1, thd: 2710

2015-06-29 23:41:33 31983 [Note] WSREP: commit failed for reason: 4 2710 delete from `sessions` where `last_activity` <= '1435609593'

2015-06-29 23:41:33 31983 [Note] WSREP: conflict state: 4

2015-06-29 23:41:33 31983 [Note] WSREP: cleanup transaction for LOCAL_STATE: delete from `sessions` where `last_activity` <= '1435609593'

2015-06-29 23:41:33 31983 [Note] WSREP: replay trx: delete from `sessions` where `last_activity` <= '1435609593' -1

2015-06-29 23:41:33 31983 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 220948)

2015-06-29 23:41:33 31983 [Warning] WSREP: RBR event 3 Delete_rows apply warning: 1615, 220948

2015-06-29 23:41:33 31983 [Warning] WSREP: Failed to apply app buffer: seqno: 220948, status: 1

at galera/src/trx_handle.cpp:apply():351

Retrying 2th time

2015-06-29 23:41:33 31983 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 220948)

2015-06-29 23:41:33 31983 [Warning] WSREP: RBR event 3 Delete_rows apply warning: 1615, 220948

2015-06-29 23:41:33 31983 [Warning] WSREP: Failed to apply app buffer: seqno: 220948, status: 1

at galera/src/trx_handle.cpp:apply():351

Retrying 3th time

2015-06-29 23:41:33 31983 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 220948)

2015-06-29 23:41:33 31983 [Warning] WSREP: RBR event 3 Delete_rows apply warning: 1615, 220948

2015-06-29 23:41:33 31983 [Warning] WSREP: Failed to apply app buffer: seqno: 220948, status: 1

at galera/src/trx_handle.cpp:apply():351


Retrying 4th time

2015-06-29 23:41:33 31983 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 220948)

2015-06-29 23:41:33 31983 [Warning] WSREP: RBR event 3 Delete_rows apply warning: 1615, 220948

2015-06-29 23:41:33 31983 [Warning] WSREP: failed to replay trx: source: 8b0b9912-1e99-11e5-ae06-9b0024a89ce3 version: 3 local: 1 state: REPLAYING flags: 1 conn_id: 2710 trx_id: 510055 seqnos (l: 6830, g: 220948, s: 220946, d: 220932, ts: 985313412716699)

2015-06-29 23:41:33 31983 [Warning] WSREP: Failed to apply trx 220948 4 times

2015-06-29 23:41:33 31983 [ERROR] WSREP: trx_replay failed for: 6, query: void

2015-06-29 23:41:33 31983 [ERROR] Aborting

2015-06-29 23:41:35 31983 [Note] WSREP: cleanup transaction for LOCAL_STATE: select * from `sessions` where `id` = '0f8fc6ee26fd34d93feb29800269922b1b7d93f2' limit 1

2015-06-29 23:41:35 31983 [Note] WSREP: killing local connection: 2712

2015-06-29 23:41:35 31983 [Note] WSREP: waiting for client connections to close: 4

2015-06-29 23:41:35 31983 [Note] WSREP: Closing send monitor...

2015-06-29 23:41:35 31983 [Note] WSREP: Closed send monitor.

2015-06-29 23:41:35 31983 [Note] WSREP: gcomm: terminating thread

2015-06-29 23:41:35 31983 [Note] WSREP: gcomm: joining thread

2015-06-29 23:41:35 31983 [Note] WSREP: gcomm: closing backend

2015-06-29 23:41:35 31983 [Note] WSREP: view(view_id(NON_PRIM,1b7d2027,85) memb {

8b0b9912,0

} joined {

} left {

} partitioned {

1b7d2027,0

b0cbf49c,0

})

2015-06-29 23:41:35 31983 [Note] WSREP: view((empty))

2015-06-29 23:41:35 31983 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

2015-06-29 23:41:35 31983 [Note] WSREP: gcomm: closed

2015-06-29 23:41:35 31983 [Note] WSREP: Flow-control interval: [16, 16]

2015-06-29 23:41:35 31983 [Note] WSREP: Received NON-PRIMARY.

2015-06-29 23:41:35 31983 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 220956)

2015-06-29 23:41:35 31983 [Note] WSREP: Received self-leave message.

2015-06-29 23:41:35 31983 [Note] WSREP: Flow-control interval: [0, 0]

2015-06-29 23:41:35 31983 [Note] WSREP: Received SELF-LEAVE. Closing connection.

2015-06-29 23:41:35 31983 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 220956)

2015-06-29 23:41:35 31983 [Note] WSREP: RECV thread exiting 0: Success


2015-06-29 23:41:35 31983 [Note] WSREP: recv_thread() joined.

2015-06-29 23:41:35 31983 [Note] WSREP: Closing replication queue.

2015-06-29 23:41:35 31983 [Note] WSREP: Closing slave action queue.

2015-06-29 23:41:35 31983 [Note] WSREP: Service disconnected.

2015-06-29 23:41:35 31983 [Note] WSREP: closing wsrep thread 1

2015-06-29 23:41:35 31983 [Note] WSREP: closing wsrep thread 2

2015-06-29 23:41:35 31983 [Note] WSREP: WSREP rollback thread wakes for signal

2015-06-29 23:41:35 31983 [Note] WSREP: WSREP rollback thread has empty abort queue

2015-06-29 23:41:35 31983 [Note] WSREP: rollbacker thread exiting

2015-06-29 23:41:35 31983 [Note] WSREP: avoiding thread re-use for applier, thd: 1

2015-06-29 23:41:36 31983 [Note] WSREP: Some threads may fail to exit.

2015-06-29 23:41:36 31983 [Note] Binlog end

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'partition'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'BLACKHOLE'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'ARCHIVE'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_TABLESPACES'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN_COLS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_FOREIGN'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_FIELDS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_COLUMNS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_INDEXES'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_TABLESTATS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_SYS_TABLES'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_FT_INDEX_TABLE'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_FT_INDEX_CACHE'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_FT_CONFIG'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_FT_BEING_DELETED'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_FT_DELETED'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_FT_DEFAULT_STOPWORD'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_METRICS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_BUFFER_POOL_STATS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE_LRU'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_BUFFER_PAGE'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX_RESET'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_CMP_PER_INDEX'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_CMPMEM_RESET'


2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_CMPMEM'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_CMP_RESET'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_CMP'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_LOCK_WAITS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_LOCKS'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'INNODB_TRX'

2015-06-29 23:41:36 31983 [Note] Shutting down plugin 'InnoDB'

2015-06-29 23:41:36 31983 [Note] InnoDB: FTS optimize thread exiting.

2015-06-29 23:41:36 31983 [Note] InnoDB: Starting shutdown...

20:41:38 UTC - mysqld got signal 11 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

key_buffer_size=268435456

read_buffer_size=104857600

max_used_connections=4

max_threads=500

thread_count=3

connection_count=3

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 51725237 K bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x91efa10

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 7f22ac450df8 thread_stack 0x40000

/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8fc4a5]

/usr/sbin/mysqld(handle_fatal_signal+0x494)[0x67d854]

/lib64/libpthread.so.0[0x394340f710]

/usr/sbin/mysqld[0x69bd9b]

/usr/sbin/mysqld[0x69cace]

/usr/sbin/mysqld[0x69cdb6]

/usr/sbin/mysqld[0x68f874]

/usr/sbin/mysqld(_Z16acl_authenticateP3THDj+0x1f7)[0x6a87b7]

/usr/sbin/mysqld[0x6cc66a]

/usr/sbin/mysqld(_Z16login_connectionP3THD+0x45)[0x6cc895]

/usr/sbin/mysqld(_Z22thd_prepare_connectionP3THD+0x24)[0x6cc924]

/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x139)[0x6cd619]

/usr/sbin/mysqld(handle_one_connection+0x47)[0x6cd7d7]

/usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0xb1fe7a]

/lib64/libpthread.so.0[0x39434079d1]

/lib64/libc.so.6(clone+0x6d)[0x39430e89dd]

Trying to get some variables.

Some pointers may be invalid and cause the dump to abort.

Query (0): is an invalid pointer

Connection ID (thread ID): 2714

Status: NOT_KILLED


2015-06-29 23:57:41 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).

150629 23:57:44 mysqld_safe WSREP: Recovered position c8e1f7ef-0ce4-11e5-89a6-f3af952dc915:220947

2015-06-29 23:57:45 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.

2015-06-29 23:57:45 0 [Warning] Using unique option prefix thread_cache instead of thread_cache_size is deprecated and will be removed in a future release. Please use the full name instead.

2015-06-29 23:57:45 0 [Warning] option 'thread_cache_size': unsigned value 268435456 adjusted to 16384

2015-06-29 23:57:45 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.

2015-06-29 23:57:45 0 [Note] WSREP: wsrep_start_position var submitted: 'c8e1f7ef-0ce4-11e5-89a6-f3af952dc915:220947'

2015-06-29 23:57:45 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).


Then we started the node. Please can you update us, why this error came.

How to solve this so that in future this error not occur.

Sorry for taking your time.

Regards
Munazir
Reply all
Reply to author
Forward
0 new messages