ERROR 1213 (40001) at line 1: Deadlock found when trying to get lock; try restarting transaction
#/bin/bash
# Demonstrate Galera deadlock
# Parameters: a PK value. Use different values for different clients
# How to reproduce:
# 1. Create database and the tables:
# create database demoApp;
# create table demoApp.cnf_p(pk int primary key);
# create table demoApp.cnf_c(pk int primary key, ppk int not null, constraint cnf_c_fk foreign key(ppk) references cnf_p(pk));
# 2. Run the script on different machines with different parameter.
pk=$1
if [ -z "$pk" ]; then echo "Give me a primary key (any integer)"; exit 2; fi
schema="demoApp"
mysql_exec="mysql -u root -p*** -A -Bse" # !!! change this to yours
ins_p="insert into $schema.cnf_p(pk) values($pk)"
ins_c="insert into $schema.cnf_c(pk, ppk) values($pk+1, $pk)"
del_p="delete from $schema.cnf_p where pk = $pk"
del_c="delete from $schema.cnf_c where pk = $pk+1"
ins_sql="set autocommit=off; $ins_p; $ins_c; commit;"
del_sql="set autocommit=off; $del_c; $del_p; commit;"
echo "Insert: $ins_sql"
echo "Delete: $del_sql"
while true; do
$($mysql_exec "$del_sql")
$($mysql_exec "$ins_sql")
done
*** Priority TRANSACTION:
TRANSACTION 8938085, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
3 lock struct(s), heap size 1184, 1 row lock(s), undo log entries 2
MySQL thread id 1, OS thread handle 0x7fbcfa845700, query id 1869 Write_rows_log_event::write_row(5334304)
*** Victim TRANSACTION:
TRANSACTION 8938084, ACTIVE (PREPARED) 0 sec flushing log
9 lock struct(s), heap size 1184, 6 row lock(s), undo log entries 2
MySQL thread id 101, OS thread handle 0x7fbcfa75e700, query id 1867 localhost root init
commit
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 515 page no 4 n bits 72 index `cnf_c_fk` of table `demoApp`.`cnf_c` trx table locks 3 total table locks 2 trx id 8938084 lock mode S lock hold time 0 wait time before grant 0
2016-05-13 14:45:03 140449633556224 [Note] WSREP: cluster conflict due to high priority abort for threads:
2016-05-13 14:45:03 140449633556224 [Note] WSREP: Winning thread:
THD: 1, mode: applier, state: executing, conflict: no conflict, seqno: 5334304
SQL: (null)
2016-05-13 14:45:03 140449633556224 [Note] WSREP: Victim thread:
THD: 101, mode: local commit, state: executing, conflict: no conflict, seqno: 5334303
SQL: commit
2016-05-13 14:45:03 140449633556224 [Note] WSREP: BF kill (1, seqno: 5334304), victim: (101) trx: 8938084
2016-05-13 14:45:03 140449633556224 [Note] WSREP: Aborting query: commit
2016-05-13 14:45:03 140449633556224 [Note] WSREP: withdraw for BF trx: 8938084, state: 0
2016-05-13 14:45:03 140449633556224 [Note] WSREP: kill trx QUERY_EXEC for 8938084
2016-05-13 14:45:03 140449633556224 [Note] WSREP: kill query for: 101
2016-05-13 14:45:03 140449632610048 [Note] WSREP: client rollback due to BF abort for (101), query: commit
2016-05-13 14:45:03 140449632610048 [Note] WSREP: abort in exec query state, avoiding autocommit
2016-05-13 14:45:03 140449632610048 [Note] WSREP: BF Aborted, thd: 101 is_AC: 0, retry: 0 - 1 SQL: commit
On Monday, May 16, 2016 at 8:52:02 PM UTC+3, Alexei Olkhovskii wrote:Hi,Do you guys have an idea why this is happening?Below is a simple script to delete/insert into parent/child tables. When I run two copies of the script (each instance working on its own set of values), the operation pretty often fails with:ERROR 1213 (40001) at line 1: Deadlock found when trying to get lock; try restarting transaction
In general, Galera exercises rather safe logic when checking for potential conflicts in FK parent table access. This often causes questions, and we have worked out a design for more optimal certification procedure for FK parent table access.
In your test case specifically, if test is run with separate pk value pairs, then the conflicts will be caused by innodb GAP locks, which DELETE statement gains. If you populate more rows in the table and run the test with non adjacent pk pairs, then no conflicts should happen.
Thanks for test case and nice script!
start transaction;
delete from demoApp.cnf_c where pk=20;
delete from demoApp.cnf_p where pk=20; -- places gap lock on the FK index interval pk < 20
start transaction;
insert into demoApp.cnf_p values(10);
insert into demoApp.cnf_c values(10, 10); -- blocks on the gap lock
The original lock was:
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 515 page no 4 n bits 72 index `cnf_c_fk` of table `demoApp`.`cnf_c` trx table locks 3 total table locks 2 trx id 8938084 lock mode S lock hold time 0 wait time before grant 0
Some bit of a mystery is remaining.