trx conflict for key - LOCAL_CERT

JonK

unread,

May 25, 2018, 4:10:02 AM5/25/18

to codership

Hi, I'm getting errors by LCF (LOCAL_CERT_FAILURES) in my Cluster MariaDB 10.1.32 + Galera 25.3.23(r3789):

2018-05-23 12:48:31 140270557226752 [Note] WSREP: trx conflict for key (1,FLAT8)859567db 91c26cb1: source: 377e9f36-5c04-11e8-8a20-77a19b389572 version: 3 local: 1 state: CERTIFYING flags: 1 conn_id: 2393076 trx_id: 95945523 seqnos (l: 215847, g: 22661545, s: 22661543, d: 22661525, ts: 35265156913652919) <--X--> source: 1bcd7fb6-5c07-11e8-89d0-ef9b351c12b0 version: 3 local: 0 state: COMMITTED flags: 1 conn_id: 2400360 trx_id: 75585097 seqnos (l: 215846, g: 22661544, s: 22661543, d: 22661543, ts: 35582694130759059)
2018-05-23 12:48:31 140270557226752 [Note] WSREP: cluster conflict due to certification failure for threads:
2018-05-23 12:48:31 140270557226752 [Note] WSREP: Victim thread:
THD: 2393076, mode: local, state: executing, conflict: cert failure, seqno: 22661545
SQL: commit

I have looked at the winner and vitctim transactions in the binary and general log and these are the sentences that cause the error (two INSERT with different values on the same table):

Winner: insert into appl_traza_aux_m (DS_PROCS_CREA_REG, FX_CREA_REG, CD_ACCION, ID_USU, DS_CRITICIDAD, DS_TRAZA, CD_LOGIN_EXTNO, IN_VISUA_CAU) values ('Creacion', '2018-05-24 13:53:39', 998, 12660, 'INFO', 'Logout realizado correctamente', 'X07984A', 1)
Victim: insert into appl_traza_aux_m (DS_PROCS_CREA_REG, FX_CREA_REG, CD_ACCION, ID_USU, DS_CRITICIDAD, DS_TRAZA, CD_LOGIN_EXTNO, IN_VISUA_CAU) values ('Creacion', '2018-05-24 13:53:39', 998, 56850, 'INFO', 'Logout realizado correctamente', 'S02326G', 1)

CREATE TABLE `appl_traza_aux_m` (
`id` bigint(12) NOT NULL AUTO_INCREMENT,
`ds_criticidad` varchar(5) NOT NULL,
`cd_accion` smallint(3) NOT NULL,
`id_usu` int(8) DEFAULT NULL,
`cd_login_extno` varchar(50) DEFAULT NULL,
`ds_traza` text NOT NULL,
`in_visua_cau` tinyint(4) NOT NULL DEFAULT '0',
`fx_crea_reg` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`ds_procs_crea_reg` varchar(50) NOT NULL,
`fx_modif_reg` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
`ds_procs_modif_reg` varchar(50) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `fk_traza_aux_m_tipo_accion` (`cd_accion`),
CONSTRAINT `_fk_traza_aux_m_tipo_accion` FOREIGN KEY (`cd_accion`) REFERENCES `appl_tipo_accion_t` (`cd_accion`)
) ENGINE=InnoDB AUTO_INCREMENT=49181399 DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4

It seems that it may be related to the AUTO_INCREMENT field.
Innodb_autoinc_lock_mode=2 and wsrep_auto_increment_control=ON but it seems that due to some error the same value is being generated in different nodes for the AUTO_INCREMENT field.
Any error or bug?

I have tried with wsrep_slave_threads=1 and wsrep_slave_fk_checks=OFF but the error follows.
any ideas?

Thanks in advance,
Jon.

Wagner Bianchi

unread,

May 29, 2018, 9:03:52 AM5/29/18

to JonK, codership

Hello JonK,

How many nodes have your cluster? What is the current of auto_increment_increment and auto_incfrement_offset on all nodes? Are you able to isolate this case to do better tests inserting data? I see a probable slowness on the auto_inc as well as InnoDB must SELECT MAX(id)+1 and then, append the row your application is sending to the table. The FK can affect the performance here as well and make the insert to be slow. You can try removing the FK and check performance. About wsrep_slave_fk_checks, IIRC, it's going to be applied over the applier threads for the Galera API replication threads over the certification, not for the InnoDB tables per se. I don't wanna say "try with FOREIGN_KEY_CHECKS=0" as it's better to remove the FK in case your application can do the referential integrity itself, and an ORM is not in use.

--

Wagner Bianchi, RDBA Team Lead @ MariaDB RDBA Team
MariaDB Corporation | t: +55.31.98654.9510 | Skype: wbianchijr

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-team+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

JonK

unread,

May 29, 2018, 2:43:34 PM5/29/18

to codership

Hello Wagner, my cluster

nodes in cluster=5
wsrep_auto_increment_control=ON, auto_increment_increment=5, auto_increment_offset=1-5

Yesterday I deleted the FK (wsrep_slave_fk_checks=OFF, it's not worth much) and the error disappeared. The table has about 50MM rows but there are few INSERTs per second, maybe 15 or 20.
I will not recreate the FK because I am going to partition the table but I am worried about not being able to use FKs with Galera.

I will try to reproduce the error in another environment...

Thanks and regards,
Jon.

To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.

Wagner Bianchi

unread,

May 29, 2018, 6:33:36 PM5/29/18

to JonK, codership

Hello JonK, glad it has worked, though.

I would say, you "can" use FKs with Galera, the issue is that it's going to bring some overhead with the needed internal row lockings InnoDB should be doing to guaranteeing the integrity. Most of the time, as this scenario, you need to balance the alternatives you have in hands. You need to be careful running without an FK because the node can leave the cluster in case the data become inconsistent. So, make sure it's the best choice after analyzing it well.

If you have more different results when isolating the case and testing it, let us know, cheers!

--

Wagner Bianchi, RDBA Team Lead @ MariaDB RDBA Team
MariaDB Corporation | t: +55.31.98654.9510 | Skype: wbianchijr

To unsubscribe from this group and stop receiving emails from it, send an email to codership-team+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward

trx conflict for key - LOCAL_CERT_FAILURES

JonK

Wagner Bianchi

JonK

Wagner Bianchi