4 - 5 times Node outage becouse of "Slave SQL: Could not execute Write_rows event on table __TABLE_NAME__; Duplicate entry ..."

1,043 views
Skip to first unread message

Rumen Palov

unread,
Jan 7, 2016, 3:49:15 AM1/7/16
to codership
Hello all,

Last few weeks, we have 4 - 5 times Node outage because of "Could not execute Write_rows event on table __TABLE_NAME__ ; Duplicate entry '37399596-1' for key 'sc_session_recno', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 295, Error_code: 1062" .

This was happen sometimes on node which accept writes, sometimes on node which is applying them thru wsrep applier.

Bellow is cutted chunk from logs:

2015-12-21 18:44:56 36547 [ERROR] Slave SQL: Could not execute Write_rows event on table __TABLE_NAME__; Duplicate entry '37399596-1' for key 'sc_session_recno', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 295, Error_code: 1062
2015-12-21 18:44:56 36547 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 650613357
2015-12-21 18:44:56 36547 [Warning] WSREP: Failed to apply app buffer: seqno: 650613357, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
2015-12-21 18:44:56 36547 [ERROR] Slave SQL: Could not execute Write_rows event on table __TABLE_NAME__; Duplicate entry '
37399596-1' for key 'sc_session_recno', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 295, Error_code: 1062
2015-12-21 18:44:56 36547 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 650613357
2015-12-21 18:44:56 36547 [Warning] WSREP: Failed to apply app buffer: seqno: 650613357, status: 1
         at galera
/src/trx_handle.cpp:apply():340
Retrying 3th time
2015-12-21 18:44:56 36547 [ERROR] Slave SQL: Could not execute Write_rows event on table __TABLE_NAME__; Duplicate entry '37399596-1' for key 'sc_session_recno', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 295, Error_code: 1062
2015-12-21 18:44:56 36547 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 650613357
2015-12-21 18:44:56 36547 [Warning] WSREP: Failed to apply app buffer: seqno: 650613357, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
2015-12-21 18:44:56 36547 [ERROR] Slave SQL: Could not execute Write_rows event on table __TABLE_NAME__ ; Duplicate entry '
37399596-1' for key 'sc_session_recno', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 295, Error_code: 1062
2015-12-21 18:44:56 36547 [Warning] WSREP: RBR event 3 Write_rows apply warning: 121, 650613357
2015-12-21 18:44:56 36547 [ERROR] WSREP: Failed to apply trx: source: a17173f3-1543-11e5-8c7e-b388f2f1afa6 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 238940589 trx_id: 4530496561 seqnos (l: 455444765, g: 650613357, s: 650613285, d: 650613160, ts: 16138176279975892)
2015-12-21 18:44:56 36547 [ERROR] WSREP: Failed to apply trx 650613357 4 times
2015-12-21 18:44:56 36547 [ERROR] WSREP: Node consistency compromized, aborting...



We are using FreeBSD 9.2 and "5.6.16-log MySQL Community Server (GPL), wsrep_25.5.r4064" . 3 Nodes Cluster, Mostly db2 is accepting writes. The log above is from db1.

Is anyone have a clue, what can cause this behavior ?

Cheers
Rumen

Philip Stoev

unread,
Jan 7, 2016, 4:01:36 AM1/7/16
to Rumen Palov, codersh...@googlegroups.com
Hello,

It seems to me that the version you are using is rather old, you may wish to
upgrade both the MySQL server and the Galera library, as issues related to
the error message you are getting have been fixed in the past.

If that does not help, you may wish to check your application for any of the
following:
* concurrent DDL happening on the table, especially under
wsrep_OSU_method=RSU
* triggers, foreign keys, complex stored procedures (e.g. handlers,
transactions or rollback within stored procedure) on the table in question
* use of wsrep_on=OFF and other dynamic changes of Galera configuration
while the cluster is running
* statements such as CREATE ... SELECT , REPLACE ... SELECT , TRUNCATE,
ADD/DROP partition, etc.

If you need additional information in order to identify the offending query
or transaction, you can look at the GRA_*.log files in the data directory.
Each contains the binary log fragment from the transaction that failed to
apply.

Philip Stoev
--
You received this message because you are subscribed to the Google Groups
"codership" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to codership-tea...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages