Percona XtraDB cluster and Tungsten Replicator

Christian Candia

unread,

Apr 15, 2015, 4:45:42 PM4/15/15

to codersh...@googlegroups.com

I have a two Percona XtraDB clusters with 3 nodes each. Each cluster have the same database schema, but serve different population of users so most tables have different information. However, some tables contain configuration parameters that are updated regularly. These tables need to be keep in sync on both clusters. So I'm trying to use Tungsten Replicator to sync these tables between the clusters using a master-slave configuration.

When I configure Tungsten Replicator on the master (one of the nodes of the first cluster), the other nodes in the same cluster stop working. The log on these servers show the following messages:

2015-04-14 19:25:52 7315 [ERROR] Slave SQL: Could not execute Update_rows event on table tungsten_test.heartbeat; Can't find record in 'heartbeat', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 259, Error_code: 1032

2015-04-14 19:25:52 7315 [Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 3013

2015-04-14 19:25:52 7315 [ERROR] WSREP: Failed to apply trx: source: 9c5c9da3-e2d4-11e4-a876-6e4d039de4c0 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 179 trx_id: 11048 seqnos (l: 4487, g: 3013, s: 3012, d: 3010, ts: 6855749659175)

2015-04-14 19:25:52 7315 [ERROR] WSREP: Failed to apply trx 3013 4 times

2015-04-14 19:25:52 7315 [ERROR] WSREP: Node consistency compromized, aborting...

I found a reference to this error message in an old presentation :http://www.percona.com/live/mysql-conference-2013/sites/default/files/slides/mysql-multi-master-state-of-art-2013-04-24_0.pdf, which mentions this is a Galera bug, but I can't find if the bug as been fix already or if there is a workaround.

Does anybody can share some light on the subject?

Thanks,

Christian

alexey.y...@galeracluster.com

unread,

Apr 15, 2015, 11:54:31 PM4/15/15

to Christian Candia, codersh...@googlegroups.com

I don't think that anybody has looked in depth what is really happening
there, and what exactly is the issue. You could start by comparing the
tables and why the row which is present on one node is absent on others,
what sort of SQL tungsten uses to update the tables and so on. i suspect
that statement based binlogging is used there.

> Thanks,
>
> Christian

Alexis Guajardo

unread,

Apr 16, 2015, 2:44:44 AM4/16/15

to alexey.y...@galeracluster.com, Christian Candia, codersh...@googlegroups.com

whats the table structure for tungsten_test.heartbeat ?

Thanks,

Christian

--
You received this message because you are subscribed to the Google Groups "codership" group.
To unsubscribe from this group and stop receiving emails from it, send an email to codership-team+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Christian Candia

unread,

Apr 16, 2015, 11:47:13 AM4/16/15

to Alexis Guajardo, alexey.y...@galeracluster.com, codersh...@googlegroups.com

Here is the definition of the table:

CREATE TABLE `heartbeat` (
  `id` bigint(20) NOT NULL,
  `seqno` bigint(20) DEFAULT NULL,
  `eventid` varchar(128) DEFAULT NULL,
  `source_tstamp` timestamp NULL DEFAULT NULL,
  `target_tstamp` timestamp NULL DEFAULT NULL,
  `lag_millis` bigint(20) DEFAULT NULL,
  `salt` bigint(20) DEFAULT NULL,
  `name` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Thanks!,

Christian

Alexis Guajardo

unread,

Apr 16, 2015, 12:07:57 PM4/16/15

to Christian Candia, alexey.y...@galeracluster.com, codersh...@googlegroups.com

I would think that your inconsistencies are caused by an error applying initial transaction, which could be due to primary key conflict since id relies on application for uniqueness .

There most likely are GRA_.*log files in your data directories that would contain failed transactions ( see http://www.percona.com/blog/2012/12/19/percona-xtradb-cluster-pxc-what-about-gra_-log-files/ ) .

You can compare those events to your binlog events to see how and why the data is missing ( see http://dev.mysql.com/doc/refman/5.6/en/mysqlbinlog-hexdump.htm )

Probably the 'fix' for this would be to add an auto_increment column ( requires changing primary key as well ) , which should be transparent to the application .

l

Christian Candia

unread,

Apr 16, 2015, 1:59:16 PM4/16/15

to Alexis Guajardo, alexey.y...@galeracluster.com, codersh...@googlegroups.com

The GRA file shows :

# at 202

#150415 15:20:02 server id 10 end_log_pos 170 Table_map: `tungsten_imedbono3`.`heartbeat` mapped to number 420
ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 69, event_type: 31
WARNING: The range of printed events ends with a row event or a table map event that does not have the STMT_END_F flag set. This might be because the last statement was not fully written to the log, or because you are using a --stop-position or --stop-datetime that refers to an event in the middle of a statement. The event(s) from the partial statement have not been written to output.

And the binlog on the master for the update event is:

# at 298
#150416 17:24:21 server id 10  end_log_pos 391 CRC32 0xa7fedeed
# Position  Timestamp   Type   Master ID   Size   Master Pos   Flags
#   12a 45 f0 2f 55   1f   0a 00 00 00   5d 00 00 00   87 01 00 00   00 00
#   13d a4 01 00 00 00 00 01 00  02 00 08 ff ff b6 01 00 |................|
#   14d 00 00 00 00 00 00 55 2f  f0 44 00 00 00 00 00 00 |......U..D......|
#   15d 00 00 36 01 00 00 00 00  00 00 00 55 2f f0 45 01 |..6........U..E.|
#   16d 00 00 00 00 00 00 00 0d  00 4d 41 53 54 45 52 5f |.........MASTER.|
#   17d 4f 4e 4c 49 4e 45 ed de  fe a7     |ONLINE....|
# Update_rows: table id 420 flags: STMT_END_F

'/*!*/;

### UPDATE `tungsten_test`.`heartbeat`
### WHERE
###   @1=1
###   @2=NULL
###   @3=NULL
###   @4=1429205060
###   @5=NULL
###   @6=NULL
###   @7=0
###   @8=NULL
### SET
###   @1=1
###   @2=NULL
###   @3=NULL
###   @4=1429205061
###   @5=NULL
###   @6=NULL
###   @7=1
###   @8=‘MASTER_ONLINE'

Thanks,

Christian

Neil Armitage

unread,

Apr 16, 2015, 3:48:37 PM4/16/15

to Christian Candia, Alexis Guajardo, alexey.y...@galeracluster.com, codersh...@googlegroups.com

I seem to remember the create table for heartbeat doesn't get replicated when it is created (binlog is set off in the session I think) but the inserts/updates do.

It's been a while since I looked at it but there was a reason for it within the replicator

--
You received this message because you are subscribed to the Google Groups "codership" group.

To unsubscribe from this group and stop receiving emails from it, send an email to codership-tea...@googlegroups.com.

suppor...@gmail.com

unread,

Oct 19, 2015, 2:28:04 PM10/19/15

to codership

I faced with same problem. During Tungsten Replicator installation process tungsten_* database created and synchronized without any errors. Immediately after replicator service started I got

"[ERROR] Slave SQL: Could not execute Update_rows event on table tungsten_test.heartbeat; Can't find record in 'heartbeat', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 259, Error_code: 1032".

Did you manage to understand reason for this error?

Reply all

Reply to author

Forward