Replication multiple databases MySQL using Debezium

1,004 views
Skip to first unread message

Eder F. Freitas

unread,
Jan 11, 2018, 6:17:30 AM1/11/18
to debezium
Hi,

I would like to know if is possible to use Debezium and JDBC Sink connectors for database replication without use native db replication techniques, e.g. I have two identical databases (same struct and data) running in different MySQL servers, and when I do inserts in two differents tables with FK constraints: Order and OrderItems, first I must insert Order then OrdemItems, how will it be executed by the JDBC Sink Connector? Will it be execute first the parent inserts and then the childs?

Once Debezium doesn't guarantee order, because are two different tables and the events will be sent to differents Kafka topics, but if it was executed in differents orders by the JDBC Sink connector it will result in a Foreign Key Exception.

What do you think about my situation, is it a good idea to use Debezium and JDBC Sink Connectors to replication? Is it safe? Or should I think in another solution?

René Kerner

unread,
Jan 11, 2018, 11:18:25 AM1/11/18
to debezium

Hi Eder!

First:















Don't use foreign keys... ;-)
The time of DBA's and building your business rules inside a DB maybe is over. at least for big installations.

If you still need and/or want to use them, you have some options:
- using a single topic with a single partition and messages for all tables in that topic (if you can use FKs your DB load won't hurt Kafka), everything is strictly ordered then
- use  "SET foreign_key_checks = 0;" on your sink side to skip FK checks for your JDBC connection/s, compare https://dev.mysql.com/doc/refman/5.7/en/create-table-foreign-keys.html see section  "Foreign Keys and Other MySQL Statements"

I hope that answer helps?

Best regards :)

Gunnar Morling

unread,
Jan 11, 2018, 12:23:31 PM1/11/18
to debezium
Hey,


Am Donnerstag, 11. Januar 2018 17:18:25 UTC+1 schrieb René Kerner:

Hi Eder!

First:















Don't use foreign keys... ;-)
The time of DBA's and building your business rules inside a DB maybe is over. at least for big installations.

I'll have to disagree on that one. At least in an operational DB FKs are very helpful to ensure data integrity on a fundamental level, they shouldn't be dismissed easily. That being said, it's not easy on the sink side in a streaming scenario.

If you still need and/or want to use them, you have some options:
- using a single topic with a single partition and messages for all tables in that topic (if you can use FKs your DB load won't hurt Kafka), everything is strictly ordered then

Note that this only will work if the data has been put into the binlog within the "correct order", which might not necessarily be the case if the source DB uses deferred FK checking (at TX commit), which I think is possible on Postgres for instance.
 
- use  "SET foreign_key_checks = 0;" on your sink side to skip FK checks for your JDBC connection/s, compare https://dev.mysql.com/doc/refman/5.7/en/create-table-foreign-keys.html see section  "Foreign Keys and Other MySQL Statements"

Doesn't this counter the requirement of the OP, though? I'd only see this working if changes happen in intervals, so you can turn on FK checking on the sink side after all pending change events have been consumed.

That said, we talked at some point about allowing to define an order of tables during the snapshotting phase. This would let a user select an order which matches the FK relationships (leaving circular relationships out of the picture), or we could even analyze the relationships automatically and snapshot tables in order. This will not help though when events from the binlog are processed after the initial snapshot. I still think that ordering feature would make sense for snapshotting, but we didn't get there yet. Any help is welcomed of course :)

--Gunnar

Eder F. Freitas

unread,
Jan 11, 2018, 1:05:47 PM1/11/18
to debezium
Hi guys,

Thank you for the tips.

How can I set Debezium to write the events in a single partition, I'm going to test René's recommendations :D

Jiri Pechanec

unread,
Jan 11, 2018, 10:08:07 PM1/11/18
to debezium
Hi,

this is a little bit complicated as we d not provide custom naming strategy for topics. You can use an SMT that will route the events to a single route. For this you either write your own SMT or you abuse RegexRouter (https://cwiki.apache.org/confluence/display/KAFKA/KIP-66%3A+Single+Message+Transforms+for+Kafka+Connect) in that way that the result will be a constant name.

J.

Gunnar Morling

unread,
Jan 12, 2018, 3:42:57 AM1/12/18
to debezium
Our own topic router SMT should work, too: http://debezium.io/docs/configuration/topic-routing/.
Reply all
Reply to author
Forward
0 new messages