Running a second instance of debezium on the same database and kafka cluster

460 views
Skip to first unread message

Vera van Mondfrans

unread,
May 6, 2021, 5:28:19 AM5/6/21
to debezium

Hi,

I'm trying to work myself into a situation where I can have a snapshot taken by Debezium without causing too much load of my normal consumers.

My idea was to have a second instance of the mysql-connector create a snapshot pushing into a different set of topics than the normal connector, so I have two sets of consumers processing data at the same time. This way changes are processed by the normal consumers without the delay caused by the data from the snapshot.

Fox context, processing a full snapshot takes our system nearly 3 days (we have a lot of data...)

So far I can get the second instance of Debezium to run within my Strimzi connect cluster, and I also got it to run in a different cluster. But with either solution Im finding that Debezium copies the binlog offsets from the instance I already had running, and doesn't start a new snapshot.

Id like suggestions on how to proceed, how can I, in parallel to my normal processes, have a full snapshot be taken by debezium?

jiri.p...@gmail.com

unread,
May 6, 2021, 11:29:53 AM5/6/21
to debezium
Hi,


Second what you are proposing is fine but I'd recommend you to use a different connector name. That way there would be no iissue with offsets.

J.

Vera van Mondfrans

unread,
May 7, 2021, 7:27:38 AM5/7/21
to debe...@googlegroups.com, Filip de Waard
Hi Jiri,

Thanks for the response! Having looked at the signaling, this seems really useful for our resynchronization when something goes wrong.

I was wondering if there's a way to also signal that a snapshot should be sent to a different topic than the normal messages are? Having a way to have multiple consumers run our sync in parallel is really essential to manage our performance. Our bottleneck right now isn't Debezium, but rather our consumers simply needing too much time per message to keep up and we end up with a message lag of millions.

If not, having two instances of Debezium running will just be the way to go :)

Regards,
Vera

--
You received this message because you are subscribed to a topic in the Google Groups "debezium" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/debezium/yT0b26DbH7Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/77393f9f-92fa-474d-8254-8fb39bff6619n%40googlegroups.com.

jiri.p...@gmail.com

unread,
May 10, 2021, 1:52:28 AM5/10/21
to debezium
Hi,

you can't but I can image you'll apply something like https://debezium.io/documentation/reference/1.5/configuration/content-based-routing.html to route to different set of topic based on snapshpot/streaming origin.

J.

Reply all
Reply to author
Forward
0 new messages