Scaling out embedded debezium engine

Miko Duenas

unread,

Oct 10, 2018, 11:00:53 PM10/10/18

to debezium

Hi,

I was prototyping the embedded debezium engine to stream changes from a single table to Kafka. It works great for a single instance, but I'm wondering how the application can be scaled out horizontally if both instances receive the same events from their respective embedded engines. Whitelisting/Blacklisting is not an options since I only have a single table to monitor. Is there a recommended approach to this problem? Unfortunately I am unable to deploy connectors to our Kafka instance due to it being a strictly controlled environment.

Thanks,

Mik

Jiri Pechanec

unread,

Oct 11, 2018, 12:26:48 AM10/11/18

to debezium

Hi Mik,

I am not sure this is technically feasible as it would defeat a purpose of in-order processing. If you'd scale horizontally thanyou cannot guarantee the order of events. One question - are you doing ane excessive calculations or only forwarding messages to Kafka? If the latter than with partitioning enabled you'll get a horiznotally scaled streaming out-of-the-box.

J.

Gunnar Morling

unread,

Oct 11, 2018, 3:40:16 AM10/11/18

to debezium

Hi,

It'd be interesting to learn a bit more about what you'd like to achieve. Why are you trying to run two instances of the embedded engine capturing the same table? Is it to avoid a lag building up between changes in the source DB and events emitted by the embedded engine? If so, it'd be interesting to do some profiling to see where time is spend.

By definition, consuming the change events of a single table is a serial operation, so it cannot really be parallelised. Theoretically one might think of a altering the connector in a way so it processes only a subset of keys, so e.g. one connector instance would read keys 1,3,5... and another one 2,4,6... . This might help to speed up things if processing the events from the DB and converting them into Kafka Connect records is the bottleneck. But you'd loose total order across all events (might be acceptable, esp. if you intend to partition the topic anyways) and also we'd have two processes reading the server log. All in all I'm not sure it's worth the effort.

But then I still might just not quite get what the exact problem here is.

--Gunnar

Jose Duenas

unread,

Oct 11, 2018, 8:17:08 AM10/11/18

to debe...@googlegroups.com

Hi Gunnar/Jiri,

Thank you for the response. Let me further clarify my question.

As I scale out my microservice horizontally, the embedded debezium engines in each instance will all process the bin log. As Jiri pointed out, this is not good from message ordering and message duplication perspective. However, I am wondering what would be the ideal approach to make the embedded engines more fault tolerant in this setup. Is it sound to designate a single active debezium engine among the microservices instances and automatically designate a new active embedded engine in case the original microservice instance crashes?

Thanks,

Jose

--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To post to this group, send email to debe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/0f1d6f93-baae-46c1-bfa7-9c7203ec0caf%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jose Duenas

unread,

Oct 11, 2018, 8:18:32 AM10/11/18

to debe...@googlegroups.com

Hi Gunnar/Jiri,

Thank you for the response. Let me further clarify my question.

As I scale out my microservice horizontally, the embedded debezium engines in each instance will all process the bin log. As Jiri pointed out, this is not good from message ordering and message duplication perspective. However, I am wondering what would be the ideal approach to make the embedded engines more fault tolerant in this setup. Is it sound to designate a single active debezium engine among the microservices instances and automatically designate a new active embedded engine in case the original microservice instance crashes?

Thanks,

Mik

On Thu, Oct 11, 2018 at 3:40 AM 'Gunnar Morling' via debezium <debe...@googlegroups.com> wrote:

--

Gunnar Morling

unread,

Oct 11, 2018, 9:20:15 AM10/11/18

to debezium

> Is it sound to designate a single active debezium engine among the microservices instances and automatically designate a new active embedded engine in case the original microservice instance crashes?

That sounds reasonable to me. In fact, you might even consider to extract a dedicated service just for running the embedded engine of which you then just run a single instance. Provided of course you are on Kubernetes or similar, so you can detect if it crashes for some reason so you can automatically restart it. You might also find this blog post interesting:

https://medium.com/blablacar-tech/streaming-data-out-of-the-monolith-building-a-highly-reliable-cdc-stack-d71599131acb

It describes an approach where they essentially read the same binlog twice, using two connectors, and then have a deduplication component in place which filters out any repeated event. That way the overall availability is increased, as the second instance will keep going also if the other one fails. The price of course is increased complexity and duplicated binlog reading. It depends on your requirements whether a downtime of the connector until it's restarted is acceptable or not.

In any case I'm curious about what approach you'll end up using. In case you're interested to share your experiences e.g. in a post on the Debezium blog, let us know and we can set something up.

--Gunnar

Jiri Pechanec

unread,

Oct 11, 2018, 11:50:50 PM10/11/18

to debezium

Hi,

just do not forget that in this case of two instances (active/passive) you need a shared storage for offsets and need to provide an exclusive access to it in case multiple instances are running in parallel. If you have a single one that you just restart there is no such requirement. The fromer case has lower recovery latency but the latter would be much easier to implement.

J.

Reply all

Reply to author

Forward