Using embedded Debezium in production

915 views
Skip to first unread message

Ori Popowski

unread,
Aug 6, 2018, 2:44:34 AM8/6/18
to debezium
Hi,

I understand that an embedded Debezium does not have the same
guarantees as in Kafka Connect deployment. However, as I see it,
if I run an embedded Debezium on Kubernetes with offset storage
into an EBS volume and a very tight healthcheck, it seems that in
the worst case, there will be a few moments of downtime, and some
duplicated messages. Is it correct?

Does the above configuration can handle the throughput and speed
of a medium-traffic MySQL databases?

If it's not, is it possible to scale-out Debezium and have several
workers reading from the same binlog?

Does embedded Debezium expose a replication lag metric for
PostgreSQL and MySQL? This is the most essential metric as I
see it.

Thanks!

Gunnar Morling

unread,
Aug 7, 2018, 3:37:11 AM8/7/18
to debezium
Hi Ori,


Am Montag, 6. August 2018 08:44:34 UTC+2 schrieb Ori Popowski:
Hi,

I understand that an embedded Debezium does not have the same
guarantees as in Kafka Connect deployment. However, as I see it,
if I run an embedded Debezium on Kubernetes with offset storage
into an EBS volume and a very tight healthcheck, it seems that in
the worst case, there will be a few moments of downtime, and some
duplicated messages. Is it correct?

Yes, essentially you must make sure to store processed offsets reliably and be prepared to re-read some events again after a restart (which could happen with Kafka Connect style deployment, too, though). 

Does the above configuration can handle the throughput and speed
of a medium-traffic MySQL databases?

Yes, this should work. There should be no real difference to the non-embedded performance really. Unless your offset storage is super-slow of course :)

If it's not, is it possible to scale-out Debezium and have several
workers reading from the same binlog?

I'd start with a single instance and see whether you run into any issues.

Does embedded Debezium expose a replication lag metric for
PostgreSQL and MySQL? This is the most essential metric as I
see it.

There's monitoring (via JMX) for the MySQL connector only atm: https://debezium.io/docs/connectors/mysql/#monitoring. It exposes a "SecondsBehindMaster" parameter which represents the event lag. It's up high on the agenda to provide equivalent monitoring for the other connectors, too (in fact it should be one of the next things we get to once the two new connectors, Oracle and SQL Server have stabilized a bit).
 

Thanks!

Hth,

--Gunnar
 

Ori Popowski

unread,
Aug 7, 2018, 3:54:43 AM8/7/18
to debezium

Thanks very much for the detailed response :)

I'd start with a single instance and see whether you run into any issues.

The real question is, if multiple consumers can read different offsets simultaneously, not to read the same events. (something like one reads the odd offsets and the other the even offsets. Sorry, I'm not familiar with MySQL binlog)

There's monitoring (via JMX) for the MySQL connector only atm: https://debezium.io/docs/connectors/mysql/#monitoring. It exposes a "SecondsBehindMaster" parameter which represents the event lag. It's up high on the agenda to provide equivalent monitoring for the other connectors, too (in fact it should be one of the next things we get to once the two new connectors, Oracle and SQL Server have stabilized a bit).

I think there's a problem with the SecondsBehindMaster parameters:

Gunnar Morling

unread,
Aug 7, 2018, 4:22:26 AM8/7/18
to debezium


Am Dienstag, 7. August 2018 09:54:43 UTC+2 schrieb Ori Popowski:

Thanks very much for the detailed response :)

I'd start with a single instance and see whether you run into any issues.

 
The real question is, if multiple consumers can read different offsets simultaneously, not to read the same events. (something like one reads the odd offsets and the other the even offsets. Sorry, I'm not familiar with MySQL binlog)

What you can do is having multiple instances that read the events from different tables by means of whitelist/blacklist filters (e.g. instance 1 reads tables A and B and instance 2 reads C and D). This filtering is done on the client-side in case of the MySQL connector, though. I.e. both instance would still receive all events from the binlog but then only process that subset of events they are interested in.

So this will cause a bit load distribution amongst the connectors (as events not matching the filters are discarded early on, and all the conversion logic into Kafka events will be skipped), but it'll mean there are two connections to the DB reading the binlog. I can't really comment on the impact on the server of doing this, but I know multiple Debezium users work with multiple connector instances connected to the same DB that way without problems.
 
There's monitoring (via JMX) for the MySQL connector only atm: https://debezium.io/docs/connectors/mysql/#monitoring. It exposes a "SecondsBehindMaster" parameter which represents the event lag. It's up high on the agenda to provide equivalent monitoring for the other connectors, too (in fact it should be one of the next things we get to once the two new connectors, Oracle and SQL Server have stabilized a bit).

I think there's a problem with the SecondsBehindMaster parameters:

Interesting, let's see what Shyiko says to this. To me it seems this metric is designed in a way that it's only meaningful if there was at least one event after (re-)starting the log reader. It's surprising though you see an actual value in this very case, I'd have expected it to be -1 until the first event has been processed.
Reply all
Reply to author
Forward
0 new messages