SQLServer Debezium Connector Latency

813 views
Skip to first unread message

Niels Berglund

unread,
Jul 11, 2021, 11:56:02 AM7/11/21
to debe...@googlegroups.com
Hi all!

At the company I am working for we have started looking at CDC in MS SQL Server and Debezium for an Outbox style publishing of events to Kafka.

We run highly transactional databases in hosting locations all over the world, and we want to be able to produce near real-time events based on transactions in the db. Right now we are using a home-grown solution based on SQLCLR, but for various reasons we want to go away from that.

That is where Debezium comes into play. The problem we are running into is that we see an almost constant latency of ~2 seconds between an "event" appearing in an underlying CDC change table and it being available in the Kafka topic. This is on a local machine with Kafka and Debezium running in a Docker Container, and SQL Server (2019) installed on the local machine. So, I assume network "stuff" should not come into play.

We are aware that MS SQL CDC has quite a few settings that impacts the performance, but regardless of settings we see this 2 seconds delay. When publishing directly from a kafka client to a topic the event appears within milliseconds.

So my question is what "knobs and levers" can be turned on the Debezium connector to lower the latency. It doesn't seem to matter what the "poll.interval.ms" is set to in the connector configuration; the latency sits at around 2 seconds. We are running Confluent Kafka 6.2 and the Debezium SQL Server connector 1.7.0.

If anyone has any ideas ...

Thanks!

Niels



Chris Cranford

unread,
Jul 13, 2021, 10:35:24 AM7/13/21
to debe...@googlegroups.com, Niels Berglund
Hi Niels -

The Debezium SQL Server connector does use a polling mechanism in order to get changes from the CDC tables in the database but this polling mechanism should be quite fast.  Have you checked and measured the time between the DML event in the database and when the event appears in the SQL Server CDC tables?  Could that be where this 2 second latency comes into play?

CC
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/CAL0uK6Chvb%2BJ1pEfu%2BeS2ktu6yVpVpb%2B7Hwany6xrN0mTdiXkQ%40mail.gmail.com.

Niels Berglund

unread,
Jul 13, 2021, 11:54:41 AM7/13/21
to Chris Cranford, debe...@googlegroups.com
Hi Chris!

Thanks for your answer! I have managed to get the latency between the DML even in the CDC enabled table, and when it appears in the system CDC table down to ~300 ms. What confuses me is that the time between it appears in the CDC table, and the time it gets timestamped in Kafka is constantly around 2 seconds.

Thanks!

Niels

Gunnar Morling

unread,
Jul 13, 2021, 11:58:17 AM7/13/21
to debezium
You could try and reduce the https://debezium.io/documentation/reference/connectors/sqlserver#sqlserver-property-poll-interval-ms setting. You should experiment a bit with the setting and what polling interval you can go for without consuming too much resources.

--Gunnar

Niels Berglund

unread,
Jul 16, 2021, 8:48:59 AM7/16/21
to debe...@googlegroups.com
Thanks Gunnar! I apologize for the late reply - a bit hectic here in SA right now :(

Anyway, I'll play around with the settings and see what comes out of it.

Thanks again!

Niels

Niels Berglund

unread,
Jul 20, 2021, 12:43:02 AM7/20/21
to debe...@googlegroups.com
Hi guys,

The latency issue I mentioned earlier - I must have done something wrong, because now when I check I see an average latency between the record appearing in the CDC system table, and being time stamped in the Kafka topic of ~15ms. 

Thanks again!

Niels

Chris Cranford

unread,
Jul 20, 2021, 12:58:56 PM7/20/21
to debe...@googlegroups.com, Niels Berglund
I'm glad is resolved Niels.

CC
Reply all
Reply to author
Forward
0 new messages