SQL Server / Debezium Server Questions on polling intervals and schema changes

Adam Whitmore

unread,

Apr 28, 2021, 9:25:13 AM4/28/21

to debezium

Hi, I have a few questions related to running Debezium Serer against SQL Server -

1. Is it correct that the poll.interval.ms determines how often the connector checks change tables for records with an LSN greater than the connector's max LSN? What I'm getting at here is, it seems that SQL Server CDC has a batch cycle for populating the change tables, and the SQL Server Debezium connector has a cycle on which it polls for records in the change table. Just want to make sure I understand that correctly.

2. Is it correct that these most important factors in understanding total latency?

- SQL Server CDC parameters: maxscans, maxtrans, and pollinginterval

- Connector config: poll.interval.ms, max.queue.size, max.batch.size, and snapshot.fetch.size

So I would expect latency to be (very) roughly equivalent to pollinginterval (MSSQL) + poll.interval.ms (Connector) given a situation where there is no backlog on either the SQL Server or Connector side?

3. From step 2 of Offline Schema Updates how do you know when the Debezium connector has streamed all unstreamed change event records. Is this accomplished by observing the MilliSecondsSinceLastEvent streaming metric to make sure there's no ongoing activity or is there a procedure to compare max LSNs between capture instance and connector offset storage? I'm trying to determine how I can be sure that it's safe to delete an old capture instance when a new one is created after a schema change.

Thanks!

Adam

jiri.p...@gmail.com

unread,

May 6, 2021, 2:10:28 AM5/6/21

to debezium

Hi,

1. Yes, you understand this correctly

2. Yes, but think this is more like maximum latency, median latency should be roughly half of that

3. Ideally you should check the maximum LSN available in db change tables and then monitor offsets topic and see the last LSN safelly committed in Kafka

J.

Adam Whitmore

unread,

May 7, 2021, 8:52:34 AM5/7/21

to debezium

Thanks Jiri, that helps. Is there standard approach to reading the offets.dat file in the case where we're storing offsets in a file rather than a kafka topic when running Debezium server?

Thanks,

Adam

jiri.p...@gmail.com

unread,

May 10, 2021, 1:53:40 AM5/10/21

to debezium

Hi,

no, nothing but the code from org.apache.kafka.connect.storage.FileOffsetBackingStore.load()

J.

Reply all

Reply to author

Forward