Hi, I have a few questions related to running Debezium Serer against SQL Server -
1. Is it correct that the poll.interval.ms
determines how often the connector checks change tables for records with an LSN greater than the connector's max LSN? What I'm getting at here is, it seems that SQL Server CDC has a batch cycle for populating the change tables, and the SQL Server Debezium connector has a cycle on which it polls for records in the change table. Just want to make sure I understand that correctly.
2. Is it correct that these most important factors in understanding total latency?
- SQL Server CDC parameters: maxscans, maxtrans, and pollinginterval
- Connector config: poll.interval.ms, max.queue.size, max.batch.size,
So I would expect latency to be (very) roughly equivalent to pollinginterval
(MSSQL) + poll.interval.ms
(Connector) given a situation where there is no backlog on either the SQL Server or Connector side?
3. From step 2 of Offline Schema Updates
how do you know when the Debezium connector has streamed all unstreamed change event records. Is this accomplished by observing the MilliSecondsSinceLastEvent streaming metric
to make sure there's no ongoing activity or is there a procedure to compare max LSNs between capture instance and connector offset storage? I'm trying to determine how I can be sure that it's safe to delete an old capture instance when a new one is created after a schema change.