Incremental snapshot - Connection closed for BIG tables via SIGNAL

41 views
Skip to first unread message

Vinoth Kumar

unread,
Nov 14, 2025, 4:06:06 AMNov 14
to debezium
Hi Chris & Community Team,

We are performing Incremental snapshot via signal table, and it has almost 2 billion rows, after fetching certain amount of data like 15 to 25% percent of data (~10hrs).

Observation:

Error especially during high archive generation (hardly 40G per hour). In short, connection closed error after 90 mins of last snapshot closed in signal table.

Also, when we restart connector, it looks not resuming from where it stopped/failed. Bcoz, no new insert into signal table and no DB session that fetches the data.

Connector runs fine without issues with heartbeat table update, but this is probably after some delay.

Also, log mining is far behind on such high generations and after some time it backs with mining all the archives.

Below changes we did but still stuck with the below error.

  • Earlier, we got connection closed after 10mins and then changed sqlnet.expire=0 (unlimited). With that encountered error after 90mins
  • Changed keepalive settings in connector configured servers at POD also specified in kafkaconnect.yaml.

 

Last few snapshot details from SIGNAL table:
397e1f09-967c-4b37-b239-e9fcdf8f210e-open              snapshot-window-open               {"openWindowTimestamp": "2025-11-13T17:48:50.327206867Z"}

397e1f09-967c-4b37-b239-e9fcdf8f210e-close             snapshot-window-close               {"openWindowTimestamp": "2025-11-13T17:48:50.327206867Z", "closeWindowTimestamp": "2025-11-13T17:48:50.606823768Z"}

05c160fb-a2cf-42e2-acc7-cb528fde1dd3-open             snapshot-window-open               {"openWindowTimestamp": "2025-11-13T17:38:01.910867525Z"}

05c160fb-a2cf-42e2-acc7-cb528fde1dd3-close            snapshot-window-close               {"openWindowTimestamp": "2025-11-13T17:38:01.910867525Z", "closeWindowTimestamp": "2025-11-13T17:38:02.232327802Z"}

003ca5f6-7353-474d-a7a7-bfea544ce025-open            snapshot-window-open               {"openWindowTimestamp": "2025-11-13T17:31:55.097950177Z"}

003ca5f6-7353-474d-a7a7-bfea544ce025-close           snapshot-window-close               {"openWindowTimestamp": "2025-11-13T17:31:55.097950177Z", "closeWindowTimestamp": "2025-11-13T17:31:55.440593014Z"}

9ac69436-ea85-4dca-a775-03178b43e20e-open         snapshot-window-open               {"openWindowTimestamp": "2025-11-13T17:30:47.540310590Z"}

9ac69436-ea85-4dca-a775-03178b43e20e-close        snapshot-window-close               {"openWindowTimestamp": "2025-11-13T17:30:47.540310590Z", "closeWindowTimestamp": "2025-11-13T17:30:47.850399530Z"}

 

Values:

securityContext:
        sysctls:
          - name: net.ipv4.tcp_keepalive_time
            value: "60"
          - name: net.ipv4.tcp_keepalive_intvl
            value: "30"
          - name: net.ipv4.tcp_keepalive_probes
            value: "5"

 ERROR: (after 90 odd mins from last snapshot close)

method:"processSignal",

  • @timestamp:"2025-11-13T19:03:25.002Z",
  • logger_name:"io.debezium.pipeline.signal.SignalProcessor",
  • source_host:"kafka-connect-dba-nltrecp-connect-0",
  • exception:

{

    • exception_class:"io.debezium.DebeziumException",
    • exception_message:"Database error while executing incremental snapshot for table 'CollectionId{id=NLTRECP.TRECS.NAMEADDRESS, additionalCondition=, surrogateKey=}'",
    • stacktrace:"io.debezium.DebeziumException: Database error while executing incremental snapshot for table 'CollectionId{id=NLTRECP.TRECS.NAMEADDRESS, additionalCondition=, surrogateKey=}'
      at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:349)
      at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.closeWindow(AbstractIncrementalSnapshotChangeEventSource.java:118)
      at io.debezium.pipeline.signal.actions.snapshotting.CloseIncrementalSnapshotWindow.arrived(CloseIncrementalSnapshotWindow.java:27)
      at io.debezium.pipeline.signal.SignalProcessor.processSignal(SignalProcessor.java:186)
      at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
      at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720)
      at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)
      at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276)
      at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
      at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179)
      at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
      at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
      at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
      at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
      at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
      at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
      at io.debezium.pipeline.signal.SignalProcessor.lambda$processSourceSignal$4(SignalProcessor.java:150)
      at io.debezium.pipeline.signal.SignalProcessor.executeWithSemaphore(SignalProcessor.java:160)
      at io.debezium.pipeline.signal.SignalProcessor.processSourceSignal(SignalProcessor.java:144)
      at io.debezium.pipeline.EventDispatcher$2.changeRecord(EventDispatcher.java:319)
      at io.debezium.relational.RelationalChangeRecordEmitter.emitCreateRecord(RelationalChangeRecordEmitter.java:79)
      at io.debezium.relational.RelationalChangeRecordEmitter.emitChangeRecords(RelationalChangeRecordEmitter.java:47)
      at io.debezium.pipeline.EventDispatcher.dispatchDataChangeEvent(EventDispatcher.java:299)
      at io.debezium.connector.oracle.logminer.buffered.BufferedLogMinerStreamingChangeEventSource.lambda$handleCommitEvent$0(BufferedLogMinerStreamingChangeEventSource.java:479)
      at io.debezium.connector.oracle.logminer.TransactionCommitConsumer.dispatchChangeEvent(TransactionCommitConsumer.java:512)
      at io.debezium.connector.oracle.logminer.TransactionCommitConsumer.accept(TransactionCommitConsumer.java:132)
      at io.debezium.connector.oracle.logminer.buffered.BufferedLogMinerStreamingChangeEventSource.lambda$handleCommitEvent$1(BufferedLogMinerStreamingChangeEventSource.java:491)
      at io.debezium.connector.oracle.logminer.buffered.memory.MemoryLogMinerTransactionCache.forEachEvent(MemoryLogMinerTransactionCache.java:89)
      at io.debezium.connector.oracle.logminer.buffered.memory.MemoryLogMinerTransactionCache.forEachEvent(MemoryLogMinerTransactionCache.java:27)
      at io.debezium.connector.oracle.logminer.buffered.BufferedLogMinerStreamingChangeEventSource.handleCommitEvent(BufferedLogMinerStreamingChangeEventSource.java:486)
      at io.debezium.connector.oracle.logminer.AbstractLogMinerStreamingChangeEventSource.processEvent(AbstractLogMinerStreamingChangeEventSource.java:478)
      at io.debezium.connector.oracle.logminer.AbstractLogMinerStreamingChangeEventSource.executeAndProcessQuery(AbstractLogMinerStreamingChangeEventSource.java:404)
      at io.debezium.connector.oracle.logminer.buffered.BufferedLogMinerStreamingChangeEventSource.process(BufferedLogMinerStreamingChangeEventSource.java:243)
      at io.debezium.connector.oracle.logminer.buffered.BufferedLogMinerStreamingChangeEventSource.executeLogMiningStreaming(BufferedLogMinerStreamingChangeEventSource.java:156)
      at io.debezium.connector.oracle.logminer.AbstractLogMinerStreamingChangeEventSource.execute(AbstractLogMinerStreamingChangeEventSource.java:212)
      at io.debezium.connector.oracle.logminer.AbstractLogMinerStreamingChangeEventSource.execute(AbstractLogMinerStreamingChangeEventSource.java:88)
      at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:329)
      at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:207)
      at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:147)
      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
      at java.base/java.lang.Thread.run(Thread.java:840)
      Caused by: java.sql.SQLRecoverableException: Closed Connection
      at oracle.jdbc.driver.OracleStatement.ensureOpen(OracleStatement.java:5007)
      at oracle.jdbc.driver.OracleStatement.setQueryTimeout(OracleStatement.java:3793)
      at oracle.jdbc.driver.OracleStatementWrapper.setQueryTimeout(OracleStatementWrapper.java:292)
      at io.debezium.jdbc.JdbcConnection.prepareUpdate(JdbcConnection.java:787)
      at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowOpen(SignalBasedIncrementalSnapshotChangeEventSource.java:70)
      at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:262)
      ... 44 more"

},

  • line_number:"197",
  • message:"Action snapshot-window-close failed. The signal SignalRecord{id='397e1f09-967c-4b37-b239-e9fcdf8f210e-close', type='snapshot-window-close', data='{\"openWindowTimestamp\": \"2025-11-13T17:48:50.327206867Z\", \"closeWindowTimestamp\": \"2025-11-13T17:48:50.606823768Z\"}', additionalData={}} may not have been processed.",

 Thanks,

Vinoth

Chris Cranford

unread,
Nov 15, 2025, 1:54:39 AMNov 15
to debe...@googlegroups.com
Hi Vinoth -

Which Debezium version is this? There have been some changes lately and we need to know the version. Meanwhile, you could also try setting the keep alive for the JDBC connection by setting 

oracle.net.keepAlive Enables TCP keep alive false
oracle.net.TCP_KEEPIDLE Specifies the amount of time for a connection to remain idle before sending keep alive probes -1
oracle.net.TCP_KEEPINTERVAL Specifies the frequency at which keep alive probes are retransmitted -1
oracle.net.TCP_KEEPCOUNT

Can you also please share what are all your internal.* and log.mining.* configuration options? It should not take the connector 90+ minutes to return even with 40G of data generated, unless your redo/archive logs are on the same physical disks as your database files.

-cc
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/94311a7c-48e3-4505-9c54-7a1599d5c36an%40googlegroups.com.

Chris Cranford

unread,
Nov 15, 2025, 1:55:40 AMNov 15
to debe...@googlegroups.com
My apologies, I should have mentioned, the JDBC connection properties can be set by prefixing them with driver. 

    driver.oracle.net.keepAlive=true

Be sure to retain the exact case specified by Oracle.

Thanks,
-cc

Vinoth Kumar

unread,
Nov 16, 2025, 11:15:41 PMNov 16
to debezium
Hi Chris,

We are using the latest Debezium version "3.3.1" and we already enabled keepalive params in the kafkaconnect.yaml (but not sure whether it is taking effect however these values reflected at POD) and the requested values already shared in the trail email and in attachement.

Also can you please confirm where do we need to set this param "driver.oracle.net.keepAlive=true"

Noticed one more thing, as per sumologic logs it is unable to close-snapshot however close-snapshot entry seen in the SIGNAL table. Not sure whether the existing session got killed and its unable to reinitiate/resume or the connectivity itself lost between DB & Connector. Could you please assist and your thoughts ?

Chris Cranford

unread,
Nov 20, 2025, 10:27:42 AM (13 days ago) Nov 20
to debe...@googlegroups.com
Hi,

The driver.oracle.net.keepAlive or similar are to be set in the connector configuration. As to your last question, we'd need to see the logs to understand what you mean.

Thanks
-cc
Reply all
Reply to author
Forward
0 new messages