Snapshot Performance Observation – Debezium Connector v2.5

156 views
Skip to first unread message

Ramesh K

unread,
Sep 17, 2025, 3:34:34 AMSep 17
to cran...@gmail.com, debe...@googlegroups.com

Hi 

I deployed Debezium connector v2.5 with snapshot.mode=initial. The snapshot completed quickly for most tables, but performance significantly slowed for one specific table — emitting only ~50 records over an extended period.

Upon investigation, I noticed this table has three primary keys, which may be contributing to the slowdown. It appears that table structure and key complexity can impact snapshot throughput, especially during incremental snapshotting.

Let me know if you’ve encountered similar behavior or have recommendations for optimizing performance in such cases.

and what else could be the reason ?

logs and config also attached below.

Best,
Ramesh.


config and logs.txt

jiri.p...@gmail.com

unread,
Sep 17, 2025, 6:38:50 AMSep 17
to debezium
Hi,

we had a smiliar problem with incremental snapshots and that should be fixed in Debezium 3.x. I am not aware of any performance issues with initial snapshots that would relate to composite key.

Jiri

jiri.p...@gmail.com

unread,
Sep 17, 2025, 6:40:05 AMSep 17
to debezium

Chris Cranford

unread,
Sep 17, 2025, 6:48:53 AMSep 17
to debezium
Hi Ramesh -

During the initial snapshot, there should be little to no performance issues with a table with a composite key. That's because this process issues a simple SELECT and iterates all the rows one by one. However, during incremental snapshots, a composite key can most definite impact performance. This is because database optimizer may decide not to use the primary key index due to the the predicate conditions, which will make the chunk-based query execute slower as it will use a full table scan to determine the chunk of rows.

Which specific database source are you working with, and is there a single column in the table that is considered unique you could use as a surrogate key instead?

-cc

Chris Cranford

unread,
Sep 17, 2025, 6:49:44 AMSep 17
to debezium
Looks like Jiri and I cross posted, please disregard.

Ramesh K

unread,
Sep 17, 2025, 7:31:04 AMSep 17
to debe...@googlegroups.com
Hi

I’m using Oracle 11g with the Debezium connector in snapshot.mode=initial, and the snapshot completed quickly for most tables. However, one table — — is noticeably slower.

Here’s what I’ve found so far: table 

has a composite primary key,
No single-column unique key exists
large payloads in the TEXT field.

Does any of those have an effect?

Appreciate any suggestions you might have to optimise the snapshot.

Best regards,
Ramesh.


--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/5a9671ff-79ed-4368-93de-b843570982e1%40gmail.com.

Chris Cranford

unread,
Sep 17, 2025, 11:47:29 AMSep 17
to debe...@googlegroups.com
Hi Ramesh -

The initial snapshot on Oracle tables that use CLOB columns are significantly slower due to how Oracle stores and retrieves LOB (Large Object) data. BLOB and CLOB column values are stored "out-of-line", meaning the actual LOB data is stored in separate database blocks from the main table row. The table row only contains a LOB locator (pointer) that references where the actual LOB data is stored.

When the initial snapshot is executed, Oracle must read the main table row that contains the LOB locator and then make an additional IO call to retrieve the actual LOB content from the separate storage, and finally assemble the data from the row & the out-of-line storage. So the reads for LOB data requires additional block IO operations beyond the normal table scan. If the table has many rows, this multiplies the IO operations substantially. In addition, LOB data is typically much larger and will consume the vast majority of the network bandwidth as result set data is received by Debezium. The larger the LOB data, the longer it takes to serialize the row or rows to the client. Lastly, LOB column data requires subtstantially more memory for buffering the result set, and if your fetch size is too large, this can lead to frequent garbage collection cycles which will impact the JVM's performance.

Hope that helps explain why you see slower performance with LOB columns.

-cc

Ramesh K

unread,
Sep 17, 2025, 12:28:54 PMSep 17
to debe...@googlegroups.com
Hi Chris,


Thank you for the detailed explanation.

Could you please outline the exact corrective steps? Our initial snapshot is hanging on one of our tables, and each time we restart the connector with snapshot.mode=initial, it reruns the full snapshot for all tables.

How can we allow the snapshot for that specific table to complete normally? Should we change field types, or is there an alternative approach?

Additionally, if we temporarily remove that table from the connector’s include list, what steps should we follow to prevent the other tables from being resnapshotted on restart?


Regards,
Ramesh


Chris Cranford

unread,
Sep 17, 2025, 12:46:43 PMSep 17
to debe...@googlegroups.com
Hi Ramesh -

How important is the CLOB column when it comes to CDC? If the CLOB column is not really relevant, perhaps you could use `columns.exclude.list` and exclude that specific column for the table. In this case, the column would not be part of the SELECT statement and the snapshot should be run faster. However, if the column data is important, then I'm afraid you will need to pick between initial or incremental snapshots. I'm afraid outside of excluding the column, there is little Debezium can do in this regard to improve the performance of fetching your row data and the CLOB data from external storage.

-cc

Ramesh K

unread,
Sep 18, 2025, 4:20:33 AM (14 days ago) Sep 18
to debe...@googlegroups.com, jiri.p...@gmail.com
Hi chris/Jiri,

So the problematic table looks like this below snippet.

image.png

 in snapshot.mode=initial with 11 tables. Snapshots for 10 tables have completed (row counts match the database), but the 11th table is still hanging—and I never saw an explicit “snapshot complete” log for each finished table. and during the connector  restart it performs full snapshots for all 11 tables.

My goal: skip snapshotting that last problematic table and start CDC for all 10, without triggering a full resnapshot on connector or cluster restart.

Questions:

  1. Excluding the problematic table

    • Should I use the Debezium Signal Table, or

    • Remove it from table.include.list, or

  2. Restart behavior

    • If the problematic table remains in include.list, will a connector or full cluster restart resume only that table’s snapshot?

    • If it’s removed, will Debezium skip all snapshots and go straight to CDC for the other 10?

  3. Snapshot tracking

    • Does Debezium track “snapshot complete” on a per-table basis (so it can resume individually), or only when all tables have finished?


Regards,
Ramesh.

Chris Cranford

unread,
Sep 18, 2025, 11:39:06 PM (13 days ago) Sep 18
to debe...@googlegroups.com
Hi Ramesh

I apologize I misunderstood, when you said TEXT before, I assumed you meant CLOB. Unfortunately LONG data types are not supported by Debezium [1].

Now that should not cause the connector to hang. Can you please do me a favor, and only include this specific table in your include list, enable DEBUG logging for `io.debezium`, and start the connector. Let the connector run for a bit, and then please share the logs.

Thanks,
-cc

[1]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-binary-character-lob-types

mohd fasil

unread,
Sep 19, 2025, 12:15:48 AM (13 days ago) Sep 19
to debe...@googlegroups.com
Hi Chris,

I have already shared the logs; please check the first email.

And how to exclude this specific one column which has a long value?

Best regards,
Mohammed Fasil.

image.png

Ramesh K

unread,
Sep 19, 2025, 12:45:40 AM (13 days ago) Sep 19
to debe...@googlegroups.com
Thanks Fasil.


Chris, Can you please mention the further steps.

Thanks,
Ramesh

Chris Cranford

unread,
Sep 19, 2025, 12:59:53 AM (13 days ago) Sep 19
to debe...@googlegroups.com
Hi -

The log in the first email does not have DEBUG enabled, nor is it the full and complete log. If you could please setup a second connector that only captures changes for this table with the LONG with debug logging enabled, and share the complete log, that would be great.

Thanks,
-cc

Ramesh K

unread,
Sep 19, 2025, 6:11:05 AM (13 days ago) Sep 19
to debe...@googlegroups.com
Hi chris,

Logs attached below.

and how can we exclude one LONG column from the snapshot? what config we should use.

i appreciate your help.

Thanks.


logs.txt

Ramesh K

unread,
Sep 22, 2025, 7:54:40 AM (10 days ago) Sep 22
to debe...@googlegroups.com
Hi Chris,

Thanks I included config "column.exclude.list" for LONG data type and yes performance increased.

I have one more query while the connector was running on initial mode and there are no live changes in db then how can we know that snapshot completed?
because even though the number of records from db and topic are in sync but still after checking in logs it says snapshot completed = false. (screenshot attached below).
how to identify that snapshot is completed.
image.png

Regards,
Ramesh

Chris Cranford

unread,
Sep 22, 2025, 8:09:38 AM (10 days ago) Sep 22
to debe...@googlegroups.com
Hi Ramesh -

The more reliable way is to set `heartbeat.interval.ms` to a value greater than `0`. This way when the snapshot finishes, a heartbeat event is emitted to Kafka and the offsets will reflect `snapshot=false` as you would expect.  Otherwise, this only happens when the first streaming event is emitted.

-cc

Ramesh K

unread,
Sep 22, 2025, 11:37:50 AM (9 days ago) Sep 22
to debe...@googlegroups.com
Hi Chris,

We’ve set heartbeat.interval.ms = 6000, and after the initial snapshot completes (even though no explicit log message confirms it), we verified completion by matching the record count between the database and Kafka topics.

However, we’ve observed that when the connector is restarted with snapshot.mode=initial and no live changes occurring in the database, the snapshot is re-triggered for the same existing data.
Q1: How can we prevent this re-snapshot from happening upon connector restart?

Additionally, as we prepare for production deployment—where there will be both existing data and ongoing CDC activity—
Q2: What is the recommended snapshot mode to use during startup so that it captures both the existing data and continues with live CDC seamlessly?

Looking forward to your guidance.


Regards,
Ramesh


Chris Cranford

unread,
Sep 22, 2025, 11:06:53 PM (9 days ago) Sep 22
to debe...@googlegroups.com
Hi Ramesh -

You mention there is no explicit log message confirming the snapshot completed, so are you not seeing logs like this:

    Snapshot step 7 - Snapshotting data   
    ...
    Snapshot - Final stage
    Snapshot completed
    Snapshot ended with SnapshotResult [status=COMPLETED, offset=OracleOffsetContext [scn=..., commit_scn=[], lcr_position=null]]

If you are not seeing "Snapshot completed" and "Snapshot ended" in the logs, then I would argue the snapshot isn't finishing properly. Can you provide the full connector logs?

As for your Q2, the default `snapshot.mode` of `initial` is sufficient for this purpose in most cases. Users may elect to use `no_data` instead with Oracle and rely on Incremental Snapshots in the event that your historical data is large and that taking an initial snapshot takes more time than your database's undo retention allows, which will cause a snapshot failure due to the flashback query SCN aging out of undo retention space. This typically results in an ORA-01555 error, which we discuss in the FAQ [1], "What's the cause for ORA-01555 and how to handle it?"

Thanks,
-cc

[1]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-frequently-asked-questions

Ramesh K

unread,
Sep 23, 2025, 2:24:58 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi Chris,

Thanks.

The connector was down for few hours and when we restarted it giving error : Online REDO LOG files or archive log files do not contain the offset scn 12337276408659. Please perform a new snapshot. at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:166

image.png

what steps should be taken. please.

Regards,
Ramesh.

Ramesh K

unread,
Sep 23, 2025, 5:28:58 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi 

I tried below

Consider removing the offsets/history topic and redeploy the connector taking a new snapshot. 

and still the same error 
Online REDO LOG files or archive log files do not contain the offset scn 12337276408659. Please perform a new snapshot. at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:166.
suggest further steps.


Chris Cranford

unread,
Sep 23, 2025, 7:51:05 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi Ramesh

In this case, you need to remove the offsets for the connector and delete it's schema history topic, and restart the connector to re-take a full snapshot.

-cc

Chris Cranford

unread,
Sep 23, 2025, 8:06:30 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi Ramesh -

How are you deleting the connector's offsets because the fact the same error with the same SCN value would indicate the offsets were likely not cleared properly.

-cc
Reply all
Reply to author
Forward
0 new messages