Adding New Tables Without Defined PK During Ongoing Snapshot

38 views
Skip to first unread message

Ramesh K

unread,
Sep 23, 2025, 8:03:14 AM (9 days ago) Sep 23
to debezium

Hi 

For debezium 2.5 and oracle.

I want to include some new tables in my Debezium Oracle connector, but they don’t have a defined primary key. When I asked the DBA, they gave me one column that’s unique and told me to treat it as a surrogate key.

My connector is currently running in snapshot.mode=initial, and the snapshot is still in progress.

How can I safely add these new tables to the connector while the snapshot is running?

Chris Cranford

unread,
Sep 23, 2025, 8:40:34 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi Ramesh -

If you reconfigure the connector and add it to the `table.include.list`, the connector will be stopped & restarted, restarting the snapshot from the beginning. If you don't want to interrupt the existing snapshot, you'll need to let it finish and then use incremental snapshots paired with reconfiguration. The choice is really entirely up to you on what makes the most sense.

-cc
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/617bc7b3-56aa-40e0-9288-a4d60ffab109n%40googlegroups.com.

Ramesh K

unread,
Sep 23, 2025, 8:55:35 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi Chris,

Thanks 

with respect to my current case i will let finish the snapshot and what parameters to include in config
like 
incremental.snapshot.chunk.size
signal data collection 

and anything else.

and can you please tell if the signal format is correct.
INSERT INTO DBZUSER.DEBEZIUM_SIGNAL (ID, TYPE, DATA) 
VALUES (
  'ad-hoc-snapshot-23', 
  'execute-snapshot', 
  '{
    "data-collections": ["ISMFTEST.TORRENT.PSTA"],
    "type": "incremental",
    "surrogate-key": "STAGEID"
  }'

Thanks.

Regards,
Ramesh


Ramesh K

unread,
Sep 23, 2025, 9:08:57 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hey Chris,

One more question.

If I understand correctly, in a production environment where live streaming is happening on the database side, and we want to add new tables to the connector configuration — what’s the right way to do that?

Regards,
Ramesh


Chris Cranford

unread,
Sep 23, 2025, 9:25:12 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi Ramesh,

The configuration options and the signal are correct, but if you've set `schema.history.internal.store.only.captured.tables.ddl` as `true` in your connector configuration, you need to be sure to follow one of two procedures [1] [2] depending on whether schema changes have recently been applied.

Thanks,
-cc

[1]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-capturing-data-from-tables-not-captured-by-the-initial-snapshot-no-schema-change
[2]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-capturing-data-from-new-tables-with-schema-changes

Chris Cranford

unread,
Sep 23, 2025, 9:26:54 AM (9 days ago) Sep 23
to debe...@googlegroups.com
Hi,

See my previous email, I shared two links for the procedure if `schema.history.internal.store.only.captured.tables.ddl` is set to `true`.  If you've left this set to `false`, which is rarely the case with Oracle, then it's really as simple as adding the table to the include list and triggering your incremental snapshot. There's no special procedure needed given the connector is already tracking the schema for all tables.

Thanks,
-cc

Ramesh K

unread,
Sep 23, 2025, 12:16:40 PM (9 days ago) Sep 23
to debe...@googlegroups.com

Hi Chris,

Yes, we are using schema.history.internal.store.only.captured.tables.ddl = true.
what may be drawbacks of using false because it seems to be straightforward while adding new tables.

I had a question regarding incremental snapshots through the signal table — how does it work when the table doesn’t have a primary key defined?

The DBA mentioned that one of the columns is unique. In this case, should the DBA alter the table to define that column as a primary key? Or is there a way to handle this through connector configuration without modifying the table schema


Regards,
Ramesh

You received this message because you are subscribed to a topic in the Google Groups "debezium" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/debezium/eprO8bnk-es/unsubscribe.
To unsubscribe from this group and all its topics, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/5015a351-7dad-406f-9089-034595611caf%40gmail.com.

Ramesh K

unread,
Sep 23, 2025, 12:28:17 PM (8 days ago) Sep 23
to debe...@googlegroups.com
I forgot to mention — I simply want to add new keyless tables to an existing connector and perform a full snapshot on them.

Please give me some steps.

Chris Cranford

unread,
Sep 24, 2025, 12:03:11 AM (8 days ago) Sep 24
to debe...@googlegroups.com
Hi Ramesh -

The drawback is that every DDL change is tracked and serialized to the schema history topic. If you have lots of DDL changes on non-captured tables, this can eventually lead to the history topic growing very large and schema recovery on connector startup will become slower depending on the number of recorded schema changes. On small databases with infrequent DDL changes, this may not be an issue.

For the signal table, a primary key or unique index is necessary. The only requirement is that the table be defined with 3 columns and that the first column is called "ID" if you're using the insert/delete watermark strategy [1].

Lastly, for keyless tables, this is there there are some very strict limitations. Incremental snapshots require some sort of key, which can be defined as either:

    - Explicit primary key or unique index on the table
    - Configuration specifies an entry in `message.key.columns` for the table in question
    - The incremental snapshot signal can specify a single column to act as a surrogate key

If you cannot satisfy one of these requirements, incremental snapshots are not possible and you'll have to settle of using ad-hoc blocking snapshots for the keyless table. But regardless of whether you're using incremental or blocking snapshots, the same prerequisites I outlined below in a prior message exist in terms of making sure the table's schema has been captured before sending the signal if you've set `schema.history.internal.store.only.captured.tables.ddl` as `true` and the table's schema hasn't yet been tracked by the connector.

Hope that helps.
-cc

[1]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-property-incremental-snapshot-watermarking-strategy
Reply all
Reply to author
Forward
0 new messages