Adding New Tables Without Defined PK During Ongoing Snapshot

70 views
Skip to first unread message

Ramesh K

unread,
Sep 23, 2025, 8:03:14 AMSep 23
to debezium

Hi 

For debezium 2.5 and oracle.

I want to include some new tables in my Debezium Oracle connector, but they don’t have a defined primary key. When I asked the DBA, they gave me one column that’s unique and told me to treat it as a surrogate key.

My connector is currently running in snapshot.mode=initial, and the snapshot is still in progress.

How can I safely add these new tables to the connector while the snapshot is running?

Chris Cranford

unread,
Sep 23, 2025, 8:40:34 AMSep 23
to debe...@googlegroups.com
Hi Ramesh -

If you reconfigure the connector and add it to the `table.include.list`, the connector will be stopped & restarted, restarting the snapshot from the beginning. If you don't want to interrupt the existing snapshot, you'll need to let it finish and then use incremental snapshots paired with reconfiguration. The choice is really entirely up to you on what makes the most sense.

-cc
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/617bc7b3-56aa-40e0-9288-a4d60ffab109n%40googlegroups.com.

Ramesh K

unread,
Sep 23, 2025, 8:55:35 AMSep 23
to debe...@googlegroups.com
Hi Chris,

Thanks 

with respect to my current case i will let finish the snapshot and what parameters to include in config
like 
incremental.snapshot.chunk.size
signal data collection 

and anything else.

and can you please tell if the signal format is correct.
INSERT INTO DBZUSER.DEBEZIUM_SIGNAL (ID, TYPE, DATA) 
VALUES (
  'ad-hoc-snapshot-23', 
  'execute-snapshot', 
  '{
    "data-collections": ["ISMFTEST.TORRENT.PSTA"],
    "type": "incremental",
    "surrogate-key": "STAGEID"
  }'

Thanks.

Regards,
Ramesh


Ramesh K

unread,
Sep 23, 2025, 9:08:57 AMSep 23
to debe...@googlegroups.com
Hey Chris,

One more question.

If I understand correctly, in a production environment where live streaming is happening on the database side, and we want to add new tables to the connector configuration — what’s the right way to do that?

Regards,
Ramesh


Chris Cranford

unread,
Sep 23, 2025, 9:25:12 AMSep 23
to debe...@googlegroups.com
Hi Ramesh,

The configuration options and the signal are correct, but if you've set `schema.history.internal.store.only.captured.tables.ddl` as `true` in your connector configuration, you need to be sure to follow one of two procedures [1] [2] depending on whether schema changes have recently been applied.

Thanks,
-cc

[1]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-capturing-data-from-tables-not-captured-by-the-initial-snapshot-no-schema-change
[2]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-capturing-data-from-new-tables-with-schema-changes

Chris Cranford

unread,
Sep 23, 2025, 9:26:54 AMSep 23
to debe...@googlegroups.com
Hi,

See my previous email, I shared two links for the procedure if `schema.history.internal.store.only.captured.tables.ddl` is set to `true`.  If you've left this set to `false`, which is rarely the case with Oracle, then it's really as simple as adding the table to the include list and triggering your incremental snapshot. There's no special procedure needed given the connector is already tracking the schema for all tables.

Thanks,
-cc

Ramesh K

unread,
Sep 23, 2025, 12:16:40 PMSep 23
to debe...@googlegroups.com

Hi Chris,

Yes, we are using schema.history.internal.store.only.captured.tables.ddl = true.
what may be drawbacks of using false because it seems to be straightforward while adding new tables.

I had a question regarding incremental snapshots through the signal table — how does it work when the table doesn’t have a primary key defined?

The DBA mentioned that one of the columns is unique. In this case, should the DBA alter the table to define that column as a primary key? Or is there a way to handle this through connector configuration without modifying the table schema


Regards,
Ramesh

You received this message because you are subscribed to a topic in the Google Groups "debezium" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/debezium/eprO8bnk-es/unsubscribe.
To unsubscribe from this group and all its topics, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/5015a351-7dad-406f-9089-034595611caf%40gmail.com.

Ramesh K

unread,
Sep 23, 2025, 12:28:17 PMSep 23
to debe...@googlegroups.com
I forgot to mention — I simply want to add new keyless tables to an existing connector and perform a full snapshot on them.

Please give me some steps.

Chris Cranford

unread,
Sep 24, 2025, 12:03:11 AMSep 24
to debe...@googlegroups.com
Hi Ramesh -

The drawback is that every DDL change is tracked and serialized to the schema history topic. If you have lots of DDL changes on non-captured tables, this can eventually lead to the history topic growing very large and schema recovery on connector startup will become slower depending on the number of recorded schema changes. On small databases with infrequent DDL changes, this may not be an issue.

For the signal table, a primary key or unique index is necessary. The only requirement is that the table be defined with 3 columns and that the first column is called "ID" if you're using the insert/delete watermark strategy [1].

Lastly, for keyless tables, this is there there are some very strict limitations. Incremental snapshots require some sort of key, which can be defined as either:

    - Explicit primary key or unique index on the table
    - Configuration specifies an entry in `message.key.columns` for the table in question
    - The incremental snapshot signal can specify a single column to act as a surrogate key

If you cannot satisfy one of these requirements, incremental snapshots are not possible and you'll have to settle of using ad-hoc blocking snapshots for the keyless table. But regardless of whether you're using incremental or blocking snapshots, the same prerequisites I outlined below in a prior message exist in terms of making sure the table's schema has been captured before sending the signal if you've set `schema.history.internal.store.only.captured.tables.ddl` as `true` and the table's schema hasn't yet been tracked by the connector.

Hope that helps.
-cc

[1]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-property-incremental-snapshot-watermarking-strategy

Ramesh K

unread,
Oct 21, 2025, 5:00:41 AM (yesterday) Oct 21
to debe...@googlegroups.com

Hi Chris, thanks for the response.

As we prepare to move to the production database, I had a quick question: If we need to add new tables later on, what’s the recommended approach?

Should we follow the method outlined in theDebezium Connector for Oracle :: Debezium Documentation— based on schema change detection — or is there a better alternative for production environments?

Also, if we do follow that method, how can we ensure there’s no data loss during the period when the connector is paused and the new tables are being added?

Appreciate your guidance on this!

Regards,
Ramesh



Chris Cranford

unread,
Oct 21, 2025, 6:18:32 AM (yesterday) Oct 21
to debe...@googlegroups.com
Hi Ramesh -

Those are the procedures that are available. They depend on what your `schema.history.internal.store.only.capture.tables.ddl` is configured as along with whether or not there have been any recent DDL changes to any of the captured tables (including the one you want to add). When you use the recovery procedure, there won't be any data loss. The connector simply re-primes the schema history topic and resumes streaming where it left off. That's why the important caveat with this procedure is that there cannot be DDL changes on any captured table (including the one you want to add) between the offset position and when the connector re-primes the schema history topic. If schema changes were made and you were unaware, you will likely face issues with capturing the older events as the schema structure will no longer match the transaction log events, and you'll need to re-take a snapshot to recover.

Hope that helps.
-cc

Ramesh K

unread,
Oct 21, 2025, 6:50:51 AM (yesterday) Oct 21
to debe...@googlegroups.com
Hi Chris,

Thanks again for the detailed explanation — that really helped clarify the process.

In our case, there won’t be any schema changes on the captured tables, including the ones we plan to add later.

Given that, could you please confirm whether it’s recommended to set
schema.history.internal.store.only.capture.tables.ddl = true/false.
And follow recovery process?
in the production environment while adding new tables?

Just want to make sure we’re aligned with best practices and avoid any schema-history issues during the process.

Appreciate your time and guidance!


Chris Cranford

unread,
Oct 21, 2025, 10:53:17 PM (12 hours ago) Oct 21
to debe...@googlegroups.com
Hi Ramesh -

Most Oracle users typically set `schema.history.internal.store.only.captured.tables.ddl` as `true`, as this keeps the schema history topic reasonably small over its lifetime. So in this case and given that you would have no schema changes at play, the best solution would be to use the `snapshot.mode` set to `schema_only_recovery` (up to Debezium 2.7) or `recovery` (Debezium 3.0+) when you add the table to the `table.include.list`. On restart, the connector re-primes the schema history topic before resuming capturing changes. After that if you need the historical data for the newly added table, use incremental or blocking snapshots.

Does that make sense?

-cc

Ramesh K

unread,
3:04 AM (8 hours ago) 3:04 AM
to debe...@googlegroups.com
Hi Chris,

thanks

Anyhow, to add a new table we have to follow the process right which involves deleting schema history topics also?
the same procedure which is mentioned on the debezium doc page. or else can we also do without deleting schema history topic?

appreciate your response,

Thanks,
Ramesh



Chris Cranford

unread,
3:16 AM (8 hours ago) 3:16 AM
to debe...@googlegroups.com
Hi Ramesh -

We recommend you remove the schema history topic because the connector is going to push new events for every table you capture. It's the perfect time to reset the topic's contents to reduce the history topic recovery time on future connector restarts; however, it's not mandatory.

-cc

Ramesh K

unread,
3:27 AM (8 hours ago) 3:27 AM
to debe...@googlegroups.com

Hi Chris,

I followed these steps to add a new table:

  1. Paused the connector

  2. Added the new table to the configuration

  3. Changed the snapshot mode to schema_only_recovery

  4. Restarted the connector

However, I don’t see the schema for the new table being captured in the schema history topic. Could you help me understand what might be missing or if there’s an additional step required?

I am worried because in production I don't want to delete schema history topics for adding new tables.

Thanks,

Ramesh







Chris Cranford

unread,
3:52 AM (7 hours ago) 3:52 AM
to debe...@googlegroups.com
Hi Ramesh -

Can you please share the full connector logs after the connector restarted?

Thanks,
-cc
Reply all
Reply to author
Forward
0 new messages