Clarification on Risk of Using Oracle LogMiner for CDC

46 views
Skip to first unread message

Timo Schmidt

unread,
Oct 16, 2025, 4:41:08 AM (6 days ago) Oct 16
to debezium

Dear Debezium Community,

  We are currently evaluating the use of Oracle LogMiner for Change Data Capture (CDC). According to Oracle’s documentation, LogMiner is primarily intended for debugging purposes, to extract information from redo logs for troubleshooting, and is not recommended for CDC use cases.  

Could you please provide your perspective on this? Specifically:

  • How does Debezium address the risks or limitations associated with using LogMiner for CDC?

  • Are there recommended configurations or mitigations that improve reliability and align with Oracle’s guidance?

Your insights will help us better understand the operational risk and ensure we are following best practices when integrating Debezium with Oracle.

Thank you in advance for your support.

Timo

Message has been deleted

Chris Cranford

unread,
Oct 16, 2025, 8:29:26 PM (6 days ago) Oct 16
to debe...@googlegroups.com
Hi Timo -

Oracle does not recommend using LogMiner mainly because they provide a competing product and they prefer to promote their commercial product instead.

As I have indicated in the past, LogMiner does have it's own set of limitations, but GoldenGate (Oracle's commercial product) has similar ones as well. There are some hurdles that GoldenGate can overcome that standalone LogMiner may not because Oracle chooses to expose certain data types through GoldenGate but not LogMiner; however, there are data types that LogMiner supports that GoldenGate does not. What's really the most important point here is to look at what data types are not supported from the outset [1]. There are a myriad of data types supported by LogMiner [2]; however, Debezium may not necessarily cover all these data types, e.g. LONG/LONG RAW, VARRAY, because these are rarely used and there has been no demand from the community for any such column types. 

As for recommended configurations, there are a couple database level settings that can really benefit Debezium.

1. Avoid setting supplemental logging for the entire database, prefer to selectively enable this for tables you need for CDC. This reduces transaction log volume.

2. Make sure that redo logs are sized to align close with Oracle recommended 5-6 log switches per hour. In peak periods where you may have a batch job load data, exceeding this in a small window is a non-issue, but if your database is generate 30 log switches per hour 20/24 hours per day, it's going to be beneficial to have the DBA recalibrate the log sizes to reduce the log switch frequency. When switches occur, CDC tools like LogMiner need to recalibrate internal state because of the log switch, and this is extra steps that introduce back pressure with trying to remain caught up to the most recent changes in the transaction logs.

3. If you intend to capture changes on very large tables, you may need to increase the database parameter, undo_retention, to allow the snapshot to complete. This is because the connector relies on Oracle flashback queries to take a consistent initial snapshot. If the snapshot cannot remain consistent from start to finish, Oracle may throw an ORA-01555 error and you will need to restart the snapshot. If the undo_retention cannot be changed, then you will need to account for using incremental snapshots if you need historical data at the outset rather than using initial snapshots.

There are a few connector configurations we do recommend you set as well, which include

    - log.mining.query.filter.mode to `in` while avoiding the use of regex in your `table.include.list`
    - query.fetch.size if you have medium to high streaming volume
    - snapshot.fetch.size if you have tables with large amounts of rows
    - consider adjusting log.mining.batch.size.* settings
    - consider setting schema.history.internal.store.only.captured.tables.ddl = true

If you have any questions, feel free to review the documentation [3], or reach out to us on Zulip [4], or here on the mailing list.

Thanks,
-cc

[1]: https://docs.oracle.com/en/database/oracle/oracle-database/21/sutil/oracle-logminer-utility.html#GUID-8A4F98EC-C233-4471-BFF9-9FB35EF5AD0D
[2]: https://docs.oracle.com/en/database/oracle/oracle-database/21/sutil/oracle-logminer-utility.html#GUID-BA995486-041E-4C83-83EA-D7BC2A866DE3
[3]: https://debezium.io/documentation/reference/stable/connectors/oracle.html
[4]: https://debezium.zulipchat.com/#narrow/stream/302529-users

--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/1ee00b98-3028-4f36-9fa3-bfedeca0193cn%40googlegroups.com.

Timo Schmidt

unread,
Oct 17, 2025, 12:05:25 AM (5 days ago) Oct 17
to debe...@googlegroups.com

Hi Chris,

Thank you very much for your detailed and insightful response. That really helps clarify the context around Oracle’s recommendation and Debezium’s practical guidance.

As a follow-up, I have two additional questions from a functional and operational perspective:

  1. Risk Mitigation in Production:
    Beyond configuration-level adjustments, are there particular operational practices or architectural patterns that Debezium users have found effective in mitigating the inherent risks or limitations of LogMiner (e.g., data-type gaps, or potential inconsistencies)?
    We’re especially interested in how teams ensure CDC reliability and recoverability under production workloads.

  2. 30-character Identifier Limitation:
    Oracle’s 30-character limit for table and column names in LogMiner poses a challenge in our environment, where renaming is not feasible. How can we replicate such tables? Are there any known approaches or best practices within the Debezium community to work around this limitation?

Thanks again for your time and support, your insights are greatly appreciated.

Best regards,
Timo


Chris Cranford

unread,
Oct 19, 2025, 10:47:04 PM (3 days ago) Oct 19
to debe...@googlegroups.com
Hi Timo -

What many of our users have found to help manage CDC risks is to make sure that have an observability stack that captures connector JMX metrics and sends notifications on abnormalities. This helps troubleshoot connector issues while the data in question may still be available and accessible on the server. This is extremely important as transaction log data is not something that can be retained indefinitely. In addition, to JMX metrics monitoring, having tooling that inspects logs for WARN or ERROR messages is also important. There are several configuration options designed to turn hard failure errors into warnings without the connector faulting, and having knowledge of those instances can be important.

In terms of data-type gaps, this is rarely a concern for most users. With most of our users still using a wide range of Oracle versions from 11g to 21c, there has been very little, if any, changes in what data-types LogMiner supports. I just rechecked Oracle GoldenGate limitations for the latest flagship 26ai build, and aside from the new data-types (vector types and boolean), LogMiner and Oracle's commercial product are quite similar. The one unique outlier that Debezium supports that GoldenGate does not is the ability to capture changes for tables that have exactly one CLOB/NCLOB or BLOB data column. That's a pretty significant advantage for Debezium in certain scenarios.

Regarding recoverability, Debezium provides two very useful on-demand snapshot mechanisms called Incremental and Blocking snapshots. These provide the ability to backfill a subset of rows or all rows from one or more tables. Incremental snapshots in particular can be extremely useful for large systems where you want to capture a subset or all historical data but cannot rely on Flashback queries due to the duration that snapshots take to conclude. In addition, Debezium relies heavily on the runtime to handle recovering from certain connector failures such as network communication issues. The Kafka Connect environment provides an error/retry and back-off mechanism to restart connectors automatically.

Lastly, I'm afraid the 30-character limit is not something that can be worked around. This is a limitation of Oracle, and directly impacts not only LogMiner but any Oracle tooling, such as Oracle XStream, that relies on LogMiner. The recommendation in this case is to rename the table or column; however, that may not always be advantageous. In such cases, you can create a second table that adheres to the 30-character limit and use triggers to insert, update, and delete events in the second table as they're mutated on the primary table. One may think a materialized view is the answer to this problem, but I'm afraid that's inaccurate. When refreshing a materialized view, it deletes and re-inserts all rows, and happens on intervals. For large tables, this will create a tremendous amount of unnecessary change event traffic as every delete and insert is replicated. So I would advise a second table that uses triggers for the best performance.

Hope that helps.
-cc

Reply all
Reply to author
Forward
0 new messages