I'm contacting the community for some guidance regarding data ingestion from our production Oracle database using the Oracle connector.
We encountered a performance issue while setting up data ingestion using the Debezium connector. Initially, supplemental logging wasn't enabled on the database. To capture more detailed changes, we activated it with the following commands:
alter database add supplemental log data;
alter table Schema_name.Table_name add supplemental log data(all) columns;Problem:
Unfortunately, enabling supplemental logging resulted in a significant slowdown in application requests, ultimately impacting service performance. To address this, we disabled supplemental logging.
Question:
While we believe supplemental logging was the culprit, we'd appreciate some insights from the community:
- What are the recommended approaches for using supplemental logging with the Oracle connector while maintaining optimal performance?
- Alternative Strategies: Are there alternative methods for achieving comprehensive data capture without sacrificing performance?
We're aiming to re-implement in production, and your expertise would be invaluable in ensuring a smooth implementation.
After the incident, we changed the configuration
--changed below the line on config
poll.interval.ms 100 -->
10000snapshot.locking.mode
shared -->
noneschema.history.internal.kafka.recovery.poll.interval.ms 100 -->
10000 --added below the config line
"log.mining.strategy": "
online_catalog",
"incremental.snapshot.chunk.size": "
1024",
The connector configuration is attached.
{
"name": "name1",
"config": {
"connector.class": "io.debezium.connector.oracle.OracleConnector",
"
offset.flush.interval.ms": "1000",
"database.hostname": "x.x.x.x",
"database.port": "xxxx",
"database.user": "xxxxxxx",
"database.password": "xxxxxxx",
"database.dbname": "xxxx",
"
database.server.name": "xxxx",
"topic.prefix": "xxxx",
"snapshot.mode": "initial",
"database.connection.adapter": "logminer",
"
poll.interval.ms": "10000",
"log.mining.strategy": "online_catalog",
"tasks.max": "1",
"snapshot.locking.mode": "none",
"incremental.snapshot.chunk.size": "1024",
"schema.history.internal.kafka.bootstrap.servers": "bootstrap,server,list",
"schema.history.internal.kafka.topic": "schema",
"
schema.history.internal.kafka.recovery.poll.interval.ms": "10000",
"schema.history.internal.store.only.captured.tables.ddl": "true",
"table.include.list": "schema.table",
"schema.include.list": "schema",
"transforms": "Reroute",
"transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter",
"transforms.Reroute.topic.regex": "(*.*)",
"transforms.Reroute.topic.replacement": "table",
"topic.creation.default.replication.factor": "3",
"topic.creation.default.partitions": "8",
"incremental.snapshot.chunk.size": "512",
"log.mining.include.redo.sql": "true",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"value.converter": "org.apache.kafka.connect.json.JsonConverter"
}
}
Thanks in advance for your time and knowledge sharing!