Debizium Oracle connector with snapshots does not work with some options

686 views
Skip to first unread message

Restwin E

unread,
Sep 18, 2023, 9:19:14 AM9/18/23
to debezium
Hi, can you help me understand how to configure the connector?

We are trying to run the connector for Oracle 19c using snapshots.
Tried running it without snapshots at all in "snapshot.mode": "schema_only". The connector starts without errors, listens to logs, sends creation records correctly.

When running in "snapshot.mode": "initial" mode, it does not take into account the parameter with the snapshot limitation:
"snapshot.select.statement.overrides.mdm6.client": "SELECT * FROM schema.table WHERE persistdate > TO_TIMESTAMP('07.03.2023','DD.MM.YYYYY') AND VERSION = 0".
and starts copying the whole table with only one constraint to the last scn.
INFO For table 'DB.SCHEMA.TABLE' using select statement: 'SELECT "DTYPE", "LOCALSYSTEMID", "VERSION", "APPLICATIONNAME", "PERSISTDATE" [...] FROM "SCHEMA". "TABLE" AS OF SCN 310072308644114'.

When running the connector with the signal table in the database and in schema_only mode, it crashes with a NullPointer error, and I can't figure out from the previous log entries why it don't run with these settings.
Last messages from logs:
...
Sep 18, 2023 @ 16:00:10.164,"[2023-09-18 16:00:10,164] INFO Snapshot step 5 - Reading structure of captured tables (io.debezium.relational.RelationalSnapshotChangeEventSource:135)"
Sep 18, 2023 @ 16:00:10.164,"[2023-09-18 16:00:10,164] INFO Only captured tables schema should be captured, capturing: [DB.SCHEMA.HISTORY] (io.debezium.connector.oracle.OracleSnapshotChangeEventSource:160)"
Sep 18, 2023 @ 16:00:10.497,"[2023-09-18 16:00:10,497] INFO Capturing structure of table DB.SCHEMA.HISTORY (io.debezium.relational.RelationalSnapshotChangeEventSource:344)"
Sep 18, 2023 @ 16:00:10.497,"[2023-09-18 16:00:10,497] INFO Snapshot step 6 - Persisting schema history (io.debezium.relational.RelationalSnapshotChangeEventSource:144)"
Sep 18, 2023 @ 16:00:11.685,"[2023-09-18 16:00:11,685] INFO Already applied 1 database changes (io.debezium.relational.history.SchemaHistoryMetrics:140)"
Sep 18, 2023 @ 16:00:11.685,"[2023-09-18 16:00:11,685] INFO Capturing structure of table DB.USER.SIGNAL_SNAPSHOT (io.debezium.relational.RelationalSnapshotChangeEventSource:344)"
Sep 18, 2023 @ 16:00:11.686,"[2023-09-18 16:00:11,686] WARN Snapshot was not completed successfully, it will be re-executed upon connector restart (io.debezium.pipeline.source.AbstractSnapshotChangeEventSource:97)"
Sep 18, 2023 @ 16:00:11.686,"[2023-09-18 16:00:11,686] INFO Snapshot - Final stage (io.debezium.pipeline.source.AbstractSnapshotChangeEventSource:88)"
Sep 18, 2023 @ 16:00:11.686,"[2023-09-18 16:00:11,686] INFO Connected metrics set to 'false' (io.debezium.pipeline.ChangeEventSourceCoordinator:240)"
Sep 18, 2023 @ 16:00:11.823,"[2023-09-18 16:00:11,823] INFO WorkerSourceTask{id=reindexer_client_1809_3-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask:433)"
Sep 18, 2023 @ 16:00:11.823,"[2023-09-18 16:00:11,823] INFO WorkerSourceTask{id=reindexer_client_1809_3-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask:416)"
Sep 18, 2023 @ 16:00:11.824,"[2023-09-18 16:00:11,824] INFO Stopping down connector (io.debezium.connector.common.BaseSourceTask:278)"
Sep 18, 2023 @ 16:00:11.835,"[2023-09-18 16:00:11,835] INFO Connection gracefully closed (io.debezium.jdbc.JdbcConnection:947)"
Sep 18, 2023 @ 16:00:11.835,"[2023-09-18 16:00:11,835] INFO [Producer clientId=DB-schemahistory] Closing the Kafka producer with timeoutMillis = 30000 ms. (org.apache.kafka.clients.producer.KafkaProducer:1183)"
Sep 18, 2023 @ 16:00:11.836,"[2023-09-18 16:00:11,836] INFO [Producer clientId=connector-producer-reindexer_client_1809_3-0] Closing the Kafka producer with timeoutMillis = 30000 ms. (org.apache.kafka.clients.producer.KafkaProducer:1183)"

In status of connector:
org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped.
at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:72)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:116)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.debezium.DebeziumException: java.lang.NullPointerException
at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:85)
at io.debezium.pipeline.ChangeEventSourceCoordinator.doSnapshot(ChangeEventSourceCoordinator.java:155)
at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:137)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
5 more
Caused by: java.lang.NullPointerException
at io.debezium.connector.oracle.OracleSnapshotChangeEventSource.getCreateTableEvent(OracleSnapshotChangeEventSource.java:210)
at io.debezium.relational.RelationalSnapshotChangeEventSource.createSchemaChangeEventsForTables(RelationalSnapshotChangeEventSource.java:353)
at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:146)
at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:76)
8 more


Maybe it can help for understanding - connector with signal table:
{
    "name":"reindexer_client_1409_1",
    "config":{
        "connector.class":"io.debezium.connector.oracle.OracleConnector",
        "tasks.max":"1",
        "database.server.name":"DB",
        "database.url":"jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.10.10.10)(PORT = 1521)))(CONNECT_DATA = (SERVER = DEDICATED)(SERVICE_NAME = DB)))",
        "database.port":"1521",
        "database.user":"user",
        "database.password":"password",
        "database.dbname": "DB",
"database.connection.adapter": "logminer",
"log.mining.strategy": "online_catalog",
        "schema.history.internal.kafka.bootstrap.servers":"10.10.10.31:9092,10.10.10.32:9092,10.10.10.33:9092",
        "schema.history.internal.kafka.topic":"DB.SCHEMA.HISTORY",
"schema.history.internal.store.only.captured.tables.ddl": "true",
"schema.history.internal.store.only.captured.databases.ddl": "true",
        "schema.include.list":"SCHEMA",
        "table.include.list":"SCHEMA.TABLE",
"topic.prefix": "DB",
        "topic.creation.default.replication.factor":"3",
        "topic.creation.default.partitions":"3",
        "snapshot.mode":"initial",
"snapshot.locking.mode":"none",
"snapshot.select.statement.overrides":"SCHEMA.TABLE",
"snapshot.select.statement.overrides.schema.table":"SELECT * FROM SCHEMA.TABLE WHERE persistdate > TO_TIMESTAMP('07.03.2023','DD.MM.YYYY') AND VERSION = 0",
"signal.data.collection":"DB.USER.SIGNAL_SNAPSHOT",
"signal.enabled.channels":"source",
        "skipped.operations":"t",
        "unavailable.value.placeholder":"__debezium_unavailable_value",
        "publication.autocreate.mode":"filtered",
"decimal.handling.mode": "string",
"include.schema.changes": "false",
        "transforms":"unwrap",
        "transforms.unwrap.type":"io.debezium.transforms.ExtractNewRecordState",
        "transforms.unwrap.drop.tombstones":"false",
        "transforms.unwrap.delete.handling.mode":"rewrite",
"transforms.unwrap.add.fields":"op,table",
"internal.log.mining.read.only":"true"
    }
}

Restwin E

unread,
Sep 19, 2023, 11:11:29 AM9/19/23
to debezium
Hi all! We switched the application to DEBUG logging and got the following details when in signal table mode (also nothing special to my understanding):

"Sep 19, 2023 @ 17:35:40.600","[2023-09-19 17:35:40,600] INFO Capturing structure of table  DB.USER.SIGNAL_SNAPSHOT  (io.debezium.relational.RelationalSnapshotChangeEventSource:344)"
"Sep 19, 2023 @ 17:35:40.600","[2023-09-19 17:35:40,600] ERROR Error during snapshot (io.debezium.relational.RelationalSnapshotChangeEventSource:171)

java.lang.NullPointerException
at io.debezium.connector.oracle.OracleSnapshotChangeEventSource.getCreateTableEvent(OracleSnapshotChangeEventSource.java:210)
at io.debezium.relational.RelationalSnapshotChangeEventSource.createSchemaChangeEventsForTables(RelationalSnapshotChangeEventSource.java:353)
at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:146)
at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:76)
at io.debezium.pipeline.ChangeEventSourceCoordinator.doSnapshot(ChangeEventSourceCoordinator.java:155)
at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:137)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)"
"Sep 19, 2023 @ 17:35:40.601","[2023-09-19 17:35:40,601] INFO Snapshot - Final stage (io.debezium.pipeline.source.AbstractSnapshotChangeEventSource:88)"
"Sep 19, 2023 @ 17:35:40.601","[2023-09-19 17:35:40,601] WARN Snapshot was not completed successfully, it will be re-executed upon connector restart (io.debezium.pipeline.source.AbstractSnapshotChangeEventSource:97)"
"Sep 19, 2023 @ 17:35:40.601","[2023-09-19 17:35:40,601] ERROR Producer failure (io.debezium.pipeline.ErrorHandler:57)

io.debezium.DebeziumException: java.lang.NullPointerException
at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:85)
at io.debezium.pipeline.ChangeEventSourceCoordinator.doSnapshot(ChangeEventSourceCoordinator.java:155)
at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:137)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NullPointerException
at io.debezium.connector.oracle.OracleSnapshotChangeEventSource.getCreateTableEvent(OracleSnapshotChangeEventSource.java:210)
at io.debezium.relational.RelationalSnapshotChangeEventSource.createSchemaChangeEventsForTables(RelationalSnapshotChangeEventSource.java:353)
at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:146)
at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:76)
... 8 more"
"Sep 19, 2023 @ 17:35:40.601","[2023-09-19 17:35:40,601] INFO Connected metrics set to 'false' (io.debezium.pipeline.ChangeEventSourceCoordinator:240)"
"Sep 19, 2023 @ 17:35:40.672","[2023-09-19 17:35:40,672] INFO WorkerSourceTask{id=reindexer_client_1909_3-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask:416)"
"Sep 19, 2023 @ 17:35:40.672","[2023-09-19 17:35:40,672] INFO WorkerSourceTask{id=reindexer_client_1909_3-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask:433)"
"Sep 19, 2023 @ 17:35:40.672","[2023-09-19 17:35:40,672] ERROR WorkerSourceTask{id=reindexer_client_1909_3-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)

org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped.
at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:72)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:116)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.debezium.DebeziumException: java.lang.NullPointerException
at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:85)
at io.debezium.pipeline.ChangeEventSourceCoordinator.doSnapshot(ChangeEventSourceCoordinator.java:155)
at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:137)
at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
... 5 more

Caused by: java.lang.NullPointerException
at io.debezium.connector.oracle.OracleSnapshotChangeEventSource.getCreateTableEvent(OracleSnapshotChangeEventSource.java:210)
at io.debezium.relational.RelationalSnapshotChangeEventSource.createSchemaChangeEventsForTables(RelationalSnapshotChangeEventSource.java:353)
at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:146)
at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:76)
... 8 more"
"Sep 19, 2023 @ 17:35:40.673","[2023-09-19 17:35:40,673] ERROR WorkerSourceTask{id=reindexer_client_1909_3-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:180)"
"Sep 19, 2023 @ 17:35:40.673","[2023-09-19 17:35:40,673] INFO Stopping down connector (io.debezium.connector.common.BaseSourceTask:278)"
"Sep 19, 2023 @ 17:35:40.685","[2023-09-19 17:35:40,685] INFO Connection gracefully closed (io.debezium.jdbc.JdbcConnection:947)"
"Sep 19, 2023 @ 17:35:40.685","[2023-09-19 17:35:40,685] INFO [Producer clientId=DB-schemahistory] Closing the Kafka producer with timeoutMillis = 30000 ms. (org.apache.kafka.clients.producer.KafkaProducer:1183)"
"Sep 19, 2023 @ 17:35:40.686","[2023-09-19 17:35:40,686] INFO [Producer clientId=connector-producer-reindexer_client_1909_3-0] Closing the Kafka producer with timeoutMillis = 30000 ms. (org.apache.kafka.clients.producer.KafkaProducer:1183)"


понедельник, 18 сентября 2023 г. в 16:19:14 UTC+3, debezium:

Chris Cranford

unread,
Sep 19, 2023, 11:56:31 AM9/19/23
to debe...@googlegroups.com
Hi,

Please make sure that when you define the snapshot.select.statement.overrides that you use the proper case for the table as returned by the Oracle JDBC driver.  For example, Oracle stores all object references in upper-case by default unless you create the table using double-quotes to make the table or schema names case-sensitive.  Therefore, its critical you define the configuration as follows:

snapshot.select.statement.overrides=SCHEMA.TABLE1,SCHEMA.TABLE2
snapshot.select.statement.overrides.SCHEMA.TABLE1=select ...
snapshot.select.statement.overrides.SCHEMA.TABLE2=select...

That would explain the fact the override was not used in your first email.  In the second use case, this looks like there is some case mismatch problem still.  Can you share the exact configuration for the NPE use case?  If you sanitize table names, please be sure to preserve the case used so we can determine what could be going on.

Thanks,
Chris
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/01a8a709-de7f-4877-b7ab-858cda668069n%40googlegroups.com.

Restwin E

unread,
Sep 20, 2023, 10:34:05 AM9/20/23
to debezium
Hi. Thank you so much for your reply.
Indeed changing the case helped - the application overrides Select. Thanks.
I think this can either be added to the documentation or use the toUpperCase function for OracleConnector. I can create an issue later.

Regarding signaling tables with NPE, I forgot to mention that we are using version 2.2 on the test so far, but plan to switch to 2.3 or 2.4 when we bring it to the production environment. This will allow us to maybe localize the bug.
The configuration is used as stated in the first post, except we tried in schema_only mode too. I have replaced the schema, table and DB names as they have no special meaning and have a simple name with no dots (mdm1.client). Specified in the same case. All other parameters are specified as is.
Could the error be caused by the fact that we are collecting data from one schema (SCHEMA) and the signaling table is in a other schema (USER) given the specified parameters (in the first message)?

вторник, 19 сентября 2023 г. в 18:56:31 UTC+3, Chris Cranford:

Chris Cranford

unread,
Sep 21, 2023, 7:11:09 AM9/21/23
to debe...@googlegroups.com
Hi,

There is already a Jira issue and work being done to introduce case-insensitive select override behavior.  Hopefully that will be included in Debezium 2.5 next quarter.

As for the last question, ideally utilizing two schemas shouldn't be a problem as I believe we have such a use case in our test suite, but I'll add a todo to check this and report back with my findings.  In the meantime, if you use the exact same configuration but with the signal table and the data table in the same schema, do you observe any change in the connector's behavior?

Thanks,
Chris

Restwin E

unread,
Sep 22, 2023, 10:34:02 AM9/22/23
to debezium
Hi!
We have created a table in the same database schema. The connector was created successfully. So there seems to be an impact that the signal table and the tables being captured are in different schemas.
There are of course questions on how else to create records for the signal tables (lots of conditions with value in quotes), but I'll ask that in another thread if I fail.
We've decided to defer incremental snapshots to another release of our solution, so we're not dependent on this problem for now.

Thanks

четверг, 21 сентября 2023 г. в 14:11:09 UTC+3, Chris Cranford:

Chris Cranford

unread,
Sep 22, 2023, 2:27:24 PM9/22/23
to debe...@googlegroups.com
Hi,

Can you confirm what grants the Debezium database user has on the signal table as well as the schema itself that the signal table is within? 

Thanks,
Chris

Restwin E

unread,
Sep 29, 2023, 10:42:50 AM9/29/23
to debezium
Hi. Sorry for the slow response.
"signal.data.collection":"TL_DEBEZIUM.SIGNAL_SNAPSHOT"
It has SELECT, INSERT,  UPDATE, DELETE ON  SIGNAL_SNAPSHOT

пятница, 22 сентября 2023 г. в 21:27:24 UTC+3, Chris Cranford:

Chris Cranford

unread,
Oct 4, 2023, 5:05:28 PM10/4/23
to debe...@googlegroups.com
Hi Restwin -

I tested this with the latest 2.4.0.Final and unfortunately I was unable to reproduce the failure.  I will say that if you place the signal table in a secondary schema, you will need to define that schema in the "schema.include.list" or simply rely on the "table.include.list" without the "schema.include.list".  Are you using a specific micro-release of Debezium 2.2?

Thanks,
Chris

Restwin E

unread,
Oct 9, 2023, 9:55:36 AM10/9/23
to debezium
Hi!
I think I'll try version 2.4.0.Final tomorrow and come back with an answer.

четверг, 5 октября 2023 г. в 00:05:28 UTC+3, Chris Cranford:

Restwin E

unread,
Oct 19, 2023, 10:20:41 AM10/19/23
to debezium
Hi!
I tried with the new version and on one hand the connector doesn't crash when you create it. But when you create a signal it crashes with the same error.
The error is corrected, as you advised, by adding the scheme where the table of signals is located to the whitelist.
However, I have another error, but maybe I can solve it myself :)

понедельник, 9 октября 2023 г. в 16:55:36 UTC+3, Restwin E:
Reply all
Reply to author
Forward
0 new messages