Free Download Oracle 10g

3 views
Skip to first unread message

Sanny Olafeso

unread,
Jul 25, 2024, 9:02:35 PM7/25/24
to SKAT and MetaSKAT user group

To optimally configure and run a Debezium Oracle connector, it is helpful to understand how the connector performs snapshots, streams change events, determines Kafka topic names, uses metadata, and implements event buffering.

Typically, the redo logs on an Oracle server are configured to not retain the complete history of the database.As a result, the Debezium Oracle connector cannot retrieve the entire history of the database from the logs.To enable the connector to establish a baseline for the current state of the database, the first time that the connector starts, it performs an initial consistent snapshot of the database.

If the time needed to complete the initial snapshot exceeds the UNDO_RETENTION time that is set for the database (fifteen minutes, by default), an ORA-01555 exception can occur.For more information about the error, and about the steps that you can take to recover from it, see the Frequently asked questions.

The following workflow lists the steps that Debezium takes to create a snapshot.These steps describe the process for a snapshot when the snapshot.mode configuration property is set to its default value, which is initial.You can customize the way that the connector creates snapshots by changing the value of the snapshot.mode property.If you configure a different snapshot mode, the connector completes the snapshot by using a modified version of this workflow.

Determine the tables to be captured.By default, the connector captures all tables except those with schemas that exclude them from capture.After the snapshot completes, the connector continues to stream data for the specified tables.If you want the connector to capture data only from specific tables you can direct the connector to capture the data for only a subset of tables or table elements by setting properties such as table.include.list or table.exclude.list.

Capture the structure of all database tables, or all tables that are designated for capture.The connector persists schema information in its internal database schema history topic.The schema history provides information about the structure that is in effect when a change event occurs.

By default, the connector captures the schema of every table in the database that is in capture mode, including tables that are not configured for capture.If tables are not configured for capture, the initial snapshot captures only their structure; it does not capture any table data.For more information about why snapshots persist schema information for tables that you did not include in the initial snapshot, see Understanding why initial snapshots capture the schema for all tables.

Confirms that the table was created before the snapshot began.If the table was created after the snapshot began, the connector skips the table.After the snapshot is complete, and the connector transitions to streaming, it emits change events for any tables that were created after the snapshot began.

After the snapshot process begins, if the process is interrupted due to connector failure, rebalancing, or other reasons, the process restarts after the connector restarts.After the connector completes the initial snapshot, it continues streaming from the position that it read in Step 3 so that it does not miss any updates.If the connector stops again for any reason, after it restarts, it resumes streaming changes from where it previously left off.

The connector performs a database snapshot as described in the default workflow for creating an initial snapshot.After the snapshot completes, the connector begins to stream event records for subsequent database changes.

Set this option to restore a database schema history topic that is lost or corrupted.After a restart, the connector runs a snapshot that rebuilds the topic from the source tables.You can also set the property to periodically prune a database schema history topic that experiences unexpected growth.

WARNING: Do not use this mode to perform a snapshot if schema changes were committed to the database after the last connector shutdown.

The custom snapshot mode lets you inject your own implementation of the io.debezium.spi.snapshot.Snapshotter interface.Set the snapshot.mode.custom.name configuration property to the name provided by the name() method of your implementation.The name is specified on the classpath of your Kafka Connect cluster.If you use the Debezium EmbeddedEngine, the name is included in the connector JAR file.For more information, see custom snapshotter SPI.

In some cases, you might want to limit schema capture in the initial snapshot.This can be useful when you want to reduce the time required to complete a snapshot.Or when Debezium connects to the database instance through a user account that has access to multiple logical databases, but you want the connector to capture changes only from tables in a specific logic database.

In some cases, you might want the connector to capture data from a table whose schema was not captured by the initial snapshot.Depending on the connector configuration, the initial snapshot might capture the table schema only for specific tables in the database.If the table schema is not present in the history topic, the connector fails to capture the table, and reports a missing schema error.

All entries for the table in the transaction log use the same schema.For information about capturing data from a new table that has undergone structural changes, see Capturing data from tables not captured by the initial snapshot (schema change).

(Optional) After the snapshot completes, initiate an incremental snapshot on the newly added tables.The incremental snapshot first streams the historical data of the newly added tables, and then resumes reading changes from the redo and archive logs for previously configured tables, including changes that occur while that connector was off-line.

If a schema change is applied to a table, records that are committed before the schema change have different structures than those that were committed after the change.When Debezium captures data from a table, it reads the schema history to ensure that it applies the correct schema to each event.If the schema is not present in the schema history topic, the connector is unable to capture the table, and an error results.

If you want to capture data from a table that was not captured by the initial snapshot, and the schema of the table was modified, you must add the schema to the history topic, if it is not already available.You can add the schema by running a new schema snapshot, or by running an initial snapshot for the table.

Removing offsets should be performed only by advanced users who have experience in manipulating internal Kafka Connect data.This operation is potentially destructive, and should be performed only as a last resort.

In this procedure the connector performs a full initial snapshot of the database.As with any initial snapshot, in a database with many large tables, running an initial snapshot can be a time-consuming operation.After the snapshot completes, you can optionally trigger an incremental snapshot to capture any changes that occur while the connector is off-line.

By default, a connector runs an initial snapshot operation only after it starts for the first time.Following this initial snapshot, under normal circumstances, the connector does not repeat the snapshot process.Any future change event data that the connector captures comes in through the streaming process only.

However, in some situations the data that the connector obtained during the initial snapshot might become stale, lost, or incomplete.To provide a mechanism for recapturing table data, Debezium includes an option to perform ad hoc snapshots.You might want to perform an ad hoc snapshot after any of the following changes occur in your Debezium environment:

You can re-run a snapshot for a table for which you previously captured a snapshot by initiating a so-called ad-hoc snapshot.Ad hoc snapshots require the use of signaling tables.You initiate an ad hoc snapshot by sending a signal request to the Debezium signaling table.

When you initiate an ad hoc snapshot of an existing table, the connector appends content to the topic that already exists for the table.If a previously existing topic was removed, Debezium can create a topic automatically if automatic topic creation is enabled.

Ad hoc snapshot signals specify the tables to include in the snapshot.The snapshot can capture the entire contents of the database, or capture only a subset of the tables in the database.Also, the snapshot can capture a subset of the contents of the table(s) in the database.

You specify the tables to capture by sending an execute-snapshot message to the signaling table.Set the type of the execute-snapshot signal to incremental or blocking, and provide the names of the tables to include in the snapshot, as described in the following table:

An array that contains regular expressions matching the fully-qualified names of the table to be snapshotted.
The format of the names is the same as for the signal.data.collection configuration option.

An optional array that specifies a set of additional conditions that the connector evaluates to determine the subset of records to include in a snapshot.
Each additional condition is an object that specifies the criteria for filtering the data that an ad hoc snapshot captures.You can set the following parameters for each additional condition:

Specifies column values that must be present in a database record for the snapshot to include it, for example, "color='blue'".

The values that you assign to the filter parameter are the same types of values that you might specify in the WHERE clause of SELECT statements when you set the snapshot.select.statement.overrides property for a blocking snapshot.In earlier Debezium releases, an explicit filter parameter was not defined for snapshot signals; instead, filter criteria were implied by the values that were specified for the now deprecated additional-condition parameter.

Reply all
Reply to author
Forward
0 new messages