Online REDO LOG files or archive log files do not contain the offset

2,764 views
Skip to first unread message

Johan Vandevenne

unread,
Jun 26, 2022, 7:27:25 AM6/26/22
to debezium
Hello,

My connector fails to start with below error message. I assume I have to talk to the DBA about this, but my question is:
How to I trigger a new snapshot ? Signal table is unavailable as the connector is not running.

[2022-06-26 13:20:04,854] ERROR Mining session stopped due to the {} (io.debezium.connector.oracle.logminer.LogMinerHelper)
io.debezium.DebeziumException: Online REDO LOG files or archive log files do not contain the offset scn 146545375278.  Please perform a new snapshot.
        at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:142)
        at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:60)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:174)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:141)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2022-06-26 13:20:04,854] ERROR Producer failure (io.debezium.pipeline.ErrorHandler)
io.debezium.DebeziumException: Online REDO LOG files or archive log files do not contain the offset scn 146545375278.  Please perform a new snapshot.
        at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:142)
        at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:60)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:174)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:141)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:109)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

marcelo...@gmail.com

unread,
Jun 26, 2022, 2:44:01 PM6/26/22
to debezium
This is because the SCN has been deleted or moved. I faced this same problem on a recent installation of Debezium with Oracle Logminer. (Debezium version 1.93 and Oracle 11g)

You need to talk to a DBA and understand why the SCN is being deleted or moved.

Here, in our case, the backup process erased the redologs. So we changed the backup strategy.

I have some tips for configuring Debezium with LogMiner (use at your own risk):

Here are my 9 tips to Configure Debezium with Oracle LogMiner

1. Archived Redo Logs files must be in a single directory only, as Debeziun always collects data from a single ARCHIVE directory in the table:

V$ARCHIVE_DEST_STATUS (SELECT * FROM V$ARCHIVE_DEST_STATUS  WHERE STATUS='VALID' AND TYPE='LOCAL'  AND ROWNUM=1)

2. Archived Redo Logs files must not be removed or moved to another directory. Debezium may need these files in a re-snapshot case, and if they don't exist, SCN data will be lost.

3. If it is necessary to move the archived Redo Logs to another directory, due to lack of disk space or other reasons, we recommend setting a time of at least 10 to 15 days for these files to remain to ensure that everything is being captured by Debezium. This removal is not advised because in an incident (such as loss of capture or database outage) we may need older Redo Logs due to the SCN contained in them.

4. We should not keep any Redo Logs with SEQ=0, this causes Redo Logs conflict issues in Debezium.

5.  A minimum Redo Log time of at least 24 hours is required. Depending on the size of the table and the seasonality at which the data is updated, this time is necessary to ensure that all data is captured. You can reduce this period as you learn how your data behaves. We are in an age of live data (real time remembers?) so learn from your data.

6. You need to deep dive and understand which tables in your ecosystem go more than a day without receiving any records. Oracle understands that the SCN becomes "old" or outdated and the backup process removes the Redo log, as it has reached a condition where its "age" indicates that it needs to be moved or deleted and this causes a problem for the capture stopping the Debezium connector from working.

7. We use a strategy with Heartbeat to fix this improper "removal" from the SCN. With this strategy, Heartbeat sends data every 10 seconds, updating the SCN automatically so it will never go out of date and avoid undue removal.

8. By default, the Debezium Connector on Oracle LogMiner only considers the existing Redo Logs in the +DGRECO directory, any process that moves the Redo Log to another directory, Debezium ignores.  A lot of confusion occurs in this context as it leads to believe that, like magic, Debezium automatically identifies where the files are. What is not true.

9. Another important information is that /orabackup/logminer is not considered by the connector, if the Redo Log is moved to this directory as a backup strategy, it will not be recognized by the connector

What I learned from this is that you are going to configure this data capture with Debezium 100 times in a LogMiner environment and each of those 100 times will be different. Be patient and fight the red demon just like Gandalf did with Balrog

Regards,

Marcelo Costa

https://www.linkedin.com/in/marcelojscosta/






Nathan Smit

unread,
Jun 26, 2022, 3:27:29 PM6/26/22
to debezium
Just to add to the comment above, in order to get things up and running again you're going to have to clear the offsets.  There's some detail on clearing committed offsets in the FAQ:  https://debezium.io/documentation/faq/#how_to_remove_committed_offsets_for_a_connector

Here's some helpful advice from Chris that I've stolen from a Zulip chat.  In the past I've found it easiest to just delete the history topic and then redeploy the connector with a new name.

Chris Cranford

 the best and easiest way to perform a new snapshot when you hit this scenario is either to:

  1. Delete the database history & offset topic from the Kafka broker
  2. Restart the connector

OR

  1. Delete the database history topic from the Kafka Broker
  2. Redeploy the connector in Kafka Connect with a brand-new name

Using a new name in the latter approach guarantees that a new offsets key will be used since they're based on the connector's names, so there is no need to remove the prior offsets from the broker when using this approach.


Chris Cranford

unread,
Jun 27, 2022, 9:06:06 AM6/27/22
to debe...@googlegroups.com, marcelo...@gmail.com
Hi Marcelo -

These are great points, though I have a few comments inline.

On 6/26/22 14:44, marcelo...@gmail.com wrote:
This is because the SCN has been deleted or moved. I faced this same problem on a recent installation of Debezium with Oracle Logminer. (Debezium version 1.93 and Oracle 11g)

You need to talk to a DBA and understand why the SCN is being deleted or moved.

Here, in our case, the backup process erased the redologs. So we changed the backup strategy.

I have some tips for configuring Debezium with LogMiner (use at your own risk):

Here are my 9 tips to Configure Debezium with Oracle LogMiner

1. Archived Redo Logs files must be in a single directory only, as Debeziun always collects data from a single ARCHIVE directory in the table:

V$ARCHIVE_DEST_STATUS (SELECT * FROM V$ARCHIVE_DEST_STATUS  WHERE STATUS='VALID' AND TYPE='LOCAL'  AND ROWNUM=1)


You can also influence which archive log destination gets used by using:
https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-property-log-mining-archive-destination-name

This times into a bit of the other points here as it allows the DBA to configure a special location for archive logs with longer retention policies and space allocations that may not directly affect the database server.  Some users who have extremely large installations set up a separate location just to keep the archive logs off the database hardware but mounted via this destination mapping for Debezium compatibility & keeping server space allocation minimal.


2. Archived Redo Logs files must not be removed or moved to another directory. Debezium may need these files in a re-snapshot case, and if they don't exist, SCN data will be lost.

See above.


3. If it is necessary to move the archived Redo Logs to another directory, due to lack of disk space or other reasons, we recommend setting a time of at least 10 to 15 days for these files to remain to ensure that everything is being captured by Debezium. This removal is not advised because in an incident (such as loss of capture or database outage) we may need older Redo Logs due to the SCN contained in them.

See above.  The 10-15 day period is really arbitrary and I would say depends on your environment.  The best way to know how far you need to keep archive logs is to review the connector's metrics, checking the OffsetScn value.  Any archive log with a range that does not contain or come after this SCN point can safely be removed without any concerns of problems restarting the connector as those logs have safely been processed.


4. We should not keep any Redo Logs with SEQ=0, this causes Redo Logs conflict issues in Debezium.

This is the first I've seen this, can you elaborate how this happens and the implications, Marcelo?


5.  A minimum Redo Log time of at least 24 hours is required. Depending on the size of the table and the seasonality at which the data is updated, this time is necessary to ensure that all data is captured. You can reduce this period as you learn how your data behaves. We are in an age of live data (real time remembers?) so learn from your data.

I would recommend seeing my comments on (3), as this gives a clear indication of how to know when its safe to deal with aged log files.


6. You need to deep dive and understand which tables in your ecosystem go more than a day without receiving any records. Oracle understands that the SCN becomes "old" or outdated and the backup process removes the Redo log, as it has reached a condition where its "age" indicates that it needs to be moved or deleted and this causes a problem for the capture stopping the Debezium connector from working.

This "age" is called the Undo Retention period and your DBA knows all about how this works and can give more detail if needed.


7. We use a strategy with Heartbeat to fix this improper "removal" from the SCN. With this strategy, Heartbeat sends data every 10 seconds, updating the SCN automatically so it will never go out of date and avoid undue removal.

8. By default, the Debezium Connector on Oracle LogMiner only considers the existing Redo Logs in the +DGRECO directory, any process that moves the Redo Log to another directory, Debezium ignores.  A lot of confusion occurs in this context as it leads to believe that, like magic, Debezium automatically identifies where the files are. What is not true.

See my comments in (1), this can be explicitly influenced.

9. Another important information is that /orabackup/logminer is not considered by the connector, if the Redo Log is moved to this directory as a backup strategy, it will not be recognized by the connector

This is likely caused because the value in V$ARCHIVED_LOG.NAME column no longer aligns with the location on the file system.  Log filenames are resolved from V$ARCHIVED_LOG fwiw.
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/4c329b18-8e94-4caa-9311-3dea76986fe7n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages