ERROR| REDO LOG files or archive log

1,388 views
Skip to first unread message

HariBabu kuruva

unread,
Nov 10, 2023, 5:24:50 AM11/10/23
to debe...@googlegroups.com
Hi All,
We got the below error in the connect logs. When I check with the DB team they are saying they will store only the 3 days archive logs, not more than that.

Please help me with the below doubts on this.

1. How to address this issue
2. Why does the connector look at 3 days old logs as it would  immediately capture the changes from the DB. Please help me understand how exactly the capturing works here.

Thanks in advance.

[2023-11-09 21:27:19,284] ERROR [cap-connector|task-0] Mining session stopped due to error. (io.debezium.connector.oracle.logminer
.LogMinerStreamingChangeEventSource:260)
io.debezium.DebeziumException: Online REDO LOG files or archive log files do not contain the offset scn 10972785816716.  Please pe
rform a new snapshot.
        at
io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.jav
a:159)
        at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.jav
a:62)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:272)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:197)
        at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:137)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
[2023-11-09 21:27:19,286] ERROR [cap-connector|task-0] Producer failure (io.debezium.pipeline.ErrorHandler:52)
io.debezium.DebeziumException: Online REDO LOG files or archive log files do not contain the offset scn 10972785816716.  Please pe
rform a new snapshot.

--

Thanks and Regards,
 Hari 
Mobile:9790756568

Chris Cranford

unread,
Nov 10, 2023, 11:32:35 AM11/10/23
to debe...@googlegroups.com
Hi Hari -

We recently became aware of a specific use case that could lead to a bug where the connector would not advance the low watermark SCN correctly.  This would lead to a situation where one or more transactions would be considered active in the internal transaction buffer and upon a restart, the low watermark SCN would point to a position in the past, which very well could reference a position in an archive log that no longer exists.  I would suggest all our Oracle users to consider upgrading to Debezium 2.5.0.Alpha2, released today, or to grab Debezium 2.4.1.Final which will be released next week that include the fix for this bug.

Now the more detailed explanation is that the Oracle connector maintains two watermarks or positions into the redo logs, a low and high position.  The low watermark represents a resume position while the high watermark represents the last transaction that committed per redo thread that we have sent events for.  These watermarks are necessary because Oracle writes a combination of committed and uncommitted changes to the redo logs and we do not know whether we should act on a transaction until we see the COMMIT or the ROLLBACK for that specific transaction. Therefore, the connector maintains an internal buffer of all in-progress transactions and a transaction is only released from that buffer when we see the COMMIT or ROLLBACK.

When the connector restarts, it must rebuild this buffer and so it starts from the low watermark and re-processes all data from that point forward. The high watermark is used to avoid resending transactions to your Kafka topics that we've already sent, so basically the state of the buffer after a restart should be identical to what it was prior to after re-consuming the redo logs from the low watermark.

And as such, since we are rebuilding this buffer from a position in the past (the low watermark), if you have an in-progress tranasction that is left in this state longer than your archive long retention period, then when the connector restarts we won't be able to find the resume position and this error will be thrown.  Now there are user-controlled cases (beyond the bug I mentioned above) that can cause this.  We've had at least one report where an engineer left a SQL Developer session open over a long-weekend with an active transaction, the connector restarted later in the weekend and failed because the resume point was older than the available archive logs.  For these types of corner cases, you can explore whether the `log.mining.transaction.retention.ms` setting will help.

Let me know if you have any other questions.
Chris
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/CANHpA4uGUKmLqcYX%3D6_HifXPJd2QQskKeRZ0Dfyhpi5%3DawfS_Q%40mail.gmail.com.

HariBabu kuruva

unread,
Nov 13, 2023, 9:39:04 AM11/13/23
to debe...@googlegroups.com
Thanks for your detailed explanation.

I have a few questions, appreciate it if you could help.

Is this bug only with the Debezium 2.4 version or do we have it in the previous version also ?

As I am working on a POC environment, how can I temporarily fix this issue ? ( ex. by deleting connector and recreating or by deleting the topics).








Chris Cranford

unread,
Nov 13, 2023, 9:54:07 AM11/13/23
to debe...@googlegroups.com
Hi Hari -

This bug existed in all previous versions of the connector.

You can certainly elect to delete the database history topic, clear the offsets, and redeploy the connector taking a new snapshot to rectify the problem.  If you're also confident in the process, you can also manipulate the offsets manually and then use incremental snapshots to re-take the snapshot for the affected tables as needed.  I'd suggest using which ever process is the easiest for you to manage.

Thanks,
Chris

HariBabu kuruva

unread,
Nov 20, 2023, 11:41:13 PM11/20/23
to debe...@googlegroups.com
Hi Chris,

In continuation to this thread. Please help me with the below things.

I have installed debezium-2.4.1-Final and restarted the connector , then also it gave the same previous error message of missing archive log.  So I tried to delete the db history topic and restarted the connector , then I got the error " db history topic is missing" .For now I have restarted the connector with a new name.

  1. As I have continued with the same old setup, Is the archive log missing error is expected for the first time ?
  2. When i delete db history topic and restart connector ,I am getting db history topic missing error ? How to address this instead of using a new connector name ?
Thank you.

Chris Cranford

unread,
Nov 27, 2023, 9:49:40 AM11/27/23
to debe...@googlegroups.com
Hi Hari -

Sorry for the late reply, I was on holiday break last week. 

As for (1), normally missing archive logs are due to the fact that the SCN the connector wishes to resume from points to a log your DBA has already purged from the database.  These archive logs can sometimes be several GB for each file and so it's not possible to retain these logs indefinitely, and often not even for more than a few days.  We recently made some improvements to the Oracle connector to address a corner case that could cause a transaction to be viewed by the connector as uncommitted, leaving the buffer open, and therefore not advancing this low watermark causing this particular error on restart.  We attempted to fix this in 2.4.1 and 2.5.0.Alpha1 but unfortunate the fix was short-sighted in that we overlooked a small corner case.  This was reported in DBZ-7158 [1] and we've since scheduled this fix for Debezium 2.5 Beta 1 and the next Debezium 2.4.2 (likely due out sometime in December).

As for (2), you could have attempted to use a "snapshot.mode" of "schema_only_recovery", but since you were facing an issue with the missing SCN on start-up, you would have simply gotten past the database history missing problem to face exactly the same issue with the SCN in the offset having aged out of the archive logs.  Unfortunately the only recourse when the SCN has aged out is to remove the offset details for the connector, remove the database history, and redeploy so the connector retakes a snapshot.  Redeploying the connector with a brand new name and no database history topic is an alternative that yields the same outcome.

Hope that helps.
Chris

[1]: https://issues.redhat.com/browse/DBZ-7158

HariBabu kuruva

unread,
Nov 27, 2023, 10:26:11 AM11/27/23
to debe...@googlegroups.com
Thank you for your reply Chris.

Helmi Aziz Muhammad

unread,
Dec 21, 2023, 4:33:56 AM12/21/23
to debezium
Hello, Chris.

We recently got this error too and we've replaced the connector with the version 2.4.2, yet the error still come out four days after we first initiated the Kafka and the connector it. Our DBA hasn't deleted or move the archive log either during those four days. Do you think this is still caused by the bug in the connector or is there any other potential cause for it?

Thank you,
Helmi Aziz Muhammad.

Chris Cranford

unread,
Dec 24, 2023, 10:35:18 PM12/24/23
to debe...@googlegroups.com
Hi Helmi -

If you restart the connector and you're seeing this error, it's most likely that the file either (a) doesn't exist or (b) you may have multiple log destinations configured and you haven't setup Debezium to use the right destination.  Please provide the SCN from the error to your DBA and confirm that the file exists and that it has an entry in the V$ARCHIVED_LOG table.

Thanks,
Chris

Helmi Aziz Muhammad

unread,
Dec 26, 2023, 4:53:11 AM12/26/23
to debezium
Hello, Chris.

Thanks for the reply. While it's true that the error occurred after we restarted the connector, it has been four days since then and no archive logs got deleted at that time interval. Knowing that things are going smoothly with the log mining for four consecutive days, we can safely say that the file does indeed exists, which makes point (a) out of our equations. For point (b), you mentioned about setting up Debezium connector into having a proper destination. Do you mean that we haven't done this step properly or is it another step that we might miss out during the preparation? Also, in which column does the V$ARCHIVED_LOG store the SCN? Please let us know.

Thank you,
Helmi Aziz Muhammad.

Chris Cranford

unread,
Jan 2, 2024, 7:42:08 AM1/2/24
to debe...@googlegroups.com
Hi Helmi -

No I am referring to this particular section [1] in the documentation referring to archive log destinations.  It's important that if your DBA configures multiple locations on the system, you may have one with much more rigid retention polices than the other that you may think the connector is using but could maybe not be.  When multiple destinations are configured on Oracle, it's imperative that you explicitly tell the connector which to use.

In V$ARCHIVED_LOG, there should be a FIRST_CHANGE# and a NEXT_CHANGE# column, these are the SCN values.  The FIRST_CHANGE# is inclusive, meaning that it will be part of that archive log; however the NEXT_CHANGE# is exclusive, meaning that SCN values that come before that SCN will be in the archive log but not that specific SCN.

Hope that helps.
Chris

[1]: https://debezium.io/documentation/reference/stable/connectors/oracle.html#_archive_log_destinations

Helmi Aziz Muhammad

unread,
Jan 4, 2024, 9:37:04 AM1/4/24
to debezium
Hello,

Sorry for the late reply. We've briefed this with our DBA and it seems like your guess were right about the nonexistent archive log. Thanks a lot for the help.

Helmi Aziz Muhammad.
Reply all
Reply to author
Forward
0 new messages