Missing SCN and Long Running Transactions - debezium oracle connector

229 views
Skip to first unread message

Eduardo Susini Ribeiro

unread,
Nov 7, 2024, 3:13:00 PM11/7/24
to debezium
Hello, we need your help.

We are encountering a problem using the debezium connector oracle 3.0 in a production environment consisting of Openshift 4.9, confluent for kubernetes 2.9.3, confluent platform 7.7.1 and oracle database 19c - RAC on exadata.

In our last test, we enabled the "rac.nodes" parameter to try to mitigate the error that we were encountering frequently, but even with this new parameter the error occurred again after a week. 

The error we are encountering is the following: after a random amount of time, the connector reports "missing scn" warnings for a specific scn (scn=32152259204298) and after about 5 minutes, it starts reporting "long running transaction" errors referencing a scn number immediately before the one reported in the "missing scn" error (scn=32152259204297). And it continues reporting this error forever. If we try to restart the connector, it crashes because it cannot find the scn (scn=32152259204297). 

When we try to locate the scn (scn=32152259204297) in the archive logs, it does not exist, only the previous and the next ones exist. for example: 32152259204296 and 32152259204298. 
And according to what we were informed by Oracle support, this behavior would be normal, that is, there may be gaps in the SCN numbers. 

Therefore, some questions we have are: 
is this a bug in Debezium? 
How does Debezium detect a "missing SCN"? 
Why does Debezium assume that a "missing SCN" is a problem? 
And why does it consider that a "missing SCN" is a "long running transaction"? 
Or is it possible that the information given to us by Oracle support was inaccurate, and there should not be any SCN gaps?

More information is attached. 
Note: The timezone in log file is UTC, and in png file is UTC-3, this means that at the same time that the "missing scn" was detected, the metric debezium_metrics_numberofactivetransactions changed to 1.

Thank you in advance !
active-transactions.png
connector-configuration.txt
connector-log.txt

Chris Cranford

unread,
Nov 7, 2024, 3:40:52 PM11/7/24
to debe...@googlegroups.com
Hi Eduardo -

Before we dive into the specifics of your questions, can you clarify two things for me.

1. Is the "missing SCN" concern you have related to a log entry that looks like this?

    WARN Missing SCN detected. LogMinerEventRow
{scn=32152259204297,...}

2. The "long running transaction" log entry, does it look like this?

    Offset has not changed in 25 mining session iterations. This indicates long running transaction(s) are active. Commit SCNs ...

If not, could you please provide the exact messages as they're recorded.

Thanks,
-cc
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/38dbaadb-728a-41e6-b118-e26bf3c5a06fn%40googlegroups.com.

Eduardo Susini Ribeiro

unread,
Nov 8, 2024, 12:59:45 AM11/8/24
to debezium
Hi, Chris Cranford.

Thank you for your quick response. 

Regarding your questions, the answer is yes to both. The file attached to my first message, "connector_log.txt" is very small and contains the two messages you referenced.

Thanks.

Chris Cranford

unread,
Nov 8, 2024, 10:40:47 AM11/8/24
to debe...@googlegroups.com
Hi Eduardo -

Thanks, my apologies for missing it.

The MISSING_SCN event is not something that Debezium resolves, this is something provided directly by Oracle. Debezium 2.7 or later has a robust solution to prevent gaps in archive and redo logs, so you shouldn't observe any MISSING_SCN due to gaps. What I think might be helpful is if you observe this frequently, can you enable logging as follows, in this order:

    io.debezium.connector.oracle=DEBUG
    io.debezium.connector.oracle.OracleValueConverters=INFO

This way we get all but the value converter output and we can see what the log collector gathers and the mining step data to see if we can identify the problem.

The message about long running transactions, I think this is related to the fact that the SCN that comes before the SCN assigned to the MISSING_SCN events comes just before, so this seems related and perhaps solving the former addresses the latter.

Can you provide those logs?

Thanks,
-cc
Reply all
Reply to author
Forward
0 new messages