Offset SCN stucks and not proceeding during HIGH Transactions

171 views
Skip to first unread message

Vinoth Kumar

unread,
Nov 3, 2025, 12:45:38 PMNov 3
to debezium
Hello Team,

I'm getting the below error with Oracle 19c database. I've attached the connector config details and shared the analysis sofar.

Can you please assist to move forward.

Error
"Offset SCN 9420122456881 has not changed in 25 mining session iterations. This indicates long running transaction(s) are active.  Commit SCNs {1=9420159644839, 2=9420159644743}."

What we have tried
Verified from the database end that LogMiner is functioning correctly and successfully capturing all archive logs.
Monitored via SUMO Logic, where we noticed that the offset SCN remains stuck even though LogMiner has completed processing those archives.
Restarted the connector multiple times (both manual and automatic restarts) however, it continues to pick up from the same stuck SCN, referring to archives that have already been mined.
The SCN has not moved during such high loads, indicating that the connector offset isn't updating as expected.

Findings
After the data load, streaming stucks with the offset SCN.
This behavior is observed only in PROD, particularly during periods of high transaction volume (30-60G per hour).

Thanks,
Vinoth
Connector-Debezium.txt

Chris Cranford

unread,
Nov 3, 2025, 12:53:33 PMNov 3
to debe...@googlegroups.com
Hi Vinoth -

First and foremost, this message isn't an error but rather just a warning, indicating that you could have a long running transaction present. In a situation where you could have a large batch operation that modifies millions of rows in a single transaction, this can happen until the transaction is completely consumed. But there are some other reasons that are related to bugs. 

What Debezium version are you using, so we can identify whether your issue may be related to a bug?

Thanks,
-cc
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/debezium/4fae1065-ead8-4811-9d91-2564f53e46fen%40googlegroups.com.

Vinoth Kumar

unread,
Nov 3, 2025, 1:38:32 PMNov 3
to debe...@googlegroups.com
Hi Chris,

First of all thank you for the prompt response!

Yes its a warning but SCN not at all moved for a prolonged window and connector getting either restarted or failed with timeout. In addition, there were no long running sessions at the DB end.

Version: 3.1.3

We have Goldengate running on the same DB server where we don't see any latency issues. Also we stopped Goldengate for a while however Debezium log mining re-read from SCN from where it stuck eventhough those archives were already mined as per DB alert log.

Regards,
Vinoth Kumar
8056-274-822

Chris Cranford

unread,
Nov 3, 2025, 1:43:17 PMNov 3
to debe...@googlegroups.com
Hi Vinoth -

Please update to 3.3.1.Final. There was a bug (DBZ-8747) that could leave transactions in-progress by mistake that were present in versions prior to 3.2.4.Final.  Could you update and see whether you continue to experience any issues with stuck OffsetScn values?

Thanks,
-cc

Vinoth Kumar

unread,
Nov 3, 2025, 1:45:25 PMNov 3
to debe...@googlegroups.com
Hi Chris,

Thanks much !

We will update and keep you informed on this.

Regards,
Vinoth Kumar
8056-274-822

Vinoth Kumar

unread,
Nov 4, 2025, 12:07:18 PMNov 4
to debe...@googlegroups.com
Hi Chris,

We've updated our Debezium to the latest version "3.3.1" and we did a clean initial load.
The large archive/transaction batch starts from 5 AM IST, will monitor and keep you updated 🤞
--
Regards,
Vinoth Kumar
8056-274-822

Vinoth Kumar

unread,
Nov 5, 2025, 9:21:28 AMNov 5
to debe...@googlegroups.com
Hi Chris,
Post the Debezium update, Please see the below updates.

Before Debezium Update to 3.3.1

Monitored via SUMO Logic, where we noticed that the offset SCN remains stuck even though LogMiner has completed processing those archives.
Restarted the connector multiple times (both manual and automatic restarts) however, it continues to pick up from the same stuck SCN, referring to archives that have already been mined.

After Debezium Update to 3.3.1

Stuck SCN:"9420755786214" - Archives "Thread1=767758 & Thread2=526103" --> 7:12 AM IST
Detected possible SCN gap: But SCN is moving ahead - startSCNupperBounds SCN, startSCN 9420786218992, prevEndSCN 9420786218997  - Archives "Thread1=767958 & Thread2=526165" --> 9:04 AM IST
During shutdown SCN:Offsets as shutdown: OracleOffsetContext [scn=9420796942649, txId=46000300bc3d0a00, txSeq=1, commit_scn=[\"9420797074942:1:38000900f8231000\",\"9420797028473:2:18000100e3eb1800\"], lcr_position=null]"
Here "9420796942649" - Archives "Thread1=768027 & Thread2=526188"   --> 11:07 AM IST
Post Restart:  Got the below message
No latest table SCN could be resolved, defaulting to current SCN 
Event with `MISSING_SCN` operation found with SCN 9420801068798 (Archives Thread1=768167 & Thread2=526233)

Here SCN is moved far ahead instead of picking from where it left. Is this expected ? Please let me know your valuable thoughts.

Note: In the current connector I kept only a small set of tables where there won't be many transactions, however the bigger tables which generate more transactions are yet to be onboarded.

Thanks,
Vinoth

Vinoth Kumar

unread,
Nov 5, 2025, 7:02:25 PMNov 5
to debe...@googlegroups.com
Hi Chris/Community Team,

Can you please help and assist me on the below ask.

Regards,
Vinoth Kumar
8056-274-822

Chris Cranford

unread,
Nov 6, 2025, 5:56:52 AMNov 6
to debe...@googlegroups.com
Hi Vinoth -

The "No latest table SCN could be resolved" happens only at one point in time, which is during the snapshot, which itself explains why the connector did not resume from where it left off at shutdown. As to why that happened is what we need to understand.  Did you delete the offsets and history topic, or is your snapshot.mode set to `when_needed`?  What is your connector configuration? 

Thanks,
-cc

Vinoth Kumar

unread,
Nov 6, 2025, 7:26:39 AMNov 6
to debe...@googlegroups.com
Hi Chris,

We haven't deleted anything.
Initially we had a small set of tables with "snapshot.mode: initial" and once the snapshot was completed then after sometime we stopped the connector and added the below snapshot mode. This is to add new tables when we perform Incremental load via Signal Table. We've followed a similar practice in other DBs where we don't see such messages.
 
    snapshot.mode: configuration_based
    snapshot.mode.configuration.based.snapshot.schema: true
    snapshot.mode.configuration.based.snapshot.data

Note: Small sets of tables won't have many transactions, however the newly added tables are bigger which generate more transactions and yet to be onboarded.


Vinoth Kumar

unread,
Nov 6, 2025, 7:35:29 AMNov 6
to debe...@googlegroups.com
Hi Chris,

I mean, did this post the completion of initial snapshot.
#commented
"snapshot.mode: initial"
Added

  snapshot.mode: configuration_based
    snapshot.mode.configuration.based.snapshot.schema: true
    snapshot.mode.configuration.based.snapshot.data

Chris Cranford

unread,
Nov 6, 2025, 9:02:15 AMNov 6
to debe...@googlegroups.com
Hi -

If you have set the following

    "snapshot.mode.configuration.based.snapshot.schema": "true",
    "snapshot.mode.configuration.based.snapshot.data": "true"

Then you have effectively told Debezium to use
    
    "snapshot.mode": "always"

This explains your circumstance.

Thanks,
-cc

Vinoth Kumar

unread,
Nov 6, 2025, 10:57:42 AMNov 6
to debe...@googlegroups.com
Hi Chris,

Apologize, my bad for the typo.
Actually enabled CDC and disabled initial load to have Incremental load via signal for the new tables.
Worrying whether it is data loss or expected because transactions of those big tables were not in the replication list.

     snapshot.mode: configuration_based
    snapshot.mode.configuration.based.snapshot.schema: true
    snapshot.mode.configuration.based.snapshot.data: false
    snapshot.mode.configuration.based.start.stream: true

Thanks,
Vinoth

Chris Cranford

unread,
Nov 11, 2025, 10:14:58 PMNov 11
to debe...@googlegroups.com
Hi Vinoth -

If a user performs a large transaction or if there is a period of many transactions, it's very plausible that you could see the OffsetScn plateau and not change for a period of time. This can be due to the connector waiting for data to be returned by Oracle, we're actively processing a large batch of data, or if there is an in-flight transaction that has not yet committed or rolled back.

I always suggest to folks to look at the broader picture of the connector, checking other JMX metrics to see whether or not they're updating periodically. Do you see the FetchQueryCount incrementing, or the TotalCapturedDmlCount or TotalProcessedRows increasing? These are signs that the connector is operating and consuming data. 

In the JMX metrics, what is the value of OffsetScn, OldestScn, CommittedScn, and CurrentScn ?

-cc

Vinoth Kumar

unread,
Nov 12, 2025, 5:24:11 AMNov 12
to debe...@googlegroups.com
Hi Chris,

We've captured these logs from Sumologic and we do see these SCN values keep changing during offset scn or SCN gap so which means streaming continues as usual without hiccups, Correct If I'm wrong?
I've also asked the Infra team to share the JMX metric info for reference.

@timestamp
:"2025-11-12T07:44:03.842Z  "Offset SCN 9421572168715 has not changed in 25 mining session iterations. This indicates long running transaction(s) are active. Commit SCNs {1=9421573925565, 2=9421573925272}."

@timestamp:"2025-11-12T03:34:10.947Z" "Detected possible SCN gap, using upperBounds SCN, startSCN 9421562666535, prevEndSCN 9421562667455, timestamp 2025-11-11T19:34:03Z, upperBounds SCN 9421565658898 timestamp 2025-11-11T19:34:09Z."  

@timestamp:"2025-11-12T02:02:05.173Z" "Offset SCN 9421534993086 has not changed in 25 mining session iterations. This indicates long running transaction(s) are active. Commit SCNs {1=9421535001588, 2=9421535001572}."

Chris Cranford

unread,
Nov 12, 2025, 10:02:17 AMNov 12
to debe...@googlegroups.com
That's correct, so I would say this is the classic pattern where you have a batch job that generates a substantial amount of redo in a small window, that triggers the connector to panic there could be an issue. The defaults for this check is quite conservative to work best for lower activity environments. I would recommend that you customize the defaults to see if it would help minimize the frequency of this warning.
  1. Change log.mining.scn.gap.detection.gap.size.min from 1 million to perhaps 2 or 5 million.
  2. Change log.mining.scn.gap.detection.time.interval.max.ms from 20000 (20 seconds) to 10000 (10 seconds)

So right now the connector looks to see if within a 20 second window does the SCN grow by 1 million. By reducing the time window and increasing the gap, you reduce the frequency.

As I think this often raises more questions/concerns than being helpful to the community, I've logged DBZ-9672 and intend to likely remove the behavior entirely in Debezium 3.4.

Thanks,
-cc

Vinoth Kumar

unread,
Nov 12, 2025, 10:04:58 AMNov 12
to debe...@googlegroups.com
Hi Chris,

Great, thanks much for the excellent support !

Sure, will plan and modify the default settings to supress these messaga in the logs.

Regards,
Vinoth Kumar
8056-274-822

Vinoth Kumar

unread,
Dec 2, 2025, 2:12:17 AM (yesterday) Dec 2
to debezium
Hi Chris/Team,

Hope it is good to follow-up on the existing thread as we similar pattern with slight difference.

We do see LOGMINER StartScn remains same for a longer and EndScn keeps changing however it progress very slow (Almost 8hrs StartScn didn't changed yet). We have ample of free memory available in Streams Pool. In addition we have GoldenGate configured where we don't see any Latency.

Could you please help to boost the performance of these connectors.

Note: Onboarded 500+ tables. 1st connector  - 480 tables and table size is small, 2nd Connector - 15 tables which has huge LOBs and rest all 3 connectors has not more than 3-5 tables which are either BIG in size or some tables don't have PK/Unique key.

Debezium version: 3.3.1 
Connector Info:
    #Debezium Engine/Queing Settings
    max.batch.size: 8192
    max.queue.size: 65536
    max.queue.size.in.bytes: 536870912  #512MB
    log.mining.query.filter.mode: in

topic.prefix: BCPROD
    topic.creation.enable: true
    topic.creation.default.replication.factor: -1
    topic.creation.default.partitions: -1
    topic.creation.default.retention.ms: 1209600000
    schema.include.list: APPNAME,DBZUSER
    table.include.list: APPNAME.TABLE1,APPNAME.TABLE2,DBZUSER.HEARTBEAT,DBZUSER.SIGNAL

log.mining.batch.size.default: 60000
    log.mining.batch.size.max: 1000000
    log.mining.batch.size.min: 10000
    log.mining.scn.gap.detection.gap.size.min: 1000000      # 1 million SCNs
    log.mining.scn.gap.detection.time.interval.max.ms: 10000 # 10 seconds

    #Special character
    field.name.adjustment.mode: avro
    #BLOB, RAW  
    binary.handling.mode: base64

Sumo Logic:
Offset SCN 9423036857229 has not changed in 25 mining session iterations. This indicates long running transaction(s) are active. Commit SCNs {1=9423062374267, 2=9423062373495}."

DB ALERT LOG
LOGMINER: Summary for session 2164177921, OS id 374401
LOGMINER: StartScn: 9423036857229 (0x00000891f8c3bb8d)
LOGMINER: EndScn: 9423069174443 (0x00000891fab0daab)
LOGMINER: HighConsumedScn: 0
LOGMINER: PSR flags: 0x0
LOGMINER: Session Flags: 0x4000441
LOGMINER: Session Flags2: 0x400
LOGMINER: Read buffers: 4
LOGMINER: Region Queue size: 256
LOGMINER: Redo Queue size: 4096
LOGMINER: Memory Size = 15M, HWM 14M, LWM 13M, 90%
LOGMINER: Memory Release Limit: 0M
LOGMINER: Max Decomp Region Memory: 1M
LOGMINER: Transaction Queue Size: 1024
2025-12-01T22:48:04.942578-08:00

Vinoth Kumar

unread,
Dec 2, 2025, 5:25:13 AM (yesterday) Dec 2
to debe...@googlegroups.com
Adding to the below. With this it progressed almost 63M SCNs and StartScn is not even changed. Even Though this is valid Long running transactions from the Application side, can we overcome such cases in Debezium ?

LOGMINER: Summary for session 2209447938, OS id 249842
LOGMINER: StartScn: 9423036857229 (0x00000891f8c3bb8d)
LOGMINER: EndScn: 9423100213632 (0x00000891fc8a7980)

Chris Cranford

unread,
Dec 2, 2025, 10:53:45 AM (24 hours ago) Dec 2
to debe...@googlegroups.com
Hi Vinoth, can you please do me a favor,

Can you checkout the code from https://github.com/Naros/debezium-oracle-query.tool and follow the README on how to compile/build the code. Once you have the code built, could you please run the following:

    java -jar target/quarkus-app/quarkus-run.jar \
        list-changes \
        --hostname <your database.host> \
        --username <your database.user> \
        --password <your database.password> \
        --port <your database.port> \
        --service <your database.dbname> \
        --start-scn 9423036857200 \
        --end-scn 9423069174443 \
        --output extract.csv 

That will generate a CSV file on all the data changes between the two SCN ranges. Can you please ZIP that up and share that with my email privately so I can review the data. The log entry indicates that a transaction was started and is still in-progress, which would explain why the lower SCN does not change. The extract will help identify if thats the case or if there is a potential bug in the connector.

Thanks,
-cc    

Chris Cranford

unread,
Dec 2, 2025, 10:56:34 AM (24 hours ago) Dec 2
to debe...@googlegroups.com
Hi Vinoth, if you have such long running transactions, and those are expected; there is little we can do. Transactions must be mined a very specific way and we cannot advance the low watermark until the transaction commits or rolls back in the case that the connector crashes or is restarted due to a rebalance. Using the tool i shared in the prior email, it would be helpful to look at such transactions and see is the 63M SCNs, are these over a period of a few minutes or over a period of minutes/hours. If its the former, we need to look closer at potential optimizations. But if its the latter, that's a business issue that unfortunately we cannot solve.

-cc

Vinoth Kumar

unread,
Dec 2, 2025, 12:02:09 PM (22 hours ago) Dec 2
to debe...@googlegroups.com
Hi Chris,

We will check the suggested options and share the required details. Also, in DB alert log noticed that to progress SCN in batches takes sometime, is this because of the long running transactions which we do see or multiple connectors mining the same logs ?

Regards,
Vinoth Kumar
8056-274-822

Chris Cranford

unread,
Dec 2, 2025, 12:23:23 PM (22 hours ago) Dec 2
to debe...@googlegroups.com

Vinoth Kumar

unread,
Dec 2, 2025, 12:29:43 PM (22 hours ago) Dec 2
to debe...@googlegroups.com
Do you mean long running transactions or multiple connectors mining same logs, please confirm. 

Regards,
Vinoth Kumar
8056-274-822

Chris Cranford

unread,
Dec 2, 2025, 12:59:40 PM (22 hours ago) Dec 2
to debe...@googlegroups.com
Long running transactions will influence the low watermark advancing. 
Having multiple connectors running can create an IO bottleneck if the redo subsystem is already saturated by normal user load.

Vinoth Kumar

unread,
6:21 AM (4 hours ago) 6:21 AM
to debezium

Hi Chris,

We exported the CSV as per the given tool where it comes around 3.1G size, later for better analysis we imported into one of the test MySQL instance where we noticed the stuck SCN 9423036857229 not exists, attached details for reference.

Due to this its stuck, it never moved ahead and waiting for the transaction to complete ? If that's the case how can we overcome such scenario. please suggest.

Also please let us know if you need any other additional details for analysis.

"Offset SCN 9423036857229 has not changed in 25 mining session iterations. This indicates long running transaction(s) are active. Commit SCNs {1=9423127090737, 2=9423127088242}."

SCN.sql

Chris Cranford

unread,
9:21 AM (1 hour ago) 9:21 AM
to debe...@googlegroups.com
Hi Vinoth, do you see if 9423036857228 (-1) or 9423036857230 (+1) are in the output?
Reply all
Reply to author
Forward
0 new messages