High latency with Oracle database

583 views
Skip to first unread message

Gali Winer

unread,
Sep 22, 2022, 8:13:49 AM9/22/22
to debezium
Hi,
I am running a script that inserts to  Oracle DB and measures the  lag between the time the record is inserted to the database and the time it is stored in Redis database.
I see high latency in case of Oracle database.
I changed the setting of the properties
debezium.source.log.mining.sleep.time.max.ms.debezium.source.log.mining.sleep.time.default.ms and log.mining.sleep.time.increment.ms to 100 and it improved my results, so i wonder if the high latency can be related to the fact that there is low activity on the database.
attached is the Debezium log file .
In line number 51173 you can see the time the row was inserted into the table TABLE_TEST until Debezium actually reads it from the binlog.
If you need any other input please let me know.
Thanks,Gali
 
2022-09-20 13:46:43,161 TRACE [io.deb.con.ora.log.pro.AbstractLogMinerEventProcessor] (debezium-oracleconnector-redislabs-change-event-source-coordinator) DML: LogMinerEventRow{scn=18867944, tableId='ORCLPDB1.C##DBZUSER.TABLE_TEST', tableName='TABLE_TEST', tablespaceName='C##DBZUSER', eventType=INSERT, changeTime=2022-09-20T13:43:18Z, transactionId='0a001c005a0c0000', operation='INSERT', userName='C##DBZUSER', rowId='AAASL/AAOAAAFC7AAD', rollbackFlag=false, rsId= 0x000290.00039afc.0010 , ssn=3, redoSql='insert into "C##DBZUSER"."TABLE_TEST"("ID","DATE_FIELD") values ('4',TO_TIMESTAMP('2022-09-20 13:43:18.'));'}


debez.zip

Chris Cranford

unread,
Sep 22, 2022, 9:35:52 AM9/22/22
to debe...@googlegroups.com, Gali Winer
Hi Gali -

So depending on how the script is written, there is one latency that is outside of anyone's control, and it's the one with Oracle's internal log writer (LGWR) buffers.  Let's say your script is creating a large, single transaction and inserts a lot of data that takes more than 3 seconds to complete but the size of the data being inserted in the transaction is small enough to fit into the LGWR buffers, then none of those will be visible to the connector for at least 3 seconds (longer if you're using Oracle RAC).  There are a number of rules that Oracle follows to know when the LGWR buffer is flushed, another being when the buffer reaches 75% capacity and another when the transaction COMMITs or ROLLBACKs. Unfortunately, this latency bit isn't something that can be tuned outside of making sure you don't have overly large transactions and that you COMMIT or ROLLBACK often, all of which are just generally good SQL behavior.

The "log.mining.sleep.time.*" settings also play a roll in some of the connector's latency, but it's important to understand the implications of changing these values.  The default implementation uses a range [0, 3] seconds.  This range dictates that when the connector is caught up, close to the current SCN, the connector will sleep at a value closer to the upper boundary and if the connector falls behind, the connector will adapt and move closer to a value at the lower boundary.  This means when we're behind, we mine faster, when we're in near real-time, we mine slower.  You might ask why mine slower?  A mining step involves disk and memory IO on the database server and issuing a mining step too frequently could impact database performance if the hardware isn't capable of handling it.  I would recommend not setting log.mining.sleep.time.max.ms to anything below 1 second.

So yes, reducing those sleep times will improve latency, but on average a 3 second latency is expected when the connector is working in ideal conditions.

But in your example, you're showing approximately a 3 minute latency, and I suspect this may have more to do with the mining strategy and your log group configuration.  Can you share your full connector configuration, Debezium version, and can you also tell me what is the value of the BYTES column using the query "SELECT BYTES FROM V$LOG"?

Thanks,
Chris

--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/1420d20b-7757-4d2a-be64-20c762c2c377n%40googlegroups.com.

Oren Elias

unread,
Sep 22, 2022, 10:13:04 AM9/22/22
to debe...@googlegroups.com, Gali Winer
Chris,
Thanks for the explanation about the log.mining.sleep.time. This was indeed our understanding. 
A few questions:
Is the backoff exponential? I saw there is a setting of 200ms but was not sure if this means it increases linearly by 200ms or exponential?
Also, when it reaches the max (e.g. 3 seconds) and senses that transactions are arriving, does it drop down to the minimum immediately, or also reduced over time? Linearly?
This will help us understand the mechanism and set expectations.

Finally, the log shows a lot of COMMITs that have 0 transactions. This is in a lab environment where no other client is connected to the DB except the script. What is the source of so many COMMITs without any transactions in them?

Thanks,
-Oren


Chris Cranford

unread,
Sep 22, 2022, 10:25:21 AM9/22/22
to debe...@googlegroups.com, Oren Elias, Gali Winer
Hi Oren -

The back-off is linear and is re-calculated on each mining iteration.  When it reaches the max, I believe it will flux between max and max-increment in a ping-pong fashion while the calculation determines the connector is caught-up.  It will only continue to move back toward the lower boundary if the calculation determines that the connector is falling behind.  There is also log.mining.batch.size.* settings that work in tandem with the sleep algorithm.  They both are adaptive in nature, meaning that if we're caught up, we use a smaller batch size and when we fall behind we use a larger batch size, up to the max.  Both of these play an important role in LogMinerStreamingChangeEventSource#calculateEndScn, which is where we also look at the current batch and sleep values and adjust those adpatively in a linear way depending on whether we are near real-time or behind.

Also, the 0 transaction COMMITs you see are likely from an internal Oracle transaction.  Unfortunately, we have to read all START, COMMIT, and ROLLBACK transactions produced in the system because we have no prior knowledge of what operations will be included in those transactions and we need to know when the transactions start/end accordingly to manage the in-memory transaction buffer as well as emit transaction metadata events if those are enabled.  If those are creating any sort of a problem, let us know.  The one thing that does come to mind is when transaction metadata is enabled, we could be emitting a larger number of begin/end events than necessary and we could look at whether we could avoid this if the transaction contains 0 events.

Hope that answers your questions,
Chris

Chris Cranford

unread,
Sep 22, 2022, 10:27:15 AM9/22/22
to debe...@googlegroups.com, Oren Elias, Gali Winer
To clarify one point Oren, I should have said "we have to read all START, COMMIT, and ROLLBACK events", not transactions.  Sorry if that caused any confusion.

Chris
Reply all
Reply to author
Forward
0 new messages