Oracle logminer - commits/rollbacks

209 views
Skip to first unread message

Donald Lyon

unread,
May 21, 2021, 12:50:18 PM5/21/21
to debezium
I have a question about the logminer contents query.  It looks like I'm getting thousands of commit/rollback messages for schemas I'm not interested in.  Do those need to be in there, or is that something that could be filtered?

Thanks,
Don

Chris Cranford

unread,
May 21, 2021, 3:07:44 PM5/21/21
to debe...@googlegroups.com, Donald Lyon
Hi,

Both COMMIT and ROLLBACK operations do not specify a tablespace / owner / schema value in the row metadata so there isn't a safe way to exclude those while still capturing the COMMIT and ROLLBACK operations that are pertinent to the transactions of interest.  I suspect many of those you are seeing are generated by SYS and SYSTEM users which were being filtered out in early iterations of the connector but we later found this presented a problem for SQL clients that enabled auto-commit and the associated COMMIT record was performed by the SYSTEM user and so we never concluded the transaction and the events were never emitted.  This is why the COMMIT and ROLLBACK operations are all included.

That said, the connector will weed out these empty transactions as a no-op.  Is this creating any issue for you in someway?

Chris

--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/ddbc5208-0275-4da2-8160-593759a93ba4n%40googlegroups.com.

Donald Lyon

unread,
May 21, 2021, 5:14:25 PM5/21/21
to debezium
Chris,

I'm not positive this is creating the issue, but the engine doesn't appear to be keeping up with the changes in my test environment.  I'm not running much load, but it does appear to get about 30,000 events in every batch.  Many of those commits are from the c##dbzuser, and a bunch are from a tungsten-replicator CDC process that I'm trying to replace.  The events do have a username associated with them, which is why I thought filtering them might be viable.

Are there some other toggles I should be looking at for performance concerns?

Don 

Chris Cranford

unread,
May 24, 2021, 1:57:15 PM5/24/21
to debe...@googlegroups.com, Donald Lyon
Hi Donald -

If I may, what is your redo log configuration in your test environment, i.e. log size & number of log groups.  In addition, are you using the default mining strategy?

Donald Lyon

unread,
May 24, 2021, 4:53:07 PM5/24/21
to debezium
Hi Chris,

I'm running 4 groups at 512M log size.  I'm also using the default strategy.

Don

Donald Lyon

unread,
May 25, 2021, 3:17:44 PM5/25/21
to debezium
Chris,

I'm testing out the online_catalog mining strategy now and that appears to behave much better.  It's processing smaller chunks of < 10 rows at a time, as apposed to 30k rows from the redo_log_catalog strategy.  I don't need to track DDL changes for my use case, so that solution works for me.  Thanks for pointing me in that direction.

I haven't tested the xstreams method yet.  Is that considered more reliable than logminer in your experience?  I'm not excited about the GoldenGate licensing requirement, but I haven't seen anybody comment on the tradeoffs.

Don

Chris Cranford

unread,
May 26, 2021, 6:15:46 AM5/26/21
to debe...@googlegroups.com, Donald Lyon
Hi Don,

I'm glad the online_mining strategy is working.  The reason you're likely seeing the smaller chunks is more a product of timing than anything else.  With the default strategy, you'll often see a longer start-up time while the data dictionary has to be written to the redo logs and analyzed, so you'll see a larger number of rows but that should eventually reach the same levels as the online_mining strategy as long as you don't have any log swaps.  Either way if you don't need to track DDL, then the online_mining strategy is definitely better performance wise on Oracle.

Both methods are equally reliable and the Xstream method is based on LogMiner at its foundation. 

The biggest difference is that Xstream does quite a bit out of the box such as combining events into transactional batches, merges multi-operations into single logical events (i.e. blob/clobs), so all the connector code is respond to the receipt of a LCR change event that is handed to the callback handler.  With the LogMiner implementation, the ownership of the entire process is on Debezium.  This means when we mine changes, we have to build the transactional batches, merge multi-operations, and manage the logs that are to be mined.  It seems like a lot but it does have the advantage that we're in full control of the pipeline, allowing us to adjust that pipeline as needed.

HTH,
Chris
Reply all
Reply to author
Forward
0 new messages