Hello everyone, I open this post to notify an issue that we are having with our installation of Kafka Connect with Oracle Debezium connector (Debezium v2.4 and Kafka v3.6.0).
Basically we noticed that the query that the connector does on the Oracle REDOLOG table retrieves a lot more data that actually concerns the table that we are listening on.
We extracted the query from the database logs:
And noticed that there is no filter applied regarding the table name. So we did some research on your documentation and we found this connector property: log.mining.query.filter.mode
It's default value is "none", which results in no filtering, so we valued instead with "in".
After doing so, the above query changed like this:
Where "DBAOBT.OBT_VEHICLE_ORDERS" is the name of the table we are monitoring.
This query unfortunately has 2 issues that make it useless:
Because of this, even with the log.mining.query.filter.mode set to "in", the query is downloading a lot of useless data from the REDOLOGS.
We did some test to put an actual number on it, and on our database the query is downloading around 3 billion records where actually the records related to our monitored table are only in the order of the hundreds.
This caused a big load on our infrastructure in terms of network usage and DB CPU usage.
We even went as far as talking to Confluent support about this, and they conveyed this looks like a bug on Debezium side.
Can you please give us a feedback on your side about this? Because in this state, the Debezium connector is unusable on our usecase.
Thank you!
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/48d6a10a-5b4a-4b25-92a0-390de15aadbbn%40googlegroups.com.