Debezium connector for Oracle ask about SCN

966 views
Skip to first unread message

Suphakit Wongsarawit

unread,
Feb 23, 2022, 2:45:30 AM2/23/22
to debezium
I have question about SCN on how it work in oracle database. If the connector finish running the snapshot and I wait for connector to run for about 5-10 minutes. And the data got inserted in another table that is not included in the config of the connector in this 5-10 minutes. After 5-10 minutes I change the config of this connector to include this table too. Those data that inserted in 5-10 minutes will be detected by connector or not. And how does it work
Thank you very much. 

Chris Cranford

unread,
Feb 23, 2022, 10:06:55 AM2/23/22
to debe...@googlegroups.com, Suphakit Wongsarawit
Hi Suphakit -

So the short answer is it may or it may not.  How this works gets a bit technical, so if you have any questions feel free to ask.

If you have not configured "lob.enabled=true" then the connector is operating in what I call "sliding mining window" mode.  In this mode, we mine from A to B, B to C, C to D, etc.  Whether your change is picked up will depend entirely on whether or not the connector is processing near real-time events or if its processing older events from the past.  We expose a JMX_METRIC that tells you how far behind the connector is lagging.  You can also calculate this on the sink/consumer side by comparing the delta between "ts_ms" (the time we processed the event) and "source.ts_ms" (the time the event was written by Oracle).  If this delta is very small (i.e. within a few seconds or a minute) then most likely the outcome will be that we've mined well past the system change number associated with your non-included table and so after you reconfigure & restart the connector, only changes you make thereafter for that table will be included.  If the connector is lagging behind to a point before your change to this non-included table and you reconfigure & restart, then yes when the connector eventually catches up to that time in the logs, your change would be emitted.

If you have configured "lob.enabled=true", then you're operating in more of an "adaptive mining window" mode where we mine from A to B, A to C, C to D, etc where the starting point of each window is based on the oldest in-progress transaction's system change number.  So while the connector may be processing near real-time events, in this mode it depends entirely on if the starting point of the window is before or after the system change number associated to your non-included table.  If its before, then yes your change would be emitted after the reconfigure & restart, but if its after then it would not.

HTH,
CC

On 2/23/22 02:45, Suphakit Wongsarawit wrote:
I have question about SCN on how it work in oracle database. If the connector finish running the snapshot and I wait for connector to run for about 5-10 minutes. And the data got inserted in another table that is not included in the config of the connector in this 5-10 minutes. After 5-10 minutes I change the config of this connector to include this table too. Those data that inserted in 5-10 minutes will be detected by connector or not. And how does it work
Thank you very much. 
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/9e18079d-3691-4a89-ab99-e51a83d6ef29n%40googlegroups.com.

Suphakit Wongsarawit

unread,
Feb 24, 2022, 1:27:39 AM2/24/22
to debezium
Hi Chris

I have some question regard the adaptive mining window mode. What's the meaning of the oldest in-progess transcation's system change number. If the connector got start will it start reading from the system change number that got transacted by the connector? For example when connector detect change from the scn number 111110 and then send the event to Kafka. Then we wait for about 10 minutes with no change coming to the table. And then change the config for the connector to include more table. Connector will then start reading from scn number  111110 for the another table. Do I understand it correctly?

Thank you so much for your answer.

Chris Cranford

unread,
Feb 24, 2022, 8:44:32 AM2/24/22
to debe...@googlegroups.com, Suphakit Wongsarawit
Hi Suphakit -

So I want to clarify a few things that are important.  When we read redo log entries, they're not immediately sent to Kafka.  Oracle's redo logs contain entries for committed, rolled back, internal, and in-progress operations.  This means that all events we read must be buffered by the connector and only once we detect a commit are those events released to Kafka. 

So lets say you have the following events in your redo logs

    TRX1 START SCN 12345
    TRX2 START SCN 12346
    TRX2 INSERT SCN 12347
    TRX2 COMMIT SCN 12348
    ...
    TRX1 COMMIT SCN 23456

If at any point between SCN 12345 and SCN 23456 the connector is reconfigured which triggers a connector restart, the connector will restart mining at SCN 12345.  This is because we have not yet read the entire transaction into the buffer and so we do not yet know whether the transaction is committed or rolled back.  Once we reach SCN 23456, we can then advance the SCN low watermark forward to the new oldest in-progress transaction starting point. 

Now it is also worth noting that we also maintain an high watermark SCN as well.  The hi watermark is used to prevent the dispatch of transactions we've already read that in that range that we know have been committed or rolled back.  So for example, after the restart in our scenario, had the connector read at least through SCN 12348, then the offsets would have a low watermark SCN of 12345 and a hi watermark of 12348.  So on the restart, we would have ignored TRX2 since we had previously already released the transaction to Kafka.

If you adjust the configuration and include a new table, then yes changes will begin to stream to Kafka from whatever the low watermark SCN happens to be at that time.
Reply all
Reply to author
Forward
0 new messages