Hi Johan -
The Oracle connector maintains a transaction cache. This cache is
managed by a watermark system with a low and high SCN watermark.
The low watermark is the SCN where the oldest, but yet not
committed or rolled back transaction began. Since Oracle tracks
all changes in the redo logs, including those that are rolled
back, we mine continuously and maintain events in the transaction
cache until we observe a COMMIT or ROLLBACK. The high watermark
is the SCN of the most recent COMMIT or ROLLBACK. This enables
the connector to avoid re-sending committed transactions between
the low and high SCN watermarks when a connector is restarted.
So how does that relate to long running transactions?
The biggest drawback to long running transactions is that most
Oracle environments do not maintain archive logs indefinitely.
Some environments purge them after 24 hours, others are less
aggressive. So if you have a long running transaction that lasts
longer than your archive log retention period, you can quickly see
that if the connector is restarted for any reason and the low
watermark SCN points to a position in the redo logs inside a log
that has been removed, the connector will no longer be capable of
starting without adjusting the offsets and accepting data loss.
You can obviously address the data loss by running an Incremental
Snapshot after you've manipulated the offsets, but that's not
always ideal and depends entirely on the volume of data in
question.
And finally, to touch on your last question, the transaction
semantics on the destination depends on the sink connector, which
isn't Debezium related. I believe the JdbcSinkConnector likely
handles the sink side as one transaction per event. There may be
some ways you can work around this by writing a KStreams
application to combine the transaction metadata topic with all the
table-specific topics for a given transaction and emit an
aggregated event to a new topic for consumption, but you'd be
looking at an entirely custom sink connector to handle that
aggregate.
Hope that answers your questions,
Chris