Hi Fairy
So then it would seem that your concern is mostly about the
performance impact of the snapshot rather than the streaming, do I
understand correctly?
If we step back, Debezium offers two types of snapshot modes:
1. Traditional snapshot that happens before streaming begins
2. Incremental snapshot that happens concurrently with
streaming
As you know, the traditional snapshot must be completed in one
execution or else it must be restarted. I'm not sure whether 2 TB
of data can be extracted in a 2-3 hour window, you'd need to
likely do a test in a non-production environment to guarantee if
you can meet that requirement. What I can say is that this mode
uses Oracle flashback queries. This mode also requires that your
UNDO RETENTION on the database be longer than the runtime of the
snapshot or the snapshot will fail when the SCN used for the
flashback query ages out of the retention period. The default (as
of Oracle 19) is 15 minutes but generally your DBA will have
increased this in production environments. Increased retention
here is mostly a storage concern, since the undo tablespace that
stores the undo log has to grow to storage the volume of changes
for the UNDO_RETENTION period. This mode also will require that
your archive log retention period be longer than your snapshot
duration as well since logs for streamed events won't be read
until after the snapshot concludes. Lastly, do note that we don't
apply any locks to the tables when we generate a snapshot, there
is only ever a small lock to generate table schema structures for
the connector but this lock mode can be disabled if you can
guarantee no schema changes during the snapshot yourself.
For incremental snapshots, we shift gears quite a bit. In this
mode, the snapshot can be resumed from where it left off. In
addition, we stream changes concurrently with the snapshot. In
this mode, we do not rely on flashback queries for a consistent
snapshot and so the UNDO_RETENTION configuration does not come
into play. Additionally, since we perform the snapshot
concurrently with streaming, you also have less a concern about
archive log retention since we begin streaming changes much faster
since we don't have to take a consistent snapshot first. But with
this mode, there is a caveat that schema changes are *NOT*
permitted during the incremental snapshot.
Now, if you can only run the connector for 2 to 3 hours per day
even with streaming changes, then I think you may have some
additional concerns. The connector always mines from where if
left off, so you're going to have a pretty significant archive log
retention period in place to support this. Furthermore, you're
likely going to always be perpetually behind in capturing change
events, so you're never likely going to reach near real-time.
Chris