Hello everyone,
I am fairly new to Debezium, currently trying to set up cdc pipeline between our source DB on SQL Server and Postgres DWH.
After speaking with the DevOps team, it turned out that we cannot mindfully take the snapshot from source before starting streaming(current hardware limitations regarding Debezium server), do you think that such workaround would be feasible:
1. Turning on cdc on needed tables in prod and a read only replica
2. Figuring out the max LSN in read only
3. Taking a manual snapshot from the replica through ETL(bypassing debezium)
4. Storing said max LSN in the config for the connector
5. Starting up the connector with schema_only_recovery (gonna connect to prod, taking into considerations that logs are still kept alive)
In theory this sounds like a feasible solution, but I probably lack enough insight to be completely sure and we cannot start the testing just yet, so I thought about asking here.
If it is indeed feasible, what are some things beside log retention I have to consider?
Any tips, critique and help appreciated.
Kind regards,
Aiana.