Hey guys,
Hope you are doing fine. I have a couple of questions about snapshot modes and the Postgres connector.
I read in the documentation that it is not recommended to use the initial snapshot mode, as it could lead to loss of events. And it is recommended to use exported mode.
What does having a snapshot really helps with? As far as I'm aware the LSN is stored in the kafka offset topic and it is used to recover from failures in the connector, so where does a snapshot come into play?
What happens if I use the never snapshot mode. Let's say that for whatever reason there is an extended down time for the connector, however, a previous LSN is stored in the kafka offset topic. How long does this down time has to be to cause a loss of events for instance because the WAL was purged?
If I originally had the snapshot mode set to never for an extended period of time and decided to change the configuration to exported instead, would that cause reprocessing of old events?
Finally, both modes use the following sentence ...based on the point in time when the replication slot was created the difference is that the mode never starts streaming from that point on, and exported starts the snapshot from that point on. However, I fail to understand how to know when that point is? From the last time the WAL was purged?, from the very beginning of the creation of the Postgres DB?
If it is of any help we are using AWS RDS for PostgresSQL to host our DB to which Debezium is connected to.
Sorry for having so many questions, but I really would like to understand how it works underneath to make sure that we are doing things right.
Hope to hear from you, and thank you very much in advance.
Cheers,
Randy