Hi Team,
We have observed that whenever Debezium connector is restarted, we see Duplicate Messages being published into Kafka Topic from the connector. This issue is particularly concerning as it is occurring in our production environment, which demands high data consistency and integrity.
In an effort to address this issue, we have reviewed the official Debezium documentation, specifically:
https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-when-things-go-wrong
https://debezium.io/documentation/faq/#why_must_consuming_applications_expect_duplicate_events
These resources have provided insights into the reasons behind the duplicate event issue during connector restarts. However, to maintain the reliability and data consistency of our system, we are seeking your assistance in resolving this behaviour permanently.
We kindly request your support and expertise in addressing and rectifying this issue within the Debezium connector. Our primary objective is to ensure that the connector restarts do not result in the generation of duplicate messages in the Kafka topic.
We understand that Debezium is a widely adopted tool for change data capture (CDC), and we value the benefits it provides. However, the duplicate message issue is a critical concern for us, and we believe that, with your assistance, we can overcome this challenge.
Please let us know how we can proceed with this.
Thanks.
This issue was with prod where the connector is looking for the bin log file which is getting deleted as per the retention.
This duplicate of issues are happening when there is restart to the connector.
Thanks.