Hi Chris, thanks for that suggestion - it actually worked perfectly and because the history topic on the original connector was empty anyway since the retention period had passed, it was a simple matter to configure the new "schema_only" connector to update the history topic on the original connector directly.
So for anyone else, this is what we did:
- Create a duplicate Connector for each failing Connector, with unique names to ensure it didn't conflict (we have a Terraform module to configure each Connector (using the Mongey Kafka Connect Provider) and associated resources so it was simple to supply a different prefix that is used in the naming)
- Ensure the new Connector is configured with: snapshot.mode=schema_only
- We had to hack our module a little to allow us to override the database.history.kafka.topic value to retain the same name as the original Connector
- Ensure the Kafka identity used by the new Connector has required permissions on the old DB History Topic
- Start the new Connector and it filled up the DB History Topic with the current state of the DB schema
- The old Connectors picked up the fact the topic was no longer empty and started running and recent changes started to flow through again
- Delete the old Connector and associated resources
The problem at that point is that CDC in SQL Server appears to default to retaining 72 hours of history, so we didn't magically get all of the changes since the Connectors stopped worked and had to signal a refresh. Which I believe means that even if the schema had changed at some point, the refresh would work since it would match the schema that we just recreated.
Still puzzled on why we weren't warned earlier that our topic was not configured correctly (we had 30 day retention configured, which I triple checked). One thing I did think I noticed is that while working this out, I hadn't done the step of granting the Kafka identity of the new Connector the right ACL to see the old DB history topic, yet the Connector still said the topic was OK but then failed at some other point (sorry I should have made more detailed notes). I note that the methods in the Kafka Client library throw checked exceptions for authorisation errors and I saw nothing in the Debezium code to suggest that those exceptions were swallowed or turned into something else.
We are on the latest Debezium 1.9 version.
Cheers,
Patrick