Not strictly a debezium question but one about debezium and kafka.
Apologies but I thought the easiest way to explain is to lead you down my thought process path... :)
So, for example I have a db with tables
customers
customer_emails
customer_addresses
customer_telephones
etc
....I've created a debezium connector that puts all CDC events for each table into the same customer topic.
I'm wanting to use this topic for replay for any new consumers, ie allowing them to get a full load of all the customer data, before the consumer then gets new messages coming through.
Given a topic contains these messages with CRUD events for one (customerId/table name) key:
- New customer (C)
- Change customer name (U)
- New customer_address (C)
- New customer_email (C)
- Change customer_email (U)
...any new consumer will have logic in it to deal with Creates and Updates (as well as Reads and Deletes) differently.
That all sounds fine.
However now I add in topic compaction....
To avoid consumption of duplicates and deleted items, I'm setting the topic to use log compaction and tomb-stoning.
Log compaction will leave the latest message for each key in the topic:
- Change customer name (U)
- New customer_address (C)
- Change customer email (U)
In this scenario a new consumer can't treat these replay Update events as Updates but has to treat them as Creates? However it will have to treat non-compacted, new Update messages as actual Updates? (or what ever their actual CRUD event is).
So a consumer will have to treat messages in the compacted tail differently to messages in the non-compacted head.
This also sounds doable but not anything I've read on my kafka travels.
However I'm currently stuck as there does not seem to be a way for a consumer to tell where the head and tail meet on a topic. Ie how far has compaction currently got.
This leads me to think I'm missing something here?
Grateful for any advice if people have figured this out already, or can see a flaw in my thinking?
best wishes
n99