Transactional updates - How to get data from all the tables updated in a transaction?

240 views
Skip to first unread message

Jay

unread,
Jul 8, 2022, 9:01:18 AM7/8/22
to debezium
Hi All

I am reading the documentation on Debezium. Based on that my understanding is Debezium will create topic per table or it can be configured to have one topic for all the tables. In addition, there is a separate topic created for transaction.

Is there a way to get a consolidated json for all the tables updated in a transaction? Is there a design/architecture I can refer to achieve the same?

Thank you

Regards
Jay

Jay

unread,
Jul 8, 2022, 9:15:49 AM7/8/22
to debezium
Ok, I found something in the conversation on Complex transactions: https://groups.google.com/g/debezium/c/M2Lqbbv-oZM/m/Gg1GZrOSAAAJ .

As per the comments in the above conversation, one alternative it to write a K-Stream application. Are there any other alternatives possible?

With K-Stream, I am assuming I will have to read data from each and every topic to create an aggregated payload. This will have impact on performance. We have huge number of tables and I mean really large. I want to ensure transaction updates are done in order of the transaction. Reading on all the topics and then aggregating would come at a cost of latency and throughput. Has anyone done this? What has been the observation?

Kindly guide me.

Thank you

Chris Cranford

unread,
Jul 8, 2022, 9:44:04 AM7/8/22
to debe...@googlegroups.com, Jay
Hi Jay -

That's probably the most reliable way when you look at it from the perspective of a transaction could be sourced from anything, a CLI change by a DBA, a change made by an application, or someone inside a GUI SQL client.  You could look at the outbox pattern to see if that would fit into your architecture, but I suspect Kstreams will probably be the most reliable and flexible solution.

Chris
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/5f13a8b3-83f9-4094-839f-608313108443n%40googlegroups.com.

Jay

unread,
Jul 8, 2022, 9:58:07 AM7/8/22
to debezium
Thank you Chris.

We have 100+ tables and in a transaction any number of tables can get updated. Is there a way I can figure out which table was updated and scan only those topics to create an aggregated payload?

Based on your experience, can you guide me what would the impact be on latency and throughput? We will do the POC, but any insights would be extremely helpful.

Appreciate your help.

Regards
Jay

Chris Cranford

unread,
Jul 8, 2022, 10:07:32 AM7/8/22
to debe...@googlegroups.com, Jay
Hi Jay -

Absolutely, the "END" (commit) transaction event we emit when "provide.transaction.metadata" is enabled provides a "data-collections" property that contains the name of all the collections/tables that had an event emitted for that specific transaction.  So you should be able to use this event to then look at the contents of the respective table-level topics and locate records that contain that transaction identifier within the event's payload in the source information block.

Chris

Jay

unread,
Jul 8, 2022, 10:23:34 AM7/8/22
to debezium
Thank you Chris

Any guidance on the performance impact?

Regards
Jay

Chris Cranford

unread,
Jul 9, 2022, 10:41:20 AM7/9/22
to debe...@googlegroups.com, Jay
Hi Jay -

Kstream apps aren't all that uncommon when you need to do parallelized stream processing, and that would definitely help in terms of performance.  But unfortunately, I don't have a lot of experience with Kstreams and I'm likely not the best person to give any evaluation on performance.  You could consider reaching out to the Kafka team directly and propose your scenario and they may be able to give you a better idea in terms of performance benchmarks.

Chris

Jay

unread,
Jul 18, 2022, 4:36:55 AM7/18/22
to debezium
Hi Chris

Thank you for your suggestions and guidance. I will read on K-Stream and will try to find more details.

Regards
Jay
Reply all
Reply to author
Forward
0 new messages