Outbox and schemas

8 views
Skip to first unread message

Morrigan Jones

unread,
Apr 29, 2021, 11:51:33 AMApr 29
to debezium
Hey mailing list,

We're currently exploring switching to an outbox style pattern at my company from our current Debezium setup. One downside to this approach is we lose the ability to easily and automatically get schematized records from the DB and then write them out as Avro to Kafka with their schemas registered. 

We found this issue which covers this problem: https://issues.redhat.com/browse/DBZ-1297. Gunnar's proposed solution makes sense, though I think you should also be able to infer or provide a schema for the aggregate ID. However, there hasn't been any movement on this issue in about a year. Is there any plan to move forward on this issue?

I'd also be interested to hear how other people have tackled this issue who are using the outbox SMT that comes with Debezium. 

Thanks!

Gunnar Morling

unread,
Apr 29, 2021, 4:39:35 PMApr 29
to debezium
 Hey Morrigan,

> Is there any plan to move forward on this issue?

I'd personally still very much like to see progress here; so far no one was ready to pick it up, though. So if you're interested in implementing this, a PR would be more than welcome.

> I think you should also be able to infer or provide a schema for the aggregate ID.

This already should be the case? Do you see a case where the message key schema wouldn't be set? If so, I agree that this should be changed.

Reading the issue again, I'd by now lean towards using regular JSON Schema (https://json-schema.org/) for describing the payload schema.

Hth,

--Gunnar

Morrigan Jones

unread,
May 3, 2021, 2:54:17 PMMay 3
to debe...@googlegroups.com
Thanks for your response, Gunnar! We're still deciding how we want to handle schemas across teams. If we decide to do something in line with the ticket, we'll see about contributing.

>> I think you should also be able to infer or provide a schema for the aggregate ID.

> This already should be the case? Do you see a case where the message key schema wouldn't be set? If so, I agree that this should be changed.

Maybe I'm mistaken, but I thought the SMT handles the aggregate ID and payload similarly, in that it (mostly) just gets passed along. If both are JSON strings, and we add the ability to specify a schema column for the payload, then I'd want to do the same for the ID.

--
You received this message because you are subscribed to a topic in the Google Groups "debezium" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/debezium/dLn6zQhm9tg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/722a73b9-f569-4e44-86dc-572fbca649e2n%40googlegroups.com.


--


Morrigan Jones

Principal Engineer

jwplayer.com

Reply all
Reply to author
Forward
0 new messages