Messages to multiple topics with different transforms

148 views
Skip to first unread message

Noe Charmet

unread,
Oct 21, 2021, 8:08:48 AM10/21/21
to debezium
Hello,
One things that I would like to achieve would be to publish change events to different topics while applying different transforms.

The idea on our side is that we could have some sensitive data in the table, with consumers having different levels of permissions. We would have one topic with all columns, and another one with either a column whitelist only or with hash masked columns.

Would this be possible with the current Debezium server implementation (we use Google Pub/Sub)?

Best regards

Chris Cranford

unread,
Oct 21, 2021, 9:30:06 AM10/21/21
to debe...@googlegroups.com, Noe Charmet
Hi Noe -

If you wanted to accomplish this entirely with Debezium, then you would start multiple Debezium Server instances each running the specific SMT configuration needed to emit changes to the destination topic as you need for consumers. 

Another option that is Kafka-based is where you would have Debezium Server emit events to Kafka with all the columns and then utilize a KStreams application to read the events and re-emit them to the obfuscated topics with the appropriate fields masked/removed, only requiring that you read/emit from the datasource once.  I'm not a Google Pub/Sub expert, so I'm not sure whether that platform offers similar functionality.

HTH,
CC
--
You received this message because you are subscribed to the Google Groups "debezium" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debezium+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/cfecc36c-a7a4-4330-9e52-fd8d7bea5a82n%40googlegroups.com.

Nuria Ruiz

unread,
Oct 21, 2021, 2:04:49 PM10/21/21
to debezium
Hello, 

Probably is a lot easier to have one CDC consumer (i.e. one debezium process) publishing to `raw.<database>.<table> topics` and later have a kafka consumer (of ksql/kstream app) that reads from that topic and (using some configuration) republishes it to a topic (say redacted.<database>.<table>) per every table. This has less moving parts and replicating data from CDC topics into topics with different configs (say compaction) is a common strategy. 

Thanks,

Nuria

Nuria Ruiz

unread,
Oct 21, 2021, 2:13:51 PM10/21/21
to debe...@googlegroups.com
Also a simple kafka to kafka consumer is easier to setup than a ksql
or kstream application and redacting of fields does not benefit from
streaming.
> You received this message because you are subscribed to a topic in the Google Groups "debezium" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/debezium/TUY_VlXnGWc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to debezium+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/debezium/40c382ec-c1a7-47bf-ba23-cfa0023c8cf2n%40googlegroups.com.

Nuria Ruiz

unread,
Oct 22, 2021, 2:02:50 PM10/22/21
to debezium
Sorry, corrected below.
* as redacting of fields does not benefit from streaming.

Reply all
Reply to author
Forward
0 new messages