Kafka Engine vs JDBC Driver

Skip to first unread message

Sateesh Kommineni

Mar 29, 2021, 4:11:02 PMMar 29
to ClickHouse

  We are trying to monitor the Kafka Topics and use the data to drive Analytical dashboards.

We can create the tables in clickhouse using the Kafka Engine or use the JDBC Driver and send the event info to Clickhouse using either Custom Kafka Consumer or Apache NiFi etc.

What are the downsides of using the Kafka Engine. We want to monitor any topic created on the Kafka Cluster and the no.of topics are only going to grow. How does that impact the Clickhouse server/Cluster if we use the Kafka Engine. If we don't use Kafka Engine then the retrieval of info from kafka is externalized and hence no impact on the Cluster.

If we use the JDBC Driver and send the data in batches what are the optional batch sizes.

   Are there any best practices w.r.to this.


Denis Zhuravlev

Mar 29, 2021, 4:33:29 PMMar 29
to ClickHouse
The biggest problem that Kafka Engine does not provide logs and does not have tools to debug to process bad messages.
One day Kafka Engine will stop to consume and will spend hours trying to understand why. Eventually you will find that some field in some message corrupted by bit-rot, and manually shift commit offset.
Reply all
Reply to author
0 new messages