Hi,
we are using kafka tables - > materialized view -> final_table mechanism for ingesting the data.
Looks like kafka engine is not reliable so we are facing the issue in which single kafka message is written twice (or more than twice) in final_table.
There is also agg_view that aggregates data from final table.
CREATE MATERIALIZED VIEW agg_view
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{layer}-{shard}/...
as select ... from final_table
How to deal with duplicates in final_table in order to aggregate only single copy of each row?
I hoped that I will be able to use collapsing engine for final_table, but than I have a problem that aggregating merge tree cannot use final keyword.
CREATE MATERIALIZED VIEW agg_view
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{layer}-{shard}/...
as select ... from final_table FINAL
"
SQL Error [397]: ClickHouse exception, code: 397, host: localhost, port: 8123; Code: 397, e.displayText() = DB::Exception: There was an error on [node01:9000]: Code: 397, e.displayText() = DB::Exception: MATERIALIZED VIEW cannot have PREWHERE, SAMPLE or FINAL. (version 19.16.3.6) (version 19.16.3.6)
"
How to aggregate if there are duplicates?
Regards,
Zoran