I have created a 5 node cluster with replica per shard = 3.
Next I created ReplicatedMergeTree local table on all the nodes and a Distributed table on top these replicated tables and inserted data into Distributed table from a log table where record count is 2337420 (log table).
When I fire a count(*) on my Distributed table I get total count across all the replicas = 7012260.
I have configured replication as below.
<Clickhouse_10shards_3replica>
<shard>
<replica>
<host>host1</host>
<port>9020</port>
</replica>
<replica>
<host>host2</host>
<port>9020</port>
</replica>
<replica>
<host>host3</host>
<port>9020</port>
</replica>
</shard>
...
<Clickhouse_10shards_3replica>
Create ReplicatedMergeTree is as below:create table replica_table ()ENGINE = ReplicatedMergeTree('/clickhouse/Clickhouse_10shards_3replica/online/replica_table/{shard}', 'host1-01', end_date, (id), 8192)");
Distributed table is as below.CREATE TABLE replica_tableD AS replica_table ENGINE = Distributed (Clickhouse_10shards_3replica, online, replica_table, rand())
Let me know if I missing something or what is the best way to replicate a distributed table?