I am playing with Star Schema Benchmark and multi-nodes setup.
I started with two nodes, and two tables.
The tables are:
and distributed tables created as
CREATE TABLE lineorderd AS lineorder ENGINE = Distributed(perftest_2shards_1replicas, default, lineorder, LO_CUSTKEY);
CREATE TABLE customerd AS customer ENGINE = Distributed(perftest_2shards_1replicas, default, customer, C_CUSTKEY);
Two shards defined as
<remote_servers>
<perftest_2shards_1replicas>
<shard>
<replica>
<host>10.10.7.165</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>10.10.7.167</host>
<port>9000</port>
</replica>
</shard>
</perftest_2shards_1replicas>
</remote_servers>
Table sizes are
SELECT count(*)
FROM customerd
┌──count()─┐
│ 30000000 │
└──────────┘
SELECT count(*)
FROM lineorderd
┌────count()─┐
│ 5999989709 │
└────────────┘
and data loaded as:
cat lineorder.tbl | clickhouse-client --host=ch --query "INSERT INTO lineorderd FORMAT CSV"
So the problem:
On the 2 shards the query execution time is about 5x times slower than if I use only single box.
The query is:
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM
(
SELECT
LO_EXTENDEDPRICE,
LO_DISCOUNT,
LO_ORDERDATE,
LO_QUANTITY,
LO_CUSTKEY AS cust
FROM lineorderd
) AS a
ALL INNER JOIN
(
SELECT
C_REGION,
C_CUSTKEY AS cust
FROM customerd
) AS b USING (cust)
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25) AND (C_REGION = 'ASIA')
Also during execution, the 2nd box is practically idle, I do not see any CPU or IO activity during query execution.
Can you please give any advice how I should design sharded setup to see some performance improvement comparing to the single box?