Advise on schema design for sharding

va...@percona.com

unread,

Mar 31, 2017, 4:45:25 PM3/31/17

to ClickHouse

I am playing with Star Schema Benchmark and multi-nodes setup.

I started with two nodes, and two tables.

The tables are:

https://github.com/vadimtk/ssb-clickhouse/blob/master/create.sql

and distributed tables created as

CREATE TABLE lineorderd AS lineorder ENGINE = Distributed(perftest_2shards_1replicas, default, lineorder, LO_CUSTKEY);
CREATE TABLE customerd AS customer ENGINE = Distributed(perftest_2shards_1replicas, default, customer, C_CUSTKEY);

Two shards defined as

<remote_servers>

<perftest_2shards_1replicas>

<shard>

</replica>

</shard>

<shard>

</replica>

</shard>

</perftest_2shards_1replicas>

</remote_servers>

Table sizes are

SELECT count(*)

FROM customerd

┌──count()─┐

│ 30000000 │

└──────────┘

SELECT count(*)

FROM lineorderd

┌────count()─┐

│ 5999989709 │

└────────────┘

and data loaded as:

cat lineorder.tbl | clickhouse-client --host=ch --query "INSERT INTO lineorderd FORMAT CSV"

So the problem:

On the 2 shards the query execution time is about 5x times slower than if I use only single box.

The query is:

SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT)

FROM

(

SELECT

LO_EXTENDEDPRICE,

LO_DISCOUNT,

LO_ORDERDATE,

LO_QUANTITY,

LO_CUSTKEY AS cust

FROM lineorderd

) AS a

ALL INNER JOIN

(

SELECT

C_REGION,

C_CUSTKEY AS cust

FROM customerd

) AS b USING (cust)

WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25) AND (C_REGION = 'ASIA')

Also during execution, the 2nd box is practically idle, I do not see any CPU or IO activity during query execution.

Can you please give any advice how I should design sharded setup to see some performance improvement comparing to the single box?

man...@gmail.com

unread,

Apr 10, 2017, 3:19:08 PM4/10/17

to ClickHouse

In this query, JOIN is actually not distributed.
The query is executed as following:

1. Execute "right" subquery and fill a hash table on the initial server; data is transferred over the network to initiating server.
2. Execute "left" subquery, read all data over the network to initial server and do the join.

As you see, join is processed only on initiating server.

You could rewrite query like this:

SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorderd

ALL INNER JOIN
(
SELECT
C_REGION,

C_CUSTKEY AS LO_CUSTKEY
FROM customerd
) USING (LO_CUSTKEY)

WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25) AND (C_REGION = 'ASIA')

(I had not tried it yet...)

Vadim Tkachenko

unread,

Apr 10, 2017, 4:25:28 PM4/10/17

to ClickHouse

Your query works much better, I just had to use

SET distributed_product_mode = 'allow'

Thanks,

Vadim

man...@gmail.com

unread,

Apr 10, 2017, 4:40:41 PM4/10/17

to ClickHouse

In your case, data is co-located on nodes.
You could use distributed_product_mode = 'local',
or specify local table in "right" subquery.

Vadim Tkachenko

unread,

Apr 11, 2017, 2:04:32 AM4/11/17

to ClickHouse

ok, now I am trying following query.

This time tables are defined as:

CREATE TABLE lineorderd AS lineorder ENGINE = Distributed(perftest_3shards_1replicas, default, lineorder, rand());

CREATE TABLE customerd AS customer ENGINE = Distributed(perftest_3shards_1replicas, default, customer, rand());

and

SET distributed_product_mode = 'allow'

(when I set SET distributed_product_mode = 'local' - I am getting the wrong result)

SELECT

C_REGION,

C_NATION,

count(*)

FROM lineorderd

ALL INNER JOIN

(

SELECT

C_REGION,

C_NATION,

C_CUSTKEY AS LO_CUSTKEY

FROM customerd

) USING (LO_CUSTKEY)

WHERE (toYear(LO_ORDERDATE) > 1993) AND (LO_DISCOUNT >= 1) AND (LO_QUANTITY < 25)

GROUP BY

C_REGION,

C_NATION

Received exception from server:

Code: 241. DB::Exception: Received from ch:9000, 172.18.0.2. DB::Exception: Memory limit (for query) exceeded: would use 9.31 GiB (attempt to allocate chunk of 8388608 bytes), maximum: 9.31 GiB: while receiving packet from 10.10.7.165:9000.

I have

<max_bytes_before_external_group_by>50000000000</max_bytes_before_external_group_by>

<max_memory_usage>100000000000</max_memory_usage>

What settings should I change to have this query executed?

man...@gmail.com

unread,

Apr 11, 2017, 9:57:12 PM4/11/17

to ClickHouse

1. Try to specify
distributed_product_mode = 'global'
instead of 'allow'.

Difference is explained here: https://groups.google.com/forum/#!topic/clickhouse/11OceYDtPVI

Or (effect is the same), add GLOBAL keyword before ALL INNER JOIN.

> (when I set SET distributed_product_mode = 'local' - I am getting the wrong result)

It because you have changed your sharding key to rand().
If you had specified intHash32(LO_CUSTKEY), intHash32(C_CUSTKEY) as sharding keys, everything will be fine with 'local' mode of execution.

2. You may use ANY INNER JOIN instead of ALL INNER JOIN, because you know in advance, that customer table have unique keys.

3. Seems that settings for memory usage and for external aggregation has not passed to remote server.
Possible reasons:
- readonly access between shards of the cluster. If you have user with readonly = 1, the settings will not be passed from the client to the server.
You could specify readonly = 2 instead, that means "readonly, but allow to modify settings".

Also specify distributed_aggregation_memory_efficient = 1 for distributed queries.

Reply all

Reply to author

Forward