How does Replication work in ClickHouse?

1,622 views
Skip to first unread message

Prashant Bhardwaj

unread,
Feb 17, 2017, 8:24:57 AM2/17/17
to ClickHouse
I have two servers each running Clickhouse. I have created a ReplictedMergeTree table on both the instance with the following schema
CREATE TABLE user_activity
(
    event_date Date MATERIALIZED toDate(now()),
    time_stamp DateTime MATERIALIZED toDateTime(now()),
    device_id String,
    partner_id UInt32 DEFAULT toUInt32(0), 
    event_id String,
    os String,    
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/user_activity', '{replica}', event_date, (partner_id, os, time_stamp, event_id, device_id), 8192)

Also, I have added following configuration to respective servers
<macros>
    <shard>01</shard>
    <replica>clickhouseserver2</replica>
</macros>

<macros>
    <shard>02</shard>
    <replica>clickhouseserver1/replica>
</macros>

I have created a distributed table on top of these two tables
CREATE TABLE user_activity_distribute AS user_activity ENGINE = Distributed(clickhouse, datalayer, retargeting, partner_id);

Still, I can see that replica's for each shard is created on the same server. 

2017.02.17 09:05:06.829412 [ 3 ] <Debug> datalayer.user_activity (StorageReplicatedMergeTree): Creating table /clickhouse/tables/01/user_activity

2017.02.17 09:05:06.898994 [ 3 ] <Debug> datalayer.user_activity (StorageReplicatedMergeTree): Creating replica /clickhouse/tables/01/user_activity/replicas/clickhouseserver2


So, If one server goes down that part of data is lost. How do I make sure that there is replica of shard is present on different server?

If it helps following is the remote server config's
<remote_servers>
    <datalayer>
        <shard>
            <weight>1</weight>
            <internal_replication>true</internal_replication>
            <replica>
                <host>clickhouseserver1</host>
                <port>9000</port>
            </replica>
            <replica>
                <host>clickhouseserver2</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <weight>1</weight>
            <internal_replication>true</internal_replication>
            <replica>
                <host>clickhouseserver2</host>
                <port>9000</port>
            </replica>
            <replica>
                <host>clickhouseserver1</host>
                <port>9000</port>
            </replica>
        </shard>
    </datalayer>
</remote_servers>

Prashant Bhardwaj

unread,
Feb 21, 2017, 1:33:21 AM2/21/17
to ClickHouse
So figured out that you can create a replica on another server by substituting shard name and replica name while creating ReplicatedMergeTree Table.
Is there any other way around? Like specifying in the config file (specifying multiple shard and replica through macros).

Also some followup questions:
1. You can also specify shard and replica in remote_servers config. How is this related to shard and replica of ReplicatedMergeTree?
2. I have two columns which are MATERIALIZED in ReplicatedTables. Inserting data directly to these tables is fine. But when I try to insert that data through distributed table, created on top of these replicated tables it throws following exception. Here event_data is calculated column. 

Code: 44. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Cannot insert column event_date.

How can I insert data in distributed table without providing calculated column?

3. How can I reshard data after adding another shard? I have inserted to a ReplicatedTable. I have created another shard on another server , now I want to reshard data by partner_id. I ran following command.
ALTER TABLE datalayer.user_activity RESHARD TO '/clickhouse/tables/01/user_activity' WEIGHT 1, '/clickhouse/tables/02/user_activity' WEIGHT 1 USING partner_id


But it is throwing following exception.

Code: 310. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Table is inconsistent accross shards. 

Reply all
Reply to author
Forward
0 new messages