Need help to setup Zookeeper on Google Kubernetes

144 views
Skip to first unread message

Divya Sai Teja Pamarthi

unread,
Feb 23, 2022, 7:31:50 PM2/23/22
to ClickHouse
Hi Everyone, 

I was able to setup Clickhouse on Google Kubernetes Engine cluster (3 nodes) recently with the following configuration. 

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"

metadata:

name: "simple-03"

spec:

configuration:

clusters:

- name: "replicas"

layout:

shardsCount: 3

replicasCount: 2

- name: "shards"

layout:

shardsCount: 2

- name: "shards-replicas"

layout:

replicasCount: 2


I would like to know if I need ZooKeeper to carry out replication process consistent across nodes. It would be very helpful if someone helps me to setup zookeeper on a three node kubernetes cluster (GKE). Please help!

Thanks,
Sai Teja


Denis Zhuravlev

unread,
Feb 23, 2022, 9:08:26 PM2/23/22
to ClickHouse

Divya Sai Teja Pamarthi

unread,
Feb 24, 2022, 4:46:17 PM2/24/22
to ClickHouse
Hi Denis,

Thank you so much for the links. 

It would be really helpful if  you can help me with the following issues and questions also. 

1. I followed this link (https://docs.altinity.com/clickhouseonkubernetes/kubernetesquickstartguide/quickzookeeper/) and created zookeeper service. It has been in pending state all the time. I am on GKE and it is saying insufficient CPU. Currently I am using three node cluster with 2 CPU and 4 GB RAM each.
 

zookeeper1.PNG

zoo2.PNG


2. I used the following configuration, where it is mentioned as two shards but it created four pods. Could you explain how this works


pods.PNG

3. According to the above example(the link), is my cluster name demo-01? 

4. I created a database classed "first" on all four nodes and created a table on four nodes manually (i got an error while using on [cluster demo-01]) and inserted records. When i did the select *, it gives different output every time I run it. Could you please help me what mistake I am making here?

Same type of issue when I connect to a pod using kubectl exec command. 
image1.PNG

image2.PNG


5. As I am having a three node Kubernetes cluster. What would be best number of zookeeper nodes, replicas and shards?


I know I am asking for a lot of information. Please help me to understand ClickHouse. Thank you so much in advance. 

Thanks,
Sai Teja

Denis Zhuravlev

unread,
Feb 24, 2022, 5:21:05 PM2/24/22
to ClickHouse
GKE unable to allocate enough resources, how many Nodes / CPUs are available in your GKE ?

I guess you can remove resources: section

>2. I used the following configuration, where it is mentioned as two shards but it created four pods. Could you explain how this works
shardsCount: 2 replicasCount: 2

>3. According to the above example(the link), is my cluster name demo-01?
right

>4. I created a database classed "first" on all four nodes and created a table on four nodes manually (i got an error while using on [cluster demo-01])
>and inserted records. When i did the select *, it gives different output every time I run it. Could you please help me what mistake I am making here?

it's expected. Replication works with ReplicatedMergeTree, and Zookeeper is mandatory for Replicated
You need to understand Engines, MergeTree / ReplicatedMergeTree / Distributed https://kb.altinity.com/engines/

>5. As I am having a three node Kubernetes cluster. What would be best number of zookeeper nodes, replicas and shards?
tree node Kubernetes cluster with how many CPU ? RAM ?
To make a playground or for production to ingest 20 billions rows per hour?

Divya Sai Teja Pamarthi

unread,
Feb 24, 2022, 7:37:37 PM2/24/22
to ClickHouse
Hi Denis,

I am really really thankful for your help. 

1. I am using three nodes (2 Cpu , 4 GB RAM for each node)

2. I was able to set up zookeeper and it is up and running (3 nodes). Could you please what is wrong with below query?
CREATE DATABASE FIRST_DB  [ON CLUSTER demo-01] ENGINE =  ReplicatedMergeTree();
error: DB::Exception: Syntax error: failed at position 27 ('['): [ON CLUSTER demo-01] ENGINE =  ReplicatedMergeTree(). Expected one of: storage definition, ENGINE, end of query, UUID, INTO OUTFILE, SETTINGS, ON, FORMAT (version 21.8.10.1.altinitystable (altinity build))

zoo_running.PNG

3. How do I know that the clickhouse pods/nodes are linked to zookeeper and how to test if the replication is happening and where/how to see the replicas?

4. As of now, we have 3.5 billion records in some on premises Ubuntu system. Once the setup is done, I need to migrate it. What would be your suggestion on this "What would be best number of zookeeper nodes, replicas and shards for my cluster?"

Kindly help me with the above questions. 

Thanks,
Sai Teja
Reply all
Reply to author
Forward
0 new messages