I need help with KairosDB clustering using Cassandra

106 views

Skip to first unread message

Rosita Hormann Lobos

unread,

Dec 23, 2019, 9:26:19 AM12/23/19

to KairosDB

Hello everyone, I am new on time series databases, and I wanted some advice about configuring a cluster of nodes with KairosDB using Cassandra. I have read a lot and I am very lost about how to configure a cluster.

I understand how to do it with Cassandra, for example: if I have 5 nodes, the configuration of each node is: cluster_name=node_1, listen_address=ip address of the node,-seeds:ip address of the rest of the nodes, and then the others have another cluster_name and the listen_addres should be their own IP(correct me if I'm wrong please). Configure the firewall to have the corresponding ports open, After that, all the queries and insertions should be made over any of the nodes specifying the IP of the node, and doing that the data I insert is distributed randomly (by default) on the cluster. I can make it easily using this docker image: https://hub.docker.com/_/cassandra making every new node added "point" to the first one (the "master"?)

Then, for example, if I am making a stand-alone node, I should install and run KairosDB on the server and it would be listening to port 8080 to insert and query data. But here I start having problems understanding how to expand it to a cluster configuration:

1) I should install KairosDB in every node? what configurations should I put on KairosDB?

2) will the data be distributed automatically over the entire cluster or the data will be saved on the node I am making the insertion to? how can I make the cluster to just be able to the external network in one node so I can make every insertion and queries in that node and making it responsible of ditributing the data over the cluster?

3) If I have 5 nodes and I want some level of replication, how can I achieve that?

Sorry for my english (I'm not a native english speaker), any help would be very appreciated!

Brian Hawkins

unread,

Jan 1, 2020, 1:13:05 AM1/1/20

to KairosDB

All cassandra nodes should have the same cluster name. For a 5 node cluster I would pick two to be the seed nodes. Then all 5 will have those two ip addresses as the seed nodes. The seed nodes are just for bootstrapping the cluster when starting it up. You will have to start one of the seed nodes first when starting the cluster.

The kairos server doesn't have to run on the same machine as cassandra, unless you have reasons for doing so I would run them on separate machines. The kairos configuration should also point to the seed nodes. Data sent to a kairos node will automatically be balanced across the cassandra cluster.

Altering the replication can be done after Kairos has installed it's schema: https://www.tutorialspoint.com/cassandra/cassandra_alter_keyspace.htm