Hello,
Currently, setting up ClickHouse for distributed environment is not easy. In order to create a distributed table, we need the create the actual table in all nodes in the cluster explicitly and the table definition must be same for all nodes. Although INSERT INTO to distributed tables works, the documentation suggest that we should consider INSERTing data to individual tables because is more efficient and flexible. Also ALTER TABLE distributedtable doesn't seem to be working. All these problems make us hard to think that ClickHouse is a distributed database since we usually need to deal with each node in the cluster in order to perform an operation.
The other problem is that scaling cluster is not easy. Adding nodes to the cluster requires configuration file change in all nodes and the ClusterHouse process needs to be restarted in order to be able to use the new node.
Also I'm not sure how can we scale down (removing node) the cluster without losing data. Maybe ReplicatedMergeTree table may recover the data but it's not clear which configuration we should use in order to be able to create ClickHouse cluster.
Given that the use-case of columnar databases is mostly involve distributed environment and replication feature in practice, I think it should be an easy and straightforward to create distributed tables that has failure recovery (via replication) feature. I understand that it may not be easy and you guys already have a setup for ClickHouse at Yandex and solve these problems in application level but it's hard for us to do that and it's an important barrier for CloudHouse.
I wonder if you have any plans to solve these problems and create a feature-complete distributed database. You may also have other priorities such as extending the SQL syntax or improve the performance of ClickHouse (It's already great BTW) so it would be great if you could share your roadmap for ClickHouse. I have seen that there is "cloud databases" feature in development which might solve these problems but couldn't find documentation about it.
Thanks for open-sourcing ClickHouse BTW!