Hi everyone,
We are testing ClickHouse to determine if it is a good fit for our use case and if it could replace our current statistics system, we installed a cluster with 4 nodes with the following configuration:
- 1 Layer
- 2 Shards
- 2 Replicas per Shard
So we have four nodes in total and a node that makes all the ingestion of data inserting CSV format files concurrently (split the files in 16 chunks and insert them in each working node in round robin fashion) . It is working fine but some times we get the pair of error codes 252 and 242 in one of the shards.
I guess that is probably some network issues between the two nodes that makes the replication process to lag but I don't really know as I could not find any related information over the internet regarding this problem. The errors are:
Error code 252 DB::Exception: Too much parts. Merges are processing significantly slower than inserts.. (this one happens in layer1-shard2-replica1)
Error code 242 DB::Exception: Table is in readonly mode. (this one happens in layer1-shard2-replica2)
The Shard 1 never show the problem.
Any ideas?
Thanks.