Hey guys,
hope you're doing great.
I've been using clickhouse with docker for about a year now without any problems. A simple cluster with 3 replicas and 1 shard. No problem at all with this cluster, can execute ddl queries and do everything.
Now a few days needed to create a new cluster. The cluster creation went well, clickhouse and zookeeper running as expected, but I'm stuck with an issue that is not letting me run "on cluster" queries, like
"create database mydb on cluster my cluster"
The query is timing out and the job is failing on all hosts.
Clickhouse error:2020.12.11 14:25:00.727536 [ 90 ] {00707b04-f3ef-4eee-bb03-cda2ab014017} <Error> DynamicQueryHandler: Code: 159, e.displayText() = DB::Exception: Watching task /clickhouse/task_queue/ddl/query-0000000006 is executing longer than distributed_ddl_task_timeout (=180) seconds. There are 2 unfinished hosts (0 of them are currently active), they are going to execute the query in background, Stack trace (when copying this message, always include the lines below):
0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x110c8ef0 in /usr/bin/clickhouse
1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x998893d in /usr/bin/clickhouse
2. DB::DDLQueryStatusInputStream::readImpl() @ 0xdcad7d7 in /usr/bin/clickhouse
3. DB::IBlockInputStream::read() @ 0xdae814d in /usr/bin/clickhouse
4. void DB::copyDataImpl<std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&>(DB::IBlockInputStream&, DB::IBlockOutputStream&, std::__1::function<bool ()> const&, std::__1::function<void (DB::Block const&)> const&) @ 0xdb1bfcb in /usr/bin/clickhouse
5. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, DB::Context&, std::__1::function<void (std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)>) @ 0xe1fdda6 in /usr/bin/clickhouse
6. DB::HTTPHandler::processQuery(DB::Context&, Poco::Net::HTTPServerRequest&, HTMLForm&, Poco::Net::HTTPServerResponse&, DB::HTTPHandler::Output&) @ 0x9a6ce9c in /usr/bin/clickhouse
7. DB::HTTPHandler::handleRequest(Poco::Net::HTTPServerRequest&, Poco::Net::HTTPServerResponse&) @ 0x9a7095e in /usr/bin/clickhouse
8. Poco::Net::HTTPServerConnection::run() @ 0x10f798a3 in /usr/bin/clickhouse
9. Poco::Net::TCPServerConnection::start() @ 0x10fb4c6b in /usr/bin/clickhouse
10. Poco::Net::TCPServerDispatcher::run() @ 0x10fb50fb in /usr/bin/clickhouse
11. Poco::PooledThread::run() @ 0x11162cc6 in /usr/bin/clickhouse
12. Poco::ThreadImpl::runnableEntry(void*) @ 0x1115df80 in /usr/bin/clickhouse
(version 20.4.8.99 (official build))
2020.12.11 14:25:00.728261 [ 90 ] {00707b04-f3ef-4eee-bb03-cda2ab014017} <Error> DynamicQueryHandler: Cannot send exception to client: Code: 24, e.displayText() = DB::Exception: Cannot write to ostream at offset 664, Stack trace (when copying this message, always include the lines below):
Zookeeper error:
2020-12-11 14:17:33,081 [myid:5] - INFO [CommitProcessor:5:ZooKeeperServer@694] - Established session 0x500717ef9a00000 with negotiated timeout 30000 for client /
10.244.4.76:476162020-12-11 14:17:33,086 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x500717ef9a00000 type:create cxid:0x1 zxid:0x200000004 txntype:-1 reqpath:n/a Error Path:/clickhouse Error:KeeperErrorCode = NodeExists for /clickhouse
2020-12-11 14:17:33,088 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x500717ef9a00000 type:create cxid:0x2 zxid:0x200000005 txntype:-1 reqpath:n/a Error Path:/clickhouse/task_queue Error:KeeperErrorCode = NodeExists for /clickhouse/task_queue
2020-12-11 14:17:33,089 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x500717ef9a00000 type:create cxid:0x3 zxid:0x200000006 txntype:-1 reqpath:n/a Error Path:/clickhouse/task_queue/ddl Error:KeeperErrorCode = NodeExists for /clickhouse/task_queue/ddl
2020-12-11 14:17:38,199 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x1 zxid:0x200000008 txntype:-1 reqpath:n/a Error Path:/clickhouse Error:KeeperErrorCode = NodeExists for /clickhouse
2020-12-11 14:17:38,203 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x2 zxid:0x200000009 txntype:-1 reqpath:n/a Error Path:/clickhouse/task_queue Error:KeeperErrorCode = NodeExists for /clickhouse/task_queue
2020-12-11 14:17:38,207 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x3 zxid:0x20000000a txntype:-1 reqpath:n/a Error Path:/clickhouse/task_queue/ddl Error:KeeperErrorCode = NodeExists for /clickhouse/task_queue/ddl
2020-12-11 14:19:34,348 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x10 zxid:0x20000000b txntype:-1 reqpath:n/a Error Path:/clickhouse Error:KeeperErrorCode = NodeExists for /clickhouse
2020-12-11 14:19:34,350 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x11 zxid:0x20000000c txntype:-1 reqpath:n/a Error Path:/clickhouse/task_queue Error:KeeperErrorCode = NodeExists for /clickhouse/task_queue
2020-12-11 14:19:34,351 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x12 zxid:0x20000000d txntype:-1 reqpath:n/a Error Path:/clickhouse/task_queue/ddl Error:KeeperErrorCode = NodeExists for /clickhouse/task_queue/ddl
2020-12-11 14:22:00,094 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x154 zxid:0x200000010 txntype:-1 reqpath:n/a Error Path:/clickhouse Error:KeeperErrorCode = NodeExists for /clickhouse
2020-12-11 14:22:00,096 [myid:5] - INFO [ProcessThread(sid:5 cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when processing sessionid:0x4002893c4ff0000 type:create cxid:0x155 zxid:0x200000011 txntype:-1 reqpath:n/a Error Path:/clickhouse/task_queue Error:KeeperErrorCode = NodeExists for /clickhouse/task_queue
Cluster topology: 2 clickhouse nodes, 2 zookeeper nodes
metrika.xml:
<yandex>
<clickhouse_remote_servers> <!-- name of substitution-->
<single_shard_3_replicas>
<shard>
<replica>
<host>clickhouse4</host>
<port>9000</port>
<!--
<user>user</user>
<password>pass</password>
-->
</replica>
<replica>
<host>clickhouse5</host>
<port>9000</port>
<!--
<user>user</user>
<password>pass</password>
-->
</replica>
</shard>
</single_shard_3_replicas>
</clickhouse_remote_servers>
<zookeeper-servers>
<node>
<host>zoo4</host>
<port>2181</port>
</node>
<node>
<host>zoo5</host>
<port>2181</port>
</node>
</zookeeper-servers>
<macros>
<shard>01</shard>
<replica>serv4</replica>
</macros>
</yandex>
I'm obv able to create databases and tables without using the "on cluster" but all queries with "on cluster" are failing.
The only differece between this setup and the working setup that I have is that this one has 2 clickhouse/zookeeper nodes while the other one has 3.
I have also sql user access enabled on this cluster, and have tried to include user info in metrika.xml but did not make any difference.
Any tip/help would be highly appreciated.