Need help in historical - ZK setup

50 views
Skip to first unread message

Harshal Chaudhari

unread,
Apr 7, 2025, 7:52:19 AMApr 7
to Druid User
Hi everyone, 
I have setup druid in cluster mode. I was working fine, I had data migrated to it and we were able to query it properly. 
But suddenly my historical service started going down. In the logs for historical service I am getting the following error:

2025-04-07T10:55:34,282 WARN [main-SendThread()] org.apache.zookeeper.ClientCnxn - Session 0x1064123e51b059d for server ip-.ec2.internal/, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
java.io.IOException: Packet len 1414591 is out of range!
at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:121) ~[zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:84) ~[zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350) ~[zookeeper-3.8.3.jar:3.8.3]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1289) ~[zookeeper-3.8.3.jar:3.8.3]
2025-04-07T10:55:34,398 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager - State change: SUSPENDED
2025-04-07T10:55:34,398 INFO [ZkCoordinator] org.apache.druid.server.coordination.ZkCoordinator - Ignoring event[PathChildrenCacheEvent{type=CONNECTION_SUSPENDED, data=null}]
2025-04-07T10:55:34,399 WARN [NodeRoleWatcher[COORDINATOR]] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher - Ignored event type [CONNECTION_SUSPENDED] for node watcher of role [coordinator].

I incresed the "jute.maxbuffer" in the zoo keeper, for sometime one of my two data source came up but now again I have started getting the error. NOTE: The value I have kept in zookeeper is greater than the value in the error. 

Can anyone help me with this?

Thank you,
Harshal Chaudhari

gi...@imply.io

unread,
Apr 7, 2025, 3:11:59 PMApr 7
to Druid User
You may need to increase this in many places, as the ZK docs mention:

> This option can only be set as a Java system property. There is no zookeeper prefix on it. It specifies the maximum size of the data that can be stored in a znode. The default is 0xfffff, or just under 1M. If this option is changed, the system property must be set on all servers and clients otherwise problems will arise. This is really a sanity check. ZooKeeper is designed to store data on the order of kilobytes in size.

Btw, is your ZK cluster shared with other systems besides Druid? Druid these days does not use ZK for much beyond service discovery and leader election. It seems strange that Druid itself might write a znode that large. I wonder if it came from somewhere else.

Gian

Harshal Chaudhari

unread,
Apr 8, 2025, 3:23:35 AMApr 8
to Druid User
Thank you Gian for the respose.
But I am still not sure what to be done next. Could you please point me to a document containing the details of the setup.

Ben Krug

unread,
Apr 8, 2025, 12:19:10 PMApr 8
to druid...@googlegroups.com
According to zk docs, you'd need to set it on all "clients", which would be all of the druid services, iiuc.

It might be simpler to try increasing druid.indexer.runner.maxZnodeBytes instead  (you can search
for it on this doc page).

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/druid-user/1440462b-659b-4a91-b9a7-54db186c6124n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages