Historical nodes crashing

Rajesh MK

unread,

Jun 3, 2021, 9:56:19 AM6/3/21

to druid...@googlegroups.com

Hi team,

I have a 4 node cluster (2 datanodes ,1 Master,1 query node) and historical process is crashing repeatedly on both data nodes with out of memory heap error even after increasing the memory heap multiple times .This eventually leading to zookeeper cluster data corruption and i had to remove the zookeeper data to fix it..This cluster was running fine for around 3 weeks .Is the increased load causing this issue ? is there a way to fix this?

We are using kafka native ingestion.

Error from historical log

2021-06-03T09:57:04,141 INFO [Announcer-0] org.apache.druid.curator.announcement.Announcer - Reinstating [/druid/segments/drulx1002:8083/drulx1002:8083_historical__default_tier_2021-06-03T09:09:46.124Z_e0
a8eab478ff4b5cbdb61cfacf4ea9f42061]
Terminating due to java.lang.OutOfMemoryError: Java heap space
2021-06-03T09:57:33,503 INFO [main] org.hibernate.validator.internal.util.Version - HV000001: Hibernate Validator 5.2.5.Final

Total RAM 64 GB

[root@drulx1002 ~]# cat /druid/apache-druid-0.20.1/conf/druid/cluster/data/historical/jvm.config
-server
-Xms13g
-Xmx16g
-XX:MaxDirectMemorySize=24g
-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Please kindly let me know your thoughts on this

Regards

Rajesh

Stelios Philippou

unread,

Jun 3, 2021, 10:37:56 AM6/3/21

to druid...@googlegroups.com

Hi Rajesh,

We had just faced the same issue, as the system was running perfectly for a couple of months.

One thing that we have tried was to increase the zookepers to 3 to help out.

https://support.imply.io/hc/en-us/articles/360015465773-Zookeeper-Best-Practices-for-Imply

But that did not work out on our part.

Our issue was more extensive as the i-o ingestion was too aggressive and we ended up with creating way too many small files.

Druid likes to have 600-700mb files.

So perhaps you ended up having too many segments for your system and thus running our of memory when trying to bring it up online every time.

We ended up losing those data but we have since changed the ingestion into to collect the data as they are more appropriate in our part with segmentGranulality and intermediatePersistPeriod to not be very aggressive.

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CANFXxYRVHETSG31idJw_9oQKJaP1mtjkruuar1M0NRAfT51acw%40mail.gmail.com.

Rajesh MK

unread,

Jun 4, 2021, 2:21:27 AM6/4/21

to druid...@googlegroups.com

Hi Stelios,

Thank you for the quick response. We have another lingering issue where only one historical process is always running out of two data nodes,Is there any additional setting we need to do in load balancing while adding a second data node?

Regards

Rajesh

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CACh75bOoQL_ZNS19QxaUnXpxxqKKkW13jPa5CdiUe5au3cbPjg%40mail.gmail.com.

Max Gorinevsky

unread,

Jun 4, 2021, 2:40:02 AM6/4/21

to druid...@googlegroups.com

Hi Rajesh,

Druid is definitely able to have multiple data nodes, each with a historical (and middlemanager) process. What happens when you start the historical process on the second node? Are there any errors in the historical logs?

Thanks,
Max

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CANFXxYTpfSGDK4n3PXRnN32WAjeKK2Dv%3DBL-GBahDbBNhUm2jQ%40mail.gmail.com.

--

Thanks,

Max Gorinevsky

Imply Support

Rajesh MK

unread,

Jun 4, 2021, 4:58:14 AM6/4/21

to druid...@googlegroups.com

Hi Max,

Even though both nodes are having the same hardware spec and java config ,one of the nodes fails with a java heap space error.

drulx1001:8081

coordinator

drulx1001

8081 (plain)

drulx1001:8081

overlord

drulx1001

8081 (plain)

drulx1003:8888

router

drulx1003

8888 (plain)

drulx1003:8082

broker

drulx1003

8082 (plain)

drulx1010:8083

historical

_default_tier

drulx1010

8083 (plain)

37.54 GB

300.00 GB

12.5%

Empty load/drop queues

drulx1010:8091

middle_manager

_default_worker_category

drulx1010

8091 (plain)

2 / 4 (slots)

Last completed task: 2021-06-04T08:44:27.501Z

drulx1002:8091

middle_manager

_default_worker_category

drulx1002

8091 (plain)

4 / 6 (slots)

Last completed task: 2021-06-04T08:39:59.531Z

drulx1010:8102

peon

_default_tier

drulx1010

8102 (plain)

drulx1010:8100

peon

_default_tier

drulx1010

8100 (plain)

drulx1002:8103

peon

_default_tier

drulx1002

8103 (plain)

drulx1002:8102

peon

_default_tier

drulx1002

8102 (plain)

drulx1002:8101

peon

_default_tier

drulx1002

8101 (plain)

drulx1002:8100

peon

_default_tier

drulx1002

8100 (plain)

a8eab478ff4b5cbdb61cfacf4ea9f42061]
Terminating due to java.lang.OutOfMemoryError: Java heap space
2021-06-03T09:57:33,503 INFO [main] org.hibernate.validator.internal.util.Version - HV000001: Hibernate Validator 5.2.5.Final

Regards

Rajesh

Max Gorinevsky

unread,

Jun 4, 2021, 5:38:25 AM6/4/21

to druid...@googlegroups.com

Hi Rajesh,

If this happens right away, it could be that the node tries to cache lookups into memory and fails. It is strange that the other node does not have the issue.
What are the specs of the node and how much heap is allocated to the historical process? What is the size of any lookups?

Thanks,
Max

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CANFXxYTp6uW6C%3DEBmY8FncTc5U4k8E%2B0%2BUEkxC01CqPpZjAXAQ%40mail.gmail.com.

Rajesh MK

unread,

Jun 4, 2021, 8:22:34 AM6/4/21

to druid...@googlegroups.com

Hi Max,

Each data node has 64GB RAM and 16 CPU cores.Below is the JVM config for historic nodes

cat /druid/apache-druid-0.20.1/conf/druid/cluster/data/historical/jvm.config
-server
-Xms16g
-Xmx19g
-XX:MaxDirectMemorySize=27g

-XX:+ExitOnOutOfMemoryError
-Duser.timezone=UTC
-Dfile.encoding=UTF-8
-Djava.io.tmpdir=var/tmp
-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

Sometimes below warning appears,is this something should be worried ?

2021-06-04T12:10:09,296 WARN [main-SendThread(zoolx1003:2181)] org.apache.zookeeper.ClientCnxn - Session 0x200a170f6bc003f for server zoolx1003/10.59.108.225:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Packet len34829732 is out of range!
at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113) ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79) ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf]
2021-06-04T12:10:09,397 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager - State change: SUSPENDED
2021-06-04T12:10:09,397 INFO [ZkCoordinator] org.apache.druid.server.coordination.ZkCoordinator - Ignoring event[PathChildrenCacheEvent{type=CONNECTION_SUSPENDED, data=null}]
2021-06-04T12:10:09,397 WARN [NodeRoleWatcher[COORDINATOR]] org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider$NodeRoleWatcher - Ignored event type[CONNECTION_SUSPENDED] for node watcher of role[coordinator].

Regards

Rajesh

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAO_Opx7-TPJVaJbDpJnONy_NosFO7K4WxHK2ruYi6TU1KbMrkQ%40mail.gmail.com.

Diego Lucas Jiménez

unread,

Jun 7, 2021, 4:54:46 AM6/7/21

to Druid User

Interesting.

We are facing the same problem, from time to time one or two Historicals goes down (not the service, it keeps running) from the cluster, the Historical starts screaming a similar message ("org.apache.druid.curator.announcement.Announcer - Reinstating ...") without the Out of Heap and the Zookeeper also the same "org.apache.zookeeper.ClientCnxn - Session xxx for server xyz/IP:PORT, unexpected error, closing socket connection and attempting reconnect" than Rajesh is having.

Difference on my side is that we have 20 Historicals, 2 Coord, 2 Overlords, 25 Middlemanager, 3 ZK, 2 Brokers and 2 Routers, each of them on a dedicated machine, with special highlight over the Historicals (64vCPUs, dedicated NVME disks, 384Gb RAM).

Happens several times per day, a Historical is "kicked out" of the cluster by ZK, then comes back, all while the Historical logs are all the time like "yeah trying to talk to ZK but he hates me".

Wonder if it's a Druid 0.20.x bug, it never happened before to us.

What version of Druid do you have Rajesh?

Samarth Jain

unread,

Jun 7, 2021, 1:50:36 PM6/7/21

to druid...@googlegroups.com

Diego,

We have seen this happen in the past when the historical nodes are going through "stop the world" gc cycles. What does the gc activity look like for your historical process? Wondering if you need to do some tuning there.

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/f352e802-0bcd-45a4-bd21-f72b5cfba1cen%40googlegroups.com.

Rajesh MK

unread,

Jun 8, 2021, 1:48:03 AM6/8/21

to druid...@googlegroups.com

Thank you for the inputs. Currently we are using the version 0.20.1 and planning to upgrade it to version 0.21.0.

Regards

Rajesh

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/CAMfSBK%2B2LaKzx%3DmLzFEO-1so1yQuC2kzgGYqQmbkPt-a%2BCOq9g%40mail.gmail.com.

Diego Lucas Jiménez

unread,

Jun 14, 2021, 5:06:40 AM6/14/21

to Druid User

@samart you were actually right. Stop the world gc was the cause. In one tier of our servers increasing heap fixed the problem, but the other tier already has 24G of heap.

Should we try to increase even further? We're using G1GC already, hardware can't be better... any ideas on how to tune that heap?

Joseph Mocker

unread,

Jun 14, 2021, 11:29:20 AM6/14/21

to druid...@googlegroups.com

You mention a few times in this thread that you see OutOfMemoryErrors in the logs. To me, this suggests that your heap is not large enough and/or there is a memory leak somewhere. This is probably the cause of long stop-the-world cycles as the GC is continually scouring memory to free up enough for what it needs but can't cause it is near full. You can confirm that by turning on some GC logging or connecting JVisualVM/JConsole to it and watch heap allocation over time.

I didn't see in your config settings below that you are using G1GC but you may have added that later.

Good luck!

--joe

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/c7f2aebd-1d8b-4c80-a424-daea5f2607ccn%40googlegroups.com.

Max Gorinevsky

unread,

Jun 15, 2021, 4:57:40 AM6/15/21

to druid...@googlegroups.com

Hi Diego,

The historical heap is used to store the following:

-Lookups
-Unmerged query results
-Per-segment and per-column information

Very large lookups can use a lot of heap. Complex queries covering a large interval can use require a lot of heap.

Typically, the last of these is not a big factor, as druid stores maybe a few KB per segment and a few hundred bytes per segment-column in memory. However, if you have a lot of segments or very many columns per segment, this can add up. For example, with 100k segments and 1,000 columns per segment, you'd need about ~10GB of heap for this alone.

If you are encountering OOMs, you will need to figure out where they are happening; there should be clues in the stack trace. But, most likely, you will need to increase the heap available to the historical process.

Thanks,
Max

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/31deb3b7-e376-dc95-4be5-03b6d46cc363%40magnite.com.

Ben Krug

unread,

Jun 15, 2021, 4:57:40 AM6/15/21

to druid...@googlegroups.com

This is more of a G1GC answer than a druid answer, but with that much RAM, you might try 31G heap. That's a sweet spot for G1GC. (31G is better than 32G - eg, see here).

To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/31deb3b7-e376-dc95-4be5-03b6d46cc363%40magnite.com.

Reply all

Reply to author

Forward

Message has been deleted