Overlord: "The leader threw an exception" related to Apache Curator

244 views
Skip to first unread message

Chris Freyer

unread,
Sep 14, 2016, 7:54:58 PM9/14/16
to Druid User
Hello all.
My team has a druid cluster in production, and we've had several Overlord outages in the past 2 months.  The Overlord remains in memory, but is non-responsive.  We have to kill -9 it.

Looking back at our logs, I see a pattern for the outages.  All of them include these two lines:

2016-09-09T22:39:48,546 ERROR [Curator-LeaderSelector-0] org.apache.curator.framework.recipes.leader.LeaderSelector - The leader threw an exception
java.lang.IllegalMonitorStateException: You do not own the lock: /druid/indexer/leaderLatchPath

Digging a bit deeper on the Curator's LeaderSelector class, I found a defect on Apache's issue site that has been resolved recently for the Curator.  

The fix was appled on July 28, 2016, and is included in Curator versions 3.2.1 and 2.11.1.  
I checked our Druid (0.9.1.1) and it uses Curator version 2.10.0

Has anyone else experienced this error?
Any advice on upgrading Curator?

Thanks--
Chris Freyer

Fangjin Yang

unread,
Oct 7, 2016, 4:36:04 PM10/7/16
to Druid User
Hi Chris, when you say outage, can you describe more what happens? Do you have logs of the overlord during these outages? I wonder if the problem is something else and the overlord error messages are a symptom rather than the cause.
Reply all
Reply to author
Forward
0 new messages