Hadoop Indexer hangs on 0.8.2

53 views
Skip to first unread message

Hagen Rother

unread,
Nov 20, 2015, 4:14:14 PM11/20/15
to druid-de...@googlegroups.com
Hi,

the upgrade was pretty smooth, however now I have hanging jobs. I.e. hadoop exists cleanly, but the indexer doesn't pick that up, asking for the logs looks like the pipe ripped somewhere.

Very rarely do I see an uncaught OutOfMemoryException in the log. Increasing the mx of the middlemanager and the tasks did not help. This worked reliably for the last couple months. It only affects my larger jobs, the small ones work as with 0.8.1

Any ideas?

Cheers,
Hagen
--
Hagen Rother
Lead Architect | LiquidM

LiquidM Technology GmbH
Schlesische Straße 29/30, Aufgang M | 10997 Berlin | Germany
Phone:+49 176 15 00 38 77
Internet:www.liquidm.com | LinkedIn

Managing Directors | André Bräuer, Philipp Simon, Thomas Hille
Jurisdiction | Local Court Berlin-Charlottenburg HRB 152426 B

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination, distribution, forwarding, or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited without the express permission of the sender. If you received this communication in error, please contact the sender and delete the material from any computer.

Hagen Rother

unread,
Nov 20, 2015, 4:18:08 PM11/20/15
to druid-de...@googlegroups.com
Just after sending the mail, I found a stack trace in the logs for the very first time:

2015-11-20T17:55:10,712 INFO [task-runner-0] com.metamx.http.client.pool.ChannelResourceFactory - Generating: http://master1.dw.lqm.io:8090
Exception in thread "task-runner-0" java.lang.OutOfMemoryError: PermGen space
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at io.druid.indexing.overlord.TaskRunnerWorkItem.compareTo(TaskRunnerWorkItem.java:91)
	at io.druid.indexing.overlord.TaskRunnerWorkItem.compareTo(TaskRunnerWorkItem.java:31)
	at java.util.concurrent.ConcurrentSkipListMap.doRemove(ConcurrentSkipListMap.java:1064)
	at java.util.concurrent.ConcurrentSkipListMap.remove(ConcurrentSkipListMap.java:1896)
	at java.util.concurrent.ConcurrentSkipListSet.remove(ConcurrentSkipListSet.java:250)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$1.onSuccess(ThreadPoolTaskRunner.java:96)
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$1.onSuccess(ThreadPoolTaskRunner.java:92)
	at com.google.common.util.concurrent.Futures$4.run(Futures.java:1181)
	at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
	at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
	at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
	at com.google.common.util.concurrent.ListenableFutureTask.done(ListenableFutureTask.java:91)
	at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:380)
	at java.util.concurrent.FutureTask.set(FutureTask.java:229)
	at java.util.concurrent.FutureTask.run(FutureTask.java:270)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

Fangjin Yang

unread,
Nov 21, 2015, 12:39:49 PM11/21/15
to Druid Development
I don't recall we made any changes in 0.8.2 that could cause this problem. I wonder if the error is a result of configuration and/or data changes.

Mark Gastel

unread,
Nov 23, 2015, 7:10:16 PM11/23/15
to Druid Development
I added -XX:MaxPermSize=256m to my middle managers druid.indexer.runner.javaOpts configuration option.  That got rid of the perm gen problem...after that I had some compatibility issues between jackson versions between hadoop and druid.   But the first might get rid of your memory issue

Mark


On Friday, November 20, 2015 at 1:18:08 PM UTC-8, Hagen Rother wrote:
Reply all
Reply to author
Forward
0 new messages