Loading native snappy in middleManager/peon

Andrew Otto

unread,

May 20, 2016, 4:44:18 PM5/20/16

to Druid User

Hi all!

I'm trying to set up Druid with Hadoop 2.6.0 CDH 5.5.2. After wading through and working around various dependency issues, I'm hitting a wall. Our default compression codec is snappy. We force the Hadoop java processes to load the native snappy libraries by setting LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native. I had gotten an error about not being able to load snappy in the middleManager earlier this week, but got around it by setting LD_LIBRARY_PATH in the middleManager's environment. Now I'm to the point where an indexing task successfully completes a MapReduce job and writes out snappy files in /tmp/druid-indexing. The MapReduce job finishes fine. After it does, it looks like a Peon process attempts to read what was written. While doing so, I get a snappy loading error:

2016-05-20T20:35:27,591 INFO io.druid.indexer.DetermineHashedPartitionsJob: Job completed, loading up partitions for intervals[Optional.of([2015-09-01T00:00:00.000Z/2015-09-02T00:00:00.000Z])].

2016-05-20T20:35:27,643 ERROR io.druid.indexing.overlord.ThreadPoolTaskRunner: Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2016-05-20T20:33:38.361Z, type=index_hadoop, dataSource=pageviews}]

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException

at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:160) ~[druid-indexing-service-0.9.0.jar:0.9.0]

at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:175) ~[druid-indexing-service-0.9.0.jar:0.9.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid-indexing-service-0.9.0.jar:0.9.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:318) [druid-indexing-service-0.9.0.jar:0.9.0]

at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_101]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_101]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_101]

at java.lang.Thread.run(Thread.java:745) [?:1.7.0_101]

Caused by: java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_101]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_101]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101]

at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]

... 7 more

Caused by: java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.

at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65) ~[?:?]

at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193) ~[?:?]

at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178) ~[?:?]

at org.apache.hadoop.io.compress.CompressionCodec$Util.createInputStreamWithCodecPool(CompressionCodec.java:157) ~[?:?]

at org.apache.hadoop.io.compress.SnappyCodec.createInputStream(SnappyCodec.java:163) ~[?:?]

at io.druid.indexer.Utils.openInputStream(Utils.java:101) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.Utils.openInputStream(Utils.java:77) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.DetermineHashedPartitionsJob.run(DetermineHashedPartitionsJob.java:161) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.JobHelper.runJobs(JobHelper.java:323) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:86) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:291) ~[druid-indexing-service-0.9.0.jar:0.9.0]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_101]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_101]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101]

at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101]

at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]

... 7 more

2016-05-20T20:35:27,763 INFO io.druid.indexing.worker.executor.ExecutorLifecycle: Task completed with status: {

"id" : "index_hadoop_pageviews_2016-05-20T20:33:38.361Z",

"status" : "FAILED",

"duration" : 82683

}

I've been trying variations to keep it from failing, but nothing yet has worked. I would think that setting -Djava.library.path=/usr/lib/hadoop/lib/native on druid.indexer.runner.javaOpts would help, but it doesn't. Is there a way I can pass LD_LIBRARY_PATH down to the Peon's environment? Has anyone else run into this?

Thanks!

-Andrew

Andrew Otto

unread,

May 23, 2016, 11:38:08 AM5/23/16

to druid...@googlegroups.com

​Quick update this morning.  I’m pretty sure setting just LD_LIBRARY_PATH in the middleManager’s env does propagate down to the Peon.  In logs I see:

2016-05-23T15:35:07,702 INFO io.druid.cli.CliPeon: * java.library.path: /usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib

Andrew Otto

unread,

May 23, 2016, 3:46:31 PM5/23/16

to druid...@googlegroups.com

Ok, I’m getting close to stumped. As far as I can tell, both Hadoop and Snappy native libs are loaded properly when I set LD_LIBRARY_PATH. LD_LIBRARY_PATH is prepended to java.library.path.

I prepped some code to help me make sure I wasn’t doing something dumb:

https://gist.github.com/ottomata/6caf158d3b787a1c3439d936a1e28916#file-snappynativetest-java

I am able to load native hadoop and snappy using the same classpath and java.library.path that druid uses.

At the bottom of this email is a bit more middleManager logging detail that leads up to this error. In summary I see:

- middleManager starts, uses /usr/lib/hadoop/lib/native (zookeeper too?)

- Peon indexing job starts, uses /usr/lib/hadoop/lib/native (zookeeper too?), but prints out ‘Unable to load native-hadoop library for your platform… using builtin-java classes where applicable'

- YARN Hadoop indexing job is submitted and completes. I believe this writes a .snappy file somewhere into hdfs:///tmp/hadoop-indexing/…

- middleManager (or Peon task?) attempts to read previously written snappy file. Errors out with ‘native snappy library not available: this version of libhadoop was built without snappy support’.

So ja, something is fishy with the Peon’s java.library.path. Even though the java.library.path is clearly set properly when the Peon starts up, it does not register the shared library files, as indicated by the ‘Unable to load native-hadoop library…’ message.

I guess if I don’t hear from someone by tomorrow, I’ll file an issue on Github.

Actual logs below. I’ve removed stuff that looked uninteresting. I see classpaths, extensions, and hadoop-dependencies all loading as expected.

…

2016-05-23T19:18:31,500 INFO io.druid.cli.CliMiddleManager: *                   java.library.path:/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib

...

2016-05-23T19:18:32,700 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib

...

2016-05-23T19:18:35,840 INFO org.eclipse.jetty.server.ServerConnector: Started ServerConnector@6685f71a{HTTP/1.1}{0.0.0.0:8091}

2016-05-23T19:18:35,844 INFO org.eclipse.jetty.server.Server: Started @25796ms

...

2016-05-23T19:19:40,744 INFO io.druid.cli.CliPeon: * java.library.path:/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib

...

2016-05-23T19:19:42,894 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib

...

2016-05-23T19:19:43,745 INFO io.druid.indexing.worker.executor.ExecutorLifecycle: Running with task: {

"type" : "index_hadoop",

"id" : "index_hadoop_pageviews_2016-05-23T19:19:22.575Z",

...

2016-05-23T19:19:48,880 INFO org.eclipse.jetty.server.ServerConnector: Started ServerConnector@1371e566{HTTP/1.1}{0.0.0.0:8100}

2016-05-23T19:19:48,881 INFO org.eclipse.jetty.server.Server: Started @25487ms

...

2016-05-23T19:20:20,066 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1463163743644_0030

...

2016-05-23T19:21:17,670 INFO io.druid.indexer.DetermineHashedPartitionsJob: Job completed, loading up partitions for intervals[Optional.of([2015-09-01T00:00:00.000Z/2015-09-02T00:00:00.000Z])].

2016-05-23T19:21:17,959 ERROR io.druid.indexing.overlord.ThreadPoolTaskRunner: Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2016-05-23T19:19:22.575Z, type=index_hadoop, dataSource=pageviews}]