Loading native snappy in middleManager/peon

439 views
Skip to first unread message

Andrew Otto

unread,
May 20, 2016, 4:44:18 PM5/20/16
to Druid User
Hi all!

I'm trying to set up Druid with Hadoop 2.6.0 CDH 5.5.2.  After wading through and working around various dependency issues, I'm hitting a wall.  Our default compression codec is snappy.  We force the Hadoop java processes to load the native snappy libraries by setting LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native.  I had gotten an error about not being able to load snappy in the middleManager earlier this week, but got around it by setting LD_LIBRARY_PATH in the middleManager's environment.  Now I'm to the point where an indexing task successfully completes a MapReduce job and writes out snappy files in /tmp/druid-indexing.  The MapReduce job finishes fine.  After it does, it looks like a Peon process attempts to read what was written.  While doing so, I get a snappy loading error:




2016-05-20T20:35:27,591 INFO io.druid.indexer.DetermineHashedPartitionsJob: Job completed, loading up partitions for intervals[Optional.of([2015-09-01T00:00:00.000Z/2015-09-02T00:00:00.000Z])].
2016-05-20T20:35:27,643 ERROR io.druid.indexing.overlord.ThreadPoolTaskRunner: Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2016-05-20T20:33:38.361Z, type=index_hadoop, dataSource=pageviews}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:160) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:175) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:318) [druid-indexing-service-0.9.0.jar:0.9.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [?:1.7.0_101]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [?:1.7.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [?:1.7.0_101]
at java.lang.Thread.run(Thread.java:745) [?:1.7.0_101]
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_101]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_101]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]
... 7 more
Caused by: java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.
at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65) ~[?:?]
at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193) ~[?:?]
at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178) ~[?:?]
at org.apache.hadoop.io.compress.CompressionCodec$Util.createInputStreamWithCodecPool(CompressionCodec.java:157) ~[?:?]
at org.apache.hadoop.io.compress.SnappyCodec.createInputStream(SnappyCodec.java:163) ~[?:?]
at io.druid.indexer.Utils.openInputStream(Utils.java:101) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.Utils.openInputStream(Utils.java:77) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.DetermineHashedPartitionsJob.run(DetermineHashedPartitionsJob.java:161) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.JobHelper.runJobs(JobHelper.java:323) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.HadoopDruidDetermineConfigurationJob.run(HadoopDruidDetermineConfigurationJob.java:86) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexing.common.task.HadoopIndexTask$HadoopDetermineConfigInnerProcessing.runTask(HadoopIndexTask.java:291) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.7.0_101]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[?:1.7.0_101]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.7.0_101]
at java.lang.reflect.Method.invoke(Method.java:606) ~[?:1.7.0_101]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:157) ~[druid-indexing-service-0.9.0.jar:0.9.0]
... 7 more
2016-05-20T20:35:27,763 INFO io.druid.indexing.worker.executor.ExecutorLifecycle: Task completed with status: {
  "id" : "index_hadoop_pageviews_2016-05-20T20:33:38.361Z",
  "status" : "FAILED",
  "duration" : 82683
}


I've been trying variations to keep it from failing, but nothing yet has worked.  I would think that setting -Djava.library.path=/usr/lib/hadoop/lib/native on druid.indexer.runner.javaOpts would help, but it doesn't.  Is there a way I can pass LD_LIBRARY_PATH down to the Peon's environment?  Has anyone else run into this?

Thanks!
-Andrew

Andrew Otto

unread,
May 23, 2016, 11:38:08 AM5/23/16
to druid...@googlegroups.com
​Quick update this morning.  I’m pretty sure setting just LD_LIBRARY_PATH in the middleManager’s env does propagate down to the Peon.  In logs I see:

2016-05-23T15:35:07,702 INFO io.druid.cli.CliPeon: * java.library.path: /usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib


Andrew Otto

unread,
May 23, 2016, 3:46:31 PM5/23/16
to druid...@googlegroups.com
Ok, I’m getting close to stumped.  As far as I can tell, both Hadoop and Snappy native libs are loaded properly when I set LD_LIBRARY_PATHLD_LIBRARY_PATH is prepended to java.library.path.

I prepped some code to help me make sure I wasn’t doing something dumb:

I am able to load native hadoop and snappy using the same classpath and java.library.path that druid uses.  

At the bottom of this email is a bit more middleManager logging detail that leads up to this error. In summary I see:

- middleManager starts, uses /usr/lib/hadoop/lib/native (zookeeper too?)

- Peon indexing job starts, uses /usr/lib/hadoop/lib/native (zookeeper too?), but prints out ‘Unable to load native-hadoop library for your platform… using builtin-java classes where applicable'

- YARN Hadoop indexing job is submitted and completes.  I believe this writes a .snappy file somewhere into hdfs:///tmp/hadoop-indexing/…

- middleManager (or Peon task?) attempts to read previously written snappy file.  Errors out with ‘native snappy library not available: this version of libhadoop was built without snappy support’.

So ja, something is fishy with the Peon’s java.library.path.  Even though the java.library.path is clearly set properly when the Peon starts up, it does not register the shared library files, as indicated by the ‘Unable to load native-hadoop library…’ message.


I guess if I don’t hear from someone by tomorrow, I’ll file an issue on Github.

Actual logs below.  I’ve removed stuff that looked uninteresting.  I see classpaths, extensions, and hadoop-dependencies all loading as expected.


2016-05-23T19:18:31,500 INFO io.druid.cli.CliMiddleManager: *                   java.library.path:/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
...
2016-05-23T19:18:32,700 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
...
2016-05-23T19:18:35,840 INFO org.eclipse.jetty.server.ServerConnector: Started ServerConnector@6685f71a{HTTP/1.1}{0.0.0.0:8091}
2016-05-23T19:18:35,844 INFO org.eclipse.jetty.server.Server: Started @25796ms
...
2016-05-23T19:19:40,744 INFO io.druid.cli.CliPeon: *                            java.library.path:/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
...
2016-05-23T19:19:42,894 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/usr/lib/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
...

2016-05-23T19:19:43,745 INFO io.druid.indexing.worker.executor.ExecutorLifecycle: Running with task: {
  "type" : "index_hadoop",
  "id" : "index_hadoop_pageviews_2016-05-23T19:19:22.575Z",
...
2016-05-23T19:19:48,880 INFO org.eclipse.jetty.server.ServerConnector: Started ServerConnector@1371e566{HTTP/1.1}{0.0.0.0:8100}
2016-05-23T19:19:48,881 INFO org.eclipse.jetty.server.Server: Started @25487ms
...
2016-05-23T19:20:20,066 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1463163743644_0030
...

2016-05-23T19:21:17,670 INFO io.druid.indexer.DetermineHashedPartitionsJob: Job completed, loading up partitions for intervals[Optional.of([2015-09-01T00:00:00.000Z/2015-09-02T00:00:00.000Z])].
2016-05-23T19:21:17,959 ERROR io.druid.indexing.overlord.ThreadPoolTaskRunner: Exception while running task[HadoopIndexTask{id=index_hadoop_pageviews_2016-05-23T19:19:22.575Z, type=index_hadoop, dataSource=pageviews}]
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:160) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.common.task.HadoopIndexTask.run(HadoopIndexTask.java:175) ~[druid-indexing-service-0.9.0.jar:0.9.0]
at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:338) [druid-indexing-service-0.9.0.jar:0.9.0]
...
Caused by: java.lang.RuntimeException: native snappy library not available: this version of libhadoop was built without snappy support.
at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65) ~[?:?]
at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193) ~[?:?]
at org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178) ~[?:?]
at org.apache.hadoop.io.compress.CompressionCodec$Util.createInputStreamWithCodecPool(CompressionCodec.java:157) ~[?:?]
at org.apache.hadoop.io.compress.SnappyCodec.createInputStream(SnappyCodec.java:163) ~[?:?]
at io.druid.indexer.Utils.openInputStream(Utils.java:101) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]
at io.druid.indexer.Utils.openInputStream(Utils.java:77) ~[druid-indexing-hadoop-0.9.0.jar:0.9.0]

...
2016-05-23T19:21:18,084 INFO io.druid.indexing.worker.executor.ExecutorLifecycle: Task completed with status: {
  "id" : "index_hadoop_pageviews_2016-05-23T19:19:22.575Z",
  "status" : "FAILED",
  "duration" : 93849
}





Andrew Otto

unread,
May 24, 2016, 6:08:45 PM5/24/16
to druid...@googlegroups.com

I gave up on snappy, and decided to try to force the Druid indexer jobs to output gzip files instead.  I tried this to do this in 3 different ways.  In all cases I modified properties for both the middleManager and Peon (via druid.indexer.runner.javaOpts) JVMs.

- -Dmapreduce.output.fileoutputformat.compress=org.apache.hadoop.io.compress.GzipCodec

- -Dmapred.child.java.opts=-Dmapreduce.output.fileoutputformat.compress=org.apache.hadoop.io.compress.GzipCodec

- -Dmapreduce.map.java.opts=-Dmapreduce.output.fileoutputformat.compress=org.apache.hadoop.io.compress.GzipCodec -Dmapreduce.reduce.java.opts=-Dmapreduce.output.fileoutputformat.compress=org.apache.hadoop.io.compress.GzipCodec

I examined the job properties for the YARN job launched by the indexing task.  None of these settings were passed down to the job.  The SnappyCodec configured in mapred-site.xml was used.


Andrew Otto

unread,
May 24, 2016, 6:41:26 PM5/24/16
to druid...@googlegroups.com
​Ah, but of course this won’t work.  These are JVM options on the middleManager and Peon, and they won’t pass these down to the MapReduce job automatically.

Is there a way to provide Hadoop related settings to the Peon before it submits the MapReduce indexing job?​

Andrew Otto

unread,
May 24, 2016, 7:25:57 PM5/24/16
to druid...@googlegroups.com
AH I finally was able to run an indexing job!  The answer to my previous question is

      "jobProperties" : {"mapreduce.output.fileoutputformat.compress": "org.apache.hadoop.io.compress.GzipCodec”}

In the indexing task specification.  Yay!

I still think native libs should work.  This is a bug.  I will file one. :)

charles.allen

unread,
May 24, 2016, 7:40:10 PM5/24/16
to Druid User
This is a fun one. Looking at SnappyCodec it seems thatit does some basic checking as per:


 if (!NativeCodeLoader.buildSupportsSnappy()) {
  throw new RuntimeException("native snappy library not available: " +
"this version of libhadoop was built without " +
"snappy support.");
}

Now, the FUN thing to know at this point is that you are in a special classloader. (as per io.druid.indexing.common.task.HadoopTask.invokeForeignLoader)
This means that all the classes and jars need to be within the hadoop directory found by the coordinates specified as per http://druid.io/docs/0.9.0/operations/other-hadoop.html
now, what I DON'T know, is how well native libraries play with isolated classloaders. So look at the directory where your isolated hadoop stuff is located and make sure the correct jars are there.

charles.allen

unread,
May 24, 2016, 7:45:35 PM5/24/16
to Druid User
To kind of complete this, the thing hadoop is doing in the boolean check is pretty simple:

JNIEXPORT jboolean JNICALL Java_org_apache_hadoop_util_NativeCodeLoader_buildSupportsSnappy
(JNIEnv *env, jclass clazz)
{
#ifdef HADOOP_SNAPPY_LIBRARY
return JNI_TRUE;
#else
return JNI_FALSE;
#endif
}

So as long as the hadoop-common jar is the one you intend it to be (with snappy support) then it shoooooouuuulllllddddd be ok

Andrew Otto

unread,
May 24, 2016, 8:13:01 PM5/24/16
to druid...@googlegroups.com
So as long as the hadoop-common jar is the one you intend it to be (with snappy support) then it shoooooouuuulllllddddd be ok

​Hm, it should be!  https://gist.github.com/ottomata/6caf158d3b787a1c3439d936a1e28916#file-snappynativetest-java uses the same jars that I have in hadoop-dependencies and loaded via hadoopDependencyCoordinates, and it all works fine from there.




Andrew Otto

unread,
May 26, 2016, 10:11:47 AM5/26/16
to Druid User
I just created an issue for this: https://github.com/druid-io/druid/issues/3025

Anuj Singhania

unread,
Aug 17, 2017, 11:29:33 AM8/17/17
to Druid User
Hi,

I am also facing similar kind of issue.

But task is failing randomly because of lz4 lib not available.

Error: java.lang.RuntimeException: native lz4 library not available

But not able to understand why the task is failing randomly.
Reply all
Reply to author
Forward
0 new messages