snappy nativelib bundled in snappy-java exports different symbols than vanilla snappy nativelib

847 views
Skip to first unread message

tucu

unread,
Jun 1, 2011, 9:12:43 PM6/1/11
to Xerial
Hello,

I'm running into problems because of the following:

snappy-java is adding the JNI bindings to snappy and republishing as
snappy.

This creates issues when using java code that uses snappy-java and
snappy (via hadoop-snappy) because there are 2 Java components loading
different nativelibs with the same name.

snappy-java should either use a different native library name or it
should use a different nativelib for its JNI bindings (like hadoop-
snappy does).

Thoughts?

Thanks.

Alejandro

Taro L. Saito

unread,
Jun 1, 2011, 10:39:43 PM6/1/11
to xer...@googlegroups.com
Hi,

Renaming native libraries is easy,
I will change the lib name from snappy to snappyjava.

Thank you for the suggestion.

--
Taro L. Saito
<l...@xerial.org>
University of Tokyo
http://www.xerial.org/leo
Tel. +81-47-136-4065 (64065)

> --
> You received this message because you are subscribed to the Google Groups "Xerial" group.
> To post to this group, send email to xer...@googlegroups.com.
> To unsubscribe from this group, send email to xerial+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/xerial?hl=en.
>
>

Taro L. Saito

unread,
Jun 1, 2011, 10:47:21 PM6/1/11
to xer...@googlegroups.com
Hi,

I created a snapshot version that renames native libraries from
libsnappy to libsnappyjava:
http://maven.xerial.org/repository/snapshot/org/xerial/snappy/snappy-java/1.0.2-SNAPSHOT/snappy-java-1.0.2-20110602.024416-2.jar

I am not sure this simple renaming works well, since
snappy-java and hadoop-snappy use the same snappy library,
thus function names in both native libraries might collide.

--
Taro L. Saito
<l...@xerial.org>
University of Tokyo
http://www.xerial.org/leo
Tel. +81-47-136-4065 (64065)

Alejandro Abdelnur

unread,
Jun 2, 2011, 9:35:09 AM6/2/11
to xer...@googlegroups.com
Taro,

Thanks for the quick response, I'll test it today and I will let you know.

Thanks again.

Alejandro

Alejandro Abdelnur

unread,
Jun 2, 2011, 2:26:16 PM6/2/11
to xer...@googlegroups.com
Hi Taro,

I've just build the latest code from snappy-java, with our fix. 

Now the JVM does not coredump.

But still it fails if snappy SO happens to be loaded, the error I'm getting now is:

-----
java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
        at org.xerial.snappy.SnappyNative.maxCompressedLength(Native Method)
        at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:188)
        at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:61)
        at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:54)
-----

Any chance to have libsnappyjava just with the JNI bindings and dynamically link to libsnappy ? (if this is the case, ideally,  libsnappy should be looked in the LD_PATH before trying to loaded it from the JAR, this would allow to apply patches to libsnappy without having to repatch java-snappy).

Thanks again.

Cheers

Alejandro

Taro L. Saito

unread,
Jun 2, 2011, 10:50:51 PM6/2/11
to xer...@googlegroups.com
Hi,

The solution you mentioned is possible by supplying JVM arguments, for example,
-Dorg.xerial.snappy.lib.path=(directory path containing libsnappy.so)
-Dorg.xerial.snappy.lib.name=libsnappy.so
(You can also use System.setProperty(...) )

If these two variables are set before calling methods in
org.xerial.snappy.Snappy, snappy-java tries to load the specified
native library instead of the one embedded in snappy-java.jar.

However, UnsatisfiedLinkError is observed when the same native library
(in this case libsnappy.so) is loaded twice in JVM (see also
http://code.google.com/p/snappy-java/#Using_snappy-java_with_Tomcat6_Web_Server
)

Oops, but I noticed now that using libsnappy.so as a replacement of
libsnappyjava.so is impossible, because libsnappyjava contains not
only snappy but also JNI interfaces and codes for invoking the
original API of snappy.

If the UnsatisfiedLinkError is a problem of loading libsnappyjava.so
two or more times, it can be fixed by putting snapy-java-(version).jar
where the parent class loader can find it.

Regards,

--
Taro L. Saito
<l...@xerial.org>
University of Tokyo
http://www.xerial.org/leo
Tel. +81-47-136-4065 (64065)

Alejandro Abdelnur

unread,
Jun 3, 2011, 12:32:13 AM6/3/11
to xer...@googlegroups.com
Hi Taro,

A nativelib loaded twice by a JVM (from the same classloader) is a NOP, from different classloaders it fails.

Hadoop Snappy libhadoopsnappy.so has only its JNI bindings and a dynamic link to libsnappy.so. Hadoop Snappy Java code loads both.

If you can help me build Snappy Java in the same way I can test things out.

Thanks.

Alejandro

Taro L. Saito

unread,
Jun 3, 2011, 1:04:47 AM6/3/11
to xer...@googlegroups.com
Hi Alejandro,

Technically speaking, splitting .so file into two parts is possible;
snappy and interface accessing to snappy as in the Hadoop Snappy
implementation. But accessing native codes without issuing
loadLibrary(libsnappy.so) would be a difficult problem because we have
to check whether the native code is loaded or not in the other class
outside of org.xerial.snappy.

Java has no function to check whether the specified native library
(e.g., libsnappy.so) is loaded or not in ancestor or parent class
loaders that are visible from the current class loader. If snappy-java
is loaded in a parent class loader, its child class loader uses
snappy-java loaded in the parent, so UnsatisfiedLinkError will not be
thrown.

A possible workaround would be accessing the native code through JNI
and if encountered UnsatisfiedLinkError, trying to load a native code.
I am not sure whether this approach works properly or not.

And also, note that libsnapyjava.so uses statically linked snappy. I guess
function name collision between libsnappy.so and libsnappyjava.so is
not a real problem.

Is their any possibility that your code uses snappy-java from
different class loaders in the same JVM?

I guess Hadoop might use different class loaders to run map tasks, and
if you embed snappy-java.jar and your map function classes into a
single jar file for running MapReduce task in a Hadoop, that is a
problem. I often see similar problems when using JNI-based library
with Tomcat, which also forks several class loaders in a JVM.

If it is the case, adding snappy-java to the classpath of hadoop node
might solve your problem, not in the jars of your map-reduce programs.

Bests,

--
Taro L. Saito
<l...@xerial.org>
University of Tokyo
http://www.xerial.org/leo
Tel. +81-47-136-4065 (64065)

Alejandro Abdelnur

unread,
Jun 3, 2011, 1:29:50 AM6/3/11
to xer...@googlegroups.com
Hi Taro,

Hadoop does not use classloaders, so that is not a problem there.

Plus you could try/catch the loading of the library.

We could first verify that in split mode (same as hadoop-snappy) things work, then we see how to make the loading robust.

Thanks.

Alejandro

Taro L. Saito

unread,
Jun 3, 2011, 1:44:45 AM6/3/11
to xer...@googlegroups.com
Hi,

I would like to confirm where you put snappy-java-(version).jar ?
in HADOOP_HOME/lib or within another jar file passed to Hadoop?

If snappy-java is in HADOOP_HOME/lib and task tracker node is configured to load
snappy-java in HADOO_HOME/lib, UnsatisfiedLinkError problem is probably a bug
in snappy-java (failed to find a proper native library, or something
else, e.g., function symbol collision, etc.)

If snappy-java is in somewhere else, first call to Snappy will
succeed, but subsequent
calls will fail.

This is my understandings.

--
Taro L. Saito
<l...@xerial.org>
University of Tokyo
http://www.xerial.org/leo
Tel. +81-47-136-4065 (64065)

Alejandro Abdelnur

unread,
Jun 3, 2011, 7:08:45 PM6/3/11
to xer...@googlegroups.com
Taro,

I'm testing standalone, using snappy-java and hadoop-snappy, my classpath consist of snappy-java JAR, hadoop-snappy JAR and hadoop-common JAR all in the same classloader. and libsnappy.so in the LD_PATH.

The UnsatisfiedLinkError is for a method, not for the library. Thus I believe that there symbol collision between libsnappy.so and libsnappyjava.so. The later containing the objects from libsnappy.so and libsnappyjava.so.

Thanks.

Alejandro

Taro L. Saito

unread,
Jun 3, 2011, 9:04:14 PM6/3/11
to xer...@googlegroups.com
Hi,

Thank you for the information.

I checked the exported symbols in libsnappyjava.so using 'nm -C -D' and
found the original API of snappy is also exported in the static
library. Your are correct.

I hid the snappy API using -fvisibility=hidden option in gcc-4.x, and shipped
a new snapshot version:
http://maven.xerial.org/repository/snapshot/org/xerial/snappy/snappy-java/1.0.3-SNAPSHOT/snappy-java-1.0.3-20110604.005740-2.jar

The fix is as follows:
http://code.google.com/p/snappy-java/source/detail?r=60dd607e755e13d7e0378186a19ab1322cc1819c


--
Taro L. Saito
<l...@xerial.org>
University of Tokyo
http://www.xerial.org/leo
Tel. +81-47-136-4065 (64065)

Alejandro Abdelnur

unread,
Jun 3, 2011, 9:06:11 PM6/3/11
to xer...@googlegroups.com
Thanks Taro,

I'll check this and I'll let you know.

Have a great weekend.

Alejandro
Reply all
Reply to author
Forward
0 new messages