Since some big claims were made, I thought it would be nice to see if
there's truth to these claims. :)
Compression format itself is a variation of the usual Lempel-Ziv
format, and in itself does not quite explain why the thing should be
any faster than Snappy or LZF; but it is always possible that some
clever tricks can turbo-charge processing.
Alas: I lost interest after 15 minutes, not finding suitable JNI/jar
snippet to plug in; with more time I can add a test for it,
eventually.
But I figured that maybe someone else could have longer attention span
and/or more time to spend on doing this? If so, I'd be glad to
integrate a new codec.
I assume Hadoop ticket & project sources have enough bits and pieces,
so it'd mostly just be task of hunting required stuff, assembling
together.
-+ Tatu +-
ps. I did add codec for java conversion, but based on code I doubt
that'd actually be as fast as existing fastest codecs. But perhaps
it'd come useful in creating codec for C/JNI version.
Ok thanks, I'll have a look.
> It's there :
> https://github.com/decster/jnicompressions
>
> Btw, even Snappy seems turbo-charged in this version
In what way? Snappy-java (iq80) was already rather fast...
-+ Tatu +-
Ah, never mind, so this is for JNI accessible C Snappy. It could
improve compression speed a bit I suspect.
-+ Tatu +-
Hmmh. Not very impressive, promptly fails to build on my Mac. :-p
Although this is from Snappy part, maybe it's Snappy build failure.
If anyone has a jar ready, might be easiest to just use that?
-+ Tatu +-
Thanks, good ideas, esp. latter. API is simple enough.
-+ Tatu +-
Sure, see below.
By the way not sure if these Maven warnings might be relevant:
WARNING] Some problems were encountered while building the effective
model for com.github.decster:jnicompressions:jar:0.1.0
[WARNING] 'build.plugins.plugin.version' for
org.codehaus.mojo:native-maven-plugin is missing. @ line 62, column 19
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-compiler-plugin is missing. @ line 35,
column 15
[WARNING] 'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-surefire-plugin is missing. @ line 43,
column 15
[WARNING]
[WARNING] It is highly recommended to fix these problems because they
threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer
support building such malformed projects.
[WARNING]
> Anyway I updated the project now it contains pre-build jar & jnilib
> for intel 64bit maxosx.
Thanks!
So the errors start with:
INFO] OSNAME=Mac OS X
[INFO] snappy already exists
[INFO] Building using config macos x86_64
[INFO] In file included from
/opt/java/jnicompressions/src/main/native/src/compressions_SnappyCompression.cc:23:
[INFO] /opt/java/jnicompressions/src/main/native/src/../snappy/snappy.h:45:33:
error: snappy-stubs-public.h: No such file or directory
[INFO] In file included from
/opt/java/jnicompressions/src/main/native/src/compressions_SnappyCompression.cc:23:
[INFO] /opt/java/jnicompressions/src/main/native/src/../snappy/snappy.h:59:
error: ‘uint32’ has not been declared
[INFO] /opt/java/jnicompressions/src/main/native/src/../snappy/snappy.h:69:
error: ‘string’ has not been declared
[INFO] /opt/java/jnicompressions/src/main/native/src/../snappy/snappy.h:78:
error: ‘string’ has not been declared
[INFO] In file included from
/opt/java/jnicompressions/src/main/native/snappy/snappy.cc:29:
[INFO] /opt/java/jnicompressions/src/main/native/snappy/snappy.h:45:33:
error: snappy-stubs-public.h: No such file or directory
[INFO] In file included from
/opt/java/jnicompressions/src/main/native/snappy/snappy-internal.h:34,
[INFO] from
/opt/java/jnicompressions/src/main/native/snappy/snappy.cc:30:
[INFO] /opt/java/jnicompressions/src/main/native/snappy/snappy-stubs-internal.h:35:20:
error: config.h: No such file or directory
[INFO] In file included from
/opt/java/jnicompressions/src/main/native/snappy/snappy.cc:29:
[INFO] /opt/java/jnicompressions/src/main/native/snappy/snappy.h:59:
error: ‘uint32’ has not been declared
[INFO] /opt/java/jnicompressions/src/main/native/snappy/snappy.h:69:
error: ‘string’ has not been declared
[INFO] /opt/java/jnicompressions/src/main/native/snappy/snappy.h:78:
error: ‘string’ has not been declared
[INFO] In file included from
/opt/java/jnicompressions/src/main/native/snappy/snappy-internal.h:34,
[INFO] from
/opt/java/jnicompressions/src/main/native/snappy/snappy.cc:30:
[INFO] /opt/java/jnicompressions/src/main/native/snappy/snappy-stubs-internal.h:95:
error: ‘uint32’ does not name a type
...
----
(and as usual, native compilation produces much, much more, let me
know if more is needed)
Thank you for your help,
-+ Tatu +-
Works now, thank you!
-+ Tatu +-
I was just able to make things work, to run some test runs with
jvm-compressor benchmark, lz4/jni codec.
Numbers do look good indeed!
One improvement suggestion I have is to do something similar to:
http://code.google.com/p/snappy-java/
where native code is actually bundled in jar. Without this I have to
add things to 'java.library.path', which is bit cumbersome; it would
be nice if this could be improved.
-+ Tatu +-