[nativelibs4java] Linux JavaCL problem identified with workaround

317 views
Skip to first unread message

jcpalmer

unread,
May 20, 2010, 1:26:13 PM5/20/10
to NativeLibs4Java, mb...@fh-landshut.de
First the good news (you cannot solve a problem until you identify
it), the problem with JavaCL on Linux / NVidia has been identified.
As I understand it, the problem where the JVM and device driver are
listening for the same OS signals. Eventually, the device steals a
message meant for the JVM or maybe the other way & death is
instantaneous with no diagnostic file ever written out. The issue is
also a potential problem with AMD.

When I ran an OpenCL test program with a kernel loop in Netbeans
(implemented by 'ant run'), I was getting a "Java Result: 139"
written into System.Error by Ant. Some web searches suggested I put
use the "-Xcheck:jni" option. I got 5 warnings that looked like:
Warning: SIGSEGV handler expected:libjvm.so+0x54ac80 found:libnvidia-
compiler.so.256.22+0x6df830

Some more web searches showed that Michael Bien had started
communicating about this 2 weeks ago. Doing anything more myself
(other than this thread) was not likely to yield a faster result, so I
am leaving management of this to him.

Even better is there is a work around today with caveats. If the
environment variable LD_PRELOAD is set to the path of the file
libjsig.so, the JVM changes so this is not a problem. This is NOT
possible to do with JWS, so the demo apps will not work (FYI, they are
out of commission right now, anyway). The second caveat is it makes
testing while developing in an IDE more difficult, at least Netbeans.

Here is what I do for Netbeans. Maybe some else can post for other
IDE's. Create a script file on the desktop to build/run your project
outside. Should look like:
#!/bin/bash
export LD_PRELOAD=/usr/lib/jvm/java-6-openjdk/jre/lib/amd64/libjsig.so
cd myProjectDir
ant run

Do a file save in Netbeans, then double click the script file's icon,
& select "Run in Terminal". You probably also need to edit the
properties of Terminal, setting "When command exits" to "Hold the
terminal open". If you leave -Xcheck:jni as an option of the project,
you will see this as the first line of execution output:
[java] Info: libjsig is activated, all active signal checking is
disabled

I just ran a 250,000 kernel execution loop with no problems.

Olivier Chafik

unread,
May 20, 2010, 5:34:48 PM5/20/10
to nativel...@googlegroups.com
Wow, that's some excellent news !!!

I've created a wiki page where we/you can detail the issue and the workaround, feel free to edit it :

Thanks a lot Jeff... (and Michael !)
Cheers
--
Olivier

2010/5/20 jcpalmer <jeffrey....@gmail.com>

Frederic Godin

unread,
May 20, 2010, 5:56:13 PM5/20/10
to nativel...@googlegroups.com
Glad to here it!

Only I needed to send in project yesterday xD
But its good to hear your found it!

Greetz

Fréderic

2010/5/20 Olivier Chafik <olivier...@gmail.com>

Shin Yoo

unread,
May 21, 2010, 7:36:08 AM5/21/10
to NativeLibs4Java
I probably need to do more through testing of the work-around, but for
now it seems to work. What used to SEGFAULT now works. For intel
machines, i386/libjsig.so also works fine.

Thanks!!

Benjamin

unread,
May 21, 2010, 11:36:54 AM5/21/10
to NativeLibs4Java
Yep, I can confirm that this workaround works for me too. In eclipse,
set the LD_PRELOAD environment variable in your run configuration.

jcpalmer

unread,
May 21, 2010, 1:42:11 PM5/21/10
to NativeLibs4Java
Olivier,
I see that eclipse can handle environment vars internally (via
Benjamin), and I am hoping this is not a permanent situation. I think
time is better spent on offense rather than making a work around for
Netbeans look pretty in a formatted wiki.

To that end, I want to make contact at IBM to see if this might be a
problem. I would like to reference the hardware report JWS at the
same time, to see if some one is willing to post the results of
running it on a IBM BladeCenter QS22, IBM BladeCenter JS23, and / or
PS3. Something, probably very small is wrong with the demo apps. I
have gleaned that OpenCL on Cell is actually implemented as two
devices, but that's is about all I know. Count you please take a
look?

Thanks,

Jeff

Olivier Chafik

unread,
May 21, 2010, 2:58:31 PM5/21/10
to nativel...@googlegroups.com
2010/5/21 jcpalmer <jeffrey....@gmail.com>
Something, probably very small is wrong with the demo apps.  I
have gleaned that OpenCL on Cell is actually implemented as two
devices, but that's is about all I know.  Count you please take a
look?

Sure, I'll check it out this weekend... last time I tried, ParticlesDemo ran fine (but that was a few weeks ago)... Is it crashing now ?

Cheers
--
Olivier

jcpalmer

unread,
May 21, 2010, 7:24:45 PM5/21/10
to NativeLibs4Java
I think it is a JNLP mistype. The app fails to launch

java.lang.NoClassDefFoundError: com/nativelibs4java/opencl/CLPlatform
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2427)
at java.lang.Class.getMethod0(Class.java:2670)
at java.lang.Class.getMethod(Class.java:1603)
at com.sun.javaws.Launcher.executeApplication(Launcher.java:1509)
at com.sun.javaws.Launcher.executeMainClass(Launcher.java:1466)
at com.sun.javaws.Launcher.doLaunchApp(Launcher.java:1277)
at com.sun.javaws.Launcher.run(Launcher.java:117)
at java.lang.Thread.run(Thread.java:637)

Olivier Chafik

unread,
May 25, 2010, 8:21:14 AM5/25/10
to nativel...@googlegroups.com
Hi Jeff,

I've fixed that yesterday, so you can happily launch HardwareReport again :-)

Cheers
--
Olivier

2010/5/22 jcpalmer <jeffrey....@gmail.com>

jcpalmer

unread,
May 25, 2010, 10:45:33 AM5/25/10
to NativeLibs4Java
Olivier,
Yet it is up again. Thanks! I'll see if someone will run this when I
ask if OpenCL signaling will cause problems for Java bindings. I'll
have to just post on their site. I contracted at IBM's Local / Kodak
data center in the late 90's, but that was a long time ago. I am not
hopeful that they currently implement the poorly named, Image Access
Method. It can be used to randomly get any 4 things in a single
transaction, e.g. Open-high-low-close.

If they do not, then I can put them out of my mind for the short
term. It's good to have those running anyway though.

Jeff
Reply all
Reply to author
Forward
0 new messages