Help with a SegFault when calling native code.

132 views
Skip to first unread message

Avik Sengupta

unread,
Nov 29, 2014, 6:20:51 AM11/29/14
to julia...@googlegroups.com
So I am trying to get JavaCall.jl working with JDK8. It is currently supported only on JDK7. Theoretically, everything should be backward compatible and work out of the box, but of course, the gap between theory and practice ... etc..

While everything works with 1.7, I get a seg fault with 1.8, but strangely the process does not exit. The call looks like this:

res = ccall(create, Cint, (Ptr{Ptr{JavaVM}}, Ptr{Ptr{JNIEnv}}, Ptr{J.JavaVMInitArgs}), ppjvm, ppenv, &vm_args)

This call is of course deeply entertwined with the specifics of the JVM, so I'm really looking for help with some ideas to debug this.

So when I make this call in 1.8, I see the following message printed on screen:

signal (11): Segmentation fault: 11
unknown function (ip: 311308980)

However, strangely,  the call still returns a value indicating success. And (most) subsequent calls to the JVM also return successfully. In fact, the entire JavaCall.jl testsuite runs successfully.  The "unknown function" message seems to come out of task.c, but I don't understand where the segfault is trapped.

When I run julia within lldb, I don't seem to get much information:

signal (11): Segmentation fault: 11
unknown function (ip: 314786484)
Process 86657 stopped
* thread #1: tid = 0xe2de0, 0x0000000112c342b4, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x0000000112c342b4
-> 0x112c342b4:  movl   (%rsi), %eax
   0x112c342b6:  leaq   0xf8(%rbp), %rsi
   0x112c342bd:  vmovdqu %ymm0, (%rsi)
   0x112c342c1:  vmovdqu %ymm7, 0x20(%rsi)
(lldb) bt
* thread #1: tid = 0xe2de0, 0x0000000112c342b4, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
  * frame #0: 0x0000000112c342b4

So what can I do to debug this further?

Thanks
-
Avik

Isaiah Norton

unread,
Nov 29, 2014, 12:45:15 PM11/29/14
to julia...@googlegroups.com
You can see where the SEGV handler is set up in init.c (see mach_segv_listener).
It might also be useful to set a break in the "create" function on the JVM C API side, and step through from there.

Avik Sengupta

unread,
Nov 29, 2014, 6:02:41 PM11/29/14
to julia...@googlegroups.com
Thanks Isaiah, that helps. Your hints have made be debug further into this.

The bad news however is that the exception is thrown in some hand coded assembly that seems to do CPU detection... in some code that looks like this (though probably not exactly that) : http://hg.openjdk.java.net/hsx/hsx25/hotspot/file/0c94c41dcd70/src/cpu/x86/vm/vm_version_x86.cpp#l95

I presume the truncated backtrace is due to jumping into assembly.. The backtrace dissapears in lldb as soon as execution goes into the hand generated assembly.

I've confirmed that raw C code calling this function works correctly. Also, this section of the code, and its callers, do not take any parameters, so it is unlikely that the Julia to C translation of the parameters are at fault.  So yeah, stumped :(

Isaiah Norton

unread,
Dec 20, 2014, 11:46:21 PM12/20/14
to julia...@googlegroups.com
Hi Avik,

It looks it is still an issue because version 1_6 is still asserted in JavaCall.init?

If I disable that assert, on linux I am able to  run your examples on the doc page and Pkg.test("JavaCall") completes successfully against java-8-oracle. Probably doesn't help much, but there is a datapoint on a different OS for you.

Some random debugging ideas
- valgrind's gdbserver can be very useful for general debugging in situations when gdb can't otherwise trap or follow threads reliably.
- could try calling  JNI_CreateJavaVM in a C shared library invoked from Julia
- try this from a Julia built with LLVM_VER=svn ... You might get decent backtraces from the pure-Julia code (or maybe not at all? I do on linux, but there were some issues with backtraces on OS X and I am not sure if they have been resolved).

Isaiah
Reply all
Reply to author
Forward
0 new messages