SIGALRM using g95 compiled Matlab mex files

23 views
Skip to first unread message

bcom...@gmail.com

unread,
Oct 9, 2006, 3:44:31 PM10/9/06
to gg95
I'm using g95 to compile fortran mex files to call from Matlab. They
compile and run fine, but about 9-10 minutes after I execute them, my
Matlab session will terminate with an "Alarm clock" message. I tried
running Matlab through gdb, and got messages like this after the 9-10
minute period:

Cannot find user-level thread for LWP 4532: generic error
(gdb) backtrace
#0 0x00002b955c467747 in kill () from /lib/libc.so.6
Cannot fetch general-purpose registers for thread 47920578785888:
generic error
(gdb) bt
#0 0x00002b955c467747 in kill () from /lib/libc.so.6

I'm using gcc 4.1 and Matlab 2006a (or 2006b) on a linux amd64
platform. A trivial example mex file (yprimef) works and does not stop
Matlab with a SIGALRM. Others have had similar problems (see link
below); Chis reports offline that his issue remains unresolved using
g95, but does not occur using PGI compilers.

http://groups.google.com/group/gg95/browse_frm/thread/22826e722dded059/d5d6689d241429d4?lnk=gst&q=SIGALRM&rnum=1#d5d6689d241429d4

Has anyone seen this issue, or have a guess where I can look to further
troubleshoot? I plan on trying and older version of gcc, and perhaps
gfortran for comparison. I'll try writing a signal handler for the
fortran code and see if I can get any more information.

Bill

bcom...@gmail.com

unread,
Oct 9, 2006, 6:44:04 PM10/9/06
to gg95

update: same code compiled with gfortran does not cause Matlab to
exit.

Bill

Andy

unread,
Oct 12, 2006, 12:57:37 PM10/12/06
to gg95
Weird. SIGALRM should not be set anywhere. It's used for the corefile
resume, but the x86_64 version isn't merged yet (it doesn't quite
work). I would run your program under strace and see if you can see
where the alarm() call is being called-- that is what is eventually
causing the crash.

Andy

Chris Hulbert

unread,
Oct 13, 2006, 7:58:57 AM10/13/06
to gg95

>From my past experiences, it's Matlab's polling that raises a SIGALRM,
the g95 MEX functions seem to abort on this signal.

Here's a Matlab session backtracing the SIGALRM

[chulbert@mellin ~]$ cat test_g95_mex.f90
SUBROUTINE MEXFUNCTION(nlhs,plhs,nrhs,prhs)
INTEGER :: nlhs,nrhs
INTEGER(kind=8) :: plhs(nlhs),prhs(nrhs)
END SUBROUTINE MEXFUNCTION
[chulbert@mellin ~]$ g95 -shared -o test_g95_mex.mexa64
test_g95_mex.f90
[chulbert@mellin ~]$ /usr/local/bin/matlab -Dgdb
GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
(no debugging symbols found)
Using host libthread_db library "/lib64/libthread_db.so.1".

(gdb) handle SIGALRM stop
Signal Stop Print Pass to program Description
SIGALRM Yes Yes Yes Alarm clock
(gdb) r -nojvm
Starting program: /usr/local/matlab/R2006b/bin/glnxa64/MATLAB -nojvm
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread 46912601027440 (LWP 24182)]
[New Thread 1084229952 (LWP 24187)]
[Thread 1084229952 (LWP 24187) exited]

< M A T L A B >
Copyright 1984-2006 The MathWorks, Inc.
Version 7.3.0.298 (R2006b)
August 03, 2006

Detaching after fork from child process 24188.

To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.

>> test_g95_mex
>>
>> clear all
>>
Program received signal SIGALRM, Alarm clock.
[Switching to Thread 46912601027440 (LWP 24182)]
0x00000030c2cc3086 in poll () from /lib64/libc.so.6
(gdb) bt
#0 0x00000030c2cc3086 in poll () from /lib64/libc.so.6
#1 0x00000030c862d40a in _XtWaitForSomething () from
/usr/lib64/libXt.so.6
#2 0x00000030c862e543 in XtAppNextEvent () from /usr/lib64/libXt.so.6
#3 0x00002aaaaeba1a5b in UIX_AssertXThread_internal ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwuix.so
#4 0x00002aaaaeba1d0e in UIX_AssertXThread_internal ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwuix.so
#5 0x00002aaaab123330 in ioGetCharNoEcho ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwservices.so
#6 0x00002aaaab47215a in iolib::IOProxy::ReportIqmRequest ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwbridge.so
#7 0x00002aaaab4733a0 in ioCmdLineEditLoad ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwbridge.so
#8 0x00002aaaab47a40b in mnGetExecStatusAsInt ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwbridge.so
#9 0x00002aaaab47a52a in mnGetExecStatusAsInt ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwbridge.so
#10 0x00002aaaab47ac7b in mnParser ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwbridge.so
#11 0x00002aaaab5af503 in mcrInstance::mnParser ()
from /usr/local/matlab/R2006b/bin/glnxa64/libmwmcr.so
#12 0x000000000040159a in ?? ()
#13 0x00000030c2c1c784 in __libc_start_main () from /lib64/libc.so.6
#14 0x00000000004013da in ?? ()
#15 0x00007fffeeec3e98 in ?? ()
#16 0x0000000000000000 in ?? ()

Andy

unread,
Oct 15, 2006, 2:46:56 PM10/15/06
to gg95

Chris Hulbert wrote:
> Andy wrote:
> > Weird. SIGALRM should not be set anywhere. It's used for the corefile
> > resume, but the x86_64 version isn't merged yet (it doesn't quite
> > work). I would run your program under strace and see if you can see
> > where the alarm() call is being called-- that is what is eventually
> > causing the crash.
> >
> > Andy
>
> >From my past experiences, it's Matlab's polling that raises a SIGALRM,
> the g95 MEX functions seem to abort on this signal.
>

Give it a try now. The library initialization was setting up a
handler for SIGALRM that was never being used. It's used on some
platforms to implement checkfiles. Although adding that capability to
x86_64 is in the works, I am going to work things such that it is
disabled if you are running from a non-fortran main program. Let me
know how it goes.

Andy

bcom...@gmail.com

unread,
Oct 15, 2006, 7:04:28 PM10/15/06
to gg95

I'm getting the same results that Chris posted using the latest g95
source, a SIGALRM with the same stack trace. The last few lines from
strace:

18264 16:18:05.379524 poll([{fd=4, events=POLLIN}, {fd=0,
events=POLLIN|POLLPRI}], 2, 499) = -1 EINTR (Interrupted system call)
18264 16:18:05.698031 --- SIGALRM (Alarm clock) @ 0 (0) ---
18264 16:18:05.702710 +++ killed by SIGALRM +++
18324 16:18:05.703331 <... read resumed> "", 4) = 0
18324 16:18:05.703595 close(3) = 0
18324 16:18:05.703651 exit_group(0) = ?

Bill

bcom...@gmail.com

unread,
Oct 16, 2006, 12:44:04 PM10/16/06
to gg95

Maybe I didn't think this through.. I was telling gdb to stop on
SIGALRM, which apparently Matlab is generating. So it should stop
whether or not g95 has redefined the SIGALRM handler... I'm running
now without gdb, and no unhandled SIGALRMs ( "Alarm clock") thus far.
If I see any I'll report back.

Bill

bcom...@gmail.com

unread,
Nov 1, 2006, 1:02:11 PM11/1/06
to gg95
No SIGALRM timeouts to report, the fix appears to work fine.

However, there is the related issue of SIGINT. After running a g95
compiled MEX file in Matlab, a Ctrl-C will cause Matlab to exit
immediately. Normally Matlab will intercept the SIGINT and interrupt
any running Matlab process and return to the Matlab command line. Can
a fix similar to the SIGALRM issue be applied for SIGINT to prevent the
Matlab SIGINT handler from being redefined?

Bill

Zack

unread,
Nov 21, 2006, 1:24:50 AM11/21/06
to gg95
I recently installed g95 on a newly arrived macbook pro core 2 duo and
ran into the same problem when trying to get some fortran mex functions
written previously on a linux box to work. It seems from the thread
that there is a fix for linux now, is there anything could be done for
mac also?

Zack

Reply all
Reply to author
Forward
0 new messages