Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Problem during "make test"

8 views
Skip to first unread message

Harry Jackson

unread,
Dec 29, 2003, 5:12:41 PM12/29/03
to perl6-i...@perl.org
During

[parrot@cpc5-lutn1-6-0-cust175 parrot]$ make test
echo imcc/imcc.y -d -o imcc/imcparser.c
imcc/imcc.y -d -o imcc/imcparser.c
perl -e 'open(A,qq{>>$_}) or die foreach @ARGV' imcc/imcc.y.flag
imcc/imcparser.c imcc/imcparser.h
perl t/harness --gc-debug --running-make-test -b t/op/*.t t/pmc/*.t
t/native_pbc/*.t
t/op/00ff-dos...........

This is as far as it gets. I am assuming since no one else has noticed
this that it is a problem with my set up but I am at a bit of a loss as
to what has happened to cause it.

It gets even stranger. If I do a make clean and make test again it does
not necessarily stop in the same place each time ie.

perl t/harness --gc-debug --running-make-test -b t/op/*.t t/pmc/*.t
t/native_pbc/*.t
t/op/00ff-dos...........ok

t/op/00ff-unix..........

sometime it gets as far as the aritmetic tests. Has anyone seen this
before. My myconfig is at the bottom of the page.


On a side note I noticed some warnings about a predeclared variable in
/parrot/ops/core.ops line 1059. Patch attached.

Summary of my parrot 0.0.13 configuration:
configdate='Mon Dec 29 21:21:06 2003'
Platform:
osname=linux, archname=i386-linux
jitcapable=1, jitarchname=i386-linux,
jitosname=LINUX, jitcpuarch=i386
execcapable=1
perl=perl
Compiler:
cc='gcc', ccflags=' -I/usr/local/include',
Linker and Libraries:
ld='gcc', ldflags=' -L/usr/local/lib',
cc_ldflags='',
libs='-lnsl -ldl -lm -lcrypt -lutil -lpthread'
Dynamic Linking:
so='.so', ld_shared='-shared -L/usr/local/lib',
ld_shared_flags=''
Types:
iv=long, intvalsize=4, intsize=4, opcode_t=long, opcode_t_size=4,
ptrsize=4, ptr_alignment=4 byteorder=1234,
nv=double, numvalsize=8, doublesize=8

core.ops.diff

Jeff Clites

unread,
Dec 30, 2003, 3:55:20 PM12/30/03
to ha...@uklug.co.uk, perl6-i...@perl.org
On Dec 29, 2003, at 2:12 PM, Harry Jackson wrote:

> During
>
> [parrot@cpc5-lutn1-6-0-cust175 parrot]$ make test
> echo imcc/imcc.y -d -o imcc/imcparser.c
> imcc/imcc.y -d -o imcc/imcparser.c
> perl -e 'open(A,qq{>>$_}) or die foreach @ARGV' imcc/imcc.y.flag
> imcc/imcparser.c imcc/imcparser.h
> perl t/harness --gc-debug --running-make-test -b t/op/*.t t/pmc/*.t
> t/native_pbc/*.t
> t/op/00ff-dos...........
>
> This is as far as it gets. I am assuming since no one else has noticed
> this that it is a problem with my set up but I am at a bit of a loss
> as to what has happened to cause it.
>
> It gets even stranger. If I do a make clean and make test again it
> does not necessarily stop in the same place each time ie.

Here are 3 things to try:

1) When it hangs there, check with 'top' to see if it is using CPU (ie,
is it blocking, or in an infinite loop).

2) Try running one of the tests which blocks, individually. If you can
get it to happen this way, then run it in gdb and see what it's doing.
(Or, attach to an already blocked one from 'make test'--this is
assuming it's parrot that's actually blocking, and not t/harness.)

3) Try building from a clean checkout, and see if that shows the
problem. If not, it's probably something you've changed and don't
realize.

JEff

Harry Jackson

unread,
Dec 30, 2003, 6:11:34 PM12/30/03
to perl6-i...@perl.org
Jeff Clites wrote:

>
> Here are 3 things to try:
>
> 1) When it hangs there, check with 'top' to see if it is using CPU (ie,
> is it blocking, or in an infinite loop).

Already done that and it is eating no cycles.

> 2) Try running one of the tests which blocks, individually. If you can
> get it to happen this way, then run it in gdb and see what it's doing.
> (Or, attach to an already blocked one from 'make test'--this is assuming
> it's parrot that's actually blocking, and not t/harness.)

When run individually I get the same error. Complete freeze at what
appears to be an arbitrary point.

Running gdb

0x080a7625 in Perl_sv_gets ()
(gdb) n
Single stepping until exit from function Perl_sv_gets,
which has no line number information.
0x0809d254 in Perl_do_readline ()
(gdb) n
Single stepping until exit from function Perl_do_readline,
which has no line number information.
0x08099fd8 in Perl_runops_standard ()
(gdb) n
Single stepping until exit from function Perl_runops_standard,
which has no line number information.
t/op/arithmetics....ok 13/18


This is where gdb freezes execution. CTL-C then frees it up to continue
until the next one freezes.


> 3) Try building from a clean checkout, and see if that shows the
> problem. If not, it's probably something you've changed and don't realize.

I have tried something a bit more drastic. Deleted the entire tree and
downloaded it again (sorry about the bandwidth).

I have also tried strace and got the following.

mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x401fe000
read(3, "#! perl -w\n\nuse Parrot::Test tes"..., 4096) = 4096
brk(0x81e2000) = 0x81e2000
close(3) = 0
munmap(0x401fe000, 4096) = 0
pipe([3, 4]) = 0
pipe([5, 6]) = 0
fork() = 19676
close(4) = 0
close(6) = 0
read(5, "", 4) = 0
close(5) = 0
fcntl64(0x3, 0x3, 0xbffff5c4, 0) = 0
fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x401fe000
_llseek(3, 0, 0xbffff410, SEEK_CUR) = -1 ESPIPE (Illegal seek)
fcntl64(0x3, 0x2, 0x1, 0x1d) = 0
read(3, "1..18\n", 4096) = 6
read(3,


Doing a "ps ax" reveals the following (ignore the test number it keeps
changing)


20802 pts/11 S 0:00 make test
21598 pts/11 S 0:00 perl t/harness --gc-debug --running-make-test
-b t/op/00ff-dos.t t/op/00ff-unix.t t/op/arithmetics.t t/op/basic.t
21610 pts/11 S 0:00 perl -w t/op/arithmetics.t
21620 pts/11 S 0:00 ./parrot --gc-debug -b t/op/arithmetics_4.pasm
21621 pts/11 S 0:00 ./parrot --gc-debug -b t/op/arithmetics_4.pasm
21622 pts/11 S 0:00 ./parrot --gc-debug -b t/op/arithmetics_4.pasm

From all of this I am guessing that something has corrupted a module in
Perl at least that is all I can think of.

Harry Jackson

Jeff Clites

unread,
Dec 30, 2003, 7:37:35 PM12/30/03
to ha...@uklug.co.uk, perl6-i...@perl.org
On Dec 30, 2003, at 3:11 PM, Harry Jackson wrote:

>> 2) Try running one of the tests which blocks, individually. If you
>> can get it to happen this way, then run it in gdb and see what it's
>> doing. (Or, attach to an already blocked one from 'make test'--this
>> is assuming it's parrot that's actually blocking, and not t/harness.)
>
> When run individually I get the same error. Complete freeze at what
> appears to be an arbitrary point.
>
> Running gdb
>
> 0x080a7625 in Perl_sv_gets ()

I meant try running parrot in the debugger--Perl is probably hanging
b/c it's waiting for parrot to exit. For instance, see if the following
hangs, and if so run it in gdb:

./parrot --gc-debug -b t/op/arithmetics_4.pasm

> This is where gdb freezes execution. CTL-C then frees it up to
> continue until the next one freezes.

The ctrl-C is probably killing the subprocess (parrot), which does
imply that it's parrot that's hanging.

>> 3) Try building from a clean checkout, and see if that shows the
>> problem. If not, it's probably something you've changed and don't
>> realize.
>
> I have tried something a bit more drastic. Deleted the entire tree and
> downloaded it again (sorry about the bandwidth).

Yep, that's what I meant.

> I have also tried strace and got the following.

Try this on parrot rather than Perl.

> Doing a "ps ax" reveals the following (ignore the test number it keeps
> changing)
>
>
> 20802 pts/11 S 0:00 make test
> 21598 pts/11 S 0:00 perl t/harness --gc-debug
> --running-make-test -b t/op/00ff-dos.t t/op/00ff-unix.t
> t/op/arithmetics.t t/op/basic.t
> 21610 pts/11 S 0:00 perl -w t/op/arithmetics.t
> 21620 pts/11 S 0:00 ./parrot --gc-debug -b
> t/op/arithmetics_4.pasm
> 21621 pts/11 S 0:00 ./parrot --gc-debug -b
> t/op/arithmetics_4.pasm
> 21622 pts/11 S 0:00 ./parrot --gc-debug -b
> t/op/arithmetics_4.pasm

That does seem odd that it appears to be running the same test 3
times....

JEff

Harry Jackson

unread,
Dec 30, 2003, 8:19:31 PM12/30/03
to perl6-i...@perl.org
Jeff Clites wrote:
> On Dec 30, 2003, at 3:11 PM, Harry Jackson wrote:
>
>>> 2) Try running one of the tests which blocks, individually. If you
>>> can get it to happen this way, then run it in gdb and see what it's
>>> doing. (Or, attach to an already blocked one from 'make test'--this
>>> is assuming it's parrot that's actually blocking, and not t/harness.)
>>
>>
>> When run individually I get the same error. Complete freeze at what
>> appears to be an arbitrary point.
>>
>> Running gdb
>>
>> 0x080a7625 in Perl_sv_gets ()
>
>
> I meant try running parrot in the debugger--Perl is probably hanging b/c
> it's waiting for parrot to exit. For instance, see if the following
> hangs, and if so run it in gdb:
>
> ../parrot --gc-debug -b t/op/arithmetics_4.pasm

[parrot]$ ./parrot --gc-debug -b t/op/arithmetics_4.pasm
4123
4123
[parrot]$ ./parrot --gc-debug -b t/op/arithmetics_3.pasm
3877
3877
[parrot]$ ./parrot --gc-debug -b t/op/arithmetics_2.pasm
0
1234567890
1234567890
0
1234567890
1234567890
[parrot]$ ./parrot --gc-debug -b t/op/arithmetics_1.pasm
0
-1234567890
1234567890
0
-1234567890
1234567890


Looks good to me. I have taken the drastic measure of upgrading to 5.8.2
and....... no change. I am still locking up during tests, or should I
say, parrot is still locking up during tests, I seem to be continuing
along fine which is why I am still complaining ;-)

>> I have also tried strace and got the following.
>
>
> Try this on parrot rather than Perl.


strace on parrot gets to

rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
wait4(-1, [WIFEXITED(s) && WEXITSTATUS(s) == 0], 0, NULL) = 8423
--- SIGCHLD (Child exited) ---
sigreturn() = ? (mask now [])
write(1, ": blib/lib/libparrot.a\n", 23: blib/lib/libparrot.a
) = 23
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT TERM XCPU XFSZ], NULL, 8) = 0
vfork() = 8424
--- SIGCHLD (Child exited) ---
sigreturn() = ? (mask now [HUP INT QUIT TERM
XCPU XFSZ])
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
wait4(-1, [WIFEXITED(s) && WEXITSTATUS(s) == 0], 0, NULL) = 8424
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT TERM XCPU XFSZ], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
stat64("blib/lib/libparrot.a", {st_mode=S_IFREG|0664, st_size=18102092,
...}) = 0
write(1, "cc -o parrot -Wl,-E -g imcc/ma"..., 98cc -o parrot -Wl,-E
-g imcc/main.o blib/lib/libparrot.a -lnsl -ldl -lm -lcrypt -lutil -lpthread
) = 98
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT TERM XCPU XFSZ], NULL, 8) = 0
vfork() = 8425
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
wait4(-1, [WIFEXITED(s) && WEXITSTATUS(s) == 0], 0, NULL) = 8425
--- SIGCHLD (Child exited) ---
sigreturn() = ? (mask now [])
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT TERM XCPU XFSZ], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
stat64("parrot", {st_mode=S_IFREG|0775, st_size=4133629, ...}) = 0
stat64("test_prep", 0xbfffddf0) = -1 ENOENT (No such file or
directory)
stat64("testb", 0xbfffddf0) = -1 ENOENT (No such file or
directory)
write(1, "/usr/local/bin/perl5.8.2 t/harne"...,
106/usr/local/bin/perl5.8.2 t/harness --gc-debug --running-make-test -b
t/op/*.t t/pmc/*.t t/native_pbc/*.t
) = 106
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT TERM XCPU XFSZ], NULL, 8) = 0
vfork() = 8428
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
t/op/00ff-dos...........ok

t/op/00ff-unix..........ok

t/op/arithmetics........ok 15/18

freeze punk it's the police.

I am now convinced that due to the baffling nature of the problem that
it will be something stupid.

Harry Jackson

Leopold Toetsch

unread,
Dec 31, 2003, 4:32:16 AM12/31/03
to ha...@uklug.co.uk, perl6-i...@perl.org
Harry Jackson <ha...@uklug.co.uk> wrote:

> It gets even stranger. If I do a make clean and make test again it does
> not necessarily stop in the same place each time ie.

Do you have a SMP machine with SMP enabled in your OS?
The unpredictable behavior of your freezes makes me think, that it could
be related to multi-threading. OTOH arithmetic tests or such don't
utilize threads and no events are being generated.

leo

Harry Jackson

unread,
Jan 1, 2004, 6:29:31 PM1/1/04
to perl6-i...@perl.org
Leopold Toetsch wrote:
>
> Do you have a SMP machine with SMP enabled in your OS?
> The unpredictable behavior of your freezes makes me think, that it could
> be related to multi-threading. OTOH arithmetic tests or such don't
> utilize threads and no events are being generated.

I am running a Cray X1 ( I wish ). I am on a redhat box on a bog
standard single Athlon XP1700. I am stumped.

Harry

Jeff Clites

unread,
Jan 1, 2004, 6:47:11 PM1/1/04
to ha...@uklug.co.uk, perl6-i...@perl.org

That looks like you ran strace on 'make'.

Here's one more thing to investigate: When you get to the point where
it freezes, run ps with the -j option, to display the parent pid of
each process (so maybe 'ps -jax'). (Do this in a separate terminal
session, of course.) Find whatever process is at the bottom of the
"tree" of processes descending from 'make' (that is, 'make' should be
the parent of 'perl t/harness', which should be the parent of another
perl process running a ".t" script, which should be the parent of some
parrot process running an individual test), then try to 'gdb attach' to
that pid, and do a backtrace to see where it is hanging.

JEff

Dan Sugalski

unread,
Jan 1, 2004, 6:53:03 PM1/1/04
to ha...@uklug.co.uk, perl6-i...@perl.org

Just out of curiosity... what version of gcc are you running? We were
having no end of problems with the JIT and one of the mutant versions
of 2.95 that redhat was packaging up at one point.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Harry Jackson

unread,
Jan 1, 2004, 7:34:29 PM1/1/04
to perl6-i...@perl.org
Dan Sugalski wrote:
> Just out of curiosity... what version of gcc are you running? We were
> having no end of problems with the JIT and one of the mutant versions of
> 2.95 that redhat was packaging up at one point.

[parrot@cpc5-lutn1-6-0-cust175 pbin]$ gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
gcc version 2.96 20000731 (Red Hat Linux 7.2 2.96-108.1)


RHAS 2.1 dev edition

Harry Jackson

Dan Sugalski

unread,
Jan 1, 2004, 7:49:42 PM1/1/04
to ha...@uklug.co.uk, perl6-i...@perl.org

Yeah, that was the one, unfortunately. Try disabling the JIT during
configuration and seeing if that takes care of the problem. If so,
we'll need to update configure to do that automatically, since I
think we're going to end up running into this for years. :(

Harry Jackson

unread,
Jan 2, 2004, 6:52:40 AM1/2/04
to perl6-i...@perl.org
Dan Sugalski wrote:
>>
>> [parrot@cpc5-lutn1-6-0-cust175 pbin]$ gcc -v
>> Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs
>> gcc version 2.96 20000731 (Red Hat Linux 7.2 2.96-108.1)
>
>
> Yeah, that was the one, unfortunately. Try disabling the JIT during
> configuration and seeing if that takes care of the problem. If so, we'll
> need to update configure to do that automatically, since I think we're
> going to end up running into this for years. :(


No Joy. I will upgrade gcc and see what happens.

Harry Jackson

Dan Sugalski

unread,
Jan 2, 2004, 3:11:53 PM1/2/04
to ha...@uklug.co.uk, perl6-i...@perl.org

Let us know either way -- if upgrading gcc works then we're going to
have to figure out how RH/GCC2.96 is breaking things so we can make
it not happen. :(

Harry Jackson

unread,
Jan 4, 2004, 11:48:41 AM1/4/04
to perl6-i...@perl.org
Dan Sugalski wrote:
> Let us know either way -- if upgrading gcc works then we're going to
> have to figure out how RH/GCC2.96 is breaking things so we can make it
> not happen. :(

I have now upgraded gcc to 3.3.2 and I am getting the same error. We are
still freezing during test.

I have also noticed something that might be my crap "imc" or related to
the problem

Can someone tell me if there is an error in the code below. When I run
it repeatedly from the command line it sometimes freezes ie it prints
the contents of the array and then just stops and I need to do a CTRL-C
to get back to the command line.

.pcc_sub _MAIN prototyped
.param pmc argv
.local PerlArray a
a = new PerlArray
.local PerlString s
s = new PerlString

a = 10
a[0] = "Zero"
a[1] = "One"
a[2] = "Two"
a[3] = "Three"
a[4] = "Four"
a[5] = "Five"
a[6] = "Six"
a[7] = "Seven"
a[8] = "Eight"

s = a[2]
print "\n"
print s
print "\n"
end
.end

I have also tried the above code using the "set" syntax and I get the
same problem.

Are there any recommended examples of IMC in the source tree and which
docs are the most recent. I have noticed that there are a lot of
different ways of doing things (typical perl). I am trying to pick it up
from the FAQ, some examples and the docs but its an uphill struggle. For
instance I have noticed that

set a[0], "one"

or

a[0] = "one"


appear to do the same thing. I cannot confirm that they do due to the
bug above.

I have got to the point where I am trying to put rows from Postgres into
arrays and this is slowing me down a bit.

Harry

Leopold Toetsch

unread,
Jan 4, 2004, 12:23:26 PM1/4/04
to ha...@uklug.co.uk, perl6-i...@perl.org
Harry Jackson <ha...@uklug.co.uk> wrote:

> Can someone tell me if there is an error in the code below.

The code is fine.

> it repeatedly from the command line it sometimes freezes ie it prints
> the contents of the array and then just stops and I need to do a CTRL-C
> to get back to the command line.

Your are sure that there is no hardware problem? Run memcheck for a
couple of hours for example.

> ...I have noticed that

> set a[0], "one"

> or

> a[0] = "one"

> appear to do the same thing. I cannot confirm that they do due to the
> bug above.

They are the same. The first one is PASM syntax, the second is PIR
syntax.

E.g. running your "imc trouble" code
$ parrot -o- hj.imc
_MAIN:
new P16, 31 # .PerlArray
new P17, 36 # .PerlString
set P16, 10
set P16[0], "Zero"
...

yields the generated PASM code (with variables names allocated to Parrot
registers).

> Harry

leo

Harry Jackson

unread,
Jan 4, 2004, 3:09:46 PM1/4/04
to perl6-i...@perl.org
Leopold Toetsch wrote:
> Harry Jackson <ha...@uklug.co.uk> wrote:
>
>
>>Can someone tell me if there is an error in the code below.
>
>
> The code is fine.
>
>
>>it repeatedly from the command line it sometimes freezes ie it prints
>>the contents of the array and then just stops and I need to do a CTRL-C
>>to get back to the command line.
>
>
> Your are sure that there is no hardware problem? Run memcheck for a
> couple of hours for example.
>

I managed to compile gcc which is a fairly good indication that my
hardware is ok but you never know. I will try memtest86 and see how it goes.

> They are the same. The first one is PASM syntax, the second is PIR
> syntax.
> E.g. running your "imc trouble" code
> $ parrot -o- hj.imc
> _MAIN:
> new P16, 31 # .PerlArray
> new P17, 36 # .PerlString
> set P16, 10
> set P16[0], "Zero"
> ...
>
> yields the generated PASM code (with variables names allocated to Parrot
> registers).

I tried that as well, it spits out identical PASM each time but on the
odd occasion I need to use CTRL-C to get back to the shell.

H

Chromatic

unread,
Jan 16, 2004, 1:42:59 AM1/16/04
to ha...@uklug.co.uk, perl6-i...@perl.org
On Sun, 2004-01-04 at 12:09, Harry Jackson wrote:

> I tried that as well, it spits out identical PASM each time but on the
> odd occasion I need to use CTRL-C to get back to the shell.

I'm seeing the same thing on Linux PPC -- odd hangs from time to time
when running PIR, while running the PASM emitted with -o works well.
t/op/arithmetics 3 and 9 seem to be the big culprits in the test suite.

Perl 5.8.2, gcc version 3.2.3 20030422.

I've checked out a fresh source tree and still see this behavior.
Removing -DHAVE_JIT from the Makefile (since I didn't find the configure
argument) had no effect.

-- c

Leopold Toetsch

unread,
Jan 16, 2004, 2:26:10 AM1/16/04
to Chromatic, perl6-i...@perl.org
Chromatic <chro...@wgz.org> wrote:
> On Sun, 2004-01-04 at 12:09, Harry Jackson wrote:

>> I tried that as well, it spits out identical PASM each time but on the
>> odd occasion I need to use CTRL-C to get back to the shell.

> I'm seeing the same thing on Linux PPC -- odd hangs from time to time
> when running PIR, while running the PASM emitted with -o works well.
> t/op/arithmetics 3 and 9 seem to be the big culprits in the test suite.

Could you attach gdb to the hanging parrot?

$ cat sl.pasm
sleep 10000
end
$ parrot sl.pasm

[ in second term ]

$ ps ax | grep [p]arrot
28952 pts/0 S 0:00 parrot sl.pasm
28953 pts/0 S 0:00 parrot sl.pasm
28954 pts/0 S 0:00 parrot sl.pasm

$ gdb parrot 28952
GNU gdb 5.3
...
0x4011a391 in __libc_nanosleep () at __libc_nanosleep:-1
-1 __libc_nanosleep: No such file or directory.
in __libc_nanosleep
(gdb) bac
#0 0x4011a391 in __libc_nanosleep () at __libc_nanosleep:-1
#1 0x4011a31b in __sleep (seconds=10000)
at ../sysdeps/unix/sysv/linux/sleep.c:82
#2 0x08086792 in Parrot_sleep (seconds=10000) at src/platform.c:47
#3 0x080f89c4 in Parrot_sleep_ic (cur_opcode=0x826e488, interpreter=0x824b0a8)
at ops/sys.ops:151
#4 0x08082921 in runops_slow_core (interpreter=0x824b0a8, pc=0x826e488)
at src/runops_cores.c:115
...

This is on linux, the lowest PID is the main thread.
There should be some hints, where it hangs.

> -- c

leo

Chromatic

unread,
Jan 16, 2004, 2:33:18 AM1/16/04
to l...@toetsch.at, perl6-i...@perl.org
On Thu, 2004-01-15 at 23:26, Leopold Toetsch wrote:

> Could you attach gdb to the hanging parrot?

This time, it's hanging at t/op/00ff-dos.t:

(gdb) bac
#0 0x0fd0e600 in sigsuspend () from /lib/libc.so.6
#1 0x0ff970ac in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
#2 0x0ff96cf8 in pthread_onexit_process () from /lib/libpthread.so.0
#3 0x0fd10bc8 in exit () from /lib/libc.so.6
#4 0x1008c750 in Parrot_exit (status=0) at src/exit.c:54
#5 0x100320b4 in main (argc=1, argv=0x7ffff5c0) at imcc/main.c:555

Here's another run, this time hanging at test #3 in t/op/arithmetics.t:

#0 0x0fd0e600 in sigsuspend () from /lib/libc.so.6
#1 0x0ff970ac in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
#2 0x0ff96cf8 in pthread_onexit_process () from /lib/libpthread.so.0
#3 0x0fd10bc8 in exit () from /lib/libc.so.6
#4 0x1008c750 in Parrot_exit (status=0) at src/exit.c:54
#5 0x100320b4 in main (argc=1, argv=0x7ffff5b0) at imcc/main.c:555

I can upgrade glibc to see if that helps.

-- c

Jeff Clites

unread,
Jan 16, 2004, 3:02:35 AM1/16/04
to chromatic, Internals List

Yeah, I think JIT is a red herring--I don't see how JIT problems can be
involved when not running with the JIT core....

JEff

Leopold Toetsch

unread,
Jan 16, 2004, 4:07:33 AM1/16/04
to Chromatic, perl6-i...@perl.org
Chromatic <chro...@wgz.org> wrote:
> This time, it's hanging at t/op/00ff-dos.t:

I've checked in now:

* terminate the even loop thread on destroying of the last interp
* this could help against the spurious hangs reported on p6i

Could you please check if that helps.

Thanks,
leo

Leopold Toetsch

unread,
Jan 16, 2004, 3:37:28 AM1/16/04
to Chromatic, perl6-i...@perl.org
Chromatic <chro...@wgz.org> wrote:
> On Thu, 2004-01-15 at 23:26, Leopold Toetsch wrote:

>> Could you attach gdb to the hanging parrot?

> This time, it's hanging at t/op/00ff-dos.t:

> (gdb) bac
> #0 0x0fd0e600 in sigsuspend () from /lib/libc.so.6
> #1 0x0ff970ac in __pthread_wait_for_restart_signal ()
> from /lib/libpthread.so.0
> #2 0x0ff96cf8 in pthread_onexit_process () from /lib/libpthread.so.0
> #3 0x0fd10bc8 in exit () from /lib/libc.so.6
> #4 0x1008c750 in Parrot_exit (status=0) at src/exit.c:54
> #5 0x100320b4 in main (argc=1, argv=0x7ffff5c0) at imcc/main.c:555

Ugly. Parrot starts the event thread as detached, so that *should* not
cause problems. But maybe I'm doing something stupid somewhere.
The event thread is waiting on a queue condition, but when main exits,
it should just terminate AFAIK. I can send a special
event_loop_terminate event though.

> I can upgrade glibc to see if that helps.

That's another possibility.

> -- c

leo

Chromatic

unread,
Jan 16, 2004, 1:08:52 PM1/16/04
to l...@toetsch.at, perl6-i...@perl.org
On Fri, 2004-01-16 at 01:07, Leopold Toetsch wrote:

> I've checked in now:
>
> * terminate the even loop thread on destroying of the last interp
> * this could help against the spurious hangs reported on p6i
>
> Could you please check if that helps.

Yes, that's better. (Upgrading glibc didn't help -- I was worried that
this was an NPTL issue that Parrot couldn't fix.)

Now it hangs on t/pmc/timer:

0x10090b30 in Parrot_del_timer_event (interpreter=0x10273e88,
timer=0x30185838)
at src/events.c:176
176 for (entry = event_queue->head; entry; entry = entry->next)
{
(gdb) bac
#0 0x10090b30 in Parrot_del_timer_event (interpreter=0x10273e88,
timer=0x30185838) at src/events.c:176
#1 0x101f35e0 in del_timer (interpreter=0x10273e88, pmc=0x30185838)
at timer.c:88
#2 0x101f3868 in Parrot_Timer_destroy (interpreter=0x10273e88,
pmc=0x30185838)
at timer.c:151
#3 0x1008bc10 in free_unused_pobjects (interpreter=0x10273e88,
pool=0x10294808) at src/dod.c:529
#4 0x100345a8 in Parrot_really_destroy (exit_code=0,
vinterp=0x10273e88)
at src/interpreter.c:1137
#5 0x1008c7c0 in Parrot_exit (status=0) at src/exit.c:48
#6 0x10032134 in main (argc=1, argv=0x7ffff5c0) at imcc/main.c:555

and imcc/t/syn/file:

0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
(gdb) bac
#0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1 0x0ff973e0 in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
#2 0x0ff96fe4 in pthread_onexit_process () from /lib/libpthread.so.0
#3 0x0fd0a694 in exit () from /lib/libc.so.6
#4 0x101a72fc in fataly (code=1, file=0x7ffff764 "temp.imc", lin=2,
fmt=0xfe00610 "No such file or directory") at imcc/debug.c:34
#5 0x1019ca94 in include_file (file_name=0x102ba670 "non_existent.imc")
at imcc.l:771
#6 0x10199330 in yylex (valp=0x7fffed60, interp=0x10273e88) at
imcc.l:299
#7 0x10193e40 in yyparse (interp=0x10273e88) at imcc/imcparser.c:1611
#8 0x10031d34 in main (argc=1, argv=0x7ffff5d8) at imcc/main.c:493


Ahh, here's another for t/pmc/timer:

Parrot_del_timer_event (interpreter=0x10273e88, timer=0x30185868)
at src/events.c:177
177 if (entry->type == QUEUE_ENTRY_TYPE_TIMED_EVENT) {
(gdb) bac
#0 Parrot_del_timer_event (interpreter=0x10273e88, timer=0x30185868)
at src/events.c:177
#1 0x101f35e0 in del_timer (interpreter=0x10273e88, pmc=0x30185868)
at timer.c:88
#2 0x101f3868 in Parrot_Timer_destroy (interpreter=0x10273e88,
pmc=0x30185868)
at timer.c:151
#3 0x1008bc10 in free_unused_pobjects (interpreter=0x10273e88,
pool=0x10294808) at src/dod.c:529
#4 0x100345a8 in Parrot_really_destroy (exit_code=0,
vinterp=0x10273e88)
at src/interpreter.c:1137
#5 0x1008c7c0 in Parrot_exit (status=0) at src/exit.c:48
#6 0x10032134 in main (argc=1, argv=0x7ffff5c0) at imcc/main.c:555


I hope this helps.

-- c

Leopold Toetsch

unread,
Jan 16, 2004, 3:42:33 PM1/16/04
to Chromatic, perl6-i...@perl.org
Chromatic <chro...@wgz.org> wrote:

> Yes, that's better. (Upgrading glibc didn't help -- I was worried that
> this was an NPTL issue that Parrot couldn't fix.)

Cool.

> Now it hangs on t/pmc/timer:

> 0x10090b30 in Parrot_del_timer_event (interpreter=0x10273e88,

Ah yep. When committing the first (trial) fix, I thought about such
a problem, which is related:
- if it seems to hang on a condition variable (still AFAIK: it shouldn't)
- but anyway - it could depend on objects, that need destruction, like
a timer event, so ...

I moved killing the event loop a bit down in the interpreter destroy
sequence. Reaching that point, timers should be removed from the queue.

HTH and thanks for your valuable feedback to track things down.

> -- c

leo

Chromatic

unread,
Jan 16, 2004, 4:09:14 PM1/16/04
to l...@toetsch.at, perl6-i...@perl.org
On Fri, 2004-01-16 at 12:42, Leopold Toetsch wrote:

> Ah yep. When committing the first (trial) fix, I thought about such
> a problem, which is related:
> - if it seems to hang on a condition variable (still AFAIK: it shouldn't)

It reminds me of a problem I'm having with MySQL, actually.

> - but anyway - it could depend on objects, that need destruction, like
> a timer event, so ...
>
> I moved killing the event loop a bit down in the interpreter destroy
> sequence. Reaching that point, timers should be removed from the queue.

This gets further. Now it's imcc/t/syn/macro, around tests 13 and 14:

#0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1 0x0ff973e0 in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
#2 0x0ff96fe4 in pthread_onexit_process () from /lib/libpthread.so.0
#3 0x0fd0a694 in exit () from /lib/libc.so.6

#4 0x101a730c in fataly (code=1, file=0x102b66f7 "M", lin=6,
fmt=0x1021ae9c "Macro '%s' requires %d arguments, but %d given")
at imcc/debug.c:34
#5 0x1019c940 in expand_macro (valp=0x7fffed30, interp=0x10273e98,
name=0x102b66f7 "M") at imcc.l:731
#6 0x101996f4 in yylex (valp=0x7fffed30, interp=0x10273e98) at
imcc.l:349
#7 0x10193e50 in yyparse (interp=0x10273e98) at imcc/imcparser.c:1611
#8 0x10031d34 in main (argc=1, argv=0x7ffff5b0) at imcc/main.c:493

and a similar one in imcc/t/sym/clash, around test 9:

#0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1 0x0ff973e0 in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
#2 0x0ff96fe4 in pthread_onexit_process () from /lib/libpthread.so.0
#3 0x0fd0a694 in exit () from /lib/libc.so.6

#4 0x101a730c in fataly (code=1, file=0x7ffff756
"imcc/t/syn/clash_9.imc",
lin=2, fmt=0x10216920 "Unknown PMC type '%s'\n") at imcc/debug.c:34
#5 0x101953b4 in yyparse (interp=0x10273e98) at imcc.y:694
#6 0x10031d34 in main (argc=1, argv=0x7ffff5b0) at imcc/main.c:493

Would you like me to keep reporting them as I find them? Looking at the
source code of fataly, there's a TODO comment just above the culprit.

-- c

Leopold Toetsch

unread,
Jan 17, 2004, 7:29:51 AM1/17/04
to Chromatic, perl6-i...@perl.org
Chromatic <chro...@wgz.org> wrote:

> This gets further. Now it's imcc/t/syn/macro, around tests 13 and 14:

Good.

> #3 0x0fd0a694 in exit () from /lib/libc.so.6

I've converted this exit() to Parrot_exit() now. If that helps, I'll
change a bunch of other such code too.

leo

Chromatic

unread,
Jan 17, 2004, 2:29:51 PM1/17/04
to l...@toetsch.at, perl6-i...@perl.org
On Sat, 2004-01-17 at 04:29, Leopold Toetsch wrote:

> I've converted this exit() to Parrot_exit() now. If that helps, I'll
> change a bunch of other such code too.

Yep, that fixed. Now there are hangs in some of the t/src files. Right
now, it's t/src/sprintf.t at test 2. The backtrace looks familiar:

(gdb) bac


#0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
#1 0x0ff973e0 in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
#2 0x0ff96fe4 in pthread_onexit_process () from /lib/libpthread.so.0

#3 0x0fd0a694 in exit () from /lib/libc.so.6

#4 0x0fcf23a8 in __libc_start_main () from /lib/libc.so.6

I saw something similar in t/src/extend.t as well. Otherwise,
everything passes.

Perhaps a Parrot_cleanup() would help extenders and embedders?

-- c

Leopold Toetsch

unread,
Jan 17, 2004, 3:12:03 PM1/17/04
to Chromatic, perl6-i...@perl.org
Chromatic <chro...@wgz.org> wrote:
> On Sat, 2004-01-17 at 04:29, Leopold Toetsch wrote:

>> I've converted this exit() to Parrot_exit() now. If that helps, I'll
>> change a bunch of other such code too.

> Yep, that fixed. Now there are hangs in some of the t/src files. Right
> now, it's t/src/sprintf.t at test 2. The backtrace looks familiar:

Fine. I just changed 2 occurences of exit. There are some more, that
should fix more stuff.
But I'm a bit worried about the reason, why it actually hangs here:

> (gdb) bac
> #0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0

...


> #3 0x0fd0a694 in exit () from /lib/libc.so.6
> #4 0x0fcf23a8 in __libc_start_main () from /lib/libc.so.6

These are exit calls from the main thread. That should AFAIK just kill
all threads that got started eventually and finish the process.

There seem to be a lot of exit() calls in the icu lib. Changing these
isn't really a solution.

> Perhaps a Parrot_cleanup() would help extenders and embedders?

That is: Parrot_exit(exit_status) - we have that already.

My first attempts to cleanup all interpreter resources use an on_exit(3)
or atexit(3) handler, but both werent't protable, so the code was change
to do an explicit destroy processing via Parrot_exit().

> -- c

Thanks for digging into this,
leo

Chromatic

unread,
Jan 18, 2004, 9:10:17 PM1/18/04
to l...@toetsch.at, perl6-i...@perl.org
On Sat, 2004-01-17 at 12:12, Leopold Toetsch wrote:

> But I'm a bit worried about the reason, why it actually hangs here:
>
> > (gdb) bac
> > #0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
> ...
> > #3 0x0fd0a694 in exit () from /lib/libc.so.6
> > #4 0x0fcf23a8 in __libc_start_main () from /lib/libc.so.6
>
> These are exit calls from the main thread. That should AFAIK just kill
> all threads that got started eventually and finish the process.

What if the event thread is stuck? When the tests hang, suspending and
resuming the process unsticks it, though the current test will fail. I
wonder if it's waiting for a masked signal or a signal that never
arrives.

(Warning: I've just about exhausted my knowledge of pthreads programming
coming up with that idea, so if it sounds crack-addled, it's definitely
due to gaps in my knowledge.)

I base that idea on this backtrace:

/home/chromatic/dev/parrot/t/src/extend_8, process 19479
Reading symbols from /lib/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 19479)]
[New Thread 32769 (LWP 19480)]
[New Thread 16386 (LWP 19481)]
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libutil.so.1...done.
Loaded symbols for /lib/libutil.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld.so.1...done.
Loaded symbols for /lib/ld.so.1


0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0

(gdb) info threads
3 Thread 16386 (LWP 19481) 0x0ff976a4 in __pthread_sigsuspend ()
from /lib/libpthread.so.0
2 Thread 32769 (LWP 19480) 0x0ff9c140 in waitpid ()
from /lib/libpthread.so.0
1 Thread 16384 (LWP 19479) 0x0ff976a4 in __pthread_sigsuspend ()


from /lib/libpthread.so.0
(gdb) bac
#0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0

#1 0x0ff973e0 in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0
#2 0x0ff96fe4 in pthread_onexit_process () from /lib/libpthread.so.0

#3 0x0fd0a694 in exit () from /lib/libc.so.6
#4 0x0fcf23a8 in __libc_start_main () from /lib/libc.so.6

(gdb) thread 2
[Switching to thread 2 (Thread 32769 (LWP 19480))]#0 0x0ff9c140 in
waitpid ()
from /lib/libpthread.so.0
(gdb) bac
#0 0x0ff9c140 in waitpid () from /lib/libpthread.so.0
#1 0x0ff9c128 in waitpid () from /lib/libpthread.so.0
#2 0x0ff958d0 in pthread_handle_exit () from /lib/libpthread.so.0
#3 0x0ff94c90 in __pthread_manager () from /lib/libpthread.so.0
#4 0x0fdb3118 in clone () from /lib/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 16386 (LWP 19481))]#0 0x0ff976a4 in


__pthread_sigsuspend () from /lib/libpthread.so.0
(gdb) bac
#0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0

#1 0x0ff973e0 in __pthread_wait_for_restart_signal ()
from /lib/libpthread.so.0

#2 0x0ff93f9c in pthread_cond_wait@GLIBC_2.0 () from
/lib/libpthread.so.0
#3 0x101d7614 in queue_wait (queue=0x10279e30) at src/tsq.c:159
#4 0x1009c968 in event_thread (data=0x10279e30) at src/events.c:349
#5 0x0ff94d98 in pthread_start_thread () from /lib/libpthread.so.0
#6 0x0fdb3118 in clone () from /lib/libc.so.6
(gdb) q

and the fact that the attached patch seems to fix things. I don't
expect that it's correct. It might paper over a real problem, perhaps
on my system. It's food for thought though.

If it turns out that the problem does lie here, can anyone suggest a
very small test program that would demonstrate the real problem?

-- c


event_thread.patch

Leopold Toetsch

unread,
Jan 19, 2004, 6:40:41 AM1/19/04
to Chromatic, perl6-i...@perl.org
Chromatic <chro...@wgz.org> wrote:
> (gdb) bac

This is the main thread, that has suspended itself during exit
processing.

> #0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
...
> #3 0x0fd0a694 in exit () from /lib/libc.so.6

> (gdb) thread 2

That's the thread-manager thread. It has sent the event thread a
sig_cancel signal and is now waiting on the event thread to terminate:

> #1 0x0ff9c128 in waitpid () from /lib/libpthread.so.0
> #2 0x0ff958d0 in pthread_handle_exit () from /lib/libpthread.so.0
> #3 0x0ff94c90 in __pthread_manager () from /lib/libpthread.so.0

> (gdb) thread 3

And finally the event handler thread, which seems to hang in waiting for
the condition. WTF it doesn't get the cancel signal, sent by the
thread-manager?

> #0 0x0ff976a4 in __pthread_sigsuspend () from /lib/libpthread.so.0
> #1 0x0ff973e0 in __pthread_wait_for_restart_signal ()
> from /lib/libpthread.so.0
> #2 0x0ff93f9c in pthread_cond_wait@GLIBC_2.0 () from
> /lib/libpthread.so.0
> #3 0x101d7614 in queue_wait (queue=0x10279e30) at src/tsq.c:159
> #4 0x1009c968 in event_thread (data=0x10279e30) at src/events.c:349

> and the fact that the attached patch seems to fix things. I don't
> expect that it's correct.

I don't know, why it seems to fix things. Parrot_new_terminate_event()
places a "stop the run-loop" event into this interpreter's task queue.
It doesn *not* effect the hanging event thread. But it seems to trigger
something in an odd way, so that threads can make some progress and
terminate finally. Really strange.
I don't not even know, if your inserted line is executed, as it seems that
exit() was called somewhere else.

OTOH some lines later (interpreter.c:1143) the main thread tells the
event thread to terminate by Parrot_kill_event_loop(), which pushes an
event into the event queue. This also signals the waiting event thread,
that something arrives, so it should wake up and finally terminate the
event handler thread.

I think, a better solution is to just call Parrot_exit() instead of
exit(), so that Parrot_kill_event_loop() is run.

leo

Leopold Toetsch

unread,
Jan 19, 2004, 7:11:04 AM1/19/04
to ha...@uklug.co.uk, perl6-i...@perl.org
Harry Jackson <ha...@uklug.co.uk> wrote:

> RHAS 2.1 dev edition

Harry, do you still see these hanging parrot programs?
chromatic, do you run a DeadRat (sorry) linux too?

RedHat is well known to provide b0rken patches to otherwise running
software. Could you try to up/down/side-grade *libpthread* (*not*
glibc, at least if its separate).

> Harry Jackson

leo

Harry Jackson

unread,
Jan 19, 2004, 7:35:37 AM1/19/04
to perl6-i...@perl.org

I am no longer using deadrat. I am now using Debian and it seems to have
cured my problem ;-)

Harry

Chromatic

unread,
Jan 19, 2004, 1:17:40 PM1/19/04
to l...@toetsch.at, ha...@uklug.co.uk, perl6-i...@perl.org
On Mon, 2004-01-19 at 04:11, Leopold Toetsch wrote:

> Harry, do you still see these hanging parrot programs?
> chromatic, do you run a DeadRat (sorry) linux too?

Nope, none here. I can try a different pthread library though.

-- c

0 new messages