test.pl runperl() exit oddity

Nicholas Clark

unread,

Aug 7, 2013, 1:18:37 PM8/7/13

to perl5-...@perl.org, vms...@perl.org

In trying to fix this, I think I might have also fixed 3 problems with
different recently failing tests on VMS. But I don't understand why.
I have questions at the end, but the story in the middle is probably necessary.

On Wed, Aug 07, 2013 at 06:20:00PM +0200, H.Merijn Brand wrote:
> Automated smoke report for branch blead 5.19.3 patch 6a9ebaf3683d6b0799ce26b5b512469b9c2384d5 v5.19.2-294-g6a9ebaf
> i3: PPC_POWER5 (PPC/1 cpu)
> on AIX - 5.3.0.0/ML12

> Failures: (common-args) -Dcc=gcc
> [stdio]
> ../t/op/fork.t..............................................FAILED
> 7

git bisect reveals that this started failing on AIX with this commit:

commit 684b0ecaad281760b04dd2da317ee0459cafebf6
Author: Tony Cook <to...@develop-help.com>
Date: Tue Jul 16 14:57:20 2013 +1000

[perl #116190] feed an empty stdin to run_multiple_progs() programs

Two tests for -a were attempting to read stdin and blocking with the -a
implies -n change.

diff --git a/t/test.pl b/t/test.pl
index 41efbb8..eb4f868 100644
--- a/t/test.pl
+++ b/t/test.pl
@@ -1134,7 +1134,8 @@ sub run_multiple_progs {
print $fh "\n#line 1\n"; # So the line numbers don't get messed up.
print $fh $prog,"\n";
close $fh or die "Cannot close $tmpfile: $!";
- my $results = runperl( stderr => 1, progfile => $tmpfile, $up
+ my $results = runperl( stderr => 1, progfile => $tmpfile,
+ stdin => '', $up
? (switches => ["-I$up/lib", $switch], nolib => 1)
: (switches => [$switch])
);

What that commit does is change the generated backtick command from

`./perl -Ilib -e ...`

to

`./perl -e "print qq()" | ./perl -Ilib -e ...`

so that stdin is at EOF.

It turns out that this causes fun on AIX with two tests, because on AIX
/bin/sh is actually ksh, and ksh does pipes differently (with one less
process). With sh, for the latter command line the sh process forks two
children, which use exec to start the two perl processes. The parent shell
process persists for the duration of the pipeline, and the second perl
process starts with no children. With ksh (and zsh), the shell saves a
process by forking a child for just the first perl process, and execing
itself to start the second. This means that the second perl process starts
with one child which it didn't create. This breaks the tests assume that
wait (or waitpid) will only return information about processes started
within the test. One can replicate this is on Linux:

$ sh -c 'pstree -p $$ | cat'
sh(13261)-+-cat(13263)
`-pstree(13262)
$ ksh -c 'pstree -p $$ | cat'
cat(13349)---pstree(13350)

I thought about fixing the tests to make them immune to unexpected extra
child processes, but then realised that it was probably easier to change the
generated backtick command to be

`./perl </dev/null -Ilib -e ...`

I wasn't sure whether this would work on VMS, so I tested it there. I ran
the tests before making that change, and t/lib/croak.t, lib/charnames.t and
lib/warnings.t failed. These failures are new. The failures all look like
this:

$ @[.vms]test .EXE "" "-v" "[.lib]croak.t"
%DELETE-I-FILDEL, PTAC$DKA0:[NCLARK.I.perl-7d9633e5aba8.t]Perl.EXE;1 deleted (16
blocks)
%COPY-S-COPIED, PTAC$DKA0:[NCLARK.I.perl-7d9633e5aba8]PERL.EXE;1 copied to PTAC$
DKA0:[NCLARK.I.perl-7d9633e5aba8.t]Perl.EXE;1 (16 blocks)
%DELETE-I-FILDEL, PTAC$DKA0:[NCLARK.I.perl-7d9633e5aba8.t]vmspipe.com;1 deleted
(16 blocks)
%COPY-S-COPIED, PTAC$DKA0:[NCLARK.I.perl-7d9633e5aba8]vmspipe.com;1 copied to PT
AC$DKA0:[NCLARK.I.perl-7d9633e5aba8.t]vmspipe.com;1 (2 blocks)

7-AUG-2013 21:00:37.83 User: NCLARK Process ID: 20A3A4EB
Node: PTAC Process name: "_FTA275:"

Accounting information:
Buffered I/O count: 1289345 Peak working set size: 31056
Direct I/O count: 1685993 Peak virtual size: 224288
Page faults: 1735001 Mounted volumes: 0
Images activated: 4216
Elapsed CPU time: 0 00:07:36.58
Connect time: 48 19:31:58.07
1..37
# From PTAC$DKA0:[NCLARK.I.perl-7d9633e5aba8.t.lib.croak]mg.
PROG:
# mg.c
$SIG{_HUNGRY} = \&mmm_pie;
warn "Mmm, pie";
EXPECTED:
No such hook: _HUNGRY at - line 2.
EXIT STATUS: != 0
GOT:
No such hook: _HUNGRY at - line 2.
EXIT STATUS: 0
not ok 1 - Perl_magic_setsig
t/[.lib]croak ... FAILED at test 1
Failed 1 test out of 1, 0.00% okay.
[.lib]croak.t
### Since not all tests were successful, you may want to run some of
### them individually and examine any diagnostic messages they produce.
### See the INSTALL document's section on "make test".
u=456.66 s=0.00 cu=0.00 cs=0.00 scripts=1 tests=37

7-AUG-2013 21:01:21.01 User: NCLARK Process ID: 20A3A4EB
Node: PTAC Process name: "_FTA275:"

Accounting information:
Buffered I/O count: 1290342 Peak working set size: 31056
Direct I/O count: 1686033 Peak virtual size: 224288
Page faults: 1735612 Mounted volumes: 0
Images activated: 4218
Elapsed CPU time: 0 00:07:36.68
Connect time: 48 19:32:41.25
%SYSTEM-F-ABORT, abort

Note

1) All three fail on the first program which has an expected nonzero
EXIT STATUS
2) For all three, the test script aborts at that point

When I change t/test.pl to use the generated backticks redirecting from
/dev/null instead of using a pipeline, all 3 pass again.

(This is currently on smoke-me/nicholas/runperl-empty-STDIN and I will merge
it to blead when results come in for Win32 and a couple of other systems)

Questions

1) Why does using a pipeline in backticks change the exit status?

2) Why does runperl() in t/test.pl abort on VMS if exit status is wrong?
(or is it vmspipe.com aborting?)
*nix doesn't abort on the first test, and I don't think that Win32 does
either.

Nicholas Clark

Nicholas Clark

unread,

Aug 7, 2013, 1:40:04 PM8/7/13

to Steve Hay, perl5-...@perl.org, vms...@perl.org

On Wed, Aug 07, 2013 at 06:29:26PM +0100, Steve Hay wrote:

> Nicholas Clark wrote on 2013-08-07:
> > It turns out that this causes fun on AIX with two tests, because on
> > AIX /bin/sh is actually ksh, and ksh does pipes differently (with one
> > less process). With sh, for the latter command line the sh process
> > forks two children, which use exec to start the two perl processes.
> > The parent shell process persists for the duration of the pipeline,
> > and the second perl process starts with no children. With ksh (and
> > zsh), the shell saves a process by forking a child for just the first
> > perl process, and execing itself to start the second. This means that
> > the second perl process starts with one child which it didn't create.
> > This breaks the tests assume that wait (or waitpid) will only return
> > information about processes started within the test. One can replicate
> this is on Linux:
> >
> > $ sh -c 'pstree -p $$ | cat'
> > sh(13261)-+-cat(13263)
> > `-pstree(13262)
> > $ ksh -c 'pstree -p $$ | cat'
> > cat(13349)---pstree(13350)
> >
> >
> > I thought about fixing the tests to make them immune to unexpected
> > extra child processes, but then realised that it was probably easier
> > to change the generated backtick command to be
> >
> > `./perl </dev/null -Ilib -e ...`
> >
>

> And the commit message for reads:
> " Fortuitously it seems that the string /dev/null is portable enough to
> work
> with the command line parsing code on VMS and Win32 too."

Yes, I asked on #win32 and didn't get a useful answer.
Thanks for spotting this.

> What command line parsing code are you talking about here? I'm not aware
> of /dev/null being valid on Windows in general; I've always used the
> device called nul instead:
>
> D:\temp>perl </dev/null -e1
> The system cannot find the path specified.
>
> D:\temp>perl <nul -e1

Does

perl <nul.txt -e1

work too?

(ie it's still compatible with MS/DOS and earlier)

My assumption was based on what seem to be several regression tests
that opened /dev/null and aren't failing on Win32.

Starting with this in t/base/term.t:

open(try, "/dev/null") || open(try,"nla0:") || (die "Can't open /dev/null.");

also t/lib/warnings/pp_sys has

########
# pp_sys.c [pp_leavewrite]
use warnings 'io' ;
format STDOUT_TOP =
abc
.
format STDOUT =
def
ghi
.
$= = 1 ;
$- =1 ;
open STDOUT, ">".($^O eq 'VMS'? 'NL:' : '/dev/null') ;
write ;
no warnings 'io' ;
write ;
EXPECT
page overflow at - line 13.
########

and t/lib/warnings/sv contains

open F, ">".($^O eq 'VMS'? 'NL:' : '/dev/null') ;

Is it that /dev/null works from within perl, but not on the command line?

If so, the fix would seem to be to change the new code to

$runperl = $runperl . ($is_mswin ? ' <nul' : ' </dev/null');

Does this make t/op/fork.t test 16 pass again on Win32?
That now seems to be failing, and I don't know why.

Nicholas Clark

Craig A. Berry

unread,

Aug 7, 2013, 1:50:55 PM8/7/13

to Nicholas Clark, Perl5 Porters (E-mail), VMSperl Mailing List

The root cause of the problem on VMS is that command-line redirection
is done by Perl and not by the shell. Tony's addition of the stdin
parameter to runperl gives us the equivalent of:

$ perl -e "exit 2;" | perl -e "exit 0;"
%NONAME-E-NOMSG, Message number 00000002
$ show symbol $status
$STATUS == "%X00000002"

The Perl process with an exit value of 0 is a child of the one that
has an exit value of 2 so the final status we see in runperl is the
exit value of the parent, not of the child. But the child is actually
the test program whose exit value we're interested in and we don't get
it this way.

I *think* on *nix both perl invocations are subprocesses of the shell,
or in any case $? is 0 when all is said and done, i.e., the exit
status from the second Perl invocation.

I have a working patch that fixes it by doing a piped open and writing
to the pipe that provides the stdin of the child, then capturing the
child's exit and passing it along. It's ugly and makes the VMS code
in t/test.pl even more different than it already is. I've been
sitting on it for a bit hoping to think of something better, but it
looks like you've now done that. If /dev/null has portability problems
on Win32, simply lifting the platform-specific code from
File::Spec::dev_null should provide the solution in a very small
amount of additional code.

Craig A. Berry

unread,

Aug 7, 2013, 2:04:47 PM8/7/13

to Nicholas Clark, Perl5 Porters (E-mail), VMSperl Mailing List

On Wed, Aug 7, 2013 at 12:18 PM, Nicholas Clark <ni...@ccl4.org> wrote:

> 2) Why does runperl() in t/test.pl abort on VMS if exit status is wrong?
> (or is it vmspipe.com aborting?)

When you run vms/test.com, it's running t/TEST by default. Isn't
bailing out early the normal behavior for TEST (as opposed to
harness)? The abort messages are just the usual result of
transmogrifying a generic unixy failure exit to a generic VMS failure
exit:

$ perl -e "exit 1;"
%SYSTEM-F-ABORT, abort

Eric Brine

unread,

Aug 7, 2013, 2:54:03 PM8/7/13

to Nicholas Clark, Steve Hay, perl5 porters, vms...@perl.org

On Wed, Aug 7, 2013 at 1:40 PM, Nicholas Clark <ni...@ccl4.org> wrote:

Does

perl <nul.txt -e1

work too?

(ie it's still compatible with MS/DOS and earlier)

Yes.

My assumption was based on what seem to be several regression tests
that opened /dev/null and aren't failing on Win32.

There isn't that much difference between a file handle to an empty file and an opened file handle. <$fh> returns false for both.

Nicholas Clark

unread,

Aug 8, 2013, 3:50:56 AM8/8/13

to Craig A. Berry, Perl5 Porters (E-mail), VMSperl Mailing List

On Wed, Aug 07, 2013 at 12:50:55PM -0500, Craig A. Berry wrote:

> > $ sh -c 'pstree -p $$ | cat'
> > sh(13261)-+-cat(13263)
> > `-pstree(13262)
> > $ ksh -c 'pstree -p $$ | cat'
> > cat(13349)---pstree(13350)

> The root cause of the problem on VMS is that command-line redirection
> is done by Perl and not by the shell. Tony's addition of the stdin
> parameter to runperl gives us the equivalent of:
>
> $ perl -e "exit 2;" | perl -e "exit 0;"
> %NONAME-E-NOMSG, Message number 00000002
> $ show symbol $status
> $STATUS == "%X00000002"
>
> The Perl process with an exit value of 0 is a child of the one that
> has an exit value of 2 so the final status we see in runperl is the
> exit value of the parent, not of the child. But the child is actually
> the test program whose exit value we're interested in and we don't get
> it this way.
>
> I *think* on *nix both perl invocations are subprocesses of the shell,
> or in any case $? is 0 when all is said and done, i.e., the exit
> status from the second Perl invocation.

Well, it turns out to differ between /bin/sh and /bin/ksh (which was the
cause of the AIX bug), but whether it's (ksh) child | parent
or (sh) child | child, either way the exit status of the pipeline is that
of the *last* command.

(This has turned out to be dangerous in the *nix Makefile. There were one
or two rules which were

some_command ... | sort >output

and if some_command failed, it wasn't noticed, because make picked up the
exit status of the sort, which was 0. I refactored things to avoid the need
for pipelines.)

This sort of means that the VMS pipeline setup is the wrong way round.
At least, it's the wrong way round if it wanted to be consistent with Unix.
Even if it's possible to change, I doubt that it would be a good idea.

> I have a working patch that fixes it by doing a piped open and writing
> to the pipe that provides the stdin of the child, then capturing the
> child's exit and passing it along. It's ugly and makes the VMS code
> in t/test.pl even more different than it already is. I've been
> sitting on it for a bit hoping to think of something better, but it
> looks like you've now done that. If /dev/null has portability problems
> on Win32, simply lifting the platform-specific code from
> File::Spec::dev_null should provide the solution in a very small
> amount of additional code.

Well, I sort of worked round it by not using a pipeline. But it achieves
the end goal.

Nicholas Clark

Nicholas Clark

unread,

Aug 23, 2013, 3:58:29 AM8/23/13

to Craig A. Berry, Perl5 Porters (E-mail), VMSperl Mailing List

Yes, it is. I've got spoiled by t/harness, which doesn't, and failed to
realise that t/TEST behaves differently.

Nicholas Clark