Cmd pipe failure

155 views
Skip to first unread message

dslate

unread,
Jul 4, 2012, 8:27:46 AM7/4/12
to juli...@googlegroups.com
While playing around with Cmd objects, I ran into a situation in which a Cmd pipe fails because of the irrelevant failure of one of its components.  A common type of pipe written in the shell (sh, bash, etc.) looks like:

  ./program | sed 5q

The pipe input, ./program, may abort due to a write failure to a broken pipe after sed has read its 5 lines, but that is of no consequence to the output of the whole command line.  However, a similar pipe constructed using julia Cmd objects apparently fails without producing its output if one of its components aborts.  The following julia log shows what happens.  Has this situation has been considered by the developers of Cmd, and if so is there a convenient workaround (besides combining the pipe components into a separate shell command file)?

--------------------------------------------------------------------------------------------------------
Script started on Wed Jul  4 07:01:51 2012
david@LC2430HD:~/julia$ cat tstcmd.jl
println( VERSION)
println( readall( `uname -a`))
println( readall( `date`))
system( "dd if=/dev/urandom bs=512 count=1000 | base64 >tstcmd.in")
system( "cat tstcmd.in | sed 5q >tstcmd.out")
println( readall( "tstcmd.out"))
println( readall( `cat tstcmd.in` | `sed 5q`))
david@LC2430HD:~/julia$ 
david@LC2430HD:~/julia$ julia tstcmd.jl
0.0.0+90138694.r4e63
Linux LC2430HD 3.2.0-26-generic #41-Ubuntu SMP Thu Jun 14 17:49:24 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Wed Jul  4 07:02:17 CDT 2012

1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.0486268 s, 10.5 MB/s
fv5uw7Vu3cMhg7l6gk23CATZEfCddcaZRoY5+0uVk3a4ax3Alxim0lXfIMOzMkDCfg3B+RHveh+/
tqcYPblFS/FY3Zh8edB0eO2VmtiIRrSyO/5r5HyddxvtO58kde9YvrdXn7q0WaF0gHedQE4CbF+B
6xMFNA+nNUIdiNmkZEzF+qiyAzWUEdXxj21rzgFG81uK2pEPD4NQxuY13oA4z27XT1g2RFZnepHf
mSGVSkX7Ohg38LoZVl3tlUf1+RcRsSGwGXA6mkyCXz/OSAhCKrGB9Spq8CnKbnqzCPZY+2rDURnV
g3an1Bpn0K5iaEEq75+70s7G5Xo/zQkrk0rveXzbVYaX/DsWnuMqhCxp+FLxXgWCt4Wo6suTLM0B

failed process: `cat tstcmd.in`
 in pipeline_error at process.jl:468
 in _readall at process.jl:485
 in readall at process.jl:492
 in include at boot.jl:197
at tstcmd.jl:7
 in include at boot.jl:197
david@LC2430HD:~/julia$ exit
exit

Script done on Wed Jul  4 07:02:20 2012
--------------------------------------------------------------------------------------------------------


Miguel Bazdresch

unread,
Jul 4, 2012, 8:49:46 AM7/4/12
to juli...@googlegroups.com
In my machine, running


system( "cat tstcmd.in | sed 5q >tstcmd.out")

works properly (substituting tstcmd.in for the name of a local large text file).

Are you using a recent build of Julia?

-- mb

dslate

unread,
Jul 4, 2012, 9:12:07 AM7/4/12
to juli...@googlegroups.com
Miguel,

Yes, the system() command works, but the julia Cmd pipe doesn't.  That's the point of my post.

-- Dave Slate


On Wednesday, July 4, 2012 7:49:46 AM UTC-5, Miguel Bazdresch wrote:
In my machine, running

system( "cat tstcmd.in | sed 5q >tstcmd.out")

works properly (substituting tstcmd.in for the name of a local large text file).

Are you using a recent build of Julia?

-- mb

Miguel Bazdresch

unread,
Jul 4, 2012, 9:34:17 AM7/4/12
to juli...@googlegroups.com
I'm sorry; I missed that. In any case, I ran your script and it indeed errors out as you describe. What is weird is that if I run

println( readall( `cat COPYING` | `sed 5q`))

Julia prints the first five lines of COPYING with no error (this file contains the BSD license). So there is something in testcmd.in that is causing the pipe to fail.

-- mb

Miguel Bazdresch

unread,
Jul 4, 2012, 10:31:43 AM7/4/12
to juli...@googlegroups.com
I think this is related to the size of tstcmd.in. I did a manual binary search over all files sizes. My results are that this works:

system( "dd if=/dev/urandom bs=1 count=51543 | base64 >tstcmd.in")
run(`cat tstcmd.in` | `sed 5q`)

and this fails:

system( "dd if=/dev/urandom bs=1 count=51544 | base64 >tstcmd.in")
run(`cat tstcmd.in` | `sed 5q`)

-- mb

dslate

unread,
Jul 4, 2012, 12:54:42 PM7/4/12
to juli...@googlegroups.com
My understanding is that if there is enough room in the pipe buffer for the complete output of the pipe writer, then the writer won't necessarily abort when the pipe reader quits before reading all of the pipe buffer's contents .  But if the pipe fills up and blocks waiting for a reader that is no longer running, then that will definitely cause the writer to abort.

I hope that makes sense,

-- Dave Slate


On Wednesday, July 4, 2012 9:31:43 AM UTC-5, Miguel Bazdresch wrote:
I think this is related to the size of tstcmd.in. I did a manual binary search over all files sizes. My results are that this works:

system( "dd if=/dev/urandom bs=1 count=51543 | base64 >tstcmd.in")
run(`cat tstcmd.in` | `sed 5q`)

and this fails:

system( "dd if=/dev/urandom bs=1 count=51544 | base64 >tstcmd.in")
run(`cat tstcmd.in` | `sed 5q`)

-- mb

Stefan Karpinski

unread,
Jul 5, 2012, 2:26:32 PM7/5/12
to juli...@googlegroups.com
The error message in your original post indicates that the cat command doesn't exit cleanly (SPOILER: it dies from receiving an unhandled SIGPIPE signal). In general, Julia's command execution is far more careful (aka finicky) when it comes to termination status of subprocesses than the shell or other programming languages. In the shell (and therefore in system), how a process terminates is completely ignored by default. I wrote a blog post back in March about this kind of thing if you want to read it:


I've been meaning to publish a follow-up since then but haven't gotten around to finishing it up :-\

In this case, let's see the difference:

bash-3.2$ (cat /dev/zero | head -c100) && echo true
true
bash-3.2$ set -o pipefail
bash-3.2$ (cat /dev/zero | head -c100) && echo true

Before I set the pipefail option, the pipeline is considered true because the head command succeeds, but the cat command fails because its output stream gets forcibly closed. After the pipefail option is set, bash considers the pipeline to have failed if any component fails, so the failure of cat prevents success. Either way, system doesn't care and will just ignore the exit status of all commands. Julia is not so happy-go-lucky:

julia> readall(`cat /dev/zero` | `head -c100`)
failed process: `cat /dev/zero` [ProcessSignaled(13)]
 in pipeline_error at process.jl:459
 in pipeline_error at process.jl:468
 in _readall at process.jl:488
 in readall at process.jl:495

The cat process terminated because it got an untrapped signal 13 (SIGPIPE), which makes sense since head just closes the pipe and cat keeps trying to write to it. [I just improved the error message so that it includes the process status. Also, I have no idea why the cat process sometimes prints "cat: write error: Broken pipe" and sometimes doesn't.]

Julia, by default, considers any termination status besides exit(0) to indicate that something went wrong. If we want to ignore the status of cat in this pipeline, we can use the ignorestatus function:

julia> readall(ignorestatus(`cat /dev/zero`) | `head -c100`)
"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"

I had to tweak the code a bit to make this work because ignorestatus used to treat processes with any *exit* status as successful, but still considered processes that died from an untrapped signal to be failures. Now it considers both to be fine.

dslate

unread,
Jul 5, 2012, 3:16:47 PM7/5/12
to juli...@googlegroups.com
Thanks Stefan for your detailed explanation.  Yes, I had read your "shelling-out-sucks" post.  I was not aware of the ignorestatus function.

-- Dave Slate

John Cowan

unread,
Jul 5, 2012, 7:02:46 PM7/5/12
to juli...@googlegroups.com
On Thu, Jul 5, 2012 at 2:26 PM, Stefan Karpinski <ste...@karpinski.org> wrote:

> In general, Julia's command execution is far more careful (aka
> finicky) when it comes to termination status of subprocesses than the shell
> or other programming languages.

I very much agree that this is the Right Thing in general, but I also
strongly believe that SIGPIPE should be treated as normal termination.
Pipes are essentially a form of lazy data stream connecting two
coroutines, in which the consumer resumes the producer when it needs
more input. If the consumer chooses not to read from the producer,
this ought not to be treated as a fault in the producer, which is what
any other sort of signal would mean. The reason for SIGPIPE to happen
at all is to prevent useless processes from hanging about with nowhere
to send their output to.

In particular, using ignorestatus on any occasion when the consumer
*might* terminate spontaneously without reading all of its input ("sed
q" is just one such case), which in principle can mean *any* process
that writes to a pipe, will wind up masking actual errors (the best is
the enemy of the good). It should be possible to determin that a
SIGPIPE has occurred, of course, but it should not propagate an error
into the controlling process.

--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures
Reply all
Reply to author
Forward
0 new messages