On both unix and Windows, a process trying to write on a pipe whose
far end is closed, exits.
In unix terminology, that's because SIGPIPE defaults to exit
The rationale behind that is the ubiquitous use of idioms like
some_producer | head
But Tcl sets SIGPIPE to SIG_IGN (or masks it, which is the same),
which allows it to get an error value returned by write() instead of
losing control, and report it at script level with the exception
error writing <channel>: broken pipe
which, when uncaught before toplevel, adds to stderr the usual
traceback:
while executing
"foo bar"
(file "foo.tcl" line 123)
....
Of course this is annoying. But when the pipeline is longer, like in
tclsh a | tclsh b | tclsh c | ... | head
The result is hideous. Why on earth doesn't Tcl consider SIGPIPE just
as lethal as all other shells and utilities (sh csh cat sed awk grep
cut paste sort find ...) ?
Of course, the workaround is to wrap the whole toplevel of the script
in a [catch], and detect and intercept the "broken pipe" error. But
how ugly, indenting everything by one level just for that; not to
mention the brittlness of pattern matching on an error message...
-Alex
I agree that this is an annoying quirk. My pragmatic workaround in
scripts that typically run in pipes is
proc puts! str {if [catch {puts $str}] exit}
but of course I'd be happy if Tcl itself handles this situation more
graciously...
Well, that and that the shell arranges for things like ^C to signal the
last process in the pipeline rather than the first.
> The result is hideous. Why on earth doesn't Tcl consider SIGPIPE just
> as lethal as all other shells and utilities
Because SIGPIPE is also generated by broken sockets?
> mention the brittlness of pattern matching on an error message...
You're parsing the wrong error message. Check $errorCode.
--
Darren New / San Diego, CA, USA (PST)
Remember the good old days, when we
used to complain about cryptography
being export-restricted?
Because many Tcl scripts want to recover from an enclosed pipeline
terminating? (Yeah yeah, we ought to offer better signal handling.
Patches to make this work in threaded mode are welcome; we can always
steal the code from TclX for the non-threaded case.)
Donal.
Because the failing 'puts' might only be one (minor) part of your
application? As it is now, I have the chance to 'catch' that failing
'puts', take appropriate action (signal user, reopen pipe),
AND continue with the rest of the program.
R'
set a [exec somecommand]
which will deliver a SIGPIP to the interpreeter when
somecommand exits.
Nope... let's leave the behavior alone thank you very much.
Ron
Amen!
Don't touch SIGPIPE handling...
--
Matthias Kraft
Software AG, Germany
No. No SIGPIPE is delivered to the interpreter in that situation.
> Nope... let's leave the behavior alone thank you very much.
Oh, you're very welcome, but I advise you to learn a bit about signals
before giving definite advice like this.
-Alex
OK I've been unclear. I meant the *default* behavior. This implies
that some tool should have been available to get protection against
the lethal SIGPIPE. For example, [fconfigure -erronsigpipe].
Notice that this is exactly what happens in C: by default SIGPIPE is
lethal, but you can handle it yourself by ignoring SIGPIPE and
handling the EPIPE from write().
Now I do realize that Tcl has been this way around for so many years
that such a compatibility earthquake is impractical ;-) But an
acceptable compromise would be to add to the beginning of all proper
scripts:
fconfigure stdout -lethalsigpipe
Which of course doesn't need any signal handler (so it is readily
Threads-friendly); just intercept EPIPE returned from write() and exit
instead of raising an exception. Yes I know who you think should write
it ;-)
-Alex
That's an intriguing possibility.
TIP maybe?
--Joe English
That really depends. Sometimes it will and sometimes it won't. For example:
set a [exec echo foo <<bar]
will send a SIGPIPE. That would be annoying, especially as in some cases
programs can terminate early if an unexpected error condition occurs
(e.g. you're trying to dump a file to lpr, but lpd is down) and it is in
the script's interest to recover from the failure by itself instead of
keeling over.
It should also be noted that on some operating systems it is not just
writes to pipes that can cause SIGPIPE to be sent. A case in point is
that of sockets on Solaris. You don't want your tclhttpd instance
keeling over just because the client timed out on you!
Arguably, making EPIPE optionally lethal (but only settable in trusted
interpreters) is best. But that's just like putting in a low-level
"if-catch-puts-then-exit", and in fact you could script that in
yourself, like this.
package require Tcl 8.5 ;# I use 8.5-isms :-)
set deadlyFDs stdout
rename puts real_puts
proc puts args {
catch {real_puts {*}$args} msg opt
if {
[dict get $opt -code] == 1 &&
[string match "POSIX EPIPE *" [dict get $opt -errorcode]]
} then {
global deadlyFDs
set ch [lindex $args end-1]
if {$ch eq "" || $ch eq "-nonewline"} {set ch stdout}
if {$ch in $deadlyFDs} exit
}
return -options $opt $msg
}
Now, tell me again why we need to handle SIGPIPE?
Donal.
"...it is delivered to the interpreter."
You said
> No. No SIGPIPE is delivered to the interpreter in that situation.
Are these different?
If the interpreter has set SIGPIPE action to exit as you propose
that will cause the script to exit... as it is the
interpreter that is executing the script...there's no
separate 'script process' the interpreter handling of SIGPIPE
and script 'handling' of SIGPIPE are not separable, unless exposed
to the script via some interface.
Unless what you propose is that SIGPIPE handling be turned on
and off within the interpreter.. in which case I think you need
to specify more clearly what you want....that is when you
want the interpreter to allow SIGPIPE to cause it to exit,
and when you want it to be caught, or ignored and how it should
differntiate between those cases, and what action to take if it
catches a signal rather than exiting.
I'm in agreement with Donal that any change in SIGPIPE handling
should be something that must be explicitly requested by the
script. On the other hand, that's exactly what the tclx extension's
[signal] command supports. Is there a reason you can't use that?
Ron.
fconfigure filedescriptor -sigpipe action
where action might be one of:
lethal
ignore
script {some command}
That would give some nice general functionality everyone might
enjoy and that tclx does not offer.
Ron.
The one reason I have for objecting to this is that I don't think
SIGPIPE tells you which FD had the pipe-closed condition. Better to
structure it as an EPIPE handler.
Donal.
I once thought about
filevent $fd exception $script
to handle EOF, SIGPIPE and friends
uwe
That's what he suggested in the start of this subthread, see the EPIPE
error on the write and act accordingly... so -sigpipe is probably a bad
name as well...maybe -pipeerroraction is a better option name?
Or maybe -onremoteclose, and so hide the details about how the condition
is detected? (OK, it's late so my name-creating powers are weak...)
Donal.
-alexandreferrieux
;-)
Short of anything better, I'm tempted to settle for that. I'm more
interested in
seeing it happen than finding a kewl name... Let's TIP, then.
-Alex
Is there any difference between Tcl as a shell (tclsh) and Tcl as a
language? I can't see why a program should exit just because of an
error. Communication errors in particular are not unexpected. The
reason for so many types of error codes is so that the application can
recover, not to provide more signals to exit. Also, when you consider
that Tcl is a glue language, exiting by default on anything short of
unrecoverable disaster seems unwise. Adding a callback switch seems
like a poor way to code your recovery procedure. Error detection and
recovery shouldn't be hidden away somewhere like they are an
unimportant part of the application. Maybe the code would look nice,
but it would also look even nicer if these critical operations were
bundled up in a proc and reused.
Primarily because traditionally, UNIX programs didn't check the return
value from write() to see if the call worked. A signal that defaulted to
crashing the program was the only way to get their attention when you
hit ^C. I don't think Tcl needs such, given that it throws an error
that crashes the program "cleanly" if you don't catch it.
See the beginning of the thread for the rationale.
See the end for the conclusion: we're heading for a [fconfigure]
switch, which you're welcome not to use if you dislike it.
The main target of this switch is stdout, for _filters_, which are a
specific case of glue, with specific properties: when a filters gets
to know that nobody's interested in its output any longer, it makes
sense to exit. The situation is of course utterly different for a more
complex glue with higher connectivity.
-Alex
Not that I oppose a new fconfigure option, but I still haven't seen
any explanation of what keeps you from exiting now if you want to.
I see - "I get an error during a script, i really would prefer to exit"
I don't see why "if [catch {do stuff}] exit" doesn't satisfy.
Bruce
On Oct 13, 9:38 am, Alexandre Ferrieux <alexandre.ferri...@gmail.com>
wrote:
I did see the beginning of the thread. Tcl is a programming language.
Bash, and tclsh, and less and more are programs. Putting a switch on a
channel is trying to turn a command into a program. Are there any
other examples of commands doing this? In any language? Even a shell
(the shell command, not the shell)? It seems like poor programming
style and pure laziness. Programming languages need to be simple.
Actually this is one of my biggest gripes with switches. Some
developers think that a switch is okay to add because it is backwards
compatible (the "don't use it if you don't need it" argument...All
switches are used by everyone, every call passes through the switch
handling, source code becomes more complex). In most cases, it is
better to wrap a regular API with a filter to get the behavior you
want.
Instead of coding behavior into a switch, you could instead add
something else to fconfigure. Maybe fconfigure could return the last
result code. Then this tiny amount of information could be used by
developers familiar with channel programming to decide what to do.
Programming channels/sockets/fds with Tcl is easy because much of the
real complexity is removed, trying to add a little bit of it back via
switches is very short sighted. One other problem is that the call to
fconfigure could be well removed from the actual use of the channel
and the decision to handle the error.
I'm not sure this response is coherent, but looking at fconfigure, I
see channel configuration (get and set options), and at most
filtering. Exit code is not a filter, it is a branch. My argument is
that branch code should not be in switches. I think this is distinct
from constructs like try..., etc. where you make very explicit, in one
place, what is going on.
Just to note that there is a huge difference between program behavior
and command behavior. Shell pipelines are composed of programs, not
commands. Solving this problem with a switch really trivializes the
issue. In a shell pipeline, if a program fails to work as a filter,
the whole thing has failed. There is no other possible interpretation:
pipes don't retry, loop, or branch off. Any single point of failure is
fatal. Essentially a pipeline is a new composite command. It returns
an error and 'exits' (actually it just finishes badly), but the OS
doesn't exit.
It doesn't satisfy conciseness and readability.
I'd like to preserve the ability to express simple things in a few
lines like
while {[gets stdin line]} {
if {[regexp $pat $line]} {puts $line}
}
as a tiny grep. Adding a toplevel if {[catch]} is of course possible,
but blurs the elegance of the idiom.
Elegance and conciseness matter in several situations. The main one is
writing few-liner filters like the above. When you aim for modularity
and orthogonality, it's not unusual to split a processing chain into a
dozen of such filters. Another likely environment is the Computer
Language Shootout: there, languages get compared in expressivity and
style on various tasks, some of which are very simple.
-Alex
I'd code that, for benign close-pipe behavior (which just adds one
line):
proc puts! str {if [catch {puts $str}] exit}
while {[gets stdin line]} {
if {[regexp $pat $line]} {puts! $line}
}
The proposed solution with fconfigure also consumes a line, so you are
even.
However, the original premise is that tclsh doesn't work well in a
pipeline. That it does something strange. So, I decided to test the
behavior with a little program. I rewrote it to gets stdin and puts to
stdout. The claim is that there is a difference between calling exit
and responding to an error and unwinding. Here is the pipeline:
$ cat /tmp/errors.txt | ./matchLinePipe a | grep -c "" /dev/fd/0
1492
Then I added into the while loop this code:
if {$matches > 100} { exit }
Result was 101 instead of 1492.
Then I changed [exit] to [set expression {[ab}], a regexp which will
not compile.
An error is printed to console, but grep gets all the data and returns
101.
If tclsh doesn't work well in a pipeline, then we have a real problem,
because other things besides a broken pipe can cause a program to
error out. But in this example, I don't see anything wrong. The error
didn't pass through to grep. If I put the results into a file, the
error doesn't show up there either. So, other than seeing the error on
your screen, I'm not sure what the issue is here.
Here is the entire test program:
#!/bin/sh
# This line continues for Tcl, but is a single line for 'sh' \
exec /web/nsd45/bin/tclsh8.4 "$0" ${1+"$@"}
global argv
if {[string eq "-c" [lindex $argv 0]]} {
set count 1
set argv [lreplace $argv 0 0]
} else {
set count 0
}
set expression [lindex $argv 0]
set matches 0
while {![eof stdin]} {
set line [gets stdin]
if {[regexp $expression $line]} {
incr matches
if {$count} {
} else {
puts stdout $line
}
}
if {0} {
# test error conditions here
if {$matches > 100} {
set expression {[ab}
}
}
}
if {$count} {
puts stdout $matches
}
If I am concerned with seeing error messages I can do this:
$ cat /tmp/errors.txt | ./matchLinePipe a > result 2>/dev/null
$ grep -c "" result
101
Yes, but the performance hit is too much for small filters.
-Alex
My only problem is that if a,b,c,d, and e are simple tclsh scripts
doing uncaught [puts], then the frequent idiom
a | b | c | d | e | head
will emit five times several lines of errorInfo to stderr.
That's not a tsunami, right, but if the stderr of this goes to a log
file (which it often does in my case), it tends to "dilute" more
interesting errors.
-Alex
unfortunately only "enter" and "leave" are currently supported ;-)
tip di tip ...tata
uwe
Ahh, after some more testing, what I see is that each tclsh script
a,b,c... may keep chugging along regardless of what is happening
upstream. I tried this:
$ cat /tmp/errors.txt |./matchLinePipe a | ./matchLinePipe2 a | grep
-c {[ab} /dev/fd/0
Got an error for grep, then got an error for the first matchLinePipe
(at 101, regexp is made bad), then the second matchLinePipe2 aborts
due to a broken pipe writing to grep. I tried adding flush to each
puts, but it didn't help. Is this caused by a channel buffer? Where is
the data going? It seems like everything should fail with a message at
the first iteration. If the problem is with detecting the downstream
exit, it might be worth having, but not necessarily an exit, just an
error to puts/gets.
OK. Suddenly I realize you're not that familiar with what the SIGPIPE
mechanism is about...
The key is that each SIGPIPE needs (1) a write() and (2) that the
reader be already dead.
So, the end of a pipeline like a | b | c | d | e | head is a five-step
"progressive wave" of doom:
Suppose for simplicity that all commands are line-buffered, and do
have the default libc SIGPIPE handler set to exit.
(1) after 10 lines, head dies.
(2) Next line written by e kills it (SIGPIPE)
(3) Next line written by d kills it (SIGPIPE)
(4) Next line written by c kills it (SIGPIPE)
(5) Next line written by b kills it (SIGPIPE)
(6) Next line written by a kills it (SIGPIPE)
The progressiveness can be fairly conspicuous if some of the early
filters pass only a small percentage of their input (eg grep w/ rare
patterns). But in most everyday uses, the process is quick enough (so
much so that you never noticed ;-).
This mechanism is in fact a natural extension of the overall flow
control in the pipeline: if e blocks (stops reading), eventually d
will block too (after filling the pipe's buffer of course), then c,
then b, then a. This allows you to write
a | b | c | d | e | less
and notice that, however CPU-intensive the generator a and filters
b,c,d,e may be, while you sit reading at less's display, not
scrolling, computations end up *pausing*. The SIGPIPE mechanism, then,
extends this by allowing you to quit "less" by typing its own quit-
command (Q), and have the rest of the chain properly (though not
instantaneously) dismantled. No, don't tell me you exit by ^C in this
case ;-)
The ubiquity of the default SIGPIPE->exit binding is a Good Thing in
this context, because if you had to rely on the five developers of
a,b,c,d,e to properly react to a write error, then most of the time
one member of the chain would just continue until its input were
exhausted, which may be never (yes | grep y).
Notice also that even strong discipline among the programmers, eg
saying Thou Shalt Always Exit On Write Errors, is not a solution
either. Indeed, the EPIPE case (the errno value set by write when
SIGPIPE is masked) is, as shown above, a kind of "normal" error; while
"Disk Full" is more of an bnormal condition, which should at least be
reported before exiting. This means that the recommended idiom should
contain the EPIPE test.
Now you see the background, maybe you'll stop thinking that the libc
tradition of SIGPIPE-exit is just a sweetie for lazy programmers.
Otherwise, I have bad news fro you: I'm not alone ;-)
-Alex
Well, the difference is that my line works right now in 8.4, while the
fconfigure solution is a feature request :^) I'm often happy to find a
fast workaround, just to get the job done (and not make a bad
impression on other users of my scripts, who most often still are at
8.4.1 ...)
Alex,
I admit that I'm starting to warm up to your idea. Thanks for
persisting in explaining the situation. At the start of the discussion
I made an assumption about puts/gets, and that was that they would
fail instantly and the upstream and downstream programs would somehow
know this. Now I'm not too sure I understand the timing of program
failure and writing to stderr, etc. I'm also not too sure what is to
be expected.
The bottom line appears to be that each program in the pipeline is
relatively independent. Each one may continue to chug-along at a
somewhat independent rate. Assuming a buffer in either in or out, it
seems difficult to predict behavior. Personally I'm more interested in
the buffering than what goes to stderr. From my limited tests, each
program in the pipeline gets to emit its own error message. Why did it
fail? Grep emits one, tclsh is verbose, but each one also emits an
error. If a program doesn't error out, it doesn't send anything to
stderr. But in my simple test, grep exited because the regexp didn't
compile, then the first tcl program exited because it got a bad
regexp, then the second tcl program, between the two, exited
because grep exited. I see four programs (cat was first), three
exited for different reasons and at different times. Which of these is
unimportant? Can you predict in advance? I'm surprised at the
complexity, but I don't know how exiting without printing a message to
stderr would be helpful.
Yes Richard, I'm the first to applaud to something working right now.
I myself am playing everyday with 8.3.5 on dozens of machines (Fedora
Core 1 shipped with that one, and upgrading a working system makes no
sense).
But my request (soon to be turned into a TIP, promised) is not in a
hurry. Of course I know I can overload puts, or even as you suggest
write a variant. But that comes with a runtime cost. My dream would be
that the tiny grep-loop higher up in this thread become as fast as
reasonable given the level of bytecode compilation we have at a given
time [*], while exactly emulating the nice SIGPIPE behavior of the
libc tradition.
-Alex
[*] This means the speed of [gets]/[regexp]/[puts] with little
overhead.
What you describe as "difficult to predict", I prefer to call
"globally predictable"; this means that some uncertainty remains
(described below), but it does absolutely no harm.
Here is why:
An important notion with pipes is the (limited, but nonzero)
"elasticity" provided by the pipe's buffer. That is, in "a | b", even
if "b" doesn't read anything on its stdin (for example because it is
busy initializing some internal, complicated thing), "a" can still
start writing things, up to PIPEBUF bytes (typically 4k), before
blocking. This produces the "somewhat independent rate" you mention.
This elasticity has important consequences on performance: even when
"a" writes by very small chunks, the OS's scheduler is not compelled
to yield the CPU to "b" at each and every one, as would be the case
with "rigid" connectivity. Instead, "a" will consume its time-quantum
fully (assuming the sum of its writes don't exceed PIPEBUF), and
*then* "b" will be woken up, and be content to receive the whole load
in one read() syscall (assuming the length argument is large enough).
The "elasticity->performance" rule applies also one layer up with
stdio's buffering. The very same description applies, replacing kernel
PIPEBUF buffers with stdio's userland (FILE *) buffers, and context
switch overhead by syscall overhead.
In everyday life, the usual unix tools use stdio in the "canonical"
way, that is, without forcing either block-buffered or unbuffered
output. In this case, stdio uses a smart heuristic to guess what is
best: if the stdout is a terminal, switch to unbuffered, so that the
human on the other side will be happy to see timely results; if the
stdout is a file, pipe, or socket, instead, switch to block-buffering
so that scheduling elasticity is maximal and syscall overhead minimal,
so that the human in the end is still happy to see the whole job
finish in no time over gigabytes of input.
All the above is here to illustrate the "globally predictable" part.
Now, what's the remaining uncertainty, and why doesn't it do any
harm ?
It is in the "position" within its own input, at which each of the
a,b,c,d,e processes will die. The reason is clear: "elasticity" means
that you'll be "ahead" by a various amount, depending on which is the
"scarce" resource among:
- input: producer is slow, you block on read()
- output: consumer is slow, you block on write()
- own-cpu: your own processing is slower than i/o,
you never block but force the one on your left to block on
write() and the one on your right to block on read()
So, for example, in the second case, when you get the lethal SIGPIPE,
maybe the consumer will not even have read the last PIPEBUF bytes you
wrote ! In another situation, where rare inputs get instantly through
the whole chain (everybody blocks on read()), the progressive wave
will take a long time to get rid of "a".
So, the exact ordering of things and position-of-doom, vary broadly.
But who cares ?
If you followed me so far, you'll see that the whole mechanism of
preemptive scheduling and interprocess-communication, whose job is to
distribute CPU fairly in a weakly-coupled ecosystem of processes, does
its job correctly ;-) Specifically, the finite PIPEBUF value
guarantees that the amount of "chugged-ahead" data (which are "wasted"
from the human's point of view) stay limited. Could we expect anything
better ?
-Alex