Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Literal pipes in [open "|cmd..."]

55 views
Skip to first unread message

Alexandre Ferrieux

unread,
Dec 3, 2006, 6:30:19 PM12/3/06
to
Hi,

I'm an old Tcler and this one looks like an FAQ, but... Who knows ?
When starting a child with [open "|cmd args" r], it seems impossible to
properly pass a single "|" character as a standalone argument of the
child. Is it, or did I miss a secret escape out of this little piece of
quoting hell ?

(I known workarounds like [open "|sh -c {exec cmd a | b}" r]. I just
want to be sure ther's no other way.)

TIA,

-Alex

Donal K. Fellows

unread,
Dec 3, 2006, 7:40:13 PM12/3/06
to
Alexandre Ferrieux wrote:
> Is it, or did I miss a secret escape out of this little piece of
> quoting hell ?

You are correct that there is no way to do that. Your workaround (using
an external shell) is the easiest one, and is probably why nobody's
bothered fixing the problem. If you're on Windows, I believe the
easiest thing involves something horrible with batch files...

Donal.

Cameron Laird

unread,
Dec 3, 2006, 7:18:17 PM12/3/06
to
In article <1165188618.8...@j44g2000cwa.googlegroups.com>,
.
.
.
Yes--at least, as far as I know. More certain is that these
exec-and-open problems are, in many ways, Tcl's most undeniable
blemish.

'Good to cross paths with you, Alexandre.

Alexandre Ferrieux

unread,
Dec 4, 2006, 2:27:46 PM12/4/06
to

On Dec 4, 1:40 am, "Donal K. Fellows" <donal.k.fell...@man.ac.uk>
wrote:


> If you're on Windows, I believe the
> easiest thing involves something horrible with batch files...

Naively, I'd have said something like

open "|cmd.exe /c {c a ^| b}"

but cannot try it right now...

(and yes, I forgot a backslash in my sh -c example which should read:

open "|sh -c {c a \| b}"

-A

Alexandre Ferrieux

unread,
Dec 4, 2006, 5:14:41 PM12/4/06
to

Of course by

> open "|sh -c {c a \| b}"

I meant:

open "|sh -c {c a \\| b}"

Once you've wandered in quoting hell, you never know whether you're
quite back :-)

-Alex

Andreas Leitgeb

unread,
Dec 7, 2006, 11:16:12 AM12/7/06
to
Cameron Laird <cla...@lairds.us> wrote:
> In article <1165188618.8...@j44g2000cwa.googlegroups.com>,
> Alexandre Ferrieux <alexandre...@gmail.com> wrote:
>>I'm an old Tcler and this one looks like an FAQ, but... Who knows ?
>>When starting a child with [open "|cmd args" r], it seems impossible to
>>properly pass a single "|" character as a standalone argument of the
>>child. Is it, or did I miss a secret escape out of this little piece of
>>quoting hell ?

> Yes--at least, as far as I know. More certain is that these


> exec-and-open problems are, in many ways, Tcl's most undeniable
> blemish.

It's not only with "|", but also with "<", ">", "2>",...

I once developed a patch (specifically for exec), which
would have made the problem workaround-able, but it had
two bad sides: it only applied to "exec" (by introducing
a new option, not to "open |", and second the option wasn't
exactly self-explanatory. If you don't mind shipping some
platform-specific native program with your tcl-scripts,
there is a trick at the end of this posting.

I mention it here, in the hope, that based on your current
problem, you might see yourself able to improve it. :-)

there would have been a new option "-quoted" to exec, which
caused the following behaviour:
each argument is checked, if it begins(no enclosing!) with
a quote-char ('), and for each that is, *at most one* such
quote-char is stripped off. (*after* pipes and redirections
have been parsed, and before calling out to the external
program).

What does that mean? You could write code like that:

exec -quoted myprog '| '' | myotherprog '<hello

where "myprog" would see "|" and "'" as it's arg, and
"myotherprog" would see "<hello" as it's arg. The non-quoted
pipe would still be taken as a pipe. "myprog" and
"myotherprog" being literal constants without magic chars,
obviously don't really need a "'" before them, but could
just as well have been written as "'myprog".

On one hand I don't like it, since it's hideously ugly,
but on the other hand, it would at least make it possible
to pass through arbitrary strings to external programs.
The ugliness could still be hidden through wrappers.

exec -quoted prog '$argument
would be safe (as long as "prog" is safe). Only for {*}-expanded
stuff, some extra caution would still be needed.


Now, if patching tcl is not an option, there is another trick:
You write a platform-specific program, that does just this:
from every argument it sees, strip at most one "'"-char, then
pass the rest to system's exec()-function. (Obviously, this
external program cannot be itself a tcl-script!) Let's call that
program "unquote":

exec unquote myprog '| '' | unquote myotherprog '<hello

voila!
I think this is still clearer than entering system-shell-quoting-hell.

PS: the patch was made for 8.4.3, and I doubt that it cleanly
applies with current versions, but it probably could be
changed to apply with not too much effort...
Here: <http://www.logic.at/people/avl/stuff/tcl-exec-patch>

Alexandre Ferrieux

unread,
Dec 7, 2006, 5:32:54 PM12/7/06
to

On Dec 7, 5:16 pm, Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at>
wrote:

> I once developed a patch (specifically for exec), which
> would have made the problem workaround-able, but it had
> two bad sides: it only applied to "exec" (by introducing
> a new option, not to "open |", and second the option wasn't
> exactly self-explanatory.

May I propose something a bit more "tclish" ?

Idea: a "-deep" flag to [exec], which means that each
command-and-args-list must be passed as a single list argument of exec.
This way, the toplevel args of exec are only:
- lists (command and args)
- redirections and pipes
Hence there's no room for ambiguity about the pipe/redir characters,
because if they appear as arguments of one of the commands, then they
are no longer at toplevel (they are in a sublist).

Examples:

exec a b c | d e f

can be also written

exec -deep {a b c} | {d e f}

And your example becomes

exec -deep {myprog | '} | {myotherprog <hello}

Isn't this more readable ?
Also, it can be extended to [open "|..."], for example by using a
leading double pipe:

open "||{a b c} | {d e f}"

corresponds to

exec -deep {a b c} | {d e f}

Reactions ?

-Alex

Fredderic

unread,
Dec 8, 2006, 3:16:20 AM12/8/06
to
On 7 Dec 2006 14:32:54 -0800,
"Alexandre Ferrieux" <alexandre...@gmail.com> wrote:

> Idea: a "-deep" flag to [exec], which means that each
> command-and-args-list must be passed as a single list argument of
> exec. This way, the toplevel args of exec are only:
> - lists (command and args)
> - redirections and pipes
> Hence there's no room for ambiguity about the pipe/redir characters,
> because if they appear as arguments of one of the commands, then they
> are no longer at toplevel (they are in a sublist).

> exec -deep {myprog | '} | {myotherprog <hello}

Sounds like a jolly good idea to me... Thought the switch "-deep", I
think, needs a better name... ;)

But having each command in its own individual list, with tokens such as
pipes and redirection separate, would make life an awful lot simpler in
many cases, and safer too in quite a few, with no need to quote
anything. (Except if the name of your program is "|", or something)


Fredderic

Alexandre Ferrieux

unread,
Dec 8, 2006, 3:54:41 PM12/8/06
to

On Dec 8, 9:16 am, Fredderic <put_my_name_h...@optusnet.com.au> wrote:

> Sounds like a jolly good idea to me... Thought the switch "-deep", I
> think, needs a better name... ;)

Oh, sure... No strong opinion here. What about :
-sub
-sublists
-nested
-ex (like Windows' WhateverFunctionEx with 15 args instead of 13 ;-)
-quoting-heaven
-safepipes
...
Any other idea ?

-Alex

Darren New

unread,
Dec 8, 2006, 6:07:14 PM12/8/06
to

You're addressing what it fixes or how it's implemented, rather than
what it's doing.

How about -processes or some such, indicating each argument is indeed
one process instead of one argument? Or -chunked. Or "-finally!" ;-)

--
Darren New / San Diego, CA, USA (PST)
Scruffitarianism - Where T-shirt, jeans,
and a three-day beard are "Sunday Best."

Alexandre Ferrieux

unread,
Dec 9, 2006, 5:27:53 AM12/9/06
to

On Dec 9, 12:07 am, Darren New <d...@san.rr.com> wrote:
> You're addressing what it fixes or how it's implemented, rather than
> what it's doing.

Yes, the reason being that the old syntax will still exist (I'm not
asking for a revolution). Hence the option should highlight what's
different.

> How about -processes or some such, indicating each argument is indeed
> one process instead of one argument?

See above: spawning processes is exec's job anyway, so for a newbie
reading [exec -processes] can be completely confusing.

> Or -chunked.

I prefer this one (though it's not parsecs away from -sublist ;-)

-Alex

Andreas Leitgeb

unread,
Dec 11, 2006, 6:29:27 AM12/11/06
to
Alexandre Ferrieux <alexandre...@gmail.com> wrote:
> Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at> wrote:
>> I once developed a patch (specifically for exec), which
>> would have made the problem workaround-able, but it had
>> two bad sides: it only applied to "exec" (by introducing
>> a new option, not to "open |", and second the option wasn't
>> exactly self-explanatory.

> May I propose something a bit more "tclish" ?

Actually, my strategy was to change exec as little as
possible, just to make it principially "safe".

Then some wrapper could be developed (under a different
name, e.g. "pipe", "process", "spawn") to make it really
user friendly.

This wrapper then could take all sorts of syntax, reorganize
them to safe exec arguments, and let exec do the OS-specific stuff.

PS: I do like the idea of using [open "||..."] to trigger
the behaviour for [open].

Fredderic

unread,
Dec 11, 2006, 10:03:53 PM12/11/06
to
On 11 Dec 2006 11:29:27 GMT,
Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at> wrote:

> > there would have been a new option "-quoted" to exec, which
> > caused the following behaviour:
> > each argument is checked, if it begins(no enclosing!) with
> > a quote-char ('), and for each that is, *at most one* such
> > quote-char is stripped off. (*after* pipes and redirections
> > have been parsed, and before calling out to the external
> > program).

I don't like that idea at all... It's ugly, and non-TCL as far as I
see it. It'd probably also be better to put that quote character on
ALL the non-command parts. eg. exec -quoted '| command arg '<@fileid


The idea of each command being a list would avoid the whole issue
entirely. Within a list, there is no redirection or anything else.
The first word is the command to execute, the rest are its arguments,
passed to the command verbatim. Even raw binary data could probably be
passed safely, if the OS will allow it.

In addition, it would probably also be entirely simpler than the
current exec implementation. If it's not an [exec] directive, then
it's a command. No extra parsing neccesary. You can probably do away
with the intermediary pipe directives entirely also (or more to the
point, they'd become optional).

In the event that the command itself looks like a redirection or pipe
directive, a "--" option or something could be used (in listed mode) to
inform [exec] that the next argument *IS* actually a command to run.


Though personally, I'd also like to see [exec] able to return a list of
the standard channels that were passed to it, and an option to take
a matching list to bind to before running the command (and/or discrete
options to set individual channels one at a time). Any standard
channels specified as {} in the option would be filled in
automatically with newly opened pipes. This again, would alleviate the
entire issue since [exec] then wouldn't need to parse its arguments at
all. Each and every non-option argument would be an argument to be
passed directly to the command being [exec]d.

Something along the (vague) lines of:
foreach {in out} [exec -fids -in {} -- command here] {break}
puts $in "Some input..."
set output [exec -in $out -- /bin/second half of the pipe]


Fredderic

Andreas Leitgeb

unread,
Dec 14, 2006, 9:11:47 AM12/14/06
to
Fredderic <put_my_n...@optusnet.com.au> wrote:
> It'd probably also be better to put that quote character on
> ALL the non-command parts. eg. exec -quoted '| command arg '<@fileid

This utterly fails, because it still restricts the range of
external commands' arguments. (they wouldn't be allowed
to start with a single-quote then, or they'd possibly be
mistaken for meta-arguments)

Currently, there are a couple of moderately-well-recogniseable
(but not escape-able) arguments, that are magic to exec.
Quoting the magic ones just moves the reserved space, but
doesn't remove it.

> The idea of each command being a list ...
Is just toooo strong a change for old achy-breaky exec :-)

> In addition, it would probably also be entirely simpler than the
> current exec implementation. If it's not an [exec] directive, then
> it's a command. No extra parsing neccesary. You can probably do away
> with the intermediary pipe directives entirely also (or more to the

> point, they'd become optional). ... [+ return fd's]

I'm not principially against this new syntax, I just think
it doesn't fit into exec. Not even with explicit options.
Otoh, I could imagine a new command with a clean syntax,
and exec being the compatibility-wrapper-function.

While we could compete each other thinking up gobs of
"nice to haves" for new-exec syntax, I think this just leads
to nothing being done.

Otoh, if we close just the principial gap in exec as concisely
as possible, this has a better chance of actually being done,
then you have all the time to come up with a really handy API
without bothering about the real lowlevel (os-specific) issues
of exec, by just footing on a gap-filled stable exec..


Fredderic

unread,
Dec 14, 2006, 11:08:20 AM12/14/06
to
On 14 Dec 2006 14:11:47 GMT,
Andreas Leitgeb <a...@gamma.logic.tuwien.ac.at> wrote:

> > In addition, it would probably also be entirely simpler than the
> > current exec implementation. If it's not an [exec] directive, then
> > it's a command. No extra parsing neccesary. You can probably do
> > away with the intermediary pipe directives entirely also (or more
> > to the point, they'd become optional). ... [+ return fd's]
> I'm not principially against this new syntax, I just think
> it doesn't fit into exec. Not even with explicit options.

What can be a more TCLish improvement to the existing [exec], then
options to set up the programs descriptors, and each program with its
arguments specified in individual lists. Certainly more TCLish than
the current shell-lookalike syntax, plus it gets rid of any [eval] type
behaviour with respect to the program and its arguments since they're
encapsulated within a list. The only remaining issue is the case of a
program that looks like an [exec] redirection/piping directive and has
no arguments.

Of course, the simplest is to simply support escaping the redirection
directives as I'd suggested, but with a slightly more thought-out
escape character. Though as pointed out, that's not going to be a whole
lot better, either, because whatever character you pick as the
escape, also has to somehow deal with the case of it NOT being an escape
character (double it up?). And now we're getting into some very
NON-TCL realms, indeed.

The command-in-a-list idea IS the best and simplest way to go I've seen
so far in this thread.


> Otoh, I could imagine a new command with a clean syntax,
> and exec being the compatibility-wrapper-function.

As for a new command with a clean syntax, the only way you're going to
get an exec with clean syntax, is by moving the bells and whistles of
[exec] into options before the program and its arguments, or even
removing most of them from the new [exec] command entirely.

1) Redirection by discrete options (or maybe a dict); would allow you
to attach the programs stdin to an already-opened channel, or join the
programs stderr to stdout. Dump the capability to return the programs
output, or feed it the contents of a variable as input, and always (or
at least an option to) run programs as background tasks.

2) A means to create a pipe that's linked to a variable or procedure.
(Possibly an extension to the [open] syntax, with a leading "<" or ">"
similar to the "|"). Something along the vague lines of this;

proc popen {mode arg} {
foreach {in ou} [create-a-pipe] {break}
switch -exact -- $mode {
"<" {
fileevent $ou write $arg
return $in
}
">" - out {
fileevent $in read $arg
return $ou
}
"<<" {
fconfigure $ou -blocking 0
puts $ou $arg
return $in
}
">>" {
fileevent $in write built-in-variable-appender $arg
return $ou
}
}
}

(I could well have my ins and outs and what-not confused, but you get
the idea)

3) Possibly restore some user-friendliness by allowing [exec]-like
redirection directives within the descriptor binding arguments
(though they'll still be a little simpler and neater).

4) Possibly providing a means to request that a descriptor be filled in
by a pipe automatically. You'd then need to make the return value a
list containing the pid of the process, and a list of the channels it
was provided. You could then use [lindex] to pluck out the one(s)
needed for chaining onto the next command.

5) A command (and/or an option) to wait for a child process to
complete. Having it as a discrete command would be more useful, but as
an option would be easier.

Have I missed anything? Though when it's all said and done, I'm not
entirely certain how much cleaner it'll end up being. Though there is
some rather handy functionality in there... Point #2 certainly has uses
outside of [exec].


Fredderic

0 new messages