[Haskell-cafe] createProcess running non-existent programs

Niklas Hambüchen

unread,

Aug 12, 2012, 9:18:56 PM8/12/12

to haskel...@haskell.org

I just came across the fact that running

createProcess (proc "asdfasdf" [])

with non-existing command "asdfasdf" returns perfectly fine handles.
I would expect an exception.
You can even hGetContents on stdout: You just get "".

I find this highly counter-intuitive. Is this intended?

Thanks
Niklas

PS: I checked how some other programming languages do this:

Python:
import subprocess; subprocess.call("asdfasdf", shell=False)

OSError: [Errno 2] No such file or directory

Ruby:
exec("asdfasdf")

Errno::ENOENT: No such file or directory - asdfasdf

_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Evan Laforge

unread,

Aug 13, 2012, 2:18:30 AM8/13/12

to Niklas Hambüchen, haskel...@haskell.org

On Sun, Aug 12, 2012 at 6:18 PM, Niklas Hambüchen <ma...@nh2.me> wrote:
> I just came across the fact that running
>
> createProcess (proc "asdfasdf" [])
>
> with non-existing command "asdfasdf" returns perfectly fine handles.
> I would expect an exception.
> You can even hGetContents on stdout: You just get "".
>
> I find this highly counter-intuitive. Is this intended?

Yes, I ran into the same thing a while back. The problem is that the
subprocess has already been forked off before it runs exec() and finds
out the file doesn't exist. The reason python reports the right error
is that it sets up a pipe from child to parent to communicate just
this error. It's more friendly, but on the other hand the
implementation is more complicated.

If you don't want to hack up the whole send-exception-over-the-pipe
thing, the easiest thing to do is to wait for the processes's return
code. If you don't want to do that, you can at least have the
subproces log, e.g.:

loggedProcess :: Process.CreateProcess -> IO (Maybe IO.Handle,
Maybe IO.Handle, Maybe IO.Handle, Process.ProcessHandle)
loggedProcess create = do
r@(_, _, _, pid) <- Process.createProcess create
Concurrent.forkIO $ do
code <- Process.waitForProcess pid
case code of
Exit.ExitFailure c -> notice $
"subprocess " ++ show (binaryOf create) ++ " failed: "
++ if c == 127 then "binary not found" else show c
_ -> return ()
return r
where
binaryOf create = case Process.cmdspec create of
Process.RawCommand fn _ -> fn
Process.ShellCommand cmd -> fst $ break (==' ') cmd

As an aside, I've had the idea to at some point go look at the latest
python's version of the subprocess module and see about porting it to
haskell, or at least make sure the haskell version doesn't suffer from
problems fixed in the python one. They went through a lot of
iterations trying to get it right (and earlier python versions are
broken in one way or another) and we might as well build on their
work.

Andrew Cowie

unread,

Aug 13, 2012, 4:16:50 AM8/13/12

to haskel...@haskell.org

On Sun, 2012-08-12 at 23:18 -0700, Evan Laforge wrote:
> Yes, I ran into the same thing a while back. The problem is that the
> subprocess has already been forked off before it runs exec() and finds
> out the file doesn't exist.

Given how astonishingly common it is to pass an invalid executable name
and/or path, wouldn't it be worth doing a quick probe to see if the file
exists before createProcess actually forks?

[It's not like the effort the OS is going to do for the stat is going to
be thrown away; whether that call pulls it up off of disk or the one
after the fork that exec will do doesn't matter]

AfC
Sydney

signature.asc

David Feuer

unread,

Aug 13, 2012, 5:23:05 AM8/13/12

to Andrew Cowie, haskel...@haskell.org

In Unix, at least, "check, then act" is generally considered unwise:
1. Something can go wrong between checking and acting.
2. You might not be checking the right thing(s). In this case, the fact that the file exists is not useful if you don't have permission to execute it. You may not be able to determine whether you have the appropriate permissions without fairly deep manipulation of ACLs.
3. Even if the OS has the info at hand, making the system call(s) necessary to get it is not free. Various non-free things happen every time control passes between user-space and kernel-space.

Alexander Kjeldaas

unread,

Aug 13, 2012, 7:26:28 AM8/13/12

to David Feuer, haskel...@haskell.org

This isn't that hard - a pipe shouldn't be needed anymore. Just require a post-2003 glibc.

fexecve is a system call in most BSDs. It is also implemented in glibc using a /proc hack.

http://www.kernel.org/doc/man-pages/online/pages/man3/fexecve.3.html

Apparently, there are proposals/RFCs to get a system called execveat into the linux kernel which makes this work properly without /proc.

http://www.gossamer-threads.com/lists/linux/kernel/1574831

Alexander

Donn Cave

unread,

Aug 13, 2012, 10:23:42 AM8/13/12

to haskel...@haskell.org

Quoth Evan Laforge <qdu...@gmail.com>,
...
> ... or at least make sure the haskell version doesn't suffer from

> problems fixed in the python one.

Exactly. This morning I'm reading suggested solutions that would
work only some of the time, or on only some platforms, which wouldn't
be satisfactory in the long run.

Though speaking of platforms, I guess one large headache would be
what to do about Microsoft operating systems. Given the unusual
nature of these functions (I mean, what operating-system-independent
command are you going to invoke, anyway?), maybe it would be OK for
the more elaborate support functions to be POSIX / Windows specific.
At the level where people are redirecting the output FD and not the
error FD, etc.

Donn

Brandon Allbery

unread,

Aug 13, 2012, 10:28:50 AM8/13/12

to Alexander Kjeldaas, David Feuer, haskel...@haskell.org

On Mon, Aug 13, 2012 at 7:26 AM, Alexander Kjeldaas <alexander...@gmail.com> wrote:

This isn't that hard - a pipe shouldn't be needed anymore. Just require a post-2003 glibc.

So, we are desupporting the *BSDs and OS X (and Solaris etc.) now? glibc is only used on Linux and the Hurd (and debian kfreebsd, if that hasn't fallen on its face yet).

POSIX has some new spawn-type calls, btw, but I don't know how widely implemented they are or how buggy they are.

--
brandon s allbery allb...@gmail.com
wandering unix systems administrator (available) (412) 475-9364 vm/sms

Brandon Allbery

unread,

Aug 13, 2012, 10:29:58 AM8/13/12

to Donn Cave, haskel...@haskell.org

On Mon, Aug 13, 2012 at 10:23 AM, Donn Cave <do...@avvanta.com> wrote:

Though speaking of platforms, I guess one large headache would be
what to do about Microsoft operating systems. Given the unusual

Microsoft provides APIs that work as is for this, by my understanding; it's the POSIX fork/exec model that makes life difficult.

Donn Cave

unread,

Aug 13, 2012, 11:14:33 AM8/13/12

to haskel...@haskell.org

Quoth Brandon Allbery <allb...@gmail.com>,

> On Mon, Aug 13, 2012 at 10:23 AM, Donn Cave <do...@avvanta.com> wrote:
>
>> Though speaking of platforms, I guess one large headache would be
>> what to do about Microsoft operating systems. Given the unusual
>>
>
> Microsoft provides APIs that work as is for this, by my understanding; it's
> the POSIX fork/exec model that makes life difficult.

Or interesting, anyway. I wasn't thinking of the `exception in child'
problem here, so much as more generally, how much is a fully cross-platform
API worth, in a situation where the eventual application of the API is
inherently unlikely to be of a cross platform nature. The Python version
goes to some length, but can't fully resolve the inconsistencies. That's
OK if someone wants to go to the trouble, but if I'm right about inherent
platform dependence, it runs the risk of being more irritating than helpful!

Richard O'Keefe

unread,

Aug 13, 2012, 5:49:52 PM8/13/12

to Alexander Kjeldaas, David Feuer, haskel...@haskell.org

On 13/08/2012, at 11:26 PM, Alexander Kjeldaas wrote:

>
> This isn't that hard - a pipe shouldn't be needed anymore. Just require a post-2003 glibc.
>
> fexecve is a system call in most BSDs. It is also implemented in glibc using a /proc hack.

fexecve is now in the Single Unix Specification, based on
POSIX as of 2008, I believe. However,
http://www.gnu.org/software/gnulib/manual/html_node/fexecve.html
says
Portability problems not fixed by Gnulib:
* This function is missing on many non-glibc platforms: MacOS X 10.5, FreeBSD 6.0,
NetBSD 5.0, OpenBSD 3.8, Minix 3.1.8, AIX 5.1, HP-UX 11, IRIX 6.5, OSF/1 5.1,
Solaris 11 2010-11, Cygwin 1.5.x, mingw, MSVC 9, Interix 3.5, BeOS.

That warning doesn't seem to be fully up to date. I'm using MacOS X 10.6.8
and fexecve() isn't in the manuals or in <unistd.h>.

Alexander Kjeldaas

unread,

Aug 14, 2012, 4:42:57 AM8/14/12

to Richard O'Keefe, David Feuer, haskel...@haskell.org

On 13 August 2012 23:49, Richard O'Keefe <o...@cs.otago.ac.nz> wrote:

On 13/08/2012, at 11:26 PM, Alexander Kjeldaas wrote:

>
> This isn't that hard - a pipe shouldn't be needed anymore. Just require a post-2003 glibc.
>
> fexecve is a system call in most BSDs. It is also implemented in glibc using a /proc hack.

fexecve is now in the Single Unix Specification, based on
POSIX as of 2008, I believe. However,
http://www.gnu.org/software/gnulib/manual/html_node/fexecve.html
says
Portability problems not fixed by Gnulib:
* This function is missing on many non-glibc platforms: MacOS X 10.5, FreeBSD 6.0,
NetBSD 5.0, OpenBSD 3.8, Minix 3.1.8, AIX 5.1, HP-UX 11, IRIX 6.5, OSF/1 5.1,
Solaris 11 2010-11, Cygwin 1.5.x, mingw, MSVC 9, Interix 3.5, BeOS.

That warning doesn't seem to be fully up to date. I'm using MacOS X 10.6.8
and fexecve() isn't in the manuals or in <unistd.h>.

FreeBSD 8.0 is covered.

OpenBSD not covered

OS X not covered

http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/execve.2.html

Solaris probably not covered.

So support is pretty good, I'd say. For non-modern systems, checking the existence of the file first is possible. The race isn't important, and one can always upgrade to a modern operating system.

Alexander

Niklas Larsson

unread,

Aug 14, 2012, 11:22:28 AM8/14/12

to Alexander Kjeldaas, David Feuer, haskel...@haskell.org

2012/8/14 Alexander Kjeldaas <alexander...@gmail.com>:

The check would be unreliable, the file's existence doesn't imply that
it's executable. Furthermore it would add unnecessary overhead,
createProcess can be run thousands of times in a program and should be
lean and mean.

Niklas

> Alexander

Alexander Kjeldaas

unread,

Aug 14, 2012, 5:09:51 PM8/14/12

to Niklas Larsson, David Feuer, haskel...@haskell.org

See access(2)

Furthermore it would add unnecessary overhead,
createProcess can be run thousands of times in a program and should be
lean and mean.

Just to keep the bikeshedding doing, I'm going to state as a fact that running performance sensitive *server* workload on any unix other than Linux is purely of theoretical interest. No sane person would do it. Therefore, from a performance overhead, Linux performance is the only important performance measure.

But even given the above, the overhead we're talking about is minuscule. A program like '/bin/echo -n ''' which does exactly *nothing*, requires 35(!) system calls to do its job :-).

A more complex program like 'id' requires 250 system calls!

Also, to see just how minuscule this is, the dynamic linker, ld-linux.so does a few extra access(2) system calls *to the same file*, /etc/ld.so.hwcaps, on startup of every dynamically linked executable. 2 in the 'echo' case, and 8 in the 'id' case above. Even the glibc folks haven't bothered to optimize those syscalls away.

Alexander

Donn Cave

unread,

Aug 15, 2012, 1:25:41 AM8/15/12

to haskel...@haskell.org

Quoth Alexander Kjeldaas <alexander...@gmail.com>,

> See access(2)

... a classic "code smell" in UNIX programming, for the same reasons.

We can solve this problem in an efficient way that works well, and equally
well, on any POSIX platform that supports F_CLOEXEC on pipes, and I can't
think of anything that doesn't. The appended code is the basic gist of it.
This was not invented by the Python world, but as suggested it's one of
the things that we'd get from a review of their subprocess module.

Donn

spawn file cmd env = do
(e0, e1) <- pipe
fcntlSetFlag e1 F_CLOEXEC
t <- fork (fex e0 e1)
close e1
rx <- readFd e0 256
if null rx
then return t
else ioerr (chrToErrType rx) file
where
fex e0 e1 = do
close e0
catch (execve file cmd env)
(\ e -> writeFd e1 (errToChr e : ioeGetErrorString e))
ioerr (e, s) file = ioError (mkIOError e s Nothing (Just file))

Niklas Hambüchen

unread,

Aug 30, 2012, 9:19:30 PM8/30/12

to Donn Cave, haskel...@haskell.org

Well, overhead or not, it would be nice to at least have *some* solution.

Currently, it just doesn't work.

I am sure that as soon the functionality is there, somebody will step in
to fake it fast.

Reply all

Reply to author

Forward