[Haskell-cafe] Discussion: The CLOEXEC problem

Niklas Hambüchen

未讀,

2015年7月20日上午9:07:292015/7/20

收件者：haskell-cafe

Hello Cafe,

I would like to point out a problem common to all programming languages,
and that Haskell hasn't addressed yet while other languages have.

It is about what happens to file descriptors when the `exec()` syscall
is used (whenever you `readProcess`, `createProcess`, `system`, use any
form of `popen()`, Shake's `cmd` etc.).

(A Markdown-rendered version containing most of this email can be found
at https://github.com/ndmitchell/shake/issues/253.)

Take the following function

f :: IO ()
f = do
inSomeTemporaryDirectory $ do
BS.writeFile "mybinary" binaryContents
setPermissions "mybinary" (setOwnerExecutable True emptyPermissions)
_ <- readProcess "./mybinary" [] ""
return ()

If this is happening in parallel, e.g. using,

forkIO f >> forkIO f >> forkIO f >> threadDelay 5000000`

then on Linux the `readProcess` might often fail wit the error message

mybinary: Text file busy

This error means "Cannot execute the program 'mybinary' because it is
open for writing by some process".

How can this happen, given that we're writing all `mybinary` files in
completely separate temporary directories, and given that `BS.writeFile`
guarantees to close the file handle / file descriptor (`Fd`) before it
returns?

The answer is that by default, child processes on Unix (`fork()+exec()`)
inherit all open file descriptors of the parent process. An ordering
that leads to the problematic case could be:

* Thread 1 writes its file completely (opens and closes an Fd 1)
* Thread 2 starts writing its file (Fd 2 open for writing)
* Thread 1 executes "myBinary" (which calls `fork()` and `exec()`). Fd 2
is inherited by the child process
* Thread 2 finishes writing (closes its Fd 2)
* Thread 2 executes "myBinary", which fails with `Text file busy`
because an Fd is still open to it in the child of Process 1

The scope of this program is quite general unfortunately: It will happen
for any program that uses parallel threads, and that runs two or more
external processes at some time. It cannot be fixed by the part that
starts the external process (e.g. you can't write a reliable
`readProcess` function that doesn't have this problem, since the problem
is rooted in the Fds, and there is no version of `exec()` that doesn't
inherit parent Fds).

This problem is a general problem in C on Unix, and was discovered quite
late.

Naive solutions to this use `fcntl`, e.g. `fcntl(fd, F_SETFD, FD_CLOEXEC)`:

http://stackoverflow.com/questions/6125068/what-does-the-fd-cloexec-fcntl-flag-do

which is the equivalent of Haskell's `setFdOption` to set the `CLOEXEC`
flag to all Fds before `exec()`ing. Fds with this flag are not inherited
by `exec()`ed child processes. However, these solutions are racy in
multi-threaded programs (such as typical Haskell programs), where an
`exec()` made by some thread can fall just in between the `int fd =
open(...); exec(...)` of some other thread.

For this reason, the `O_CLOEXEC` flag was added in Linux 2.6.23, see
e.g. `man 2 open`

http://man7.org/linux/man-pages/man2/open.2.html

to the `open()` syscall to atomically open a file and set the Fd to
CLOEXEC in a single step.

This flag is not the default in Haskell - but maybe it should be. Other
languages set it by default, for example Python. See

PEP-433: https://www.python.org/dev/peps/pep-0433/
and the newer
PEP-446: https://www.python.org/dev/peps/pep-0446/

for a very good description of the situation.

Python >= 3.2 closes open Fds *after* the `exec()` when performed with
its `subprocess` module.
Python 3.4 uses O_CLOEXEC by default on all Fds opened by Python.

It is also noted that "The programming languages Go, Perl and Ruby make
newly created file descriptors non-inheritable by default: since Go 1.0
(2009), Perl 1.0 (1987) and Ruby 2.0 (2013)":

https://www.python.org/dev/peps/pep-0446/#related-work

A work-around for Haskell is to use `O_CLOEXEC` explicitly, as in this
example module `System/Posix/IO/ExecSafe.hsc`:

https://gist.github.com/nh2/4932ecf5ca919659ae51

Then we can implement a safe version of `BS.writeFile`:

https://gist.github.com/nh2/4932ecf5ca919659ae51

Using this form of `writeFileExecSafe` helps in cases when your program
is very small, when you can change all the code and you don't use any
libraries that open files. However, this is a very rare case, and not a
real solution.

All multi-threaded Haskell programs that write and execute files will
inherently trigger the `Text file busy` problem.

We need to discuss what to do about this.

Let us run this discussion on haskell-cafe and move to the libraries@
mailing list once we've got some ideas and opinions.

My personal stance is that we should follow Python's example, and all
functions in our standard libraries should open files with the O_CLOEXEC
flag set.

Niklas
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe

Erik Hesselink

未讀,

2015年7月20日上午9:32:072015/7/20

收件者：Niklas Hambüchen、haskell-cafe

I've run into this, but with sockets instead of files. For example, if
you run a kind of launcher that spawns processes with a double fork,
and it listens on its own socket, restarting it will fail to rebind
the socket, since the spawned processes inherited it. We set
FD_CLOEXEC on the socket now, but, at least on Linux, you could pass
SOCK_CLOEXEC to 'socket' in a similar way as with 'open'. Mac support
is trickier: it does seem to support the flag on 'open', but not on
'socket', as far as I can tell. I have no idea if this discussion
applies to Windows at all.

Personally I agree with you that we should probably set this by
default, and expose a flag to change it.

Erik

Mike Meyer

未讀,

2015年7月20日上午9:57:312015/7/20

收件者：Niklas Hambüchen、haskell-cafe

This is just one example of the general problem of fork()'ing with threads holding a lock. Given that locks come in different flavors and have different uses, they need individual handling. The best general solution I know of is "don't do that". One of the things I like about Haskell is that the type system gives me hints when I'm trying to do that.

That said, there are a number of issues raised by leaving FD's open across an exec, and that's rarely the right thing to do. So making it the default behavior is probably a good idea.

Donn Cave

未讀,

2015年7月21日晚上10:48:162015/7/21

收件者：haskell-cafe

quoth Niklas Hambüchen,
...

> The scope of this program is quite general unfortunately: It will happen
> for any program that uses parallel threads, and that runs two or more
> external processes at some time. It cannot be fixed by the part that
> starts the external process (e.g. you can't write a reliable
> `readProcess` function that doesn't have this problem, since the problem
> is rooted in the Fds, and there is no version of `exec()` that doesn't
> inherit parent Fds).
>
> This problem is a general problem in C on Unix, and was discovered quite
> late.

I believe it has actually been a familiar issue for decades. I don't
have any code handy to check, but I'm pretty sure the UNIX system(3)
and popen(3) functions closed extraneous file descriptors back in the
early '90s, and probably had been doing it for some time by then.

I believe this approach to the problem is supported in System.Process,
via close_fds. Implementation is a walk through open FDs, in the child
fork, closing anything not called for by the procedure's parameters
prior to the exec.

That approach has the advantage that it applies to all file descriptors,
whether created by open(2) or by other means - socket, dup(2), etc.

I like this already implemented solution much better than adding a
new flag to "all" opens (really only those opens that occur within
the Haskell runtime, while of course not for external library FDs.)
The O_CLOEXEC proposal wouldn't be the worst or most gratuitous
way Haskell tampers with normal UNIX parameters, but any time you
do that, you set up the conditions for breaking something that
works in C, which I hate to see happen with Haskell.

Donn

Niklas Hambüchen

未讀,

2015年7月22日上午8:33:302015/7/22

收件者：Donn Cave、haskell-cafe

Hello Donn,

Python has a detailed discussion of this suggestion:

*
https://www.python.org/dev/peps/pep-0433/#close-file-descriptors-after-fork
*
https://www.python.org/dev/peps/pep-0446/#closing-all-open-file-descriptors

It highlights some problems with this approach, most notably Windows
problems, not solving the problem when you exec() without fork(), and
looping up to MAXFD being slow (this is what the current Haskell
`runInteractiveProcess` code
(http://hackage.haskell.org/package/process-1.2.3.0/src/cbits/runProcess.c)
seems to be doing; Python improved upon this by not looping up to MAXFD,
but instead looking up the open FDs in /proc/<PID>/fd/, after people
complained about this loop of close() syscalls being very slow when many
FDs were open.

> do that, you set up the conditions for breaking something that
> works in C, which I hate to see happen with Haskell.

While I understand your opinion here, I'm not sure that "breaking
something that works in C" is the right description. O_CLOEXEC changes a
default setting, but does not irrevocably disable any feature that is
available in C. The difference is that you'd have to say which FDs you
want to keep in the child - which to my knowledge is OK, since it is a
much more common thing to work with *some* designated FDs in the child
process than with all of them.

To elaborate a bit, if you wanted to write a program where a child
process would access the parent's Fds, you would in most cases already
have those Fds in some Haskell variables you're working with. In that
case, it is easy to `setFdOption fd CloseOnExec False` on those if
CLOEXEC is the default, and everybody is happy.
If CLOEXEC is not the default, then you'd get a problem with all those
Fds on which do *not* have a grip in your program, and it's much harder
to fix problems with these resources that are around invisible in the
background than with those that you have in variables that you use.

In other words, CLOEXEC is something that is easy to *undo* locally when
you don't want it, but hard to *do* globally when you need it.

Let me know what you think about this.

Niklas

Donn Cave

未讀,

2015年7月22日晚上9:24:482015/7/22

收件者：haskell-cafe

quoth Niklas_Hambüchen,

>> do that, you set up the conditions for breaking something that
>> works in C, which I hate to see happen with Haskell.
>
> While I understand your opinion here, I'm not sure that "breaking
> something that works in C" is the right description. O_CLOEXEC changes a
> default setting, but does not irrevocably disable any feature that is
> available in C.

Sure, it isn't irrevocable - so what's broken may be fixed, if you
have access to it, but of course it's better not to break things
in the first place.

> In other words, CLOEXEC is something that is easy to *undo* locally when
> you don't want it, but hard to *do* globally when you need it.

Yes, of course, I understand the appeal. But it's a deep change
to the way FDs have historically worked that affects widely used
UNIX features, and it doesn't solve the problem - sockets, file
descriptors created by external libraries or inherited from the
parent process, child processes that don't exec - so if you want
to relieve a child process of all extraneous open files, you still
have to walk the FD table, the sam way it's been done for the last
20 or 30 years. Fork-exec is the relatively unusual event where
it makes sense to deal with these issues - including other resources
besides FDs as required. Fork-exec outside of GHC should of course
continue to work as written.

Donn

Alexander Kjeldaas

未讀,

2015年7月23日凌晨2:31:532015/7/23

收件者：Donn Cave、haskell-cafe

This history is from before the c10k problem and related file descriptor scaling became relevant.

Yes we need to walk the open file descriptors by walking /prod/self/fd and using obscure APIs on OS X. No matter how you see it, it's not what it was 30 years ago.

Alexander

Donn Cave

未讀,

2015年7月23日中午12:34:242015/7/23

收件者：haskell-cafe

Quoth David Turner <dave.c...@gmail.com>,
...
> Non-inheritable-by-default makes sense to me - if you forget to clear
> FD_CLOEXEC when you meant to then your program breaks loudly and obviously;

This is where we differ. First, it isn't fair to say we "forget"
to clear FD_CLOEXEC, if in pre-existing software we didn't have a
CLOEXEC bit to clear. Rather, in the existing set of applications
that have an exec - whether in Haskell code or not - we can expect
that some subset of them will start to behave differently. Maybe
behave better, but in any case likely a surprise to everyone because
we've reversed the default on some large subset of file descriptors.
And I think it's very optimistic to predict that the breakage will
be obvious - the obvious effects will naturally be obvious, but
I personally am not up to accounting for all non-obvious cases.

And there's the other FDs - inherited, non-Haskell, sockets - so
we still need another solution if we're serious about fixing
the problem.

Donn Cave

未讀,

2015年7月24日下午3:23:252015/7/24

收件者：haskell-cafe

Quoth David Turner <dave.c...@gmail.com>,

> Could you be a bit more specific? Which bits of pre-existing software
> didn't have a FD_CLOEXEC bit and would be broken by this proposal?

Well, of course to be precise, the bit's always there, it's just
normally not set - that's the normal environment that anything
written up to now would expect. And of course, anything that depends
on a GHC-opened file to stay open over an exec would be broken.
I can't enumerate the software that meets that criterion.

> Since Python recently decided to go through this exact transition, their
> experience should be instructive. Do you know if there was negative fallout
> from PEP 0466?

I gave up on Python a long time ago and don't follow what goes on.
If recently means less than a decade or so, though, it's not much to
go on. If the problem addressed by the O_CLOEXEC proposal is obscure,
the problems it may create are even more so - I'll certainly concede
that - and it could take a lot of experience before those problems
would be well known enough to show up if you went looking for them.

> When thinking about FDs from outside the Haskell runtime (whether inherited
> or simply opened in an external library), can you give an example of a case
> where such a FD causes a problem if inherited and yet cannot be set as
> FD_CLOEXEC at source?

Sorry, I'm confused here. Files opened within GHC and externally have
equal potential to work as intended or to cause problems, it seems to me.
If we infer from the proposal that GHC-opened files with CLOEXEC unset
may cause a problem, then it follows that other files with CLOEXEC unset
also may cause the same problem. The proposal addresses only the former,
and not the latter, and only for normal files - while the ordinary solution,
as implemented in UNIX popen(3), deals with all - pipes, sockets, etc.

Alexander Kjeldaas

未讀,

2015年7月24日下午6:29:442015/7/24

收件者：Niklas Hambüchen、haskell-cafe

On Mon, Jul 20, 2015 at 3:07 PM, Niklas Hambüchen <ma...@nh2.me> wrote:

I think CLOEXEC should be the default, but it doesn't seem to solve your problem. What if thread 2 executes "myBinary" before thread 1 called exec()?

Alexander

Bardur Arantsson

未讀,

2015年7月25日下午1:24:592015/7/25

收件者：haskel...@haskell.org

On 07/24/2015 09:22 PM, Donn Cave wrote:
> Quoth David Turner <dave.c...@gmail.com>,
>> Could you be a bit more specific? Which bits of pre-existing software
>> didn't have a FD_CLOEXEC bit and would be broken by this proposal?
>
> Well, of course to be precise, the bit's always there, it's just
> normally not set - that's the normal environment that anything
> written up to now would expect. And of course, anything that depends
> on a GHC-opened file to stay open over an exec would be broken.
> I can't enumerate the software that meets that criterion.
>
>> Since Python recently decided to go through this exact transition, their
>> experience should be instructive. Do you know if there was negative fallout
>> from PEP 0466?
>
> I gave up on Python a long time ago and don't follow what goes on.
> If recently means less than a decade or so, though, it's not much to
> go on. If the problem addressed by the O_CLOEXEC proposal is obscure,
> the problems it may create are even more so - I'll certainly concede
> that - and it could take a lot of experience before those problems
> would be well known enough to show up if you went looking for them.
>

It seems to me that discovering a
"FD-was-unexpectedly-closed-before-it-was-supposed-to" problem is a lot
more likely than discovering FD leaks, no?

(Not that I'm advocating any particular solution to this -- backward
compatibility is a harsh mistress.)

Regards,

Donn Cave

未讀,

2015年7月25日下午3:09:212015/7/25

收件者：haskel...@haskell.org

Quoth Bardur Arantsson <sp...@scientician.net>,

> On 07/24/2015 09:22 PM, Donn Cave wrote:

...

>> If recently means less than a decade or so, though, it's not much to
>> go on. If the problem addressed by the O_CLOEXEC proposal is obscure,
>> the problems it may create are even more so - I'll certainly concede
>> that - and it could take a lot of experience before those problems
>> would be well known enough to show up if you went looking for them.
>
> It seems to me that discovering a
> "FD-was-unexpectedly-closed-before-it-was-supposed-to" problem is a lot
> more likely than discovering FD leaks, no?

Maybe ... Note that if it were exactly about FD leaks, that problem
would be undiscovered yet. The reason anyone cares is that the leaked
file descriptor may go on to inconveniently hold a file open.

In what I think is the most common case, the file is a pipe, and the
open write end makes a read hang when it should complete. Pipes
aren't created by open(2) so won't be part of an O_CLOEXEC solution,
but I imagine this is where the issue is usually first encountered,
and why popen(3) closes all file descriptors.

With disk files ... off the top of my head, the most likely effect
might NOT be read/write I/O errors, because here we're talking about
passing a file descriptor value through an exec, which I think is
an unusual programming practice. It's easy enough to do, e.g. you
can format the value into a shell command like "echo whatever >&6",
but eccentric. But there are other things that could turn up.
For example, you could use flock(2) (Berkeley, not POSIX fcntl
lock) to keep an advisory file lock until the exec exits. If the
file is closed prematurely, you lose the lock, and ... whatever
happens then.

Donn

Mike Meyer

未讀,

2015年7月25日下午3:33:022015/7/25

收件者：Donn Cave、haskel...@haskell.org

While this discussion has been about the programming errors that result from leaked file descriptors, can I point out what I think is a more important issue?

A leaked file descriptor is a potential security hole. If you want your code to be secure - and in this age of internet-based applications built by plugging things together, that should always be the case - you want bugs from not dealing with an access issue to result in a permission denied error, not someone being able to read stuff they shouldn't.

So while we can't fix all the holes related to this issue or the larger issues related to forking a threaded program, changing the default to automatically close things will result in improving the security of haskell programs.

Niklas Hambüchen

未讀,

2015年8月28日下午3:54:242015/8/28

收件者：Alexander Kjeldaas、haskell-cafe

On 25/07/15 00:29, Alexander Kjeldaas wrote:
> I think CLOEXEC should be the default, but it doesn't seem to solve your
> problem. What if thread 2 executes "myBinary" before thread 1 called exec()?

I'm not sure I understand your scenario - does it change the problem?

Brandon Allbery

未讀,

2015年8月28日下午4:14:452015/8/28

收件者：Alexander Kjeldaas、haskell-cafe

On Fri, Jul 24, 2015 at 6:29 PM, Alexander Kjeldaas <alexander...@gmail.com> wrote:

I think CLOEXEC should be the default, but it doesn't seem to solve your problem. What if thread 2 executes "myBinary" before thread 1 called exec()?

I think you missed that each thread is using its own temporary directory --- they're not all running at the same time in the same directory, which would be pretty much guaranteed to fail.

--

brandon s allbery kf8nh sine nomine associates

allb...@gmail.com ball...@sinenomine.net

unix, openafs, kerberos, infrastructure, xmonad http://sinenomine.net

Alexander Kjeldaas

未讀,

2015年8月30日上午9:19:382015/8/30

收件者：Brandon Allbery、haskell-cafe

The directory is irrelevant. fork() + exec() is not an atomic operation:

* Thread 1 writes its file completely (opens and closes an Fd 1, using O_CLOEXEC)
* Thread 2 starts writing its file (Fd 2 open for writing, using O_CLOEXEC)
* Thread 1 starts executing "myBinary" by calling fork(). Fd 2 is inherited by the child process

* Thread 2 finishes writing (closes its Fd 2)
* Thread 2 executes "myBinary", which fails with `Text file busy`
because an Fd is still open to it in the child of Process 1

* Thread 1 executes "myBinary" (calling exec()). Fd 2 is automatically closed during exec(), but it's too late.

You need the file descriptor to not be inherited by a child process, which is != from O_CLOEXEC.

Alexander

Niklas Hambüchen

未讀,

2015年8月30日上午9:42:162015/8/30

收件者：Alexander Kjeldaas、Brandon Allbery、haskell-cafe

On 30/08/15 15:19, Alexander Kjeldaas wrote:
> The directory is irrelevant. fork() + exec() is not an atomic operation:
>
> * Thread 1 writes its file completely (opens and closes an Fd 1, using
> O_CLOEXEC)
> * Thread 2 starts writing its file (Fd 2 open for writing, using O_CLOEXEC)

> * Thread 1 starts executing "myBinary" by calling *fork()*. Fd 2 is

> inherited by the child process
> * Thread 2 finishes writing (closes its Fd 2)
> * Thread 2 executes "myBinary", which fails with `Text file busy`
> because an Fd is still open to it in the child of Process 1
> * Thread 1 executes "myBinary" (calling exec()). Fd 2 is automatically
> closed during exec(), but it's too late.
>
> You need the file descriptor to not be inherited by a child process,
> which is != from O_CLOEXEC.

You are right. This makes solving my original problem impossible, and

* writing the file
* then renaming it
* then executing it

seems to be the only way to do it safely.

Let us then move the discussion back to whether CLOEXEC by default or not.

Mike Meyer

未讀,

2015年8月30日上午9:53:322015/8/30

收件者：Alexander Kjeldaas、Brandon Allbery、haskell-cafe

On Sun, Aug 30, 2015 at 9:19 AM Alexander Kjeldaas <alexander...@gmail.com> wrote:

The directory is irrelevant. fork() + exec() is not an atomic operation:

This creates problems for all resources that act as locks. IIRC (it's been a few years since I looked through it thoroughly), it's been shown that there isn't a general fix for this. I.e - that the POSIX threading model & fork() will having timing issues of some sort or another no matter what you do. The work-around is to only fork when no such resources are held. So you do things like fork all your processes before starting a thread, or fork a server that will do all further forks upon request before starting a thread, etc.

So the question should not be whether CLO_EXEC "fixes everything", but whether having it as the default is a good enough idea to be worth the pain of changing. I suspect the answer is yes, as most cases where it isn't set are probably because it's the default, so won't need changing.

Niklas Hambüchen

未讀,

2015年8月30日上午9:58:412015/8/30

收件者：Donn Cave、haskel...@haskell.org

On 25/07/15 21:09, Donn Cave wrote:
> But there are other things that could turn up.
> For example, you could use flock(2) (Berkeley, not POSIX fcntl
> lock) to keep an advisory file lock until the exec exits. If the
> file is closed prematurely, you lose the lock, and ... whatever
> happens then.

This is a very valid point. Applications that rely on this will break by
changing the default here.

I'm wondering though whether this is an acceptable price to pay for
better (in my opinion) defaults. Given enough announcement and time, it
should not be too difficult to find Berkeley flock() invocations, and
explicitly fnctl their FDs to CLOEXEC=False, or open() them with
CLOEXEC=False.

I would even be surprised if there is a single Haskell program out there
that uses this; I know of one that uses file locking, bu that's using
fnctl style locks.

Brandon Allbery

未讀,

2015年8月30日上午10:18:332015/8/30

收件者：Niklas Hambüchen、haskell-cafe

On Sun, Aug 30, 2015 at 9:58 AM, Niklas Hambüchen <ma...@nh2.me> wrote:

I would even be surprised if there is a single Haskell program out there
that uses this; I know of one that uses file locking, bu that's using
fnctl style locks.

Also note that many systems emulate flock() with fcntl() locks, so that trick is nonportable anyway. (Linux used to do that, but stopped; unless you're holding onto a system with a pre-2.0 kernel or a weird Linux distribution whose glibc has been modified to do emulation, you should have real flock().)

Donn Cave

未讀,

2015年8月30日上午11:00:362015/8/30

收件者：haskel...@haskell.org

Quoth Niklas Hambüchen <ma...@nh2.me>,

> On 25/07/15 21:09, Donn Cave wrote:
>> But there are other things that could turn up.
>> For example, you could use flock(2) (Berkeley, not POSIX fcntl
>> lock) to keep an advisory file lock until the exec exits. If the
>> file is closed prematurely, you lose the lock, and ... whatever
>> happens then.

> This is a very valid point. Applications that rely on this will break by
> changing the default here.

... and you go on to demonstrate that you didn't really take the point
I was trying to make. How did you find out about this flock(2) scenario?
Do you suppose that this is the only one, because you and I don't know
of any more?

My point is not that the whole thing might hinge on whether we can
deal with flock locking in this scenario, it is that when we think
about reversing ancient defaults in the underlying system, we have
to assume the risk of obscure breakage as a result. To worry about
the flock problem now is to miss the point.

And again, this half-a-fix inconsistently applies to only files created
by open(2), and not to pipes, sockets and whatever else creates a file
descriptor in some other way, so if it's a real problem, it seems you
must address it in some other way anyway.

Donn

Niklas Hambüchen

未讀,

2015年8月30日下午6:16:572015/8/30

收件者：Donn Cave、haskel...@haskell.org

On 30/08/15 16:59, Donn Cave wrote:
> Quoth Niklas HambÃ¼chen <ma...@nh2.me>,

> ... and you go on to demonstrate that you didn't really take the point
> I was trying to make. How did you find out about this flock(2) scenario?
> Do you suppose that this is the only one, because you and I don't know
> of any more?

No, I do take your point, and I admit that more things may break when
changing the default.

What I'm saying is that such cases are relatively easy to find. When you
rely on inheriting FDs, you know it.
It is exotic enough that when you've built something that needs it,
you'll remember it.

This leads me to think that the damage done / cost to fix introduced by
breaking the backwards compatibility here might be smaller than the
damage / fixing that will arise in the future from surprising behaviour
and security/privilege problems through leaked FDs in all non-exotic
programs that want to exec() something. I'd assume the Python and Perl
people came to this conclusion.

Further, our community is small, and if you announce something loud and
long-term via mailing list and Reddit, it will go a long way and make it
unlikely that somebody will be unaware of such a change. Your point that
we may break things that we don't know of / understand still stands though.

Regarding pipes and sockets, pipe()/socket() accept CLOEXEC as well.

David Turner

未讀,

2015年8月31日凌晨3:28:332015/8/31

收件者：Niklas Hambüchen、haskel...@haskell.org

Exactly. As I said earlier, if you forget to clear FD_CLOEXEC when you meant to then your program breaks loudly and obviously; if you forget to *set* FD_CLOEXEC then the bug is much quieter and more subtle.

I asked for a specific example of some existing code that would be broken by this, but none was forthcoming. I understand why, in theory, changing this would be a problem (indeed, changing anything is similarly problematic) but in practice the pros enormously outweigh the cons here.

Niklas Hambüchen

未讀,

2015年8月31日凌晨4:05:562015/8/31

收件者：David Turner、haskel...@haskell.org

On 31/08/15 09:28, David Turner wrote:
> Exactly. As I said earlier, if you forget to clear FD_CLOEXEC when you
> meant to then your program breaks loudly and obviously; if you forget to
> *set* FD_CLOEXEC then the bug is much quieter and more subtle.

I think the example given by Donn (a lock silently being cleared too
early) is a case where it does not break loudly and obviously.

I agree with the second part though that bugs related to leaking tend to
be quieter and more subtle in general.

David Turner

未讀,

2015年8月31日凌晨4:46:522015/8/31

收件者：Niklas Hambüchen、haskel...@haskell.org

Yes, sure, but what actually does that? i.e. what locks a FD opened by the Haskell runtime and then expects to preserve the lock across a fork? I see that this is a problem in theory, but in practice is it?

The point is, how much code would really have to change to fix the fallout from this?

回覆所有人

回覆作者

轉寄