How to handle EINTR from syscall.Dup2

426 views
Skip to first unread message

pboam...@gmail.com

unread,
Feb 28, 2020, 3:41:39 PM2/28/20
to golang-nuts
(I've asked the same question already, but probably in the wrong thread, sorry for the repost.)

What to do on EINTR from syscall.Dup2 (Linux)?

1) It never happen.
2) Retry.
3) Take it as irrecoverable.
4) Take it as success.

I know this is more of an OS question, but it all started with the asynchronous preemption announcement, and I don't know where else to get help.

In the signal(7) man page, dup2 is neither mentioned in the "affected by SA_RESTART" list, nor in the other list. Is it affected or not?

According to [1][2], dup2 can close newfd and still fail, therefore it should *never* be retried because a retry would cause a dangerous race.
This would mean that a signal during dup2 (nothing out of the ordinary) would produce an irrecoverable condition!
The man page says that close+dup is done "atomically", but it isn't clear whether "close and fail" is a possibility or not.

Someone also hypothesizes that EINTR from dup2 can actually mean success, because it comes from the implicit close(2) and EINTR from close is not a failure indication on Linux (there are plans to change it to EINPROGRESS), see also [3][4].

I hope someone can shed some light on this,
thanks.

[1] https://github.com/libuv/libuv/issues/462
[2] https://www.python.org/dev/peps/pep-0475
[3] https://stackoverflow.com/questions/15930013/can-dup2-really-return-eintr
[4] https://lwn.net/Articles/576478

Kurtis Rader

unread,
Feb 28, 2020, 5:42:45 PM2/28/20
to pboam...@gmail.com, golang-nuts
On Fri, Feb 28, 2020 at 12:41 PM <pboam...@gmail.com> wrote:
What to do on EINTR from syscall.Dup2 (Linux)?

1) It never happen.
2) Retry.
3) Take it as irrecoverable.
4) Take it as success.

I know this is more of an OS question, but it all started with the asynchronous preemption announcement, and I don't know where else to get help.

Note that dup2() can only fail with EINTR if the new fd is currently open on a "slow" device and the implicit close() fails due to being interrupted. In my experience it is usually an error if the new fd is currently in use unless the new fd is 0, 1 or 2 . If you expect the new fd to be in use at the time of the dup2() it is usually better, from the viewpoint of clarity, to incur the cost of an explicit close() call. If the new fd isn't in use then dup2() is atomic and will never fail with EINTR. So your question is really what to do if close() fails with EINTR and the answer is retry. However, in practice the only situation where EINTR will happen on close() is if the fd is open on a remote file system; e.g., a NFS mounted file system. Even that is extremely unlikely unless there is a problem with the network or remote file server in which case retrying is unlikely to succeed.

--
Kurtis Rader
Caretaker of the exceptional canines Junior and Hank

Philip Boampong

unread,
Feb 28, 2020, 11:27:29 PM2/28/20
to Kurtis Rader, golang-nuts
Thanks for the reply.

> Note that dup2() can only fail with EINTR if the new fd is currently open on a "slow" device and the implicit close() fails due to being interrupted.

I understand the condition may be rare, but I still want to know the
correct way to handle it.

> In my experience it is usually an error if the new fd is currently in use unless the new fd is 0, 1 or 2 .

The Go programmer is not fully in charge of the FD namespace:
libraries and the runtime can create new FDs at any time. Therefore,
unless you are sure that newfd *is* in use and know exactly what it
is, you are probably looking for trouble.

> If you expect the new fd to be in use at the time of the dup2() it is usually better, from the viewpoint of clarity, to incur the cost of an explicit close() call.

That seems very dangerous, especially in Go.
If another goroutine opens a file between the close and the dup2, such
open will likely reuse newfd which is about to be replaced. Then the
second goroutine will have the wrong file and operate concurrently on
it.
Atomicity with close is the whole point of dup2.

> So your question is really what to do if close() fails with EINTR and the answer is retry.

The man page [1] explicitly says that Linux close(2) should *never* be
retried, not even on EINTR.
As I mentioned, there are plans to change close to return EINPROGRESS,
or even no error at all, instead of EINTR [2].

[1] http://man7.org/linux/man-pages/man2/close.2.html
[2] https://lwn.net/Articles/576478/

Brian Candler

unread,
Feb 29, 2020, 3:32:58 AM2/29/20
to golang-nuts
Just to ask the an obvious question: is dup2() idempotent or not?

Ian Lance Taylor

unread,
Feb 29, 2020, 7:35:06 AM2/29/20
to Philip Boampong, Kurtis Rader, golang-nuts
On Fri, Feb 28, 2020 at 8:27 PM Philip Boampong <pboam...@gmail.com> wrote:
>
> The Go programmer is not fully in charge of the FD namespace:
> libraries and the runtime can create new FDs at any time. Therefore,
> unless you are sure that newfd *is* in use and know exactly what it
> is, you are probably looking for trouble.

It does not make sense to use dup2 if you are not in control of the FD
namespace. In order to use dup2 you need to specify the new FD. If
that FD might be concurrently opened by some other package, or by the
runtime, then you can not use dup2 safely. That is true regardless of
whether dup2 returns EINTR or not.

This doesn't mean that Go program can never use dup2. The runtime
will only open a file descriptor when requested. You can avoid
packages that open descriptors at unpredictable times. You can use
locks or channels to ensure that when you call dup2 nothing else can
be opening a file. Of course, this does mean that a large Go program
with uncontrolled dependencies may not be able to call dup2 safely.
The same is true of a multi-threaded C program. It's in the nature of
the system call.

Is this a theoretical question or one that arises from real code? The
dup2 system call has fairly specialized uses. It's unusual for a
general purpose program to need to call it.

Ian

Ian Lance Taylor

unread,
Feb 29, 2020, 7:37:24 AM2/29/20
to Brian Candler, golang-nuts
On Sat, Feb 29, 2020 at 12:33 AM Brian Candler <b.ca...@pobox.com> wrote:
>
> Just to ask the an obvious question: is dup2() idempotent or not?

dup2 in itself is idempotent. But I'm not sure that is a useful
question. The issue is whether some other thread in the same process
can open a file at the target file descriptor between calls to dup2.
To put it another way, dup2 is idempotent, but if you make multiple
calls to dup2, the order in which you make those calls matters.

Ian

Manlio Perillo

unread,
Feb 29, 2020, 9:41:00 AM2/29/20
to golang-nuts
On Friday, February 28, 2020 at 9:41:39 PM UTC+1, pboam...@gmail.com wrote:
(I've asked the same question already, but probably in the wrong thread, sorry for the repost.)

What to do on EINTR from syscall.Dup2 (Linux)?

1) It never happen.
2) Retry.
3) Take it as irrecoverable.
4) Take it as success.

I know this is more of an OS question, but it all started with the asynchronous preemption announcement, and I don't know where else to get help.

In the signal(7) man page, dup2 is neither mentioned in the "affected by SA_RESTART" list, nor in the other list. Is it affected or not?

According to [1][2], dup2 can close newfd and still fail, therefore it should *never* be retried because a retry would cause a dangerous race.
This would mean that a signal during dup2 (nothing out of the ordinary) would produce an irrecoverable condition!
The man page says that close+dup is done "atomically", but it isn't clear whether "close and fail" is a possibility or not.


What about using fcntl with F_DUPFD?
It will "Duplicate the file descriptor fd using the lowest-numbered available file descriptor greater than or equal to arg"

It does not have the problems of dup2 where you can use an already in use fd, and unlike dup you can specify where the new fd should be allocated.


Manlio 

Uli Kunitz

unread,
Feb 29, 2020, 10:35:24 AM2/29/20
to golang-nuts
My reading of the Linux kernel sources (5.3) is that dup2 will never return -EINTR. Any necessary file closure will happen, but its return value will be ignored.

But you will have to program the loop around the syscall anyway, because Linux may return EBUSY, which happens if the new fd has been allocated by somebody else, but has not finished the initialization of the descriptor. See the comment in the source below.

So if you want to write a safe program, you have to implement the loop around dup2 checking for EBUSY. It cannot harm to check for EINTR as well.

Here is the core routine from fs/file.c in the Linux kernel.

static int do_dup2(struct files_struct *files,
        struct file *file, unsigned fd, unsigned flags)
__releases(&files->file_lock)
{
        struct file *tofree;
        struct fdtable *fdt;

        /*
         * We need to detect attempts to do dup2() over allocated but still
         * not finished descriptor.  NB: OpenBSD avoids that at the price of
         * extra work in their equivalent of fget() - they insert struct
         * file immediately after grabbing descriptor, mark it larval if
         * more work (e.g. actual opening) is needed and make sure that
         * fget() treats larval files as absent.  Potentially interesting,
         * but while extra work in fget() is trivial, locking implications
         * and amount of surgery on open()-related paths in VFS are not.
         * FreeBSD fails with -EBADF in the same situation, NetBSD "solution"
         * deadlocks in rather amusing ways, AFAICS.  All of that is out of
         * scope of POSIX or SUS, since neither considers shared descriptor
         * tables and this condition does not arise without those.
         */
        fdt = files_fdtable(files);
        tofree = fdt->fd[fd];
        if (!tofree && fd_is_open(fd, fdt))
                goto Ebusy;
        get_file(file);
        rcu_assign_pointer(fdt->fd[fd], file);
        __set_open_fd(fd, fdt);
        if (flags & O_CLOEXEC)
                __set_close_on_exec(fd, fdt);
        else
                __clear_close_on_exec(fd, fdt);
        spin_unlock(&files->file_lock);

        if (tofree)
                filp_close(tofree, files);

        return fd;

Ebusy:
        spin_unlock(&files->file_lock);
        return -EBUSY;
}


Brian Candler

unread,
Feb 29, 2020, 3:13:13 PM2/29/20
to golang-nuts
On Saturday, 29 February 2020 12:37:24 UTC, Ian Lance Taylor wrote:
On Sat, Feb 29, 2020 at 12:33 AM Brian Candler <b.ca...@pobox.com> wrote:
>
> Just to ask the an obvious question: is dup2() idempotent or not?

dup2 in itself is idempotent.  But I'm not sure that is a useful
question.

I think it makes sense in the context of the question "should I repeat dup2() if it returns with EINTR?"  If it's idempotent, then it's safe to do so.

 
 The issue is whether some other thread in the same process
can open a file at the target file descriptor between calls to dup2.  
To put it another way, dup2 is idempotent, but if you make multiple
calls to dup2, the order in which you make those calls matters. 

I don't quite follow.  If two threads are fighting to use the target fd, then that's just a race anyway.

fds are a global resource; if one thread is doing (say) dup2(3,10), then it must already be sure that fd 10 is available globally, i.e. it won't be stomping on any other thread using fd 10.

Philip Boampong

unread,
Feb 29, 2020, 3:17:05 PM2/29/20
to Ian Lance Taylor, golang-nuts
On Sat, Feb 29, 2020 at 1:34 PM Ian Lance Taylor <ia...@golang.org> wrote:
>
> It does not make sense to use dup2 if you are not in control of the FD
> namespace. In order to use dup2 you need to specify the new FD. If
> that FD might be concurrently opened by some other package, or by the
> runtime, then you can not use dup2 safely.

But if you opened newfd yourself, or if it is 0/1/2 and you never
closed os.Std*, then you *can* dup2 safely, regardless of other
packages.

> This doesn't mean that Go program can never use dup2. The runtime
> will only open a file descriptor when requested. You can avoid
> packages that open descriptors at unpredictable times.

Can you really do that? I don't think the standard library guarantees
that it will not create a new FD behind the scenes tomorrow (nor it
exactly documents its FD usage and timing).

> Is this a theoretical question or one that arises from real code?

It's not theoretical, I became aware of EINTR and I'm trying to fix my own code.
I use dup2 to redirect FD 0/1/2 from inside the program itself (I know
I can assign to os.Std* but that is not always sufficient or safe).
The most common case is to redirect stderr including panic output.

As you can see, I cannot retry dup2 on EINTR unless I'm sure that the
first call has left newfd open, otherwise I will incur the race. But
if I don't retry it, then I have no way to recover from ordinary
signals!

I don't think you can argue that I'm supposed to have control of the
FD namespace: the whole point of dup2 being atomic [1] is that users
may not have such control, see also [2][3].

(I'm not actually getting EINTR from my dup2's, but I want to handle
it correctly if it can happen.)

> > Just to ask the an obvious question: is dup2() idempotent or not?
>
> dup2 in itself is idempotent.

It's hard to talk about idempotence when the context changes
unpredictably (FD state).
Dup2 is "atomic", in the sense that newfd is never reusable during the
whole syscall.
But is it "atomic" in the sense that it will either "leave FDs
unchanged with an error", or "complete without error", nothing in
between?

[1] http://man7.org/linux/man-pages/man2/dup2.2.html
[2] https://lwn.net/Articles/236843/
[3] https://stackoverflow.com/questions/23440216/race-condition-when-using-dup2

Philip Boampong

unread,
Feb 29, 2020, 3:24:35 PM2/29/20
to Manlio Perillo, golang-nuts
On Sat, Feb 29, 2020 at 3:41 PM Manlio Perillo <manlio....@gmail.com> wrote:
>
> What about using fcntl with F_DUPFD?
> [...]
> It does not have the problems of dup2 where you can use an already in use fd, and unlike dup you can specify where the new fd should be allocated.

Thanks, good to know and may come in handy!
But my current dup2 use case is to redirect a specific FD (0/1/2). If
between close and fcntl another goroutine steals the FD I need, I can
detect it but I can't do anything about it.

Philip Boampong

unread,
Feb 29, 2020, 3:32:11 PM2/29/20
to Uli Kunitz, golang-nuts
On Sat, Feb 29, 2020 at 4:35 PM Uli Kunitz <uli.k...@gmail.com> wrote:
>
> My reading of the Linux kernel sources (5.3) is that dup2 will never return -EINTR.

Thanks, good to know.

> But you will have to program the loop around the syscall anyway, because Linux may return EBUSY, which happens if the new fd has been allocated by somebody else, but has not finished the initialization of the descriptor.

In my use cases I don't think I need to worry about EBUSY because
newfd is always 0/1/2 which should be already fully open.
Someone argues that it doesn't make much sense to retry dup2 on EBUSY,
because it's an indication of a serious problem (you are trying to
swap a file under another goroutine's feet), see [1].
Also an open can take any amount of time to complete (e.g. a fifo) and
it doesn't seem like a good idea to busy-waiting for it in a loop.

> So if you want to write a safe program, you have to implement the loop around dup2 checking for EBUSY. It cannot harm to check for EINTR as well.

Whether it cannot harm is what I'm trying to find out.
If newfd gets closed then a retry loop is racy, see my previous messages.

[1] https://stackoverflow.com/questions/23440216/race-condition-when-using-dup2

Philip Boampong

unread,
Feb 29, 2020, 4:04:56 PM2/29/20
to Brian Candler, golang-nuts
On Sat, Feb 29, 2020 at 9:13 PM Brian Candler <b.ca...@pobox.com> wrote:
>
> I don't quite follow. If two threads are fighting to use the target fd, then that's just a race anyway.

One case is when you have full control of the FD namespace, then you
can rely on your own synchronization and do whatever you want, but
that seems very hard to me in Go (you would have to be aware of every
FD creation from any package).

The other case is when both oldfd and newfd are open and you control
them (or at least you are 100% sure that they will not be closed by
other goroutines). Then you can dup2 safely, thanks to the atomicity
guarantee.
But if a failed dup2 closes newfd without replacing it, then you lose
control of newfd (which becomes reusable) and there's no way to retry
the operation safely.

See my reply to Ian for more details:
https://groups.google.com/d/msg/golang-nuts/1AbvThUg3YE/phVzNNvoAAAJ

Uli Kunitz

unread,
Feb 29, 2020, 4:13:34 PM2/29/20
to golang-nuts


On Saturday, February 29, 2020 at 9:32:11 PM UTC+1, Philip Boampong wrote
Whether it cannot harm is what I'm trying to find out.
If newfd gets closed then a retry loop is racy, see my previous messages.

[1] https://stackoverflow.com/questions/23440216/race-condition-when-using-dup2

I don't understand. The point of dup2 is that it is atomic. If you are getting an error nothing has happened, no replacement of newfd and no close, so there is no harm in trying again after EBUSY or EINTR.

Philip Boampong

unread,
Feb 29, 2020, 5:39:48 PM2/29/20
to Uli Kunitz, golang-nuts
> If you are getting an error nothing has happened, no replacement of newfd and no close

I wish that sentence was written on the man page.
That was the way I first understood it too (and it makes more sense)
but the little information I found disagree (libuv [1], python [2]
(see the note about dup2)).

The man page says that "The steps of closing and reusing the file
descriptor newfd are performed atomically", but it is possible that
such sentence is only meant to imply that newfd is never reusable
during the syscall.
I'm not comfortable accepting your interpretation of "atomic" when
there is no clear reference and the python implementation disagree;
that's why I'm asking for more evidence.

[1] https://github.com/libuv/libuv/issues/462
[2] https://www.python.org/dev/peps/pep-0475/

Ian Lance Taylor

unread,
Feb 29, 2020, 9:07:07 PM2/29/20
to Philip Boampong, golang-nuts
On Sat, Feb 29, 2020 at 12:16 PM Philip Boampong <pboam...@gmail.com> wrote:
>
> On Sat, Feb 29, 2020 at 1:34 PM Ian Lance Taylor <ia...@golang.org> wrote:
> >
> > It does not make sense to use dup2 if you are not in control of the FD
> > namespace. In order to use dup2 you need to specify the new FD. If
> > that FD might be concurrently opened by some other package, or by the
> > runtime, then you can not use dup2 safely.
>
> But if you opened newfd yourself, or if it is 0/1/2 and you never
> closed os.Std*, then you *can* dup2 safely, regardless of other
> packages.

Those are examples where you are in charge of the FD namespace
(assuming you know that no other code is doing to touch descriptors
0/1/2).


> > This doesn't mean that Go program can never use dup2. The runtime
> > will only open a file descriptor when requested. You can avoid
> > packages that open descriptors at unpredictable times.
>
> Can you really do that? I don't think the standard library guarantees
> that it will not create a new FD behind the scenes tomorrow (nor it
> exactly documents its FD usage and timing).
>
> > Is this a theoretical question or one that arises from real code?
>
> It's not theoretical, I became aware of EINTR and I'm trying to fix my own code.
> I use dup2 to redirect FD 0/1/2 from inside the program itself (I know
> I can assign to os.Std* but that is not always sufficient or safe).
> The most common case is to redirect stderr including panic output.

So you are calling dup2(N, 2)? What problem are you trying to avoid?


> As you can see, I cannot retry dup2 on EINTR unless I'm sure that the
> first call has left newfd open, otherwise I will incur the race. But
> if I don't retry it, then I have no way to recover from ordinary
> signals!

What race are you worried about? You are already assuming that
nothing else is going to touch descriptor 2. dup2 is documented to
atomically close newfd and duplicate oldfd onto it. That means that
either newfd is untouched, or oldfd is duplicated onto it. That is
true whether dup2 returns EINTR or not. And dup2 is not documented to
return EINTR, and the Linux kernel code shown above does not have any
path that returns EINTR.


> I don't think you can argue that I'm supposed to have control of the
> FD namespace: the whole point of dup2 being atomic [1] is that users
> may not have such control, see also [2][3].

You need to have control because you have to know what is happening
with newfd. If two different goroutines call dup2(..., 2) then you
need to know which goroutine is going to run last.


> (I'm not actually getting EINTR from my dup2's, but I want to handle
> it correctly if it can happen.)
>
> > > Just to ask the an obvious question: is dup2() idempotent or not?
> >
> > dup2 in itself is idempotent.
>
> It's hard to talk about idempotence when the context changes
> unpredictably (FD state).
> Dup2 is "atomic", in the sense that newfd is never reusable during the
> whole syscall.
> But is it "atomic" in the sense that it will either "leave FDs
> unchanged with an error", or "complete without error", nothing in
> between?

That is what it means to atomically close newfd and duplicate oldfd
onto newfd. It will either leave newfd unchanged, or it will
duplicate oldfd onto newfd. Any other possibility would not be
atomic.

Quoting the GNU/Linux man page:

The steps of closing and reusing the file descriptor newfd are per‐
formed atomically. This is important, because trying to implement
equivalent functionality using close(2) and dup() would be subject to
race conditions, whereby newfd might be reused between the two steps.
Such reuse could happen because the main program is interrupted by a
signal handler that allocates a file descriptor, or because a parallel
thread allocates a file descriptor.

Note the explicit mention of a signal handler.

Ian

Ian Lance Taylor

unread,
Feb 29, 2020, 9:11:53 PM2/29/20
to Philip Boampong, golang-nuts
On Sat, Feb 29, 2020 at 12:16 PM Philip Boampong <pboam...@gmail.com> wrote:
>
> Can you really do that? I don't think the standard library guarantees
> that it will not create a new FD behind the scenes tomorrow (nor it
> exactly documents its FD usage and timing).

I forgot to reply to this point. The Go standard library guarantees
to not be surprising or bizarre. It's not going to open a file unless
the function/method clearly requires opening a file.

Ian

Kurtis Rader

unread,
Feb 29, 2020, 11:35:11 PM2/29/20
to Philip Boampong, golang-nuts
In addition to all of the other points that have been made by Ian and others I think it is important to reinforce another point. On UNIX like systems every function call that returns a file descriptor (e.g., `open()` and `socket()`) is expected to return the lowest unused file descriptor. If the second fd you pass to `dup2()` is not already open then there is an inherent race, resulting in random failures, of any use of `dup2()` without coordination with every other thread that might allocate a file descriptor. This has nothing to do with Go. It is inherent in the UNIX process model.

The `dup2()` function is typically used in two highly stylized ways. The first, and most common, is to open a file like object and alias it to stdin (fd 0), stdout (fd 1), or stderr (fd 2). Assuming those three file descriptors were open at the time of the `dup2()` it is guaranteed there will not be a race with other threads that might allocate a new fd. Even if the `dup2()` fails with EINTR. Retrying the `dup2()` in that instance might succeed but will probably fail since EINTR on `close()` typically only occurs when there is a persistent error.

The other common use case is exemplified by POSIX 1003.2 shells such as ksh and bash. Here we're talking about processes with a single thread that might open a file like object but want to ensure the fd is not in a "reserved" range of file descriptors. Specifically, POSIX shells allow a script to explicitly redirect to a single digit (i.e., 0 <= fd < 10) file descriptor. So if the shell performs an operation that assigns a fd in that range it will need to move it outside of that range. A typical example is opening the interactive command history file. The fd from opening the command history file might be in the reserved range. So the shell uses `dup2()` to move it outside that range. The shell knows which file descriptors are in use and can safely use `dup2()` to reassign the fd.

If you don't use `dup2()` in the stylized ways discussed above then you are guaranteed to have random failures of your code. What makes those failures particularly problematic is that they will occur exceedingly infrequently. You wrote:

> Atomicity with close is the whole point of dup2.

No, it is not.

Kurtis Rader

unread,
Feb 29, 2020, 11:54:30 PM2/29/20
to Uli Kunitz, golang-nuts
On Sat, Feb 29, 2020 at 7:35 AM Uli Kunitz <uli.k...@gmail.com> wrote:
My reading of the Linux kernel sources (5.3) is that dup2 will never return -EINTR. Any necessary file closure will happen, but its return value will be ignored.

That a specific implementation might never return EINTR as a result of a `dup2()` call is irrelevant. The question is what happens on any UNIX like system. It is also not at all clear that even Linux won't return EINTR given the vagaries of complex API layering. Your statement that "Any necessary file closure will happen" is unfounded. The whole point of returning EINTR is that the close did not happen. If the close was successful then there was no point in returning an EINTR failure.
 

Philip Boampong

unread,
Mar 1, 2020, 12:46:29 AM3/1/20
to Ian Lance Taylor, golang-nuts
On Sun, Mar 1, 2020 at 3:06 AM Ian Lance Taylor <ia...@golang.org> wrote:
>
> > But if you opened newfd yourself, or if it is 0/1/2 and you never
> > closed os.Std*, then you *can* dup2 safely, regardless of other
> > packages.
>
> Those are examples where you are in charge of the FD namespace
> (assuming you know that no other code is doing to touch descriptors
> 0/1/2).

I think we've understood each other on this point. We were just
interpreting the expression "being in charge of the FD namespace"
differently.
I actually meant "being in control of the timing of every single FD
creation in the program" and I wrote it in answer to Rader (who was
talking about "unused" FDs) to point out that you cannot be sure that
one is unused unless you have that kind of control.

> So you are calling dup2(N, 2)?

Yes.

> And dup2 is not documented to return EINTR

Yes, it is. In the same man page you quoted [1].
| EINTR The dup2() or dup3() call was interrupted by a signal; see
| signal(7).

> and the Linux kernel code shown above does not have any path that
> returns EINTR.

That's a relief, but I would be more comfortable if the man page was accurate.

> dup2 is documented to atomically close newfd and duplicate oldfd
> onto it. That means that either newfd is untouched, or oldfd is
> duplicated onto it.

If that is true, there is indeed no problem in retrying on EINTR, but
(as I explained two times already) someone disagrees with your
premise, notably the python standard library implementation [2].

From PEP 475 [3]:
| os.close, close() methods and os.dup2() are a special case: they will
| ignore EINTR instead of retrying. The reason is complex but involves
| behaviour under Linux and the fact that the file descriptor may really
| be closed even if EINTR is returned.

"the file descriptor may really be closed even if EINTR is returned",
hence the alleged race if dup2 is retried.

libuv is of the same opinion [4].

That's why I was asking for more evidence about the exact meaning of
"performed atomically". It could just mean that no one can "steal"
newfd between close and duplication, even though your stronger
interpretation makes sense and seems to reflect the actual kernel
code.

Maybe the python folks are wrongly assuming that errors from the
implicit close are reported by dup2 (the man clearly says that they
are not).

[1] http://man7.org/linux/man-pages/man2/dup2.2.html
[2] https://github.com/python/cpython/blob/6e02691f300c9918ac5806dafa1f2ecef451d733/Modules/posixmodule.c#L8730
[3] https://www.python.org/dev/peps/pep-0475/
[4] https://github.com/libuv/libuv/issues/462

Uli Kunitz

unread,
Mar 1, 2020, 3:13:44 AM3/1/20
to golang-nuts
Thanks, now I understand the concerns better. For Linux I stay with my remark, if you get an error from dup2 nothing has happened. But this is not true for POSIX, which requires the error of the close to be reported. Linux doesn't do it, if you look carefully at the code that I have sent, the result of the close is ignored.

if (tofree)
                filp_close(tofree, files);

So I have to withdraw my remark that handling EINTR can do no harm, it actually can on a system that follows the POSIX specification, because POSX.1-2013 allows the status of the file descriptor to be undefined after a close.  So the correct behavior for EINTR handling of dup2 and close is to report it back to the caller. On existing Linux kernels EINTR will however never be returned by dup2. The Python people got it right

Ian Lance Taylor

unread,
Mar 1, 2020, 9:56:28 AM3/1/20
to Philip Boampong, golang-nuts
If dup2 can 1) close newfd; 2) receive a signal before duping oldfd to
newfd; 3) return EINTR leaving newfd closed, then dup2 requires
considerable care in any multi-threaded program. It requires that if
one thread is calling dup2, no other thread is permitted to open a
file or socket or other file descriptor. That seems both unfortunate
and unbelievable. I would like to see hard evidence before believing
that kernel developers for any OS would create a system with such a
bug.

Ian

Michael Jones

unread,
Mar 1, 2020, 10:53:34 PM3/1/20
to Ian Lance Taylor, Philip Boampong, golang-nuts
...and if true, this notion of “it’s atomic unless it’s not” deserves a name: the “subatomic operation.”

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcUbftML8sERv1B9CMa0tKRi0y2Nf%2BRJgP%2B7zQhJ7ZsXzA%40mail.gmail.com.
--
Michael T. Jones
michae...@gmail.com

Uli Kunitz

unread,
Mar 1, 2020, 11:46:03 PM3/1/20
to golang-nuts
Ian, It is unclear how to interpret the POSIX specification regarding dup2 returning EINTR.

The POSIX specification [1] of dup2 says:

"If the close operation fails to close fildes2dup2() shall return -1 without changing the open file description to which fildes2 refers."

It appears that the authors assumed that close failing means that the close has not been executed and the file descriptor is still open. POSIX 2013 allows EINTR as return from dup2. However the same specification says for close [2]:

"If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to [EINTR] and the state of fildes is unspecified."

Now it is everybody's guess what the state of the old filedes2/newfd is after the hypothetical POSIX dup2 set errno to EINTR.

On the Linux kernel close can indeed return EINTR. See the Linux man page for close [3]. However it is guaranteed that the file descriptor is actually closed, when EINTR is reported. The actual Linux kernel dup2 system call deals with the situation by ignoring errors of the close call and replaces the file descriptor no matter what. The situation is confused by the dup2 man page still stating that EINTR might be returned by dup2, which actually cannot happen.

The Linux man page mentions that close on HP-UX returns EINTR and leaves the file descriptor open. On this system an EINTR errno set by dup2 can be handled by repeating the dup2 call.

Philip Boampong

unread,
Mar 4, 2020, 9:20:49 AM3/4/20
to Ian Lance Taylor, golang-nuts
On Sun, Mar 1, 2020 at 3:55 PM Ian Lance Taylor <ia...@golang.org> wrote:
>
> If dup2 can 1) close newfd; 2) receive a signal before duping oldfd to
> newfd; 3) return EINTR leaving newfd closed, then dup2 requires
> considerable care in any multi-threaded program. It requires that if
> one thread is calling dup2, no other thread is permitted to open a
> file or socket or other file descriptor. That seems both unfortunate
> and unbelievable. I would like to see hard evidence before believing
> that kernel developers for any OS would create a system with such a
> bug.

I'm not trying to convince you, I'm just showing the little
information I found (with links); you could have just said they are
wrong.
What you wrote here, I'm trying to say it from the start. That kind of
dup2 would be impractical to use, and why have *any* atomicity
guarantee at all if you have to synchronize everything yourself
anyway?
But I could argue that the man is not explicit enough to disprove the
wrong interpretation, even if it is "common sense".

It seems that if I want to use dup2 at all, I have no other choice
than to believe that retry on EINTR is harmless (and I hope in every
sane system it is).

Philip Boampong

unread,
Mar 4, 2020, 9:36:27 AM3/4/20
to Uli Kunitz, golang-nuts
Thanks for the feedback.

> "If the close operation fails to close fildes2, dup2() shall return -1 without changing the open file description to which fildes2 refers."
> "If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to [EINTR] and the state of fildes is unspecified."

The two statements you quoted are not necessarily contradictory: one
is in the context of dup2(2), the other is in the context of close(2).
In the context of dup2 they may be using the word "close" informally,
without implying that a literal close(2) should be called internally.
(Even though is seems inconvenient to actually have two different behaviors.)

I wish the spec was more specific, but the dup2 prescription not to
change the open file description to which fildes2 refers in case of
error, seems to be intended as a guarantee that you can safely retry
dup2 on EINTR.

As you observed, by ignoring the result of the implicit close, Linux
dup2 solves the problem from another angle. It deviates from the
standard but it maintains compatibility, in the sense that it won't
break if you wrap your dup2's into retry loops.

If a dup2 can free filedes2/newfd and return EINTR, you are basically
accepting that even ordinary signals (e.g. used by the runtime) can
crash your dup2 in an irrecoverable way. I hope this can't be the case
in a sane system (unless someone knows of a system which is actually
affected).

> The situation is confused by the dup2 man page still stating that EINTR might be returned by dup2, which actually cannot happen.

Yes, I was confused by that too.
Also, neither dup2 nor close are mentioned in the signal(7) man page,
therefore you can't tell if they are supposed to be affected by
SA_RESTART or not.

Ian Lance Taylor

unread,
Mar 4, 2020, 10:10:26 AM3/4/20
to Philip Boampong, golang-nuts
On Wed, Mar 4, 2020 at 6:20 AM Philip Boampong <pboam...@gmail.com> wrote:
>
> On Sun, Mar 1, 2020 at 3:55 PM Ian Lance Taylor <ia...@golang.org> wrote:
> >
> > If dup2 can 1) close newfd; 2) receive a signal before duping oldfd to
> > newfd; 3) return EINTR leaving newfd closed, then dup2 requires
> > considerable care in any multi-threaded program. It requires that if
> > one thread is calling dup2, no other thread is permitted to open a
> > file or socket or other file descriptor. That seems both unfortunate
> > and unbelievable. I would like to see hard evidence before believing
> > that kernel developers for any OS would create a system with such a
> > bug.
>
> I'm not trying to convince you, I'm just showing the little
> information I found (with links); you could have just said they are
> wrong.

The links that I looked at were speculating about behavior, not
demonstrating real behavior in kernels. I don't know how to judge
whether those speculations about hypothetical behavior are correct or
not. Many things are hypothetically possible. I'm just saying that
before changing any code I would need to see actual evidence that some
kernel behaves in that way, not just speculation that it might.

If I missed something in those links, I apologize.

> It seems that if I want to use dup2 at all, I have no other choice
> than to believe that retry on EINTR is harmless (and I hope in every
> sane system it is).

In practice, if you don't use a FUSE file system, I bet that you will
never see an EINTR error from dup2. That goes double for Go programs,
as the Go runtime installs all signal handlers with SA_RESTART set.

This isn't worth all that much, but using the os/exec package to start
a new program will use dup2 to set up file descriptors. Those calls
to dup2 do not retry on EINTR; any EINTR error will be reported back
to the caller as an execution failure. I am not aware of any bug
reports in which that actually happened.

Ian

Philip Boampong

unread,
Mar 4, 2020, 3:29:54 PM3/4/20
to Ian Lance Taylor, golang-nuts
On Wed, Mar 4, 2020 at 4:09 PM Ian Lance Taylor <ia...@golang.org> wrote:
>
> The links that I looked at were speculating about behavior, not
> demonstrating real behavior in kernels. I don't know how to judge
> whether those speculations about hypothetical behavior are correct or
> not. Many things are hypothetically possible. I'm just saying that
> before changing any code I would need to see actual evidence that some
> kernel behaves in that way, not just speculation that it might.

If they were demonstrating real behavior in kernels, I wouldn't be asking.
I had already wrapped all of my dup2's inside retry loops when I came
across that part of the python source which recommends not to do it
because it would cause problems; then I started having doubts.
I reiterated their theory here in response to the feedback only
because I was trying to make my question understood. I'm not
suggesting to change any code nor to act on speculation.

> In practice, if you don't use a FUSE file system, I bet that you will
> never see an EINTR error from dup2.

In Linux it seems that it can't happen regardless of the filesystem,
but I would prefer to leave a little slack if the man page says
otherwise.
I would also like to handle FUSE file systems as gracefully as possible.

> That goes double for Go programs, as the Go runtime installs all
> signal handlers with SA_RESTART set.

Unfortunately the signal(7) man page doesn't tell if dup2 (or even
close) are affected by SA_RESTART.

> This isn't worth all that much, but using the os/exec package to start
> a new program will use dup2 to set up file descriptors.

Good to know. I did try to search for "syscall.Dup2" in the Go source
but that's not how dup2 is called; now I've found some.
Reply all
Reply to author
Forward
0 new messages