Reopening a file descriptor

101 views
Skip to first unread message

torto...@gmail.com

unread,
Sep 18, 2017, 7:59:04 PM9/18/17
to ISO C++ Standard - Future Proposals
Possibly slightly off topic but bear with me.
Do C, C++ and Posix require a way to re-open a file descriptor?

The two common cases are:
* I have a file and after reading it I may decide I want to write it after all.
* I create a new file and having finished setting it up I want to make it read only.

Recent discussions about race free file systems have made me more aware of the potential problems with this.
When going from read to write you do need a permission check but the other way around you do not.

Ideally you should be able to re-open based on the file handle or descriptor to avoid a race if the file is moved, renamed or deleted.
This is similar to the situation with moving locks from shared read to write and back that can also be weak on Posix.

At present I think the only way to do this is to re-open the file using its path name unless you are using freopen with a null path-name.

The primary use of the freopen() function is to change the file associated with a standard text stream (stderr, stdin, or stdout)
In Posix it it is implementation defined whether freopen() will allow you to change the open mode:
  http://pubs.opengroup.org/onlinepubs/009695399/functions/freopen.html

I'm not clear whether the C standard says anything more.

There doesn't seem to be an equivalent for file descriptors.
Apparently some Unix-like OSs support using fcntl() with FD_SETFL but Posix specifically says flags for the read or write mode shall be ignored.

Windows has a ReOpenFile() call - https://msdn.microsoft.com/en-us/library/windows/desktop/aa365497(v=vs.85).aspx

So my question is should we try make this possible?
For example we might:

* make the behaviour of freopen() fully defined for the case of converting a FILE* from read to write or write to read
(something for WG14 not directly in scope for C++)

* add a new function to Posix like:
  int reopen(int fd, int flags)
(something for ŧhe Austin group not directly in scope for C++)

* add a reopen(mode) to std::basic_filebuf & std::basic_fstream
(very much in scope for C++ and perhaps a way of driving the other two?)

I can't see this functionality in AFIO. Was it considered? Perhaps rejected because there is not currently a portable solution?

Thiago Macieira

unread,
Sep 18, 2017, 8:50:09 PM9/18/17
to std-pr...@isocpp.org
On Monday, 18 September 2017 16:59:04 PDT torto...@gmail.com wrote:
> Possibly slightly off topic but bear with me.
> Do C, C++ and Posix require a way to re-open a file descriptor?
>
> The two common cases are:
> * I have a file and after reading it I may decide I want to write it after
> all.
> * I create a new file and having finished setting it up I want to make it
> read only.

There's no such API. The closest I can think of is fcntl(F_SETFL), which won't
work. http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
says for F_SETFL:
"Bits corresponding to the file access mode and the file creation flags, as
defined in <fcntl.h>, that are set in arg shall be ignored."

Other possibilities I investigated:
1) using openat(2) with an empty path and the open file descriptor as the dfd:
this results in ENOENT because of the empty path (openat has no field to pass
AT_EMPTY_PATH). If you pass any non-empty path, you get ENOTDIR.

2) using first open(2) with O_PATH (Linux-specific) and then trying to open
that. Also fails the same way.

However, on Linux, you can make it work by opening "/proc/self/fd/%d" with the
file descriptor number. This will work even if the file has been deleted or was
not even present in the file system somewhere you could find. It also works for
file descriptors of other processes.

> So my question is should we try make this possible?
> For example we might:
>
> * make the behaviour of freopen() fully defined for the case of converting
> a FILE* from read to write or write to read
> (something for WG14 not directly in scope for C++)

freopen doesn't help because it's a process-local thing. It's not an operating
system object. You can't pass FILE* over Unix sockets and child processes
can't inherit them.

> * add a new function to Posix like:
> int reopen(int fd, int flags)
> (something for ŧhe Austin group not directly in scope for C++)

First, you implement it in the major OSes to show it is possible. Then POSIX
may adopt it.

This was first one for O_CLOEXEC, which allows for thread-safe opening of file
descriptors. I also see O_NOSIGPIPE as a future standardisation possibility,
based on existing practice.

reopen() doesn't exist yet.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

torto...@gmail.com

unread,
Sep 19, 2017, 4:15:12 AM9/19/17
to ISO C++ Standard - Future Proposals


On Tuesday, 19 September 2017 01:50:09 UTC+1, Thiago Macieira wrote:
On Monday, 18 September 2017 16:59:04 PDT torto...@gmail.com wrote:
> Possibly slightly off topic but bear with me.
> Do C, C++ and Posix require a way to re-open a file descriptor?
>
> The two common cases are:
> * I have a file and after reading it I may decide I want to write it after
> all.
> * I create a new file and having finished setting it up I want to make it
> read only.

There's no such API. The closest I can think of is fcntl(F_SETFL), which won't
work. http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
says for F_SETFL:
"Bits corresponding to the file access mode and the file creation flags, as
defined in <fcntl.h>, that are set in arg shall be ignored."

Yes as I already alluded.
The rationale is not stated but I presume its to make implementation easier.
 
Other possibilities I investigated:
1) using openat(2) with an empty path and the open file descriptor as the dfd:
this results in ENOENT because of the empty path (openat has no field to pass
AT_EMPTY_PATH). If you pass any non-empty path, you get ENOTDIR.

openat() is not appropriate as the file descriptor is for a directory so ENOTDIR is exactly the expected behaviour.
 
2) using first open(2) with O_PATH (Linux-specific) and then trying to open
that. Also fails the same way.

However, on Linux, you can make it work by opening "/proc/self/fd/%d" with the
file descriptor number. This will work even if the file has been deleted or was
not even present in the file system somewhere you could find. It also works for
file descriptors of other processes.

I did find that trick but it is still opening a path. When I first tried it I had accidentally
create the file with the wrong permissions and unsurprisingly got an EPERM.
If you were really re-opening using the file descriptor that would not apply.

> So my question is should we try make this possible?
> For example we might:
>
> * make the behaviour of freopen() fully defined for the case of converting
> a FILE* from read to write or write to read
> (something for WG14 not directly in scope for C++)

freopen doesn't help because it's a process-local thing. It's not an operating
system object. You can't pass FILE* over Unix sockets and child processes
can't inherit them.

Yes but its one of three interfaces which I'm arguing should be consistent and complementary.

> * add a new function to Posix like:
>   int reopen(int fd, int flags)
> (something for ŧhe Austin group not directly in scope for C++)

First, you implement it in the major OSes to show it is possible. Then POSIX
may adopt it.

This was first one for O_CLOEXEC, which allows for thread-safe opening of file
descriptors. I also see O_NOSIGPIPE as a future standardisation possibility,
based on existing practice.

reopen() doesn't exist yet.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center

My thought was to do and try to add it to C++ with semantics that are initially required to be implementation defined
with regards to races (i.e. they use the path under the hood for now) and use that to help drive the need for putting it into Posix.
(as Niall was hinting at for some things in AFIO).
However, you are right, adding it to Linux and BSD would be an far better driver towards Posix in the first instance.

I suppose my question here is, if its such am obvious and good idea why hasn't it been done already?
I suspect as I discovered with accessing low level file descriptors there are some interesting corner cases to think about,
especially as this is once again at the intersection of at least 3 standards.

Regards,

Bruce.

Thiago Macieira

unread,
Sep 19, 2017, 11:05:59 AM9/19/17
to std-pr...@isocpp.org
On Tuesday, 19 September 2017 01:15:11 PDT torto...@gmail.com wrote:
> > I did find that trick but it is still opening a path. When I first tried
>
> it I had accidentally
> create the file with the wrong permissions and unsurprisingly got an EPERM.
> If you were really re-opening using the file descriptor that would not
> apply.

If the file can't be opened, it shouldn't be reopenable either. Note that you
can do open(O_RDWR, 0400).

Now, the details of your issue here are relevant. You're brushing them over,
so if you want a discussion of the use-case, go back and look at what you did
wrong when reopening /proc/self/fd/%d.

> I suppose my question here is, if its such am obvious and good idea why
> hasn't it been done already?
> I suspect as I discovered with accessing low level file descriptors there
> are some interesting corner cases to think about,
> especially as this is once again at the intersection of at least 3
> standards.

Because I suppose there isn't such a big need. There aren't many uses of file
descriptor passing to require this feature often enough. That's where you need
OS-provided enforcement.

If you meant protection against accidental writes, then updating FILE* and
std::fstream is fine. That's enforcement by the API, but with no protection
against malicious code doing write(fileno(fptr), "h4x0rz", 6).

torto...@gmail.com

unread,
Sep 19, 2017, 5:04:16 PM9/19/17
to ISO C++ Standard - Future Proposals


On Tuesday, 19 September 2017 16:05:59 UTC+1, Thiago Macieira wrote:
On Tuesday, 19 September 2017 01:15:11 PDT torto...@gmail.com wrote:
> > I did find that trick but it is still opening a path. When I first tried
>
> it I had accidentally
> create the file with the wrong permissions and unsurprisingly got an EPERM.
> If you were really re-opening using the file descriptor that would not
> apply.

If the file can't be opened, it shouldn't be reopenable either. Note that you
can do open(O_RDWR, 0400).

That is a fair point. I was assuming it should be reopenable because it was already open but then it wouldn't be respecting its own permissions.
 
Now, the details of your issue here are relevant. You're brushing them over,
so if you want a discussion of the use-case, go back and look at what you did
wrong when reopening /proc/self/fd/%d.


> I suppose my question here is, if its such am obvious and good idea why
> hasn't it been done already?
> I suspect as I discovered with accessing low level file descriptors there
> are some interesting corner cases to think about,
> especially as this is once again at the intersection of at least 3
> standards.

Because I suppose there isn't such a big need. There aren't many uses of file
descriptor passing to require this feature often enough. That's where you need
OS-provided enforcement.

If you meant protection against accidental writes, then updating FILE* and
std::fstream is fine. That's enforcement by the API, but with no protection
against malicious code doing write(fileno(fptr), "h4x0rz", 6).

Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center

Correct. For this case I was thinking solely about a single process for the purpose of protecting
against bugs. For example, accidentally writing to a file after it has been 'completed' fails if you
make it read-only but not if you open it read-write to start with.

One way to enforce that is by adding reopen() functions to the APIs.
But isn't there a race if the file you are reopening is moved, renamed or deleted during the reopen operation?
Wouldn't that present a potential attack surface meaning you were better off just opening the file for both
reading and writing in the first place?

An alternative is constructing a stream with more restrictive permissions from the more general one.
Something which should already work is:

    iofstream readWriteable("foobar");

    ofstream writeOnly(readWriteable.rdbuf());

    ifstream readOnly(readWriteable.rdbuf());

but this can be subverted easily using:

   iofstream readWriteableAgain(writeOnly.rdbuf());

Is it worth preventing that kind of madness? perhaps making the intent clearer?

I would never do this myself. Instead I would do:

    void writeOnlyStuff(ostream& writable) { //do stuff write only }
    void readOnlyStuff(istream& readOnly) { //do stuff read only }

    void doStuff(void)
    {
        iofstream readWriteable("foobar");
        writeOnlyStuff(readWriteable);
        readOnlyStuff(readWriteable);
    }

I don't think is a way to do the equivalent using existing cstdio or posix APIs (without a racey open).
I suppose this has come up for me now as I am working with some legacy code written in C.
Using C is inherently more dangerous. Should we even try fixing it?

Rather than reopen() perhaps,
for Posix:

int readOnlyFd = dup4(readWriteFd, O_RDONLY); //reduce permissions - always works
int readWriteFd = dup4(readOnlyFd, O_RDWR); //increase permissions - may fail with EPERM

for C maybe:

File* readWriteFile = open("foobar","w+");
File* readOnlyFile = freopen2(readWriteFile,"r");

dup4 and freopen2 are bad names (any name with a number on the end is probably a bad name).

But for File* in Posix we can presumably do:

File* readOnlyFile = fdopen(fileno(readWriteFile),"r");

Posix says the flags must respect those of the file descriptor
"The application shall ensure that the mode of the stream as expressed by the mode argument is allowed by the file access mode of the open file description to which fildes refers"

My Linux man page is a bit more ambiguous:

"The mode of the stream (one of the values "r", "r+", "w", "w+", "a", "a+") must be compatible with the mode of the file descriptor."

Compatible could be read as implying it should be the exact same mode.




 

Thiago Macieira

unread,
Sep 19, 2017, 5:52:35 PM9/19/17
to std-pr...@isocpp.org
On Tuesday, 19 September 2017 14:04:16 PDT torto...@gmail.com wrote:
> Correct. For this case I was thinking solely about a single process for
> the purpose of protecting
> against bugs. For example, accidentally writing to a file after it has been
> 'completed' fails if you
> make it read-only but not if you open it read-write to start with.
>
> One way to enforce that is by adding reopen() functions to the APIs.

That can easily be done on the API that is under the purview of WG14 (stdio.h)
and WG21 (fstream).

Adding it to the POSIX API is a different matter completely.

> But isn't there a race if the file you are reopening is moved, renamed or
> deleted during the reopen operation?

Yes, if you open it by the original path, which you must have recorded. If you
just reopen /proc/self/fd/%d, there's no race (provided you don't have another
thread closing that fd at the same time).

> for Posix:
>
> int readOnlyFd = dup4(readWriteFd, O_RDONLY); //reduce permissions - always
> works

> dup4 and freopen2 are bad names (any name with a number on the end is
> probably a bad name).

According to the Linux naming, "dup4" should have 4 arguments but you only
have 2. The call you'd have wanted is actually dup3, since it already has a
flags argument that is currently used to pass O_NONBLOCK or O_CLOEXEC. Except
that no one passes O_RDONLY or O_WRONLY or O_RDWR flags today, so the 0 could
be misinterpreted.

Anyway, I don't think "dup" should be used. That means duplicating the file
descriptor but still pointing to the kernel structures, which include the
current read/write cursor position, in addition to the flags. You actually want
something with "open" in the name, indicating it's a different open.

--

Niall Douglas

unread,
Sep 19, 2017, 6:28:17 PM9/19/17
to ISO C++ Standard - Future Proposals
On Tuesday, September 19, 2017 at 12:59:04 AM UTC+1, Bruce Adams wrote:
Possibly slightly off topic but bear with me.
Do C, C++ and Posix require a way to re-open a file descriptor?

POSIX does not.

Windows does via an undocumented set of parameters for NtOpenFile(). This undocumented use case proved so popular it was formalised into the Win32 ReOpenFile().
 

The two common cases are:
* I have a file and after reading it I may decide I want to write it after all.
* I create a new file and having finished setting it up I want to make it read only.

Recent discussions about race free file systems have made me more aware of the potential problems with this.
When going from read to write you do need a permission check but the other way around you do not.

Ideally you should be able to re-open based on the file handle or descriptor to avoid a race if the file is moved, renamed or deleted.
This is similar to the situation with moving locks from shared read to write and back that can also be weak on Posix.

Right now, on POSIX, the only way to achieve this is to loop fetching the path, opening it, and comparing st_ino and st_dev for equivalence.

This ought to not be too bad. Linux opens a fd in single digit microseconds. Windows is two orders of magnitude slower, even with ReOpenFile().
 

I can't see this functionality in AFIO. Was it considered? Perhaps rejected because there is not currently a portable solution?

Honestly it's because nobody made me think of it until now.

Expect to see it land by end of week at the latest :) It's trivially easy to implement. It's a good idea, and thanks for suggesting it.

BTW Thiago you may be interested to learn that I wrote a template adapter which lets you adapt any afio::handle to no longer rely on path retrieval working (correctly/at all). It's on the user to instantiate adapted handles however as they incur significant runtime overhead.

Niall 

Thiago Macieira

unread,
Sep 19, 2017, 7:17:05 PM9/19/17
to std-pr...@isocpp.org
On Tuesday, 19 September 2017 15:28:17 PDT Niall Douglas wrote:
> BTW Thiago you may be interested to learn that I wrote a template adapter
> which lets you adapt any afio::handle to no longer rely on path retrieval
> working (correctly/at all). It's on the user to instantiate adapted handles
> however as they incur significant runtime overhead.

My kernel developer colleagues advised against relying on readlink(2) on
/proc/self/fd. It's not meant for the purpose you want it to. So my
recommendation is that you redesign so that retrieving the path name is an
optional and possibly-failing feature.

Niall Douglas

unread,
Sep 19, 2017, 9:11:47 PM9/19/17
to ISO C++ Standard - Future Proposals

I can't see this functionality in AFIO. Was it considered? Perhaps rejected because there is not currently a portable solution?

Honestly it's because nobody made me think of it until now.

Expect to see it land by end of week at the latest :) It's trivially easy to implement. It's a good idea, and thanks for suggesting it.

> BTW Thiago you may be interested to learn that I wrote a template adapter 
> which lets you adapt any afio::handle to no longer rely on path retrieval 
> working (correctly/at all). It's on the user to instantiate adapted handles 
> however as they incur significant runtime overhead. 

My kernel developer colleagues advised against relying on readlink(2) on 
/proc/self/fd. It's not meant for the purpose you want it to. So my 
recommendation is that you redesign so that retrieving the path name is an 
optional and possibly-failing feature. 

As I explained before, you cannot race free unlink or rename without it. And that's a data destroying kind of race, let alone the security implications.

As I've also explained before, I've soak tested /proc/self/fd under a very rapidly changing filesystem on Linux as far back as the 2.6 kernels and it works just fine. If end users don't like it, it's trivial to override current_path() in a subclass returning a stored path by the subclass, or now to use the provided template adaptor class, or just turn off it being used at all via passing in flags requesting it not be used at handle creation.

I think that's plenty of options for end users. Nobody has to use it or rely on it.

Regarding whether it should be the default or not, I think we should always default to safety. The implementation is carefully written to detect instability and fail. Your program may error out, but no data will be lost. That's a very sensible default in my book. The programmer should be the one to explicitly opt out of default safety.

Niall
Reply all
Reply to author
Forward
0 new messages