[boost] [process] Formal Review

518 views

Skip to first unread message

Klaim - Joël Lamotte

unread,

Nov 3, 2016, 2:27:10 PM11/3/16

to Boost Developers List

Boost.Process Review
====================

...At a minimum, kindly state:
- Whether you believe the library should be accepted into Boost
* Conditions for acceptance

YES it should be accepted, though documentation need improvements.

- Your knowledge of the problem domain

I implemented naive and less naives ways to manage different processes in
the past.
I currently work on libraries which are all about multi-process
communication
and we have a poor-man's process management (launching, tracking, etc.) api
that I have to maintain
on Linux, Windows and MacOSX.
I also have home projects which are organized as client-server with
master-slave relationship,
clients spawning servers and pinging them to keep processes alive mutually
(using network communication).

So basically, I implemented basic process management libraries, but I'm
more a process
management library user, in all my projects, paid or not.
I have been exposed to a lot of problems related and solved a lot of them.

I also used the last 2 Boost.Process versions proposed to Boost (though I
voted no for these ones)
but in cases where I just needed a cross-platform way to launch an
arbitrary executable (no
tracking nor communication).

You are strongly encouraged to also provide additional information:
- What is your evaluation of the library's:
* Design

The weird choice of variadic arguments in launching functions have been
explained
so I'm more at ease with it, though maybe ways to simplify it could be
attemted in the future.

I like that the design of boost::process::child follow, where it make
sense, the design of std::thread.
Both could be considered as execution resource, so it's fine to me that
they share
similar interface phylosophies of design (like spawning on construction and
move-only).

I would love it if there was a way (even slow) to track processes lifetime
for processes that
I don't have code for. Though it's ok for me if it's a future extension.

I like the minimalism of the scope, though useful extensions could be added
later.

* Implementation

I briefly eyescanned the code, looked more closely at the termination
trigerriung code
because of the discussion we had on termination reqest vs kill.
I don't bother much about performance for this library as launching
processes is rarely
done in a tight high-performance loop.

* Documentation

I did a report of the documentation a few weeks or months ago, I do not
know if
changes have been made since.
Anyway I think the documentation should have a part focused on explaining
the concepts
because even if most programmers knows what a process is, all related
concepts (like what
is the "environnement") are so specific to processes that they are often
ignored as long as
naive usage works.
So as soon as the user need more specific usage, most, I believe, will not
know what word
to use to find the right concept.
I think this library documentation should follow something similar to
Boost.MSM which introduce
to the basic concepts, though without walls of text.

* Tests

I didn't have time to try it yet in real applications, but so far didn't
find
any problem in basic usage.

* Usefulness

It is very useful for almost all my projects.

- Did you attempt to use the library?
YES

If so:
* Which compiler(s)

Visual Studio 2015 for now.

* What was the experience? Any problems?

I only did minimal testing, nothing deep, no problems.

- How much effort did you put into your evaluation of the review?

Several hours diffused through several months.

Joël Lamotte

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Raphaël Londeix

unread,

Nov 4, 2016, 12:22:02 PM11/4/16

to bo...@lists.boost.org

>
> We encourage your participation in this review. At a minimum, kindly state:

> - Whether you believe the library should be accepted into Boost
>

YES

> * Conditions for acceptance
>

The interface used to spawn processes should be reworked.

> - Your knowledge of the problem domain
>

I have implemented some process wrappers many times. I have used the last
release of Boost.Process (0.5 I think?).

> - What is your evaluation of the library's:
> * Design
>

I think the similarity with the boost::thread api is a great choice. The
lifetime of process
is handled by small functors that will be called at specified times. I
think is a very good
design as it allows one to write small "plugins" in order to affect the
setup or the tear down
of a process. The integration with Boost.Asio follows the previous choice
of using boost::thread
as a reference, i.e. the child process is not the resource you read
from/write to, you need
some separate stream objects whose lifetime are tied to the child process:

struct MyProcess
{
bp::async_pipe std_in;
bp::async_pipe std_out;
bp::async_pipe std_err;
bp::child child;
};

I haven't seen a way however to be asynchronously notified when a child
exited, so I'm not
sure how one would handle gracefully multiple async_read/write while the
process could exit
at any time.

The choice of having a nice interface to spawn processes have been
discussed already, but I
still think that the library should aim to have the simplest and lowest
level interface.
IMO, C++ users do not need to have a nice and expressive syntax to spawn
process, I
cannot imagine a code base where many processes are launched at many places
in the
code. In other words, I think that either you rarely need to spawn a
process (a hack for example),
or your job is to spawn processes (a shell, a make clone, ...). In both
cases, you will
spawn processes in very few places in the code.

Moreover, this interface style has an implication on build times, errors
seen at build time
and makes it hard to use in cases where your program needs to spawn
processes based on
runtime decisions. if one decided to implement a shell that supports
redirections, pipes,
environment modification and background jobs, they would certainly need to
make runtime
adjustments to the arguments given to child constructor. As of today, this
has to be
worked around by the user ; I think this should be the primary interface.
A suggestion has been made to have a more idiomatic C++ construction, I
would extend it
with a specific options type:

auto options = bp::options().args("gcc");
if (verbose) options.append_arg("-v");
if (ignore_errors) options.std_err.redirect(bp::null_device);
bp::child child(io_service, std::move(options));

This way you make it impossible to modify the options after the child
process creation.
But that's just an idea :)

* Implementation
>

The implementation is clear, but the support for the aforementioned
constructor interface
"leaks" into the implementation everywhere. As the previous version of
Boost.Process,
the different OS support is nicely separated and each stage of the process
creation is
clearly implemented in its own header.

> * Documentation
>

The asynchronous section feels a bit short given the many possibilities the
interface
gives us.

> * Tests
>

Didn't check

> * Usefulness
>

I think it would be very useful to have something we can rely on instead of
our own
half baked implementations.

> - Did you attempt to use the library? If so:
> * Which compiler(s)
>

g++4.8 using c++11

> * What was the experience? Any problems?
>

I had many problems and decided to fallback on popen().
The first kind of problems were some compile errors. But the author kindly
stated
that I needed the HEAD of boost (which is not an option for me) but I found
a
workaround by not using some niceties of the interface:

https://github.com/klemens-morgenstern/boost-process/issues/20
https://github.com/klemens-morgenstern/boost-process/issues/21

As said previously, for someone not used to boost.fusion, the errors are
hard to
read.

The other problem that made me give up (temporarily) on using Boost.Process
was the fact that asynchronous reads where unreliable when more than one
process
is spawned.

https://github.com/klemens-morgenstern/boost-process/issues/22

This issue has been closed despite the fact that it is not fixed.

- How much effort did you put into your evaluation of the review?
>

I'd say half a day counting this mail and the testing I made.

Cheers,

--
Raphaël Londeix
http://hotgloupi.fr

Klemens Morgenstern

unread,

Nov 4, 2016, 3:26:42 PM11/4/16

to bo...@lists.boost.org

You've got bp::on_exit for that if you use boost.asio, i.e. pass a
io_service.

> The choice of having a nice interface to spawn processes have been
> discussed already, but I
> still think that the library should aim to have the simplest and lowest
> level interface.
> IMO, C++ users do not need to have a nice and expressive syntax to spawn
> process, I
> cannot imagine a code base where many processes are launched at many places
> in the
> code. In other words, I think that either you rarely need to spawn a
> process (a hack for example),
> or your job is to spawn processes (a shell, a make clone, ...). In both
> cases, you will
> spawn processes in very few places in the code.
>
> Moreover, this interface style has an implication on build times, errors
> seen at build time
> and makes it hard to use in cases where your program needs to spawn
> processes based on
> runtime decisions. if one decided to implement a shell that supports
> redirections, pipes,
> environment modification and background jobs, they would certainly need to
> make runtime
> adjustments to the arguments given to child constructor. As of today, this
> has to be
> worked around by the user ; I think this should be the primary interface.

The only annoying thing there are the redirections, because there you'd
need a variant. I really don't want to add variant to the
std_in/std_err/std_out initializer, since would include them everytime
then. But: I could add it as an optional feature. Now that thing could
be used like this:

auto var = bp::make_variant(bp::std_out);
bp::pipe p;
var = p;

bp::child c("foo", var);

> A suggestion has been made to have a more idiomatic C++ construction, I
> would extend it
> with a specific options type:
>
> auto options = bp::options().args("gcc");
> if (verbose) options.append_arg("-v");
> if (ignore_errors) options.std_err.redirect(bp::null_device);
> bp::child child(io_service, std::move(options));
>
> This way you make it impossible to modify the options after the child
> process creation.
> But that's just an idea :)

Thanks for the idea: option objects might be the best solution for that,
and I think the best way would be to allow an easy implementation of
that. Maybe something like that:

auto options = bp::make_options(bp::exe, bp::args, bp::std_err);
options.set_exe("gcc");
if (verbose) options.add_arg("-v");
if (ignore_errors) options.std_err(bp::null);
bp::child c(options);

I'll need to check if I can come up with a syntax for that, but that
would be a way to easily build an option type, without changing anything
underneath. And that would also be extensible, which the other ideas
were not. And if that works, I might even be able to automatically add
the io_service if you do not pass it, hence reducing the pain caused by
the fusion-errors.

auto opt = bp::make_options(bp::std_err & bp::std_out); //requires an
io_service, since it might be async.
opt.get_io_service(); //and there it is.

Now if you combine these two you'd have no problem making your own
child-implementation.

struct my_child
{
asio::io_service ios;
decltype(bp::make_options(bp::exe, bp::args, bp::std_err) opt{ios};
//so it's only one
bp::async_pipe out(ios); //because you already know what that's
going to be.

bp::child c;
void operator()() {c = child(opt, std_out > out); };
};

This is also extensible. I would however not make this the default,
since it incurrs run- and compile-time overhead.

Yep, those are really ugly even if you know fusion. But I've got no idea
how to create a good error message there, if anybody has an idea, let me
know. SFINAE doesn't really help, because that just tells you, that no
matching function could be called. And deep nested static_asserts also
don't help anybody, especially since they get buried in the template
error messages.

>
> The other problem that made me give up (temporarily) on using Boost.Process
> was the fact that asynchronous reads where unreliable when more than one
> process
> is spawned.
>
> https://github.com/klemens-morgenstern/boost-process/issues/22
>
> This issue has been closed despite the fact that it is not fixed.

Well I've got several tests for that, and the particular example (with
echo) also can fail when only used with one - that's why I assume(d)
there's some other issue.
But since we found a design bug during the review, which is actually
caused by a faulty analysis on my part, the behaviour of pipes will
change. That means that the way the async_pipe handles the exit of the
child process will be different and we'll see if that fixes it.

> - How much effort did you put into your evaluation of the review?
> I'd say half a day counting this mail and the testing I made.
>
> Cheers,
>

Nat Goodspeed

unread,

Nov 5, 2016, 7:44:51 PM11/5/16

to bo...@lists.boost.org

I would like to surface a number of issues, large and small.

You are clearly proud to support a number of different syntactic ways
to express the same semantic operation. To me this is a minor
negative. While in theory it sounds nice to be able to write Process
consumer code any way I want, in a production environment I spend more
time reading and maintaining other people's code than I do writing
brand-new code. Since each person contributing code to our (large)
code base will select his/her own preferred Process style, I will be
repeatedly confused as I encounter each new usage.

I admire the support for synchronous pipe I/O, while remaining
skeptical of its practical utility. Synchronous I/O with a child
process presents many legitimate opportunities for deadlock: traps for
the unwary. I would be content with a combination of:

* async I/O on pipes (yes, using Asio)
* system() for truly simple calls
* something analogous to Python's subprocess.Popen.communicate(): pass
an optional string for the child's stdin, retrieve a string (or, say,
a stringstream) for each of its stdout and stderr.

The example under I/O pipes the stdout from 'nm' to the stdin of
'c++filt'. But the example code seems completely unaware that c++filt
could be delayed for any of a number of reasons. It assumes that as
soon as nm terminates, c++filt must have produced all the output it
ever will. I worry about the Process implementation being confused
about such issues.

I'm dubious about the native_environment / environment dichotomy. As
others have questioned, why isn't 'environment' a typedef for a
map<string, string> (or unordered_map)?

I understand that you desire to avoid copying the native process
environment into such a map until necessary, but to me that suggests
something like an environment_view (analogous to string_view) that can
perform read-only operations on either back-end implementation.

Operations involving splitting and joining on ':' or ';' should be
defined purely in terms of strings and ranges of strings. They should
not be conflated with environment-map support.

The documentation so consistently uses literal ';' as the PATH
separator that I worry the code won't correctly process standard Posix
PATH strings.

At this moment in history, an example showing integration of Process
with Boost.Fiber seems as important as examples involving coroutines.

Why is the native_handle_t typedef in the boost::this_process namespace?

While in full context it makes sense to speak of "assigning" an
individual process to a process group, the method name assign() has
conventional connotations. Use add() instead.

There's a Note that says: "If a default-constructed group is used
before being used in a process launch, the behaviour is undefined." I
assume you mean "destroyed before being used," but this is a concern.
If a program has already instantiated a process group, but for any
reason decides not to (or fails to) launch any more child processes,
does that put the entire parent process at risk? What remedial action
can it take? Move the group object to the heap somewhere and let it
leak? Spawn a bogus child process for the purpose of defusing the
ticking process group instance?

If you're going to reify process group at all, you should wrap more
logic around it to give it well-defined cross-platform behavior. And
it should definitely be legal to create and destroy one without
associating any child process with it.

Given support elsewhere for splitting/joining PATH strings, the
string_type path parameter to search_path() feels oddly low-level.
Maybe accept a range of strings?

> At a minimum, kindly state:
> - Whether you believe the library should be accepted into Boost

> * Conditions for acceptance

YES, IF the present Boost.Process preserves the ability of its
predecessor to extend it without modifying it. I'm sorry, thorough
reading of the documentation plus some browsing through the code
leaves me doubtful.

Examples of such extensions:

* With Boost.Process 0.5 I can use Boost.TypeErasure to pass an
"initializer" which is a runtime collection of other initializers.
Such a feature in Process 0.6 would support people's requests to be
able to assemble an arbitrary configuration with conditional elements
and use it repeatedly.
* I can create an initializer (actually one for each of Posix and
Windows, presenting the same API, so portable code can use either
transparently) to implicitly forcibly terminate the child process
being launched when the parent process eventually terminates in
whatever way.

Please understand that I am not asking for the above features to be
absorbed into the Process library: I am asking for a Process library
extensible enough that I can implement such things with consumer code,
without having to modify the library. Perhaps Process 0.6 already is!
It's just hard for me tell.

Another feature should, in my opinion, be incorporated into the library:

* On Windows, if you desire to pass any file handles at all to the new
child process, it is completely whimsical what *other* open file
handles you may inadvertently pass -- unless you play games with
STARTUPINFOEXA and PROC_THREAD_ATTRIBUTE_LIST to enumerate exactly the
set of handles you intend to pass.
If the library doesn't natively support that, I *must* be able to
pass a custom initializer with deep access to the relevant control
blocks to set it up. This is a showstopper, as in "you can't remove
that file because, unbeknownst to you, some other process has it
open."

> - Your knowledge of the problem domain

I have hand-written process-management code in C++ several times
before -- each with multiple implementations to support multiple
platforms. I have tested the previous candidate Boost.Process 0.5 with
a number of programs exercising a number of features.

> You are strongly encouraged to also provide additional information:

> - What is your evaluation of the library's:
> * Design

Design notes are at the top of this mail.

> * Implementation

I have only glanced over parts of the implementation. It seems
somewhat more obscure than the corresponding pieces of Process 0.5,
which is why I couldn't quickly satisfy myself as to the library's
extensibility.

> * Documentation

Others have noted that the documentation is very sketchy in places. I
too wish for more explanation.

Much of the time I spent on this review was reading through the
documentation. Apologies if I have misunderstood the library's
capabilities.

> * Usefulness

I have felt for years now that it is essential to get a child-process
management library into Boost. It grieves me to have to write
platform-specific API calls into different applications.

> - Did you attempt to use the library?

I did not, sorry, ran out of time -- as you can infer from this review
arriving at the end of the final day!

> - How much effort did you put into your evaluation of the review?

Most of my time was spent reading the documentation. I looked through
a couple of the implementation files for further information.

Klemens Morgenstern

unread,

Nov 5, 2016, 9:10:52 PM11/5/16

to bo...@lists.boost.org

Am 06.11.2016 um 00:44 schrieb Nat Goodspeed:
> I would like to surface a number of issues, large and small.
>
> You are clearly proud to support a number of different syntactic ways
> to express the same semantic operation. To me this is a minor
> negative. While in theory it sounds nice to be able to write Process
> consumer code any way I want, in a production environment I spend more
> time reading and maintaining other people's code than I do writing
> brand-new code. Since each person contributing code to our (large)
> code base will select his/her own preferred Process style, I will be
> repeatedly confused as I encounter each new usage.

First of all: thanks for taking the time to write the review.
I just saw the critizism of "std_out > p" coming, while others really
liked that. In practice you'd have "std_out > p", "std_out = p" and
"std_out(p)". I think that is manageable.

> I admire the support for synchronous pipe I/O, while remaining
> skeptical of its practical utility. Synchronous I/O with a child
> process presents many legitimate opportunities for deadlock: traps for
> the unwary. I would be content with a combination of:

I think that is an overkill for many problems, I'd rather provide all
the possibilities. Btw.: the main deadlock issue I had will be fixed,
I'm working on that right now.
But I'm generally not very fond of forcing people to use one solution
when there are several.

> * async I/O on pipes (yes, using Asio)
> * system() for truly simple calls

Not sure why this would need to be for simple calls, I really like my
coroutine solution (that's what I'm proud of here ;)).

> * something analogous to Python's subprocess.Popen.communicate(): pass
> an optional string for the child's stdin, retrieve a string (or, say,
> a stringstream) for each of its stdout and stderr.

That'd be limiting the interface, we discussed that in length. There
cannot be any solution using stringstream, but boost::asio::streambuf is
available. To make that even easier the library does provide a syntax
for that:

boost::asio::streambuf in, out, err;
child c("foo", std_out > out, std_err > err, std_in < in);

I might be adding a new feature to make it easier to implement your own
process object.

Then again: it's not meant as a high-level implementation, it's meant to
make it easy for you implement something like the python module. I.e.
you can do that without worrying about the OS-specific stuff.

> The example under I/O pipes the stdout from 'nm' to the stdin of
> 'c++filt'. But the example code seems completely unaware that c++filt
> could be delayed for any of a number of reasons. It assumes that as
> soon as nm terminates, c++filt must have produced all the output it
> ever will. I worry about the Process implementation being confused
> about such issues.

I fear you are right, I'd need to be checking for an empty line at the end.

>
> I'm dubious about the native_environment / environment dichotomy. As
> others have questioned, why isn't 'environment' a typedef for a
> map<string, string> (or unordered_map)?

For native_environment:

auto nat_env = this_process::environment();
nat_env["FOO"] = "BAR";
assert(getenv("FOO") == "BAR");

For environment: it stores the data already in the needed format for the
OS. Thereby I don't need to transform anything when launching a process.
That just seemed like the most logical approach to me, and once I have
the native_environment it just makes sense. Also it is easily convertible:

environment env = this_process::environment();

> I understand that you desire to avoid copying the native process
> environment into such a map until necessary, but to me that suggests
> something like an environment_view (analogous to string_view) that can
> perform read-only operations on either back-end implementation.

Well native_environnent modifies the environment of the curent process,
while environment allows you to build

> Operations involving splitting and joining on ':' or ';' should be
> defined purely in terms of strings and ranges of strings. They should
> not be conflated with environment-map support.

Seems convenient to me - what's wrong with that?

auto p = nat_env["PATH"].to_vector();
assert(p == {"bar", "foo"});

and that also goes the other way around:

nat_env["PATH"] = {"bar/foo", "foo/bar"};

> The documentation so consistently uses literal ';' as the PATH
> separator that I worry the code won't correctly process standard Posix
> PATH strings.

I'll look into that.

> At this moment in history, an example showing integration of Process
> with Boost.Fiber seems as important as examples involving coroutines.

To be honest, I haven't looked that deep into boost.fiber, not sure how
that would look. I'm open for any suggestions; async_pipe implements an
asio-I/O object, so you probably can already use that with the interface
boost.fiber provides to boost.asio.

> Why is the native_handle_t typedef in the boost::this_process namespace?

Because it's used as a return-type there.

> While in full context it makes sense to speak of "assigning" an
> individual process to a process group, the method name assign() has
> conventional connotations. Use add() instead.

Good point, assign's the term used in the OS documentations, but that of
course has a different connotation in C++.

>
> There's a Note that says: "If a default-constructed group is used
> before being used in a process launch, the behaviour is undefined." I
> assume you mean "destroyed before being used," but this is a concern.
> If a program has already instantiated a process group, but for any
> reason decides not to (or fails to) launch any more child processes,
> does that put the entire parent process at risk? What remedial action
> can it take? Move the group object to the heap somewhere and let it
> leak? Spawn a bogus child process for the purpose of defusing the
> ticking process group instance?

Nope, it just may return garbage; yeah undefined behaviour is a term
that smells like it would blow up you program, that's a misnomer, sry.
On posix it will just yield nonsense (has(pid)) or do nothing and give
an error (terminate()), while that works on windows. I think
platform-dependent would be the correct word and I'll describe the details.

> If you're going to reify process group at all, you should wrap more
> logic around it to give it well-defined cross-platform behavior. And
> it should definitely be legal to create and destroy one without
> associating any child process with it.

Destroying is no affected by that.

> Given support elsewhere for splitting/joining PATH strings, the
> string_type path parameter to search_path() feels oddly low-level.
> Maybe accept a range of strings?

Would make sense, though a range of fs::path would be even better.

>> At a minimum, kindly state:
>> - Whether you believe the library should be accepted into Boost
>> * Conditions for acceptance
> YES, IF the present Boost.Process preserves the ability of its
> predecessor to extend it without modifying it. I'm sorry, thorough
> reading of the documentation plus some browsing through the code
> leaves me doubtful.

It is possible, that's one of the main reasons I insist on that interface.

>
> Examples of such extensions:
>
> * With Boost.Process 0.5 I can use Boost.TypeErasure to pass an
> "initializer" which is a runtime collection of other initializers.
> Such a feature in Process 0.6 would support people's requests to be
> able to assemble an arbitrary configuration with conditional elements
> and use it repeatedly.

That is not possible, because the functions in the handlers need to be
templated - which is a major change in that regard from 0.5. This one is
necessary to allow the initializer to access the sequence; this again is
required for the simplified asio-stuff.
Regarding the construction at runtime: I've got an idea for that (though
also not runtime) which might help out there and be extensible. Or there
could be an initializer_variant, though that would also not be
type-erasure. I'll let you know if I got a prototype of either one up.

> * I can create an initializer (actually one for each of Posix and
> Windows, presenting the same API, so portable code can use either
> transparently) to implicitly forcibly terminate the child process
> being launched when the parent process eventually terminates in
> whatever way.

You can do that, by just inherting the handler_base which has on_error,
on_success and on_setup or the os-specific extension (handler_base_ext)
which on posix adds on_exec_setup, on_exec_error, on_fork_error.

>
> Please understand that I am not asking for the above features to be
> absorbed into the Process library: I am asking for a Process library
> extensible enough that I can implement such things with consumer code,
> without having to modify the library. Perhaps Process 0.6 already is!
> It's just hard for me tell.

Sure, that's important to me, especially since that would be a good way
to eventually add more features if they are required and to know that
they're already used. I fear that calls for another section in the
documentation, though that would be more implemenation notes than a
tutorial.

>
> Another feature should, in my opinion, be incorporated into the library:
>
> * On Windows, if you desire to pass any file handles at all to the new
> child process, it is completely whimsical what *other* open file
> handles you may inadvertently pass -- unless you play games with
> STARTUPINFOEXA and PROC_THREAD_ATTRIBUTE_LIST to enumerate exactly the
> set of handles you intend to pass.
> If the library doesn't natively support that, I *must* be able to
> pass a custom initializer with deep access to the relevant control
> blocks to set it up. This is a showstopper, as in "you can't remove
> that file because, unbeknownst to you, some other process has it
> open."

You can do that through a custom initializer.

>> - Your knowledge of the problem domain
> I have hand-written process-management code in C++ several times
> before -- each with multiple implementations to support multiple
> platforms. I have tested the previous candidate Boost.Process 0.5 with
> a number of programs exercising a number of features.
>
>> You are strongly encouraged to also provide additional information:
>> - What is your evaluation of the library's:
>> * Design
> Design notes are at the top of this mail.
>
>> * Implementation
> I have only glanced over parts of the implementation. It seems
> somewhat more obscure than the corresponding pieces of Process 0.5,
> which is why I couldn't quickly satisfy myself as to the library's
> extensibility.

That's mainly because of the dual-syntax, to allow `child c("foo",
"/bar")`, which means that there are two ways to obtain the actual
initializers. That makes is not as straight-forward as 0.5.

Bjorn Reese

unread,

Nov 6, 2016, 6:26:10 AM11/6/16

to bo...@lists.boost.org

On 11/06/2016 02:10 AM, Klemens Morgenstern wrote:

> To be honest, I haven't looked that deep into boost.fiber, not sure how
> that would look. I'm open for any suggestions; async_pipe implements an
> asio-I/O object, so you probably can already use that with the interface
> boost.fiber provides to boost.asio.

The pattern used in Asio is to have both synchronous and asynchronous
functions:

return_value foo(args);
return_value foo(args, error_code);
void async_foo(args, void (*)(return_value, error_code));

If async_foo supports boost::asio::async_result, then the coroutine and
fiber support can be implemented seperately from your library.

Klemens Morgenstern

unread,

Nov 6, 2016, 6:41:06 AM11/6/16

to bo...@lists.boost.org

Am 06.11.2016 um 12:26 schrieb Bjorn Reese:
> On 11/06/2016 02:10 AM, Klemens Morgenstern wrote:
>
>> To be honest, I haven't looked that deep into boost.fiber, not sure how
>> that would look. I'm open for any suggestions; async_pipe implements an
>> asio-I/O object, so you probably can already use that with the interface
>> boost.fiber provides to boost.asio.
>
> The pattern used in Asio is to have both synchronous and asynchronous
> functions:
>
> return_value foo(args);
> return_value foo(args, error_code);
> void async_foo(args, void (*)(return_value, error_code));
>
> If async_foo supports boost::asio::async_result, then the coroutine and
> fiber support can be implemented seperately from your library.

Well there are two things. First you can use async_pipe as any asio
I/O-Object, thus this should work with fiber. Secondly there's a special
case for process::system, where you can suspend the coroutine for the
time the process runs. This requires a custom solution, since the event
handler is different - and must be. The reason is, that I need to start
the async-wait on posix before launching the process, which would be
sort of difficult, if the corutine get's suspendend when I do that.
Therefore this application needs an extra implementation, where I
imagined something like that:

void build(boost::asio::yield_context yield)
{
bp::system("g++", "-c", "foo.cpp", "-o", "foo.o", yield);
bp::system("g++", "-o", "foo.exe", "foo.o", yield);
};

I don't know really if anyone would like to use boost.fiber here; this
however would require an implementation inside process/system.hpp. So I
wouldn't add this now.

Gavin Lambert

unread,

Nov 6, 2016, 9:40:03 PM11/6/16

to bo...@lists.boost.org

On 6/11/2016 14:10, Klemens Morgenstern wrote:
>> The example under I/O pipes the stdout from 'nm' to the stdin of
>> 'c++filt'. But the example code seems completely unaware that c++filt
>> could be delayed for any of a number of reasons. It assumes that as
>> soon as nm terminates, c++filt must have produced all the output it
>> ever will. I worry about the Process implementation being confused
>> about such issues.
>
> I fear you are right, I'd need to be checking for an empty line at the end.

Once the pipe-closing issues are resolved, you should simply read the
output stream until EOF. This will only occur once c++filt has
terminated (and thus completed and closed its output stream).

(This is the right thing to do regardless of whether it's sync or async.)

>> The documentation so consistently uses literal ';' as the PATH
>> separator that I worry the code won't correctly process standard Posix
>> PATH strings.
>
> I'll look into that.

A common tactic to resolve this sort of issue is to provide a
platform-dependent property that contains the joining character and/or a
join() method. Of course, encouraging developers to actually *use* them
instead of hard-coding it is another matter.

>> * On Windows, if you desire to pass any file handles at all to the new
>> child process, it is completely whimsical what *other* open file
>> handles you may inadvertently pass -- unless you play games with
>> STARTUPINFOEXA and PROC_THREAD_ATTRIBUTE_LIST to enumerate exactly the
>> set of handles you intend to pass.

FWIW, a slight saving grace with that is that it's most common to open
files using NULL security attributes, which makes the file non-inherited
by default. You can also flip any existing handle to non-inherited
whenever you want.

Gavin Lambert

unread,

Nov 6, 2016, 9:57:20 PM11/6/16

to bo...@lists.boost.org

Mere moments ago, quoth I:

> Once the pipe-closing issues are resolved, you should simply read the
> output stream until EOF. This will only occur once c++filt has
> terminated (and thus completed and closed its output stream).

[...]

>>> * On Windows, if you desire to pass any file handles at all to the new
>>> child process, it is completely whimsical what *other* open file
>>> handles you may inadvertently pass -- unless you play games with
>>> STARTUPINFOEXA and PROC_THREAD_ATTRIBUTE_LIST to enumerate exactly the
>>> set of handles you intend to pass.
>
> FWIW, a slight saving grace with that is that it's most common to open
> files using NULL security attributes, which makes the file non-inherited
> by default. You can also flip any existing handle to non-inherited
> whenever you want.

Incidentally, this raises another good reason why it's important to
close the child-end of pipes as soon as possible after child processes
are launched -- as this would be the only child-inheritable handle of
the pipe, then as long as processes are launched only on a single thread
(and thus a second process can't be launched until after the inheritable
handle is closed) this will avoid unintended inheritance.

If you're going to be launching processes concurrently from multiple
threads then unless Boost.Process has internal locking to prevent the
race it would need to use STARTUPINFOEX to ensure only the "correct"
handles get inherited. That API is not available on XP, though; not
sure if you still care about that.

Nat Goodspeed

unread,

Nov 7, 2016, 1:42:00 PM11/7/16

to bo...@lists.boost.org

On Sun, Nov 6, 2016 at 9:38 PM, Gavin Lambert <gav...@compacsort.com> wrote:

>>> * On Windows, if you desire to pass any file handles at all to the new
>>> child process, it is completely whimsical what *other* open file
>>> handles you may inadvertently pass -- unless you play games with
>>> STARTUPINFOEXA and PROC_THREAD_ATTRIBUTE_LIST to enumerate exactly the
>>> set of handles you intend to pass.

> FWIW, a slight saving grace with that is that it's most common to open files
> using NULL security attributes, which makes the file non-inherited by
> default.

I believe the std::ifstream implementation from Visual Studio 2013 --
the API you use in portable C++ code -- opens files as inheritable by
default. Moreover, this API gives you no opportunity to change their
inheritability.

> You can also flip any existing handle to non-inherited whenever you want.

If your application uses third-party libraries, you have no way to
enumerate the complete set of handles held open by your process.

Nat Goodspeed

unread,

Nov 7, 2016, 1:50:37 PM11/7/16

to bo...@lists.boost.org

On Sun, Nov 6, 2016 at 9:56 PM, Gavin Lambert <gav...@compacsort.com> wrote:

> If you're going to be launching processes concurrently from multiple threads

or for other reasons mentioned earlier in this thread

> then Boost.Process would

> need to use STARTUPINFOEX to ensure only the "correct" handles get
> inherited. That API is not available on XP, though; not sure if you still
> care about that.

I think this falls under the heading of new Boost libraries not
needing to support obsolete platforms.

I have written code to make use of STARTUPINFOEX and the essential
InitializeProcThreadAttributeList(), etc., API functions conditional.
I don't consider it necessary for Boost.Process, but would be happy to
share if others deem it essential.

Bjorn Reese

unread,

Nov 13, 2016, 7:34:38 AM11/13/16

to bo...@lists.boost.org

On 11/06/2016 12:40 PM, Klemens Morgenstern wrote:
> Am 06.11.2016 um 12:26 schrieb Bjorn Reese:

>> The pattern used in Asio is to have both synchronous and asynchronous
>> functions:
>>
>> return_value foo(args);
>> return_value foo(args, error_code);
>> void async_foo(args, void (*)(return_value, error_code));
>>
>> If async_foo supports boost::asio::async_result, then the coroutine and
>> fiber support can be implemented seperately from your library.

> Well there are two things. First you can use async_pipe as any asio
> I/O-Object, thus this should work with fiber. Secondly there's a special
> case for process::system, where you can suspend the coroutine for the
> time the process runs. This requires a custom solution, since the event
> handler is different - and must be. The reason is, that I need to start
> the async-wait on posix before launching the process, which would be
> sort of difficult, if the corutine get's suspendend when I do that.

Here is how you could implement an async_system() with async_result
on Posix platforms.

Replace waitpid() in bp::child with a self-pipe [1], set up a SIGCHLD
handler which writes to the pipe (using the syscall write(2) to be
async-signal-safe), and use asio to read from the pipe with the yield
context. The biggest task here is that the signal handler must serve
all child processes, so you would need one self-pipe per process, and
the management of that map of self-pipes and process identifiers have
to be async-signal-safe (which means you have to go lock-free.)

Next, add async_result to async_system() and forward the handler to the
asio read operation. This will immediately give you support for
coroutines, futures, and (using the example from fiber.)

With the above, you can completely remove the yield context meta
programming from the process library, and thereby avoid having to mess
around with its internals.

[1] http://cr.yp.to/docs/selfpipe.html

> I don't know really if anyone would like to use boost.fiber here; this
> however would require an implementation inside process/system.hpp. So I
> wouldn't add this now.

The point with async_result is that this can be done outside the
process library.

Reply all

Reply to author

Forward

0 new messages