Hi,We all know about stackless coroutines proposal Gor Nishanov is working on right now -- there is a good chance it is going to land in C++20.It is essentially a composition of various tools that causes compiler to generate a C++ object (a state machine) that optionally allows caller coroutine to "subscribe" to a "ready" event (via co_await). All of this is hidden behind plain function declaration. And to make it efficient a few compiler optimizations are available (if coroutine body is visible) -- heap elision, etc.More info can be found in discussion here:I find that approach taken isn't ideal:- if for optimizations we need body to be visible -- what is the benefit of hiding coroutine behind "plain function" facade?
- all this transformations lead to creation of a plain C++ object -- instead of inventing new semantics of interacting with coroutine why don't we simply expose that object interface to end user?
- having to propagate 'co_await's down the call tree doesn't seem ideal
- for coroutines that can be located on stack you have to rely on compiler to notice that
- and etcSo instead of hiding coroutine behind 'plain function' interface and teaching compiler how to work around its limitations -- why don't we use approach similar to one used by templates? They do the similar thing -- generate C++ entities (functions and objects) from template code... Like this:coroutine int mycoro(int a, char* b) { ... } // has to be defined in declarationint main(){for(auto x: mc(1,"abc")) ... ; // we explicitly allocate coroutine on stack}
On Tuesday, March 6, 2018 at 5:02:02 PM UTC-5, Michael Kilburn wrote:But you're not going to stop the Coroutines TS from becoming part of the standard by presenting a solution that only solves a minor part of the problem the TS is intended to solve.
There is a talk given by Gor about "disappearing coroutines" where (assuming compiler has required optimizations) coroutine gets completely "inlined". He mentioned "heap allocation elision" optimization there -- whih to me looks like a big mistake (just like copy elision was).
On Tue, Mar 6, 2018 at 11:33 PM, Nicol Bolas <jmck...@gmail.com> wrote:On Tuesday, March 6, 2018 at 5:02:02 PM UTC-5, Michael Kilburn wrote:But you're not going to stop the Coroutines TS from becoming part of the standard by presenting a solution that only solves a minor part of the problem the TS is intended to solve.Damn it! My evil plans got foiled again!Well, all I wanted to hear is opinions on given approach -- because fundamentally it changes very little in current proposal. Only instead of hiding behind opaque "just a function" facade this coroutine hides behind transparent "template" facade. The way I see it -- it makes implementation (on compiler side) easier. Was this approach ever considered? If yes -- why it was rejected?
> Which optimizations are we talking about?There is a talk given by Gor about "disappearing coroutines" where (assuming compiler has required optimizations) coroutine gets completely "inlined". He mentioned "heap allocation elision" optimization there -- whih to me looks like a big mistake (just like copy elision was).
> They look great for generators, but they require a lot more effort to write genuine asynchronous function calls.No, not really... At least I don't see it.
int sync_call(...)
{
auto val = some_function(...);
return val + 1;
}
std::future<int> async_call(...)
{
auto val = co_await std::async(some_function, ...);
co_return val + 1;
}
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/33409afd-5bf6-40e0-a214-720535f9b54d%40isocpp.org.
> Gor iterated heavily on the proposal, while the others never really improved or changed.Oliver Kowalke has updated his series of fibers papers: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0876r0.pdf so there is continued work on the stackful coroutines approach, at least in the sense of the required primitives.
On Wednesday, March 7, 2018 at 1:29:30 AM UTC-5, Michael Kilburn wrote:On Tue, Mar 6, 2018 at 11:33 PM, Nicol Bolas <jmck...@gmail.com> wrote:On Tuesday, March 6, 2018 at 5:02:02 PM UTC-5, Michael Kilburn wrote:But you're not going to stop the Coroutines TS from becoming part of the standard by presenting a solution that only solves a minor part of the problem the TS is intended to solve.Damn it! My evil plans got foiled again!Well, all I wanted to hear is opinions on given approach -- because fundamentally it changes very little in current proposal. Only instead of hiding behind opaque "just a function" facade this coroutine hides behind transparent "template" facade. The way I see it -- it makes implementation (on compiler side) easier. Was this approach ever considered? If yes -- why it was rejected?
Resumable function-style coroutines weren't "rejected", as I understand it. They just stopped running the race. The people proposing it never implemented it, while Gor took the time and effort to get a decent implementation into a shipping compiler for people to use. Gor iterated heavily on the proposal, while the others never really improved or changed.
Coroutines TS has won essentially by default; the people behind it put in the work to prove that the idea functions, and its competition didn't.
> Which optimizations are we talking about?There is a talk given by Gor about "disappearing coroutines" where (assuming compiler has required optimizations) coroutine gets completely "inlined". He mentioned "heap allocation elision" optimization there -- whih to me looks like a big mistake (just like copy elision was).
Sure, but not every case of coroutines relies on that. Coroutines being completely inlined is primarily for generator coroutines, which again aren't exactly the primary use case for the feature.
> They look great for generators, but they require a lot more effort to write genuine asynchronous function calls.No, not really... At least I don't see it.
OK, take this synchronous code:
int sync_call(...)
{
auto val = some_function(...);
return val + 1;
}
This is the async version of that, using `co_await`:
std::future<int> async_call(...)
{
auto val = co_await std::async(some_function, ...);
co_return val + 1;
}
Write the equivalent of `async_call` using your coroutines system. Here are the rules. You must invoke `std::async`. You must use `future::then` to resume the rest of `async_call`. Your function must return a `future<int>` (which the caller themselves can use `::then` on). Oh, and don't forget: the future returned from the `async` call may not be a `future<int>`; merely some type that can have one added to it, the result of which is convertible to `int`.
On Wed, Mar 7, 2018 at 8:51 AM, Nicol Bolas <jmck...@gmail.com> wrote:On Wednesday, March 7, 2018 at 1:29:30 AM UTC-5, Michael Kilburn wrote:On Tue, Mar 6, 2018 at 11:33 PM, Nicol Bolas <jmck...@gmail.com> wrote:On Tuesday, March 6, 2018 at 5:02:02 PM UTC-5, Michael Kilburn wrote:But you're not going to stop the Coroutines TS from becoming part of the standard by presenting a solution that only solves a minor part of the problem the TS is intended to solve.Damn it! My evil plans got foiled again!Well, all I wanted to hear is opinions on given approach -- because fundamentally it changes very little in current proposal. Only instead of hiding behind opaque "just a function" facade this coroutine hides behind transparent "template" facade. The way I see it -- it makes implementation (on compiler side) easier. Was this approach ever considered? If yes -- why it was rejected?
Resumable function-style coroutines weren't "rejected", as I understand it. They just stopped running the race. The people proposing it never implemented it, while Gor took the time and effort to get a decent implementation into a shipping compiler for people to use. Gor iterated heavily on the proposal, while the others never really improved or changed.
Coroutines TS has won essentially by default; the people behind it put in the work to prove that the idea functions, and its competition didn't.I think you (and others) misunderstood my idea -- I do not advocate against current proposal, I am aiming only at one aspect of it -- namely hiding coroutine behind "plain function" declaration.
In my idea compiler can generate precisely same language constructs during coroutine "instantiation" (as it does in current proposal) -- same Awaitable<T> as return types, etc.
> Which optimizations are we talking about?There is a talk given by Gor about "disappearing coroutines" where (assuming compiler has required optimizations) coroutine gets completely "inlined". He mentioned "heap allocation elision" optimization there -- whih to me looks like a big mistake (just like copy elision was).
Sure, but not every case of coroutines relies on that. Coroutines being completely inlined is primarily for generator coroutines, which again aren't exactly the primary use case for the feature.And yet it is one of selling points -- unfortunately it requires optional compiler optimization, i.e. you can't rely on it. Also, it requires compiler to be able to observe coroutine body, which naturally leads to a question -- "why hiding it behind 'plain function' facade at all?". Why don't we allow user to make related decisions explicitly (like where coroutine frame will be allocated).Current proposal leads to a situation where I can't use coroutine in noexcept function -- because I can't rely on compiler to use heap allocation elision.
> They look great for generators, but they require a lot more effort to write genuine asynchronous function calls.No, not really... At least I don't see it.
OK, take this synchronous code:
int sync_call(...)
{
auto val = some_function(...);
return val + 1;
}
This is the async version of that, using `co_await`:
std::future<int> async_call(...)
{
auto val = co_await std::async(some_function, ...);
co_return val + 1;
}
Write the equivalent of `async_call` using your coroutines system. Here are the rules. You must invoke `std::async`. You must use `future::then` to resume the rest of `async_call`. Your function must return a `future<int>` (which the caller themselves can use `::then` on). Oh, and don't forget: the future returned from the `async` call may not be a `future<int>`; merely some type that can have one added to it, the result of which is convertible to `int`.coroutine int coro_call(...){auto val = std::coro_async(some_function, ...)();return val + 1;}
On Wednesday, March 7, 2018 at 10:07:37 PM UTC-5, Michael Kilburn wrote:I think you (and others) misunderstood my idea -- I do not advocate against current proposal, I am aiming only at one aspect of it -- namely hiding coroutine behind "plain function" declaration.
But that's practically the point of the Coroutines TS design: that the compiler generates the coroutine machinery based entirely on what is going on inside of the function, not how the outside world uses it. And as will be discussed below, the ramifications of changing "only one aspect of it" fundamentally changes the nature of what you're talking about.
And yet it is one of selling points -- unfortunately it requires optional compiler optimization, i.e. you can't rely on it. Also, it requires compiler to be able to observe coroutine body, which naturally leads to a question -- "why hiding it behind 'plain function' facade at all?". Why don't we allow user to make related decisions explicitly (like where coroutine frame will be allocated).Current proposal leads to a situation where I can't use coroutine in noexcept function -- because I can't rely on compiler to use heap allocation elision.
If it's a concern, trap the exception so that it doesn't try to exit the `noexcept` function.
You broke the rules. You didn't invoke `std::async`; you made a new function. Under this design, any asynchronous library will have to have `coro_` versions of all of its asynchronous functions, rather than just using `future`s or similar such types that allow you to apply continuations to them.
Also, note that this function doesn't return a `future<int>`. Which means that if the user wants to use a continuation, they actually can't. Indeed, the user can't even call it like a regular function, can they? Since it may halt mid-stream, you have to either qualify the call to it in some synchronization primitive or the caller themselves must be a `coroutine` function that can therefore be halted.
By contrast, a coroutine function can *always* be called just like a regular function. It has the specified return value, and behaves exactly like its signature says it does. This fact is a fundamental part of the system's design.
The kind of design you've presented is exactly what the resumable expressions proposal used. Only it was a bit cleverer about it, such that you would just have a resumable function called "await" that could be overloaded for some "awaitable" type, which would do the scheduling and unpacking, returning the unpacked value once resumed.
You really should look at that proposal; it's clearly what you want. And yes, it was looked at, but it didn't move forward past P0114R0.
> Gor iterated heavily on the proposal, while the others never really improved or changed.Oliver Kowalke has updated his series of fibers papers: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0876r0.pdf so there is continued work on the stackful coroutines approach, at least in the sense of the required primitives.
Michael:On "having to propagate 'co_await's down the call tree doesn't seem ideal" for many of us this is absolutely ideal, it's so ideal that we've experimented with enforcing it even in fiber-based code that doesn't need it and this feature is one reason that may cause us to port some of our fiber-based code to these coroutines! The safety benefits of doing that are enormous in terms of enforcement of asynchronous interfaces, guaranteeing where code runs under complete control of caller and callee library constructs, and ensuring that different libraries do not interfere with each others' synchronisation primitives.
We've actively been *removing* continuation support from futures in some parts of our codebase for similar reasons - explicitness has enormous value. A few weeks of extra work strengthening a library to make it trivial to use by a caller, while maintaining all the safety my library needs, is well worth the effort.
We clearly need to be able to co_await on opaque library calls - so there are going to be cases all over the codebase where the compiler can not, and indeed should not, have any visibility into the library code.
We even want coroutines to sit behind dynamic dispatch - is there any reason why a virtual function should not be a coroutine?
Enforcing visibility to the compiler would break that. Those are cases where we will not get automatic heap elision. We may want explicit stack allocation instead and this can be implemented on top of the current TS, though some modifications would make it cleaner. We may not care because the odd heap allocation across asynchronous library interfaces is trivial, and we have worse than that now with synchronising promise/future pairs.
A lot of work has gone into this, and while it is not perfect and there are certainly concerns from some parties with rushing it into the standard, on balance long years of discussions around similar questions to those you are asking have got us to this point.
i.e. idea is not about coroutine implementation or semantic of generated functions -- it is about having every client to see entire coroutine. At this stage I really don't care exactly how resulting state machine will behave or what methods it will expose.
I think you (and others) misunderstood my idea -- I do not advocate against current proposal, I am aiming only at one aspect of it -- namely hiding coroutine behind "plain function" declaration. In my idea compiler can generate precisely same language constructs during coroutine "instantiation" (as it does in current proposal) -- same Awaitable<T> as return types, etc.
And yet it is one of selling points -- unfortunately it requires optional compiler optimization, i.e. you can't rely on it. Also, it requires compiler to be able to observe coroutine body, which naturally leads to a question -- "why hiding it behind 'plain function' facade at all?". Why don't we allow user to make related decisions explicitly (like where coroutine frame will be allocated).
Current proposal leads to a situation where I can't use coroutine in noexcept function -- because I can't rely on compiler to use heap allocation elision.
generator<int, 30 + sizeof(bar(0))> foo(int i)
{
int j = i;
for (auto k : bar(i)) co_yeld (++j) * k;
}
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2b313a22-d892-4dbd-8b0a-3d94431b4ee2%40isocpp.org.
On Wed, Mar 7, 2018 at 9:33 PM, Nicol Bolas <jmck...@gmail.com> wrote:On Wednesday, March 7, 2018 at 10:07:37 PM UTC-5, Michael Kilburn wrote:I think you (and others) misunderstood my idea -- I do not advocate against current proposal, I am aiming only at one aspect of it -- namely hiding coroutine behind "plain function" declaration.
But that's practically the point of the Coroutines TS design: that the compiler generates the coroutine machinery based entirely on what is going on inside of the function, not how the outside world uses it. And as will be discussed below, the ramifications of changing "only one aspect of it" fundamentally changes the nature of what you're talking about.
I am sticking to my guns (i.e. going to claim I am being misunderstood).I'll try to explain it again:
- stackless coroutines is when your code gets transformed int a C++ object which represents a state machine (and associated fluff to tie it with the rest of your code)- the idea is to use template-like approach to "transformation" step -- as far as I am concerned it could generate exactly the same declarations as you write in MSVC version that supports Gor's proposal. I.e.
But the difference is that all translation units will see everything, unlike current state where (typically) only one will have a full vision. This would allow every call site to know everything about generated state machine (e.g. it's size). This will allow certain features:- caller can explicitly control location of coroutine frame- a coroutine will be clearly different from a function -- which would allow to avoid necessity of co_await or co_return.
Compiler can see that current coroutine calls another coroutine and generate required fluff automatically. You can introduce a new keyword (or convention) that will allow you to call coroutine in other way (but I doubt this need will be great).- inlining of some of generated C++ object methods
- you need to update only one portion of compiler -- one that transforms coroutine declarations into state machine class declaration. No need to introduce additional logic in other areas.- etci.e. idea is not about coroutine implementation or semantic of generated functions -- it is about having every client to see entire coroutine. At this stage I really don't care exactly how resulting state machine will behave or what methods it will expose.
You broke the rules. You didn't invoke `std::async`; you made a new function. Under this design, any asynchronous library will have to have `coro_` versions of all of its asynchronous functions, rather than just using `future`s or similar such types that allow you to apply continuations to them.I see... one of the aims is to integrate coroutines with future/async stuff... I am probably behind on all this. All async libs I looked at used same approach -- public API consists of a function (async_foo) that takes a callback. Then someone somewhere cranks the event loop which eventually calls "process_events()" library function that in turn calls aforementioned callback.So, for this library to add coroutine support you'd have to create second inline function (or macro) coro_foo() that takes address of resume() method of current coroutine, registers it using async_foo() and suspends current coroutine. I.e. in this design you still have to add second version of your async_foo() -- a coroutine-aware wrapper coro_foo(). I see no fault is this approach -- everything is clear and clean.Now, if library instead exports "future<int> foo()" -- there is no need for coro_foo(), but implementation becomes less efficient and more complex. Because of type erasures, having to allocate memory, having to move values and other stuff (like synchronization). Am I right? Is it a good price for not having to add a (rather simple) coro_foo()?
string read(string filename, string suffix)
{
istream fi = open(filename).get();
string ret, chunk;
while((chunk = fi.read().get()).size())
ret += chunk + suffix;
return ret;
}
task<string> read(string filename, string suffix)
{
istream fi = co_await open(filename);
string ret, chunk;
while((chunk = co_await fi.read()).size())
ret += chunk + suffix;
return ret;
}
The kind of design you've presented is exactly what the resumable expressions proposal used. Only it was a bit cleverer about it, such that you would just have a resumable function called "await" that could be overloaded for some "awaitable" type, which would do the scheduling and unpacking, returning the unpacked value once resumed.
You really should look at that proposal; it's clearly what you want. And yes, it was looked at, but it didn't move forward past P0114R0.Thank you, I will read it. But as I said -- it isn't what I was talking about.
I am probably not the best communicator.In any case -- I am not insisting that this is a brilliant idea that will turn the world on it's head. Just asking for opinions and if it was already considered.
[...]
The way the resumable expressions system handled this was to force you to do what the Coroutines TS effectively does when you invoke `co_await` and such: manually wrap your resumable function in a hidden lambda that stores a promise type and returns a future hooked into it.
The entire point of the Coroutines TS is to keep you from having to write such boilerplate. Every design decision is based on that.
[...]
`co_await`-style coding handles this with no manual intervention. Herb Sutter made a great presentation a while back (skip ahead to around 51 minutes) on the failings of explicit callback-style continuations through lambdas and such, and demonstrated how use of `await` can make asynchronous code look exactly like synchronous code.
I'll reproduce his example here, in case you're unwilling to watch the video.
Here's the synchronous code. It reads from a given filename, appending `suffix` to each "chunk" in the file. It's based on one of Microsoft's asynchronous file IO APIs (note that I've slightly adjusted some of the code):
string read(string filename, string suffix)
{
istream fi = open(filename).get();
string ret, chunk;
while((chunk = fi.read().get()).size())
ret += chunk + suffix;
return ret;
}
All of the `.get()` calls are there to convert asynchronous tasks into synchronous operations. Our goal is to take this code and make it asynchronous to the caller.
That `while` loop is the pernicious part. You have to invoke `fi.read()`, but then you have to provide a continuation function to it. That continuation function must provoke additional `fi.read()` calls as needed. And each of those calls must pass a continuation function. Namely itself. That pretty much requires heap allocation, lambdas, and heap allocation of lambdas. Nobody wants to write that code, and nobody wants to debug it.
The Coroutines TS equivalent would look like this:
task<string> read(string filename, string suffix)
{
istream fi = co_await open(filename);
string ret, chunk;
while((chunk = co_await fi.read()).size())
ret += chunk + suffix;
return ret;
}
You cannot get code that is simpler than that.
std::future<int> async_call(...) {
auto val = co_await std::async(some_function, ...);
co_return val + 1;
}
nonstd::future<int> async_call(...) {
return nonstd::async(some_function, ...)
.on_value([](auto val) { return val + 1; });
}
task<string> read(string filename, string suffix)
{
istream fi = co_await open(filename);
string ret, chunk;
while((chunk = co_await fi.read()).size())
ret += chunk + suffix;
return ret;
}
nonstd::future<string> read_impl(string ret, istream fi)
{
return fi.read().on_value_f(
[ret = std::move(ret), fi = std::move(fi)](string chunk) mutable
{
if (chunk.size()) {
ret += chunk + suffix;
return read_impl(std::move(ret), fi);
} else {
return nonstd::make_ready_future<string>(std::move(ret));
}
}
);
}
nonstd::future<string> read(string filename, string suffix)
{
return open(filename).on_value_f([](istream fi) {
return read_impl("", std::move(fi));
});
}
(Notice that in both Herb's co_foo example and in mine, we ignore the fact that the initialization of `istream fi` from the awaited value of `open(...)` is highly likely to be slicing away important information. Let's assume that this is some STL2-ish "value-semantic istream" that doesn't care about slicing issues.)
Personally I'm also skeptical of the Coroutines TS package (weird keywords, lots and lots of compiler magic, unclear customizability), but I do have to say that the "future.then" approach is... suboptimal. We desperately need some kind of idiom for working with continuations in C++, and Coroutines TS is gamely attempting to tackle that problem head-on.
On Wednesday, March 7, 2018 at 10:07:37 PM UTC-5, Michael Kilburn wrote:I think you (and others) misunderstood my idea -- I do not advocate against current proposal, I am aiming only at one aspect of it -- namely hiding coroutine behind "plain function" declaration. In my idea compiler can generate precisely same language constructs during coroutine "instantiation" (as it does in current proposal) -- same Awaitable<T> as return types, etc.And yet it is one of selling points -- unfortunately it requires optional compiler optimization, i.e. you can't rely on it. Also, it requires compiler to be able to observe coroutine body, which naturally leads to a question -- "why hiding it behind 'plain function' facade at all?". Why don't we allow user to make related decisions explicitly (like where coroutine frame will be allocated).Current proposal leads to a situation where I can't use coroutine in noexcept function -- because I can't rely on compiler to use heap allocation elision.Instead of using templates as an analogy, how about lambda functions?If lambda functions were like the current stackless coroutine proposal:
- Type erasure would be mandatory, not opt-in.
- Lambda functions would always live on the heap, except when the optimizer finds a way not to.
If the proposal was more like lambda functions:
- The user could opt into type erasure.
- The coroutine object could live on the stack, or inside another object, or anywhere else.
to this one (note that fundamentally we change nothing -- we still need a way to distinguish between two types of calls):future<void> coro1(){auto x = co_await coro2(); // suspend here until coro2 is in "data ready" stateauto y = coro2(); // no suspend}
coroutine void coro1(){auto x = coro2()(); // can suspend, if coro2 is awaitablecoro2 c2;auto y = co_nowait c2(); // no suspend, I suspect this use case will be rare
// or auto y = c2.resume(); // kick off coro2 until first suspend
return; // no need for co_return}
Other notes:
- An unwrapped coroutine object probably wouldn't be callable. It may have an inconvenient interface that's only useful for wrappers to use.
- The current proposal needs changes to std::future; this alternate idea may need additional changes. Maybe std::promise could do the type erasure for the async case?
- My initial guess is the object couldn't be copyable or movable; it'd have to be RVO'ed to its final location.
Todd
--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/jJIO4ChPf-0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/a3a1cd56-29ab-482b-accb-0a0208ef2154%40isocpp.org.
On Thursday, March 8, 2018 at 10:54:24 AM UTC+1, Michael Kilburn wrote:On Thu, Mar 8, 2018 at 12:15 AM, Michael Kilburn <crusad...@gmail.com> wrote:i.e. idea is not about coroutine implementation or semantic of generated functions -- it is about having every client to see entire coroutine. At this stage I really don't care exactly how resulting state machine will behave or what methods it will expose.One more argument -- consider template class and stackless coroutine. Both of them are language provided mechanisms for transforming some code into a C++ class -- i.e. fundamentally they do the same thing. In case of template class there is no way to hide it's declaration from users -- at any point entire class definition is visible to every user (even though you may tuck away definition of individual member functions into some TU). Stackless coroutine (in current form) hides resulting class -- this lack of symmetry bothers me. It forces designers to come up with mechanisms designed to workaround resulting problems (e.g. heap allocation elision allows us to delegate allocation to a party that knows state machine frame size and move allocation to stack, if certain criteria are met).
If I understand this problem correctly then we could alter current proposition to allow optional embedding coroutine state into return object instead of using allocation.
We then would have 3 kinds: always heap, heap and fixed, only fixed.
Return value will need have something like `std::aligned_storage` with some arbitrary size. Then if values of coroutine fit this storage then whole function compile and use this storage for variables, if not and we do not have enabled option for heap allocation then function fall to compile.
This will have some drawback, one is you will need be careful when you stack generators you will need calculate size of things used inside (that are live during `co_yeld`):Another is that return value will be not movable or copyable. Because otherwise you will need define how it will handle "local variables" when storage will be copied.
generator<int, 30 + sizeof(bar(0))> foo(int i)
{
int j = i;
for (auto k : bar(i)) co_yeld (++j) * k;
}
> Ultimately, stackless coroutine is a C++ object with methods like resume()/etc. I don't see why you shouldn't be able to treat it as such -- i.e. storing it as member variable of another class and access in a virtual function.Maybe I'm missing something here. You want the compiler to be able to see the body of the called coroutine, as I understand it.
To be clear I am saying I need to be able to do something like:class Foo {virtual Awaitable bar();};co_await my_foo->bar();
Now, you also said:> Nothing prevents you from hiding state machine object (generated from couroutine declaration) behind a function manuallySo I *think* you are saying that you can have a strictly visible coroutine that you can wrap in an async call to heap allocate it in case you want to pass the awaitable around, do some bulk collect operation or whatever. Is that right?
That's a reasonable point of view. At the moment we are pretty happy with the state of inlining when the functions are visible, and are more interested in ensuring there are no heap allocations in the example above, when they are not visible. We really can't afford to significantly increase the amount of compiler visible code we have, optimising for separate compilation is really the only option.
You are not being misunderstood. You're trying to equate all "stackless coroutine" proposals; you're claiming that they're all just minor variations of the same concept.
They are not. Coroutines TS is not just creating a resumable function; it's a lot more than that. It's implementing a specific model of coroutines, which is different from the model you're defining.
If you want "just resumable functions", then you need to understand that this really is a completely different proposal with a completely different design from the Coroutines TS. It's not a slight modification of Coroutines TS (which is evidence in the fact that your design literally removes all of the Coroutines TS's keywords).
Your design is a suspend-down coroutine model. When your kind of coroutine yields, it always returns to its nearest non-coroutine caller, who is responsible for scheduling its resumption at some point. Coroutines TS is a suspend-up coroutine model: the code responsible for scheduling its resumption is the code inside the coroutine, not necessarily its caller. If the code inside the coroutine suspends to the caller, that's because the particular coroutine chooses to.
Generators are the classic case of suspend-down. That's why your design works especially well with them, and why the Coroutines TS works so poorly with them. Continuations however are a classic case of suspend-up. This is why your design requires lots of extra work to use them, while the Coroutines TS makes it look like synchronous code.
Coroutines TS will never be as good at generators as your idea is. But neither will your idea be as good at asynchronous code transformations as the Coroutines TS is.
Compiler can see that current coroutine calls another coroutine and generate required fluff automatically. You can introduce a new keyword (or convention) that will allow you to call coroutine in other way (but I doubt this need will be great).- inlining of some of generated C++ object methods
Inlining is only relevant when dealing with suspend-down style coroutines. That is, generators and the like. Inlining is not relevant (or possible) when dealing with suspend-up continuations. This is why such inlining is an optimization to Coroutines TS rather than a requirement. It's a suspend-up model, so it does things in a suspend-up way.
I see... one of the aims is to integrate coroutines with future/async stuff... I am probably behind on all this. All async libs I looked at used same approach -- public API consists of a function (async_foo) that takes a callback. Then someone somewhere cranks the event loop which eventually calls "process_events()" library function that in turn calls aforementioned callback.So, for this library to add coroutine support you'd have to create second inline function (or macro) coro_foo() that takes address of resume() method of current coroutine, registers it using async_foo() and suspends current coroutine. I.e. in this design you still have to add second version of your async_foo() -- a coroutine-aware wrapper coro_foo(). I see no fault is this approach -- everything is clear and clean.Now, if library instead exports "future<int> foo()" -- there is no need for coro_foo(), but implementation becomes less efficient and more complex. Because of type erasures, having to allocate memory, having to move values and other stuff (like synchronization). Am I right? Is it a good price for not having to add a (rather simple) coro_foo()?
It all depends: is the person calling the async routine the one who actually has the continuation? The `future.then`-style interface allows anyone at any time to hook a continuation into the process. Your `coro_foo` style requires that the exact caller be the one who provides the continuation.
And what if the continuation function itself needs a continuation? The caller has to provide that too. And what if the inner continuation needs to access something from the outer continuation? Well, you have to allocate that explicitly. And so forth.
`co_await`-style coding handles this with no manual intervention. Herb Sutter made a great presentation a while back (skip ahead to around 51 minutes) on the failings of explicit callback-style continuations through lambdas and such, and demonstrated how use of `await` can make asynchronous code look exactly like synchronous code.
I'll reproduce his example here, in case you're unwilling to watch the video.
Here's the synchronous code. It reads from a given filename, appending `suffix` to each "chunk" in the file. It's based on one of Microsoft's asynchronous file IO APIs (note that I've slightly adjusted some of the code):
And your `coro_func` version would not work, due to the need of this code to continue itself. You'd have to write a lambda, heap-allocate it, and do a bunch of other stuff to make the explicit continuation code work.
The kind of design you've presented is exactly what the resumable expressions proposal used. Only it was a bit cleverer about it, such that you would just have a resumable function called "await" that could be overloaded for some "awaitable" type, which would do the scheduling and unpacking, returning the unpacked value once resumed.
You really should look at that proposal; it's clearly what you want. And yes, it was looked at, but it didn't move forward past P0114R0.Thank you, I will read it. But as I said -- it isn't what I was talking about.
How can you say it isn't what you were talking about when you haven't read it?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BnLVP4wXQeo%3DJsjMJybNfa9O3MxSQu1b-uACR2ygB0z1hDA3w%40mail.gmail.com.
On Thu, Mar 8, 2018 at 11:12 AM, Nicol Bolas <jmck...@gmail.com> wrote:You are not being misunderstood. You're trying to equate all "stackless coroutine" proposals; you're claiming that they're all just minor variations of the same concept.No, I don't. But you correctly noted that some of existing proposals already have "always visible" feature. That is ok -- I presented this idea in context of current proposal in order to enable options of:- manually controlling generated state machine object (allocation, etc) to make heap elision optimization irrelevant- changing convention -- e.g. use co_nowait instead of c_await (see my answer to Todd)- few othersIf you don't like any of these options -- naturally, that idea is a no go. But maybe some of them can be appealing?
They are not. Coroutines TS is not just creating a resumable function; it's a lot more than that. It's implementing a specific model of coroutines, which is different from the model you're defining.
If you want "just resumable functions", then you need to understand that this really is a completely different proposal with a completely different design from the Coroutines TS. It's not a slight modification of Coroutines TS (which is evidence in the fact that your design literally removes all of the Coroutines TS's keywords).keywords maybe removed but their intended effect stays unchanged.
Your design is a suspend-down coroutine model. When your kind of coroutine yields, it always returns to its nearest non-coroutine caller, who is responsible for scheduling its resumption at some point. Coroutines TS is a suspend-up coroutine model: the code responsible for scheduling its resumption is the code inside the coroutine, not necessarily its caller. If the code inside the coroutine suspends to the caller, that's because the particular coroutine chooses to.
Generators are the classic case of suspend-down. That's why your design works especially well with them, and why the Coroutines TS works so poorly with them. Continuations however are a classic case of suspend-up. This is why your design requires lots of extra work to use them, while the Coroutines TS makes it look like synchronous code.
Coroutines TS will never be as good at generators as your idea is. But neither will your idea be as good at asynchronous code transformations as the Coroutines TS is.Again, the idea presented doesn't fundamentally change anything in current proposal with respect to returned types (future/etc) or the way it suspends. All it does is exposes generated state machine class to every user, thus enabling certain options that designers may or may not take.
I see... one of the aims is to integrate coroutines with future/async stuff... I am probably behind on all this. All async libs I looked at used same approach -- public API consists of a function (async_foo) that takes a callback. Then someone somewhere cranks the event loop which eventually calls "process_events()" library function that in turn calls aforementioned callback.So, for this library to add coroutine support you'd have to create second inline function (or macro) coro_foo() that takes address of resume() method of current coroutine, registers it using async_foo() and suspends current coroutine. I.e. in this design you still have to add second version of your async_foo() -- a coroutine-aware wrapper coro_foo(). I see no fault is this approach -- everything is clear and clean.Now, if library instead exports "future<int> foo()" -- there is no need for coro_foo(), but implementation becomes less efficient and more complex. Because of type erasures, having to allocate memory, having to move values and other stuff (like synchronization). Am I right? Is it a good price for not having to add a (rather simple) coro_foo()?
It all depends: is the person calling the async routine the one who actually has the continuation? The `future.then`-style interface allows anyone at any time to hook a continuation into the process. Your `coro_foo` style requires that the exact caller be the one who provides the continuation.
And what if the continuation function itself needs a continuation? The caller has to provide that too. And what if the inner continuation needs to access something from the outer continuation? Well, you have to allocate that explicitly. And so forth.How often it will happen?
Your typical caller of a coroutine is probably another coroutine that awaits on it. In this case it works just fine. In others -- related boilerplate can be buried into some utility function(s).
`co_await`-style coding handles this with no manual intervention. Herb Sutter made a great presentation a while back (skip ahead to around 51 minutes) on the failings of explicit callback-style continuations through lambdas and such, and demonstrated how use of `await` can make asynchronous code look exactly like synchronous code.
I'll reproduce his example here, in case you're unwilling to watch the video.Ok, now that was below the belt. I did watch it.Here's the synchronous code. It reads from a given filename, appending `suffix` to each "chunk" in the file. It's based on one of Microsoft's asynchronous file IO APIs (note that I've slightly adjusted some of the code):<lots of snipping>And your `coro_func` version would not work, due to the need of this code to continue itself. You'd have to write a lambda, heap-allocate it, and do a bunch of other stuff to make the explicit continuation code work.Why it wouldn't work? coro_foo() will end up calling async_foo() passing my current coroutine resume() as a callback and suspending (i.e. returning). Code will be practically the same. I either fantastically missing something or we are not on the same page.
The kind of design you've presented is exactly what the resumable expressions proposal used. Only it was a bit cleverer about it, such that you would just have a resumable function called "await" that could be overloaded for some "awaitable" type, which would do the scheduling and unpacking, returning the unpacked value once resumed.
You really should look at that proposal; it's clearly what you want. And yes, it was looked at, but it didn't move forward past P0114R0.Thank you, I will read it. But as I said -- it isn't what I was talking about.
How can you say it isn't what you were talking about when you haven't read it?I meant I did not propose a new design -- I presented an idea (for current design) that may change it a bit opening up some options.
On 9 March 2018 at 09:30, Lee Howes <xri...@gmail.com> wrote:> coroutine void coro_bar(...) { ... }> class Foo {> virtual Awaitable bar() { return cb_(); }> coro_bar cb_; // an instance of state-machine class generated from coro_bar's definition> };> does it make sense? I didn't think about it this way, but it should probably work.You've added shared state. Now what happens if I want to do something like this:Foo f;thread t([&](){await f.bar());thread t2([&](){await f.bar());t.join();t2.join();I need two copies of cb_, or for cb_ to use heap allocation magic to hide that, or at the very least to synchronize the coroutine state.
coroutine void coro_bar(...) { ... }class Foo {virtual Awaitable bar(){
auto cb = new coro_bar(); // basically same thing current TS does under the hood
return (*cb)();}};
Ok, enter sent the message. I wanted to also add:
> move coroutine declarions into separate TU and hide them behind plain function.
but then I am back needing hooks like the coroutines TS provides - but unlike the coroutines TS I don't see from your example what type that plain function should return. So now, I wrap my coroutine in a free function foo.
something foo();
bar() {
await foo()? I'm not sure what the syntax would be here for awaiting on a plain function... do I need a coroutine type defined here?
}
corotuine void coro_foo() {...}awaitable<void> bar(){auto cb = new coro_foo();return (*cb)();}
// somewhere in another TUawaitable<void> bar();
co_await bar();
On Friday, March 9, 2018 at 4:10:36 AM UTC-5, Michael Kilburn wrote:On Thu, Mar 8, 2018 at 11:12 AM, Nicol Bolas <jmck...@gmail.com> wrote:You are not being misunderstood. You're trying to equate all "stackless coroutine" proposals; you're claiming that they're all just minor variations of the same concept.No, I don't. But you correctly noted that some of existing proposals already have "always visible" feature. That is ok -- I presented this idea in context of current proposal in order to enable options of:- manually controlling generated state machine object (allocation, etc) to make heap elision optimization irrelevant- changing convention -- e.g. use co_nowait instead of c_await (see my answer to Todd)- few othersIf you don't like any of these options -- naturally, that idea is a no go. But maybe some of them can be appealing?
It's not a question of liking or disliking the options. These options are irrelevant for the use cases that the Coroutines TS is intended to work with. They force the system to move away from the optimal syntax for suspend-up-style programming and towards a suspend-down model.
The basic concepts you're describing are appealing; I implore you to read P0114. But they are not appropriate for this proposal. Your idea is trying to turn a suspend-up system into a suspend-down one.
They are not. Coroutines TS is not just creating a resumable function; it's a lot more than that. It's implementing a specific model of coroutines, which is different from the model you're defining.
If you want "just resumable functions", then you need to understand that this really is a completely different proposal with a completely different design from the Coroutines TS. It's not a slight modification of Coroutines TS (which is evidence in the fact that your design literally removes all of the Coroutines TS's keywords).keywords maybe removed but their intended effect stays unchanged.
How can something which no longer exist have an effect?
coroutine ... coro1() { ... }coroutine ... coro2() {
coro1(); // we know coro1 is a coroutine (because coro1 is marked so) and we know it is being invoked from a coroutine(because coro2 is marked so)// therefore we may choose to await by default when calling it
auto x = co_noawait coro1(); // ... and (for example) require a keyword for non-awaiting call (to distinguish between two calling modes)
}
For example, how does one of your coroutines yield values? What's funny is that, since I've read P0114, I have a pretty good idea what your answer will be ;)
Why it wouldn't work? coro_foo() will end up calling async_foo() passing my current coroutine resume() as a callback and suspending (i.e. returning). Code will be practically the same. I either fantastically missing something or we are not on the same page.
Show me how the code would "be practically the same". Present the equivalent code using your idea.
And please note that it must be the equivalent code: based on `.then` style continuations and the like. So the return value still needs to be a `task<string>`. And the function itself must be a regular function (and thus cannot directly use your `coroutine` keyword).
void async_foo(..., user_cb, user_cb_data);inline string coro_foo(...) // as mentioned before coro_foo has to be inline or macro (to be able to have access to coroutine that calls it){
string ret;
struct control_struct {...
} control(&ret, ¤t_coro.resume);
auto cb = [](..., user_cb_data) {((control_struct*)user_cb_data)->set_result_and_call_resume( make_string(...) );};
async_foo(..., cb, &control);
suspend_current_coro;
return ret;
}
// user code:
coroutine string coro_user1(...)
{
string chunk = coro_foo(...);
while((chunk = coro_foo(...)).size()) ...;
...
}
The kind of design you've presented is exactly what the resumable expressions proposal used. Only it was a bit cleverer about it, such that you would just have a resumable function called "await" that could be overloaded for some "awaitable" type, which would do the scheduling and unpacking, returning the unpacked value once resumed.
You really should look at that proposal; it's clearly what you want. And yes, it was looked at, but it didn't move forward past P0114R0.Thank you, I will read it. But as I said -- it isn't what I was talking about.
How can you say it isn't what you were talking about when you haven't read it?I meant I did not propose a new design -- I presented an idea (for current design) that may change it a bit opening up some options.
You keep saying that it's not a new design, and yet all evidence says that it is. Here are a number of aspects of the design of Coroutines TS:
1. Coroutines appear to be regular functions using a regular interface. Function overloading works as normal with them.
2. Coroutines have a built-in and hidden state object with an internal return value object.
3. The internal functor that represents a coroutine is only visible to the code that absolutely needs to know it exists: the code which schedules its resumption.
4. The scheduling of the resumption of the coroutine is done entirely from within the coroutine itself. It may delegate this to the caller, but this is something it must explicitly choose to do.
5. There is direct and effortless support for suspend-up via the `co_await` operator and its associated machinery.
I would say that most of these are fundamental aspects of what makes the Coroutines TS what it is. Here is how your design compares:
1. Coroutines are not regular functions. You haven't explained how function overloading works with coroutines.
2. Coroutines do not have built-in and hidden state objects; they directly expose this state to the code talking about them.
3. The code calling a coroutine must interact with the internal functor that represents a coroutine.
4. Scheduling the resumption of the coroutine is always granted to the caller.
5. Suspend-up requires explicit effort from the caller. There is apparently no `co_await` operator at all.
It's not even clear if your system allows the user to create their own promise/future types. Coroutines TS allows this by default; the coroutine machinery inspects the coroutine function's signature to determine what the internal promise object will be (if your return value is std::future<T>, the promise type would be std::promise<T>). Yours seems to require the caller to decide what kind of promise/future will be used. So there's another difference.
How can you say that these changes do not constitute a "new design" (and, as I keep reminding you, your design is almost exactly P0114)? You're not making a minor tweak to an existing propsoal; you're fundamentally changing it. You even seem to understand that when you agreed with Todd's analogy with lambdas. An "always type-erased" lambda is very much a new design compared to a "not type-erased" lambda. They may do similar things, but they do them in a fundamentally different way.
To you and your use cases, this may not seem like a big change. But every use case you've presented is for suspend-down, not for suspend-up style coding. And suspend-up is what the Coroutines TS is all about.
So this very much is a new design.
On Fri, Mar 9, 2018 at 11:38 AM, Nicol Bolas <jmck...@gmail.com> wrote:On Friday, March 9, 2018 at 4:10:36 AM UTC-5, Michael Kilburn wrote:On Thu, Mar 8, 2018 at 11:12 AM, Nicol Bolas <jmck...@gmail.com> wrote:You are not being misunderstood. You're trying to equate all "stackless coroutine" proposals; you're claiming that they're all just minor variations of the same concept.No, I don't. But you correctly noted that some of existing proposals already have "always visible" feature. That is ok -- I presented this idea in context of current proposal in order to enable options of:- manually controlling generated state machine object (allocation, etc) to make heap elision optimization irrelevant- changing convention -- e.g. use co_nowait instead of c_await (see my answer to Todd)- few othersIf you don't like any of these options -- naturally, that idea is a no go. But maybe some of them can be appealing?
It's not a question of liking or disliking the options. These options are irrelevant for the use cases that the Coroutines TS is intended to work with. They force the system to move away from the optimal syntax for suspend-up-style programming and towards a suspend-down model.
The basic concepts you're describing are appealing; I implore you to read P0114. But they are not appropriate for this proposal. Your idea is trying to turn a suspend-up system into a suspend-down one.No, the idea is to force all coroutines to be inline and clearly mark them as coroutines. At minimum, everything else can stay the same -- return types, semantics, co_* keywords, etc.
[[coroutine]] inline future<int> some_coroutine(...) {...}
For example, how does one of your coroutines yield values? What's funny is that, since I've read P0114, I have a pretty good idea what your answer will be ;)same as it is done today -- co_yield
coroutine int mycoro(int a, char* b) { ... } // has to be defined in declaration
Here I can only bang my head against the desk -- you've decided for yourself what I am trying to present and no amount of arguing can cause you to budge. I can only give up :-)
corotuine void coro_foo() {...}
awaitable<void> bar()
{
auto cb = new coro_foo();
return (*cb)();
}