A proposal to add coroutines to C++

Oliver Kowalke

unread,

Mar 15, 2013, 1:48:17 PM3/15/13

to std-pr...@isocpp.org

Hi,

this is the first version of the proposal to add coroutines to C++.
The pdf can be found here: http://ok73.funpic.de/coroutine.pdf

so long,
Oliver

maltes...@gmail.com

unread,

Mar 16, 2013, 1:43:28 PM3/16/13

to std-pr...@isocpp.org

I like it and I think that coroutines should be added to the standard.

A few questions though:

1. You say that this is a library only change but you require that it has a stack that can grow on demand. As far as I know that can not currently be implemented in C++. So that would not make this a library only change, right? I think the requirement for growing stacks is a good one and we should not drop it just to make this a library only change. It might be a good idea to make the growing stack a separate class that is used by the coroutine.
2. Why does the coroutine get called from the constructor? You say that otherwise it is difficult to detect the completeness of a coroutine, but I don't understand why that would be. Can you explain the std::distance example you gave and why it wouldn't work if you created the coroutine in one line and called operator() in the next?
3. Why does operator() return the coroutine itself? I could understand if it returns the result (or a std::optional<result>) or void. But I don't see why it would return the coroutine itself. I think it should return void, because anything else is confusing. For example this will compile with boosts coroutine implementation:
coroutine<bool ()> coro(/*...*/);
bool result = coro(); // doesn't actually assign the result of the coroutine, but the current state instead
4. How does the coroutine unwind the stack on destruction? I believe that in boost it's done by throwing an exception, which unwinds the stack from the point where you last left the coroutine to where the exception is being caught. Meaning if nobody catches that exception the stack is unwound all the way, but that fails if there is a catch(...) block anywhere that doesn't re-throw. I think it's a reasonable requirement to say that nobody must have a catch(...) block that doesn't re-throw if you want stack unwinding to work, but if the standard is to be changed, then this could be made more robust. For example you could introduce a magical exception type that automatically re-throws when caught, except when it's caught by std::coroutine.

Oliver Kowalke

unread,

Mar 16, 2013, 2:24:31 PM3/16/13

to std-pr...@isocpp.org

Am Samstag, 16. März 2013 18:43:28 UTC+1 schrieb maltes...@gmail.com:

1. You say that this is a library only change but you require that it has a stack that can grow on demand. As far as I know that can not currently be implemented in C++. So that would not make this a library only change, right? I think the requirement for growing stacks is a good one and we should not drop it just to make this a library only change. It might be a good idea to make the growing stack a separate class that is used by the coroutine.

On demand growing stacks can be used by C++ - I've already done it for boost.coroutine (>=1.54 - currently in boost-trunk). But this requires support by the compiler. As I wrote in the proposal I know only GCC (>=4.7) to supports segmented/split stacks.
(OK - llvm has also segmented stack but the clang front end does not allow to use it, at least not yet).
So, today we have already on demand growing stacks (you could try boost.coroutine from boost-trunk -> 'b2 toolset=gcc segmented-stack=on'). I think the user should not deal with stack allocation etc. - this should be done inside coroutine<>.

2. Why does the coroutine get called from the constructor? You say that otherwise it is difficult to detect the completeness of a coroutine, but I don't understand why that would be. Can you explain the std::distance example you gave and why it wouldn't work if you created the coroutine in one line and called operator() in the next?

If we enter the coro-fn no at the coroutine<>-ctor you are forced to call coroutine::operator() (which does the jump into coro-fn) 4-times. The 4th time is required because after the 3thd jump from the coro-fn we still don't know in the main-thread/task (caller don't now hat would be the best wording) if the coro-fn would return another return value or would it be terminate (== return from coro-fn body).
In contrast to this - if we enter coro-fn with coroutine<>-ctor - we know after the 3rd jump from coro-fn that the coro-.fn terminated.

3. Why does operator() return the coroutine itself? I could understand if it returns the result (or a std::optional<result>) or void. But I don't see why it would return the coroutine itself. I think it should return void, because anything else is confusing. For example this will compile with boosts coroutine implementation:

in order to check if the coro-fn has returned a result or termianted

coroutine<bool ()> coro(/*...*/);
bool result = coro(); // doesn't actually assign the result of the coroutine, but the current state instead

result would tell you if coro is still valid (== coroutine<>::operator() can be called) or if coro-fn has teminated (== you must nor call coroutine<>::operator())

4. How does the coroutine unwind the stack on destruction? I believe that in boost it's done by throwing an exception, which unwinds the stack from the point where you last left the coroutine to where the exception is being caught.

yes - that is how boost.coroutine unwinds the stack (most generic one because not all compiler implement __unwind API)

Meaning if nobody catches that exception the stack is unwound all the way, but that fails if there is a catch(...) block anywhere that doesn't re-throw.

no - boost.coroutine uses a trampoline function which catches the unwind-exception

I think it's a reasonable requirement to say that nobody must have a catch(...) block that doesn't re-throw if you want stack unwinding to work, but if the standard is to be changed, then this could be made more robust.

I would expect that the compiler implementer calls its compiler specific stack unwinding API so that throwing and catching a special unwind-exception isn't necessary

maltes...@gmail.com

unread,

Mar 16, 2013, 3:31:35 PM3/16/13

to std-pr...@isocpp.org

On Saturday, March 16, 2013 2:24:31 PM UTC-4, Oliver Kowalke wrote:

Am Samstag, 16. März 2013 18:43:28 UTC+1 schrieb maltes...@gmail.com:
1. You say that this is a library only change but you require that it has a stack that can grow on demand. As far as I know that can not currently be implemented in C++. So that would not make this a library only change, right? I think the requirement for growing stacks is a good one and we should not drop it just to make this a library only change. It might be a good idea to make the growing stack a separate class that is used by the coroutine.

On demand growing stacks can be used by C++ - I've already done it for boost.coroutine (>=1.54 - currently in boost-trunk). But this requires support by the compiler. As I wrote in the proposal I know only GCC (>=4.7) to supports segmented/split stacks.
(OK - llvm has also segmented stack but the clang front end does not allow to use it, at least not yet).
So, today we have already on demand growing stacks (you could try boost.coroutine from boost-trunk -> 'b2 toolset=gcc segmented-stack=on'). I think the user should not deal with stack allocation etc. - this should be done inside coroutine<>.

If it requires support by the compiler then this is hardly a library only change. I agree that the feature is needed, I just disagree with the wording.

2. Why does the coroutine get called from the constructor? You say that otherwise it is difficult to detect the completeness of a coroutine, but I don't understand why that would be. Can you explain the std::distance example you gave and why it wouldn't work if you created the coroutine in one line and called operator() in the next?

If we enter the coro-fn no at the coroutine<>-ctor you are forced to call coroutine::operator() (which does the jump into coro-fn) 4-times. The 4th time is required because after the 3thd jump from the coro-fn we still don't know in the main-thread/task (caller don't now hat would be the best wording) if the coro-fn would return another return value or would it be terminate (== return from coro-fn body).
In contrast to this - if we enter coro-fn with coroutine<>-ctor - we know after the 3rd jump from coro-fn that the coro-.fn terminated.

Makes sense, but I still think that it's a bit strange. If this is only needed for coroutine iterators, then maybe the initial call could be moved into the non-member begin() function instead of the constructor.
I think that if a coroutine needs to be called four times, it should have to be called four times. Not three times and once in the constructor.

I actually have another related question:
How do you tell the difference between the calling constructor and the non-calling constructor for a coroutine<void()>? It looks like the current boost implementation will always call the coroutine immediately if it's a coroutine<void ()>. I would prefer that the default behavior is that the coroutine does NOT get called immediately. That just makes it easier to use the coroutine as a functor, for example as a callback to an observer pattern.

3. Why does operator() return the coroutine itself? I could understand if it returns the result (or a std::optional<result>) or void. But I don't see why it would return the coroutine itself. I think it should return void, because anything else is confusing. For example this will compile with boosts coroutine implementation:

in order to check if the coro-fn has returned a result or termianted

coroutine<bool ()> coro(/*...*/);
bool result = coro(); // doesn't actually assign the result of the coroutine, but the current state instead

result would tell you if coro is still valid (== coroutine<>::operator() can be called) or if coro-fn has teminated (== you must nor call coroutine<>::operator())

But you could do the same thing by checking the coroutine itself instead of checking the result of the operator(). I don't see what the benefit is. All you get is essentially that you can write
while (some_coroutine()) {}
instead of
for (; some_coroutine; some_coroutine()) {}

I find it confusing if a functor doesn't return the result of it's call. I understand why it shouldn't return the result, but I would prefer that it return nothing instead.

As I understand it the reason for why it doesn't return the result is so that there is only one way to return values from a coroutine, right? If that is the case then I would prefer it if we require that the last returned value from a coroutine has to be returned with a normal return statement, like it was done in the old boost.coroutine. Then we could make operator() return that value. And then a coroutine object is more useful as a functor for algorithms like std::transform. Another benefit would be that this would solve the problem you mentioned in response to my second question, because a coroutine could not be called any more after it has returned it's final value.

4. How does the coroutine unwind the stack on destruction? I believe that in boost it's done by throwing an exception, which unwinds the stack from the point where you last left the coroutine to where the exception is being caught.

yes - that is how boost.coroutine unwinds the stack (most generic one because not all compiler implement __unwind API)

Meaning if nobody catches that exception the stack is unwound all the way, but that fails if there is a catch(...) block anywhere that doesn't re-throw.

no - boost.coroutine uses a trampoline function which catches the unwind-exception

I think it's a reasonable requirement to say that nobody must have a catch(...) block that doesn't re-throw if you want stack unwinding to work, but if the standard is to be changed, then this could be made more robust.

I would expect that the compiler implementer calls its compiler specific stack unwinding API so that throwing and catching a special unwind-exception isn't necessary

OK that's a good idea. But it is another change that requires compiler support and can not be implemented as a library feature. It also makes it so that you can only pass a noexcept function to a coroutine if you know that it will complete before being destroyed. Which is not unreasonable, but should be a written requirement.

Oliver Kowalke

unread,

Mar 16, 2013, 6:12:01 PM3/16/13

to std-pr...@isocpp.org

Am Samstag, 16. März 2013 20:31:35 UTC+1 schrieb maltes...@gmail.com:

If it requires support by the compiler then this is hardly a library only change. I agree that the feature is needed, I just disagree with the wording.

I wrote that it does not change the current C++ standard - not that segmented stacks can be implemented by a library only

Makes sense, but I still think that it's a bit strange. If this is only needed for coroutine iterators, then maybe the initial call could be moved into the non-member begin() function instead of the constructor.

no it is not only required by iterators - it influences how the state of a coroutine can be checked hand how returned results can be accessed
for instance ctor of std::thread enters the thread-fn too

I think that if a coroutine needs to be called four times, it should have to be called four times. Not three times and once in the constructor.

this not an argument - you pass 3 return values back, so you would expect to call coroutine 3 times

How do you tell the difference between the calling constructor and the non-calling constructor for a coroutine<void()>?

I don't get it what you mean - there is no non-calling ctor nor is coroutine<void()> special or differrent to other coroutine<> instances

It looks like the current boost implementation will always call the coroutine immediately if it's a coroutine<void ()>.

yes, and for other coroutines the ctor calls the coro-fn too

I would prefer that the default behavior is that the coroutine does NOT get called immediately.

no - as I explained it would make using and implementing a coroutine harder

That just makes it easier to use the coroutine as a functor, for example as a callback to an observer pattern.

a coroutine is not a functor - we have had the same discussion for this during the review process of boost.coroutine
I've had a version which did not call coro-fn in the ctor and we came up with some examples where this design was not so good

But you could do the same thing by checking the coroutine itself instead of checking the result of the operator(). I don't see what the benefit is. All you get is essentially that you can write
while (some_coroutine()) {}
instead of
for (; some_coroutine; some_coroutine()) {}

with the current design you could also do while(coro())

because the proposed interface contains following member functions:

coroutine & operator()()
operator bool()

because operator() does the jump and returns a reference to the coroutine self the operator bool() is called in the while loop

I find it confusing if a functor doesn't return the result of it's call. I understand why it shouldn't return the result, but I would prefer that it return nothing instead.

you must not compare a coroutine to a function/functor it is different

with the current design you store the resultof the last jump inside the coroutine and you can access it multiple times and you know after returning from the context jump if the call returned a value

while (coro())
coro.get();

coro().get()

we have discussed this design during the review too and the boost developers in the review voted to store the result inside the coroutine and access it via coroutine<>::get()

you must always check the coroutine before you try to access the returned value

As I understand it the reason for why it doesn't return the result is so that there is only one way to return values from a coroutine, right?

it should anly one way how a coroutine can pass values back

If that is the case then I would prefer it if we require that the last returned value from a coroutine has to be returned with a normal return statement, like it was done in the old boost.coroutine.

in some cases the coro-fn would become a wired structure - for instance for returning values from loops - then the last result must be returned by a return statement.
the example below becomes more natureal with the current design

void fibonacci( boost::coroutines::coroutine< void( int) > & c)
{
    int first = 1, second = 1;
    for ( int i = 0; i < 10; ++i)
    {
        int third = first + second;
        first = second;
        second = third;
        c( third);
    }
   // if coro-fn would not return void then the last value must be returned here, outside the loop
}

boost::coroutines::coroutine< int() > c( fibonacci);
while( c) {
std::cout << c.get() << std::endl;

}

Then we could make operator() return that value.

this does not depend on if the last value was returned by a 'return' statement or by coroutine<>::operator()(...)

And then a coroutine object is more useful as a functor for algorithms like std::transform.

a coroutine is not a functor and it can be used with algorithms form the stl (see examples in the proposal)

It also makes it so that you can only pass a noexcept function to a coroutine if you know that it will complete before being destroyed. Which is not unreasonable, but should be a written requirement.

I don't get it - you can pass a none-noexcept function to a coroutine too. The exception thrown inside coro-fn is catched and stored inside the coroutine and rethrown from operator() after return to the calling context

Oliver Kowalke

unread,

Mar 17, 2013, 2:00:31 AM3/17/13

to std-pr...@isocpp.org

Am Samstag, 16. März 2013 23:12:01 UTC+1 schrieb Oliver Kowalke:

Am Samstag, 16. März 2013 20:31:35 UTC+1 schrieb maltes...@gmail.co

If that is the case then I would prefer it if we require that the last returned value from a coroutine has to be returned with a normal return statement, like it was done in the old boost.coroutine.

in some cases the coro-fn would become a wired structure - for instance for returning values from loops - then the last result must be returned by a return statement.

you could also take a look at the visitor example contained in the proposal (and in boost.coroutine too) - returning the last leaf node via a 'return'-statement is very hard (or nearly impossible -> how do you know in the recursive traversion of the tree that you reached the last leaf node?)

Malte Skarupke

unread,

Mar 17, 2013, 3:14:11 AM3/17/13

to std-pr...@isocpp.org

OK those are good reasons to allow returning without a value. You've got me convinced.
Could you also post a link to the discussions on the boost mailing list? I don't want to ask questions that have already been discussed.

Has there been any thought to yielding without a value? It might be useful if the coroutine is computing something that takes a while but it wants to give up control of the thread every now and then. All that would be required is that the coroutine has a operator() that takes no arguments. A coroutine can already be in a state where it doesn't have a result from the call, so this wouldn't complicate things all that much.
I also think that this is needed because there is this weird asymmetry with the first call from the constructor where the function inside of the coroutine may be called without arguments. But it has to return a value anyway. I think it would make sense if the inner function can say "If I don't have any arguments yet, yield without a return value until I get arguments."

As for the noexcept thing: What I meant is that if your function is noexcept, you have to be certain that the coroutine will finish, because stack unwinding will terminate the program. You can try that with the stack unwinding example from the boost website: If you make the function in there be noexcept the program will call terminate().

2013/3/17 Oliver Kowalke <oliver....@gmail.com>

--

---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/3g6ZIWedGJ8/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/?hl=en.

Oliver Kowalke

unread,

Mar 17, 2013, 3:51:44 AM3/17/13

to std-pr...@isocpp.org

Am Sonntag, 17. März 2013 08:14:11 UTC+1 schrieb Malte Skarupke:

Could you also post a link to the discussions on the boost mailing list? I don't want to ask questions that have already been discussed.

some threadds can be found here: http://search.gmane.org/search.php?group=gmane.comp.lib.boost.devel&query=boost.coroutine+review

Has there been any thought to yielding without a value? It might be useful if the coroutine is computing something that takes a while but it wants to give up control of the thread every now and then. All that would be required is that the coroutine has a operator() that takes no arguments. A coroutine can already be in a state where it doesn't have a result from the call, so this wouldn't complicate things all that much.

I think this does not belong to coroutine - with the template argument of coroutine you say that the coroutine is expected to return a value of a certain kind of type given in the template signature.
I think what you propose belong to a fiber (you can take a look at gitbub.com/olk/boost-fiber)

I also think that this is needed because there is this weird asymmetry with the first call from the constructor where the function inside of the coroutine may be called without arguments. But it has to return a value anyway.

you can call coroutine's ctor with or without arguments - depends on the code in your coro-fn

As for the noexcept thing: What I meant is that if your function is noexcept, you have to be certain that the coroutine will finish, because stack unwinding will terminate the program. You can try that with the stack unwinding example from the boost website: If you make the function in there be noexcept the program will call terminate().

OK - I got it. Of course the current solution is a vehicle. I tried some APIs for stack unwinding (destructing objects allocated on the stack) - but the most generic solution was to throw an exception.
It is a limitation of boost.coroutine - the compiler vendors can easily unwind the stack using their internal API for destructing objects (if they go out of scope).

Florian Weimer

unread,

Mar 17, 2013, 3:17:30 PM3/17/13

to std-pr...@isocpp.org

* Oliver Kowalke:

> On demand growing stacks can be used by C++ - I've already done it
> for boost.coroutine (>=1.54 - currently in boost-trunk). But this
> requires support by the compiler. As I wrote in the proposal I know
> only GCC (>=4.7) to supports segmented/split stacks.

Only on some architectures, and all system libraries are typically
compiled without support for split stacks. (Stack growth is still on
demand, of course, but happens in page size increments, and address
space for the maximum stack size is reserved beforehand, sometimes
causing problems on 32 bit architectures).

It is also difficult to phrase this requirement in a meaningful way.
This is quite similar to operator delete, which is not guaranteed to
free storage. (Actually, for each common implementation, there is a
sequence of operator new/delete calls which behaves as if some
operator delete calls never free storage for later re-use, because of
fragmentation.)

Florian Weimer

unread,

Mar 17, 2013, 3:18:56 PM3/17/13

to std-pr...@isocpp.org

* Oliver Kowalke:

> this is the first version of the proposal to add coroutines to C++.
> The pdf can be found here: http://ok73.funpic.de/coroutine.pdf

What's the interaction with thread_local?

Oliver Kowalke

unread,

Mar 17, 2013, 5:24:34 PM3/17/13

to std-pr...@isocpp.org

the proposal tells nothing about threads - thread are out out of scope

Florian Weimer

unread,

Mar 17, 2013, 7:55:25 PM3/17/13

to std-pr...@isocpp.org

* Oliver Kowalke:

I don't think you've got a choice in this matter.

You still have to specify what coroutine resume/yield does to
thread_local variables. Coroutine-specific variables are difficult to
implement on some platforms. If coroutines just use the thread_local
variables of the executing thread, some currently valid compiler
optimizations are broken. For example, with coroutines, the address
of a thread_local variable can change during a function call, which is
currently impossible.

One way out of this is to specify that a coroutine must not be resumed
on a different thread. But this is awfully restrictive.

Oliver Kowalke

unread,

Mar 18, 2013, 3:25:46 AM3/18/13

to std-pr...@isocpp.org

Am Montag, 18. März 2013 00:55:25 UTC+1 schrieb Florian Weimer:

> the proposal tells nothing about threads - thread are out out of scope

I don't think you've got a choice in this matter.

You still have to specify what coroutine resume/yield does to
thread_local variables. Coroutine-specific variables are difficult to
implement on some platforms. If coroutines just use the thread_local
variables of the executing thread, some currently valid compiler
optimizations are broken. For example, with coroutines, the address
of a thread_local variable can change during a function call, which is
currently impossible.

One way out of this is to specify that a coroutine must not be resumed
on a different thread. But this is awfully restrictive.

coroutines usually are a kind of extended function call - a coroutine preserves and restores
instruction pointer, stack pointer and some CPU registers defined by the ABI.
If you execute coroutines using thread-local storage then the code should work
as usual code (without coroutines).
If you migrate coroutines using thread-local storage between threads then the same restrictions
as for usual code should apply.

sorry - I still don't get your concerns.

Giovanni Piero Deretta

unread,

Mar 18, 2013, 7:19:43 AM3/18/13

to std-pr...@isocpp.org

Some ABIs use a thread status register that it is used to refer to thread local storage. Compilers do assume that the value of this register is preserved across function calls and optimize accordingly (for example hoisting thread local storage address computations). If a coroutine is moved from one thread to another, the assumption obviously does not hold. In fact it would be wrong for a coroutine to preserve the thread status register. This means that coroutines put a constraint on compiler optimizations. This constraint should be mentioned.

-- gpd

Oliver Kowalke

unread,

Mar 18, 2013, 8:10:09 AM3/18/13

to std-pr...@isocpp.org

Am Montag, 18. März 2013 12:19:43 UTC+1 schrieb Giovanni Piero Deretta:

Some ABIs use a thread status register that it is used to refer to thread local storage. Compilers do assume that the value of this register is preserved across function calls and optimize accordingly (for example hoisting thread local storage address computations). If a coroutine is moved from one thread to another, the assumption obviously does not hold. In fact it would be wrong for a coroutine to preserve the thread status register. This means that coroutines put a constraint on compiler optimizations. This constraint should be mentioned.

But this constraint is not special to coroutines it applies to ordinary code too?!

Giovanni Piero Deretta

unread,

Mar 18, 2013, 8:20:15 AM3/18/13

to std-pr...@isocpp.org

No, ordinary (standard compliant) code cannot switch to another thread, so the optimization is normally legal. These compilers do break code that use, for example, POSIX swap_context for context switching. Note that the VC++ compiler has an (obscure) option to disable optimizations like thread local storage address computation hoisting that would break with fibers, so it is a real, known problem.

-- gpd

Oliver Kowalke

unread,

Mar 18, 2013, 8:40:25 AM3/18/13

to std-pr...@isocpp.org

Am Montag, 18. März 2013 13:20:15 UTC+1 schrieb Giovanni Piero Deretta:

No, ordinary (standard compliant) code cannot switch to another thread, so the optimization is normally legal.

OK, I see that thread_local can have namespace-scope, be a static class data member and a local variable. The last one can be moved to other threads via stack swapping and should not be used in the context of coroutines if coroutine is migration to an other thread.

Giovanni Piero Deretta

unread,

Mar 18, 2013, 9:09:47 AM3/18/13

to std-pr...@isocpp.org

Unfortunately automatic thread locals are not the issue. Consider errno, usually a global, static per thread variable:
#include <ucontext.h>
#include <math.h>
#include <errno.h>

int print(const char*, int);

void foo(ucontext_t* self, ucontext_t* other)
{
    double x = sin(1.0);
    print("sin returned:", errno);
    swapcontext(self, other);
    x = cos(2.0);
    print("cos returned ", errno);

}

A recent GCC, with optimization enabled, compiles it down to this:

foo(ucontext*, ucontext*):
    movq    %rbx, -24(%rsp)
    movq    %rbp, -16(%rsp)
    movq    %rdi, %rbp
    movq    %r12, -8(%rsp)
    subq    $24, %rsp
    movq    %rsi, %r12
    movsd    .LC0(%rip), %xmm0
    call    sin
    call    __errno_location
    movl    (%rax), %esi
    movl    $.LC1, %edi
    movq    %rax, %rbx
    call    print(char const*, int)
    movq    %r12, %rsi
    movq    %rbp, %rdi
    call    swapcontext
    movsd    .LC2(%rip), %xmm0
    call    cos
    movl    (%rbx), %esi
    movq    8(%rsp), %rbp
    movl    $.LC3, %edi
    movq    (%rsp), %rbx
    movq    16(%rsp), %r12
    addq    $24, %rsp
    jmp    print(char const*, int)

Now, here errno doesn't directly use the thread pointer register (%fs on this architecture), but does an external call to the (internal) __errno_location, provided by the C runtime (as per ABI) to retrieve the address of errno, which returns the current errno address in %rax. This address is passed to print. Then swap context is called, which could potentially move the context to another thread, invalidating the errno address. Still, for the next call to print, __errno_location is not called again; the previously computed address of errno, no longer current for this thread, is reused.

As you can see, a simple and straight forward function already breaks when errno is used. I could also show an example using an explicit thread_local variable at namespace scope.

Giovanni Piero Deretta

unread,

Mar 18, 2013, 9:13:32 AM3/18/13

to std-pr...@isocpp.org

The last sentence is obviously wrong (after writing the above sentence, I had changed the signature of print to make it more realistic). The address of errno is not passed to print, just the result of dereferencing it. The issue still remain though.

-- gpd

Lawrence Crowl

unread,

Mar 18, 2013, 4:24:06 PM3/18/13

to std-pr...@isocpp.org

On 3/15/13, Oliver Kowalke <oliver....@gmail.com> wrote:
> this is the first version of the proposal to add coroutines to C++.
> The pdf can be found here: http://ok73.funpic.de/coroutine.pdf

Could you make a comparison to other coroutine libraries?
For example, the original task library has been abandoned. Why?
What does your proposal do differently?

http://www.softwarepreservation.org/projects/c_plus_plus/cfront/release_2.0/doc/LibraryManual.pdf

--
Lawrence Crowl

Oliver Kowalke

unread,

Mar 18, 2013, 5:20:17 PM3/18/13

to std-pr...@isocpp.org

Am Montag, 18. März 2013 21:24:06 UTC+1 schrieb Lawrence Crowl:

On 3/15/13, Oliver Kowalke <oliver....@gmail.com> wrote:
> this is the first version of the proposal to add coroutines to C++.
> The pdf can be found here: http://ok73.funpic.de/coroutine.pdf

Could you make a comparison to other coroutine libraries?
For example, the original task library has been abandoned. Why?
What does your proposal do differently?

I wasn't aware of the original task library.

In my opinion std::coroutine<> should be used escape-and-reenter of loops and recursive computations (special/enhanced kind of control flow).
Of course you could use std::coroutine<> as basis for cooperative multitasking but this involves scheduling
and synchronization primitives etc. For cooperative multitasking I suggest to use a different kind of object - fibers (see boost.fiber).
A fiber has the same interface as std::thread (with mutex, condition-var, future, async() ...) but provides cooperative mutlitasking (no inversion of control).

The proposal does not introduce scheduling of coroutines - in my opinion this is not in the context of coroutines.

Giovanni already mentioned that it might be better to add fibers to the proposal too. The question to be answered is how coroutines and fibers interact with threads async/futures etc.

In short: fiber -> concurrency, coroutines -> enahnced control flow, thread -> parallelism ...

ai.a...@gmail.com

unread,

Mar 19, 2013, 1:13:20 AM3/19/13

to std-pr...@isocpp.org

Hi,

have you read http://www.crystalclearsoftware.com/soc/coroutine/coroutine/coroutine_thread.html ? Not only automatic thread local variables but any kinds of thread local variables might become problematic if a coroutine migrates between threads.

I think that our only option is to specify "a coroutine should not migrate between multiple threads." The wording like "thread-local storage should not be used if a coroutine migrates between threads" eventually prohibits any use of functions and classes of which the implementation detail is not known to us. Functions and classes might use thread-local storage as their internal implementation detail. In theory, even an implementation of std::vector is allowed to use thread-local storage internally for any purpose. Therefore, as far as strict compliance with the specification is concerned, std::vector cannot be used in a coroutine if the coroutine migrates between threads. This is pretty much the same as the conclusion that a coroutine should not migrate between threads.

On the other hand, I completely agree that this is awfully restrictive. I am really interested in using coroutines with reactor/proactor frameworks. For the reason I mentioned above, only one thread can be put into the event loop in boost::asio::io_service if a coroutine is used as an event handler for Boost.Asio. If the restriction were solved, we could put multiple threads into the event loop and take full advantage of hardware concurrency.

However, I think that the most fundamental way to overcome this restriction is to extend the C++ memory models...

Message has been deleted

Giovanni Piero Deretta

unread,

Mar 19, 2013, 5:40:47 AM3/19/13

to std-pr...@isocpp.org

On Tue, Mar 19, 2013 at 5:13 AM, <ai.a...@gmail.com> wrote:

Hi,

On Monday, March 18, 2013 9:40:25 PM UTC+9, Oliver Kowalke wrote:
Am Montag, 18. März 2013 13:20:15 UTC+1 schrieb Giovanni Piero Deretta:

No, ordinary (standard compliant) code cannot switch to another thread, so the optimization is normally legal.

OK, I see that thread_local can have namespace-scope, be a static class data member and a local variable. The last one can be moved to other threads via stack swapping and should not be used in the context of coroutines if coroutine is migration to an other thread.

have you read http://www.crystalclearsoftware.com/soc/coroutine/coroutine/coroutine_thread.html ? Not only automatic thread local variables but any kinds of thread local variables might become problematic if a coroutine migrates between threads.

I think that our only option is to specify "a coroutine should not migrate between multiple threads."

No, the right solution is to require compilers to do the right thing and not break code by not assuming that thread_local addresses are invariant across function calls. Now whether the paper should ask for coroutine local storage or not is another story. The safe solution might be to require thread_local storage to be coroutine-local, but this might make coroutine creation much heavier. Note that there is already the issue on whether on a pool based async should thread_local storage be task local or survive the task. The issue with coroutines is not completely unlike (with the additional problem that lifetimes may overlap).

The wording like "thread-local storage should not be used if a coroutine migrates between threads" eventually prohibits any use of functions and classes of which the implementation detail is not known to us. Functions and classes might use thread-local storage as their internal implementation detail. In theory, even an implementation of std::vector is allowed to use thread-local storage internally for any purpose. Therefore, as far as strict compliance with the specification is concerned, std::vector cannot be used in a coroutine if the coroutine migrates between threads. This is pretty much the same as the conclusion that a coroutine should not migrate between threads.

exactly. For example the use of errno is pervasive, and it is thread local.

-- gpd

Lawrence Crowl

unread,

Mar 20, 2013, 4:22:03 PM3/20/13

to std-pr...@isocpp.org

On 3/19/13, Giovanni Piero Deretta <gpde...@gmail.com> wrote:
> On Mar 19, 2013 <ai.a...@gmail.com> wrote:
> > On March 18, 2013 9:40:25 PM UTC+9, Oliver Kowalke wrote:

If coroutines are only migrated between theads at explicit points
in the code, then coroutines can freely use thread_local variables
as caches on global state. For example, memory allocators may
use thread local_variables to keep a pool of memory that can be
managed without locks. What programs using coroutines cannot do
is use thread_local variables as persistent state distinct from
global state.

It would be possible to simply state this property as a global
restriction. That places a burden on the programmer using coroutines
to not call libraries that are sensitive to thread identity or that
switch coroutines. How is this burden any different from asking
libraries to be data-race free?

--
Lawrence Crowl

Florian Weimer

unread,

Mar 20, 2013, 6:17:30 PM3/20/13

to std-pr...@isocpp.org

* Oliver Kowalke:

> If you execute coroutines using thread-local storage then the code
> should work as usual code (without coroutines).

Right now, the programmer (or the compiler) can take the address of a
thread-local variable, call some functions, and be sure that after
those function calls, the pointer still points to the thread-local
object.

With coroutines, some one of the called functions can yield,
suspending the call stack. Afterwards, execution can be resumed from
another thread. The pointer still points to the original thread-local
object, which now belongs to another thread, not the current one.

> If you migrate coroutines using thread-local storage between threads
> then the same restrictions as for usual code should apply.

The problem here is that migration to another thread is a non-local
property which can be triggered by callback functions, for instance.
The ability to compose things this way is the point of coroutines.

Giovanni Piero Deretta

unread,

Mar 20, 2013, 6:20:35 PM3/20/13

to std-pr...@isocpp.org

that's a perfectly reasonable restriction, i.e. do not assume that thread_local values is persistent across function calls. My point, as shown by my example, is that current compiler optimizers (as opposed to user code) assume that thread_local *addresses* are invariant across function calls. No matter how careful a programmer is, his code can still be broken by a too aggressive optimizer. A coroutine proposal must explicitly prohibit such an optimization.

-- gpd

Oliver Kowalke

unread,

Mar 21, 2013, 3:02:09 AM3/21/13

to std-pr...@isocpp.org

I was thinking on code using thread_local like this way and I assume its legal:

struct Y {...};
struxt Z : public Y {...};

struct X
{
   thread_local static Y * y;
   static Y* instance() {
     return y;
   }
   static Y* set( Y * y_) {
      Y * tmp = y;
      y = y_;
      return tmp;
}
};

void coro_fn( coroutine<> & c) {
   X::instance()->abc(); // call Y installed at this thread
....
}

void thread_fn() {
   Y * old = X::set( new Z() );
   coroutine<> c( coro_fn);
   ...
}

thread_fn() is executed by each thread installing its own Y as thread_local static member of X.
each coroutine can access the instance of Y installed on the thread it is running on - if the coroutine
is migrated to another thread (executing thread_fn() ) then it should get another Y - or not?!

Oliver Kowalke

unread,

Mar 21, 2013, 3:04:32 AM3/21/13

to std-pr...@isocpp.org

Y * X::y = 0; // was missing

X == scheduler
Y == abstract interface of scheduling algorithm
Z = concrete implementation of Y

Giovanni Piero Deretta

unread,

Mar 21, 2013, 5:56:35 AM3/21/13

to std-pr...@isocpp.org

It should, but there is no guarantee, although a compiler is very
unlikely to break this specific example. This is likely to break
though:

void coro_fn( coroutine<> & c) {
X::instance()->abc(); // call Y installed at this thread

c(); // switches to a new thread
....

X::instance()->abc(); // may call the Y installed at this thread or
the previous one, depending on the phase of the moon.

Oliver Kowalke

unread,

Mar 21, 2013, 6:04:10 AM3/21/13

to std-pr...@isocpp.org

2013/3/21 Giovanni Piero Deretta <gpde...@gmail.com>

X::instance()->abc(); // may call the Y installed at this thread or
the previous one, depending on the phase of the moon.

why should it break? X::instance() access the thread_local static class member 'y'. my understanding was that if I access 'y' I get the address stored by the current thread (in its TLS).
if the code (not necessary a coroutine) is migrated to another thread 'y' returns another value.
do you have a more detailed explanation why it could break?

Giovanni Piero Deretta

unread,

Mar 21, 2013, 6:17:27 AM3/21/13

to std-pr...@isocpp.org

I showed in a previous post. To make it short, the compiler, instead
of recomputing the address of X::y after the call to c(), can reuse
the previously computed value, as normally its address cannot change.
To be more likely to see the issue, you might need to change X::y from
a pointer to an actual instance of Y. But even the example as-is can
still break.

-- gpd

Oliver Kowalke

unread,

Mar 21, 2013, 6:19:55 AM3/21/13

to std-pr...@isocpp.org

2013/3/21 Giovanni Piero Deretta <gpde...@gmail.com>

I showed in a previous post. To make it short, the compiler, instead

of recomputing the address of X::y after the call to c(), can reuse
the previously computed value, as normally its address cannot change.

the standard does explicitly allow this optimization for thread_local variables?

Giovanni Piero Deretta

unread,

Mar 21, 2013, 6:25:51 AM3/21/13

to std-pr...@isocpp.org

Implicitly, by the as-if rule. A conforming program can't tell as
normally a flow of execution cannot move to another thread. Arguably,
a compiler that claims POSIX compatibility shouldn't do this
optimization, though, as POSIX has (or had) swapcontext.

-- gpd

denis...@bredelet.com

unread,

Mar 21, 2013, 5:53:52 PM3/21/13

to std-pr...@isocpp.org

On Friday, March 15, 2013 5:48:17 PM UTC, Oliver Kowalke wrote:

Hi,

this is the first version of the proposal to add coroutines to C++.
The pdf can be found here: http://ok73.funpic.de/coroutine.pdf

so long,
Oliver

This is a bad proposal, do not do it like that!

Julien Nitard

unread,

Mar 21, 2013, 9:22:34 PM3/21/13

to std-pr...@isocpp.org

This is a bad proposal, do not do it like that!

You may want to consider throwing in a couple arguments ...

Message has been deleted

denis...@bredelet.com

unread,

Mar 22, 2013, 4:45:15 AM3/22/13

to std-pr...@isocpp.org

I would be better off writing a new proposal, which I fully intend to do.
Let me just say that I disagree with:

[quote]This design decision makes the code using std::coroutine<> let look more symmetric.[/quote]

More symmetric does not make it more readable. Yuck. Do we really want to make C++11 more confusing?

Oliver Kowalke

unread,

Mar 22, 2013, 5:00:37 AM3/22/13

to std-pr...@isocpp.org

2013/3/22 <denis...@bredelet.com>

I tried to express with the sentence you mention that you use a coroutine<> to jump into the coro-function and you use a coroutine<> to jump out of it.

But there are some ideas that bidirectional coroutines should be split into unidirectional ones (push_coroutine, pull coroutine), something like:

pull_coroutine< int > c(
        [&]( push_coroutine< int > & c) {
            int first = 1, second = 1;
            for ( int i = 0; i < 10; ++i)
            {
                int third = first + second;
                first = second;
                second = third;
                c( third);
            }
        });

for ( auto i : c)
std::cout << i << " ";

and vice versa

(pull_coroutine has only a pull_coroutine::operator()() + pull_coroutine::get(); push_coroutine has a push_coroutine::operator()( T) and no get() )

Oliver Kowalke

unread,

Mar 22, 2013, 5:16:39 AM3/22/13

to std-pr...@isocpp.org

I'm currently working to support checkpointing of coroutines - I think it would be useful too?!
Should I add it to the proposal?

Lawrence Crowl

unread,

Mar 22, 2013, 3:10:53 PM3/22/13

to std-pr...@isocpp.org

On 3/22/13, denis...@bredelet.com <denis...@bredelet.com> wrote:
> I would be better off writing a new proposal, which I fully intend to do.

Having it available before the April meeting would be most helpful.

--
Lawrence Crowl

Lawrence Crowl

unread,

Mar 22, 2013, 3:11:55 PM3/22/13

to std-pr...@isocpp.org

On 3/22/13, Oliver Kowalke <oliver....@gmail.com> wrote:
> I'm currently working to support checkpointing of coroutines -
> I think it would be useful too?! Should I add it to the proposal?

Useful for what purposes? As a component in checkpointing an
entire process? What other tools are needed to make that happen?

--
Lawrence Crowl

Oliver Kowalke

unread,

Mar 22, 2013, 3:44:52 PM3/22/13

to std-pr...@isocpp.org

2013/3/22 Lawrence Crowl <cr...@googlers.com>

I mean checkpointing a coroutine, also known as multi-shot coroutine.
Something like this:

checkpoint cp;
std::coroutine< int() > c(
   [&](std::coroutine< void( int) > & c) {
      int i = 0;
      cp = c.checkpoint(); // coroutine checkpoint
      std::cout << "ABC" << std::endl;
      while ( true) {
         c( ++i);
});

for ( int x = 0; x<3; ++x) {
   std::cout << c().get() << std::endl;

c.rollback( cp);

for ( int x = 0; x<5; ++x) {
   std::cout << c().get() << std::endl;

output should be:
1
2
3
ABC
1
2
3
4
5

Lawrence Crowl

unread,

Mar 22, 2013, 8:09:20 PM3/22/13

to std-pr...@isocpp.org

What is the use case for which this facility would be useful?

Well motivated proposals have a much easier time making it through
the process.

--
Lawrence Crowl

denis...@bredelet.com

unread,

Mar 23, 2013, 11:19:50 AM3/23/13

to std-pr...@isocpp.org

Hi Oliver,

Right, that looks much better. Do you want to add it as an alternative in your proposal?

Oliver Kowalke

unread,

Mar 23, 2013, 12:39:20 PM3/23/13

to std-pr...@isocpp.org

2013/3/23 Lawrence Crowl <cr...@googlers.com>

What is the use case for which this facility would be useful?

Well motivated proposals have a much easier time making it through
the process.

checkpointing could be used:

- speculativ execution: for instance optimistic algorithms for simulations can be rolled back to a checkpoint if desired
-> I got some requestes from game-develoeprs to provide it in boost.coroutine (I guess it would be used for state-machines)

- copy of coroutine:
copying a statefull coroutine is not possible if you think on copying the associated stack. the copy-operation would require to copy the control-block (stack-pointer,
instruction pointer, some CPU registers) and content of stack. If your code uses the address of stack-local variables you get undefined behaviour in the
target-coroutine. That is because the address points to an object on another stack, not on the stack associated with the target coroutine.
Therefore I wrote that coroutine should be move-only.
I think with checkpointing you can provide something like 'copying' a coroutine. If you require that coroutines can act only concurrent but not in parallel (== a copies of
a coroutine must not run in different threads) we could use checkpoints to load and store the stack-content + control-block for a coroutine and mimic something like
copies of a coroutine. But this needs some thoughts (for isntance resource-managment with RAII etc. might be provide some pitfalls).
-> to copy a coroutine was also requested by game-developers

Oliver Kowalke

unread,

Mar 23, 2013, 12:57:45 PM3/23/13

to std-pr...@isocpp.org

2013/3/23 <denis...@bredelet.com>

Right, that looks much better. Do you want to add it as an alternative in your proposal?

Of course if it's the outcome of this thread.

I believe that C++ needs something like coroutines - at least I know that some C++-applications use coroutines (for instance 'Simple Life').
My intention for this proposal was to get a discuss in this forum how C++ coroutines should look like.
I choose boost.coroutine as starting-point because it is working C++-code and was already reviewed by the boost-community.
But this does not mean that the proposal is immutable - I still hope to get some ideas from other C++-user to improve the proposal (Giovanni already suggested the unidirectional coroutines).
For instance I think we need some ideas about checkpointing/copying coroutiens or what about delimited continuations/sub-coroutines etc.

Ville Voutilainen

unread,

Mar 23, 2013, 1:01:54 PM3/23/13

to std-pr...@isocpp.org

On 23 March 2013 18:57, Oliver Kowalke <oliver....@gmail.com> wrote:
> 2013/3/23 <denis...@bredelet.com>
>> Right, that looks much better. Do you want to add it as an alternative in
>> your proposal?
> Of course if it's the outcome of this thread.

I recommend being careful about forming a proposal based on the feedback on
this forum. You will get the opinion of just anyone, whether sane or not.

Giovanni Piero Deretta

unread,

Mar 23, 2013, 7:34:15 PM3/23/13

to std-pr...@isocpp.org

On Sat, Mar 23, 2013 at 5:39 PM, Oliver Kowalke
<oliver....@gmail.com> wrote:
> 2013/3/23 Lawrence Crowl <cr...@googlers.com>
>>
>> What is the use case for which this facility would be useful?
>>
>> Well motivated proposals have a much easier time making it through
>> the process.
>
>
> checkpointing could be used:
>
> - speculativ execution: for instance optimistic algorithms for simulations
> can be rolled back to a checkpoint if desired

An example of this is the Amb special form in scheme (implemented on
top of continuations) used for backtracking
http://c2.com/cgi/wiki?AmbSpecialForm .

Checkpointable coroutines are, in general, as powerful as full continuations.

-- gpd

Giovanni Piero Deretta

unread,

Mar 23, 2013, 7:38:11 PM3/23/13

to std-pr...@isocpp.org

On Sat, Mar 23, 2013 at 1:09 AM, Lawrence Crowl <cr...@googlers.com> wrote:
> On 3/22/13, Oliver Kowalke <oliver....@gmail.com> wrote:
>> 2013/3/22 Lawrence Crowl <cr...@googlers.com>
>> > On 3/22/13, Oliver Kowalke <oliver....@gmail.com> wrote:
>> > > I'm currently working to support checkpointing of coroutines -
>> > > I think it would be useful too?! Should I add it to the proposal?
>> >
>> > Useful for what purposes? As a component in checkpointing an
>> > entire process? What other tools are needed to make that happen?
>>
>> I mean checkpointing a coroutine, also known as multi-shot coroutine.
>> Something like this:

>> [snip]

>
> What is the use case for which this facility would be useful?
>
> Well motivated proposals have a much easier time making it through
> the process.
>

A classic use case for multi-shot coroutines/continuations is for
handling the back button in web applications. At every page the
server checkpoint/caputure the state of the user interaction in a
continuation, and when the user cliks the back button it replays a
previous continuation.

-- gpd

Oliver Kowalke

unread,

Mar 24, 2013, 4:47:14 AM3/24/13

to std-pr...@isocpp.org

2013/3/23 <denis...@bredelet.com>

Right, that looks much better. Do you want to add it as an alternative in your proposal?

added as alternative design to proposal

best regards,
Oliver

Michael Bruck

unread,

Mar 24, 2013, 9:26:04 AM3/24/13

to std-pr...@isocpp.org

Isn't await (N3564) going to deliver mostly the same functionality (but putting the "local" variables into a compiler-generated object).

Michael

Oliver Kowalke

unread,

Mar 24, 2013, 10:08:34 AM3/24/13

to std-pr...@isocpp.org

2013/3/24 Michael Bruck <bruck....@gmail.com>

Isn't await (N3564) going to deliver mostly the same functionality (but putting the "local" variables into a compiler-generated object).

after a brief look at N3564 it seams not to be the same because I believe it does not provide some kind of escape-and-reenter of loops or
recursive computations. After reading the code snippets I think you can't solve the same fringe problem with 'await' as done with coroutines.
As N3564 describes 'After suspending, a resumable function may be resumed by the scheduling logic of the runtime...' - seams not to be
equivalent cooperative scheduling.

Lawrence Crowl

unread,

Mar 24, 2013, 8:58:29 PM3/24/13

to std-pr...@isocpp.org

I would like to see this discussion in a revision of the paper.

--
Lawrence Crowl

Lawrence Crowl

unread,

Mar 24, 2013, 9:01:40 PM3/24/13

to std-pr...@isocpp.org

We do need to be careful to identify common parts of proposals.
I view the task of the committee as finding a fusion/sythesis of
the various proposals so that we deliver maximum functionality for
minimum complexity.

So, I make a plea to all paper reviewers to look at how the proposals
interact, and how they might be merged into cleaner functionality.

--
Lawrence Crowl

Michael Bruck

unread,

Mar 24, 2013, 10:25:46 PM3/24/13

to std-pr...@isocpp.org

On Sun, Mar 24, 2013 at 3:08 PM, Oliver Kowalke <oliver....@gmail.com> wrote:

it does not provide some kind of escape-and-reenter of loops

If I understand this correctly you mean something like this:

std :: coroutine <int() > c (

[&]( std :: coroutine <void(int) > & c ){

for (int i = 0; i < 10; ++i)

c(i);

});

int ret = 0;

for (; c ; c ())

ret += c . get ();

With await you should be able to write a helper class multi_return<T> such that:

future<void> corot(multi_return<int> & mr) resumable

{

for (int i = 0; i < 10; ++i)

await mr.set(i);

mr.terminate();

}

multi_return<int> mr;

corot(mr);

int ret = 0;

while (mr.running())

ret += mr.get();

The set() and get() methods on multi_return<T> can control each other via some gadgets from the #include <future>. The key here is that due to await compiler magic corot() returns as soon as it encounters a blocking await and is afterwards restarted at the await once the blocking ended. To implement the multi-return functionality you use the await to pause the function once you found the first return value and resume it once this was consumed via the get() method.

Michael

minchul park

unread,

Mar 25, 2013, 1:13:36 AM3/25/13

to std-pr...@isocpp.org

No. Oliver's point seems to be something similar to differences between generator and coroutine. See below psuedo codes:

# if python supports coroutine...

def iterate_tree(coro, node):

if node:

iterate_tree(coro, node.left)

coro(node.value)

iterate_tree(coro, node.right)

# with current python generator

def iterate_tree(node):

if node:

for child in iterate_tree(node.left): # implicitly creates generator object

yield child

yield node.value

for child in iterate_tree(node.right): # also creates

yield child

It's fairly tedious job to efficiently implement something like above code with generator.

Adapting N3564 to this problem makes it even worse - you need to make future object for every items - though the purpose of proposal is support asynchronous job rather than enhancing control flow.

I think most parts of N3564 can be covered by solution built on coroutine(or context switching primitive), without introducing additional language complexity.

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/?hl=en.

--

Min-chul Park (summer...@gmail.com / http://summerlight.tistory.com)

Message has been deleted

Denis Bredelet

unread,

Mar 25, 2013, 3:06:38 AM3/25/13

to std-pr...@isocpp.org

On Sunday, March 24, 2013 8:47:14 AM UTC, Oliver Kowalke wrote:

2013/3/23 <denis...@bredelet.com>

Right, that looks much better. Do you want to add it as an alternative in your proposal?

added as alternative design to proposal

Nice.

Oliver Kowalke

unread,

Mar 25, 2013, 3:17:17 AM3/25/13

to std-pr...@isocpp.org

2013/3/25 Denis Bredelet <denis...@bredelet.com>

My concern with:

std::pull_coroutine <int> c( [&](std::push_coroutine <int> & c) {

is that the type of the push_coroutine is not indicated in the pull_coroutine, shouldn't it be std::pull_coroutine <int(int)>?

push_coroutine and pull_coroutine take the type transferred via the context switch as template arguments not the Signature.

pull_coroutine< int > == you get an int from the other context (you pull it from)
- int pull_coroutine<int>::get()
- pull_coroutine<int>::operator()() // no int as argument!

push_coroutine< int > == you transfer an int to the other context (you push it to)
- push_coroutine<int>::operator()( int)
// no push_coroutine::get() !

pull_coroutine and push_coroutine occur always as pair

Oliver Kowalke

unread,

Mar 25, 2013, 3:23:19 AM3/25/13

to std-pr...@isocpp.org

2013/3/25 Denis Bredelet <denis...@bredelet.com>

My concern with:

std::pull_coroutine <int> c( [&](std::push_coroutine <int> & c) {

push_coroutine<int> c([&](pull_coroutine<int> & c) { - is also possible

is that the type of the push_coroutine is not indicated in the pull_coroutine, shouldn't it be std::pull_coroutine <int(int)>?

I'm not sure because pull_coroutine <int(int)> would imply a function of signature int(int) (might confuse users?)

Beman Dawes

unread,

Mar 25, 2013, 6:38:12 PM3/25/13

to std-pr...@isocpp.org

No. The committee won't reject such a proposal because it is missing a
particular feature. They will simply ask that it be added if they care
about it.

OTOH, the committee will reject the proposal if they don't understand
what coroutines are and what the motivation is for adding them to the
C++ standard or a C++ standard technical specification.

If I were you I'd spend all the time between now and the Bristol
meeting working on your introductory material, adding references to
both current coroutine implementations and the general literature. Be
sure do dig out references to Knuth and other pioneers. Remember that
committee members come from a wide range of backgrounds. Some may be
deeply familiar with coroutines, some may have never heard of them.

And whatever else you do, submit the paper. There is nothing worse
that promising a paper on a public list like this one, and then not
delivering it. Since it is a late paper, you can just sent it to me
directly. And don't wait until the last minute. If you improve the
paper later, just send that along.

HTH,

--Beman

Michael Bruck

unread,

Mar 27, 2013, 4:21:12 AM3/27/13

to std-pr...@isocpp.org

Here is the entire thing as working code in C# which already has the await mechanism. The iterate_tree function works exactly the same as with the coroutines.

Obviously the compiler needs to create an object or a side-stack to hold the state of these functions while they are paused, but that is no different from the coroutines AFAICT. The Task object that I use here to control the process would be replaced with a future. The future class doesn't do much more than store pointers to the code and the data of the resumable function, so the added complexity should be small if the compiler can eliminate any unused features included in std::future.

using System;

using System.Threading.Tasks;

namespace corot_test

{

class multi_return<T>

{

T retval;

public Task resumer;

static Action do_nothing = () => { };

public async Task put(T _retval)

{

retval = _retval;

resumer = new Task(do_nothing);

await resumer;

resumer = null;

}

public void work()

{

resumer.RunSynchronously();

}

public T get()

{

return retval;

}

class node<T>

{

public node(node<T> _left, T _value, node<T> _right) { left = _left; right = _right; value = _value; }

public node(T _value) { left = null; right = null; value = _value; }

public node<T> left;

public node<T> right;

public T value;

public async Task iterate_tree(multi_return<T> mr)

{

if (left != null)

await left.iterate_tree(mr);

await mr.put(value);

if (right != null)

await right.iterate_tree(mr);

}

class Program

{

static void Main(string[] args)

{

var t = new node<int>(new node<int>(3), 5, new node<int>(new node<int>(7), 9, new node<int>(null, 11, new node<int>(13))));

var mr = new multi_return<int>();

t.iterate_tree(mr); // run till the first mr.put

for (; mr.resumer != null; mr.work())

{

Console.WriteLine("Next Result {0}", mr.get());

}

Console.WriteLine("Done.");

}

output:

Next Result 3

Next Result 5

Next Result 7

Next Result 9

Next Result 11

Next Result 13

Done.

--

---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/3g6ZIWedGJ8/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

Michael Bruck

unread,

Mar 27, 2013, 4:29:22 AM3/27/13

to std-pr...@isocpp.org

On Sun, Mar 24, 2013 at 3:08 PM, Oliver Kowalke <oliver....@gmail.com> wrote:

As N3564 describes 'After suspending, a resumable function may be resumed by the scheduling logic of the runtime...' - seams not to be
equivalent cooperative scheduling.

Sorry, I didn't reply to this part of the question before. The scheduling mechanisms are described in N3558. The scheduling depends on the future that you are waiting for (see the discussion of .then in that document). So this future would just run the rest of the function synchronously to simulate the coroutine behaviour. The C# sample in my other mail to Minchul does exactly that, the entire program is single-threaded.

Michael

Giovanni Piero Deretta

unread,

Mar 27, 2013, 7:37:50 AM3/27/13

to std-pr...@isocpp.org

This means that yield is O(N) where N is the average depth of the
tree, instead of of O(1), which changes the complexity of the
solution. Additionally you have to change the visiting algorithm to
yield at every level, you can't reuse an existing enumeration
interface that takes a visiting function, which is a significant
limitation. Finally you have to heap allocate both the future control
block and the generator activation frame for every depth level.

The two proposals are not equivalent.

-- gpd

minchul park

unread,

Mar 27, 2013, 7:49:58 AM3/27/13

to std-pr...@isocpp.org

On Wed, Mar 27, 2013 at 5:21 PM, Michael Bruck <bruck....@gmail.com> wrote:

Here is the entire thing as working code in C# which already has the await mechanism. The iterate_tree function works exactly the same as with the coroutines.

Your code "emulates" its functionality. In the behind of scene, it works in completely different way. async/await is based on code transformation rather than user-level execution context management, right? Then it's completely different. Imagine that your OS can't schedule user thread because it is running a function that is not marked as "resumable". Without managing full execution context, you can't do that.

Providing tool for execution context is essential in modern concurrent programming environment. Though it's hard to implement it as first-class continuation, many modern languages support it as limited format (ex. goroutine in go, task in rust) in one way or another, and even in language like python, some implementation (greenlet, stackless, pypy ...) tries to support coroutine.

Obviously the compiler needs to create an object or a side-stack to hold the state of these functions while they are paused, but that is no different from the coroutines AFAICT.

Nope. Tree iteration with coroutine never requires to allocates any object to solely control its execution flow. See the below code:

def iterate_tree(coro, node):

if node: # null check

iterate_tree(coro, node.left) # normal function call

coro(node.value) # context switch

iterate_tree(coro, node.right) # normal function call

Only one allocation for coroutine itself - before iteration - is required.

The Task object that I use here to control the process would be replaced with a future. The future class doesn't do much more than store pointers to the code and the data of the resumable function, so the added complexity should be small if the compiler can eliminate any unused features included in std::future.

Data size of resumable function can be larger than std::future itself, which leads to dynamic allocation. Since memory allocation in C++ is more expensive than C#, it can't be easily neglected.

Oliver Kowalke

unread,

Apr 6, 2013, 5:41:08 AM4/6/13

to std-pr...@isocpp.org

2013/3/25 Beman Dawes <bda...@acm.org>

And whatever else you do, submit the paper. There is nothing worse
that promising a paper on a public list like this one, and then not
delivering it. Since it is a late paper, you can just sent it to me
directly. And don't wait until the last minute. If you improve the
paper later, just send that along.

done (additional email sent to you)

corn...@google.com

unread,

Apr 8, 2013, 10:59:13 AM4/8/13

to std-pr...@isocpp.org

On Friday, March 22, 2013 10:16:39 AM UTC+1, Oliver Kowalke wrote:

I'm currently working to support checkpointing of coroutines - I think it would be useful too?!
Should I add it to the proposal?

No, absolutely not. While checkpointing might be useful, I think it is far too complicated and dangerous to be added to the initial proposal. Adding checkpointing means you have to define what this means:

checkpoint cp;
std::coroutine< int() > c(
   [&](std::coroutine< void( int) > & c) {
      int i = 0;
      cp = c.checkpoint(); // coroutine checkpoint

std::vector<int> v(5, 5);

while ( true) {
c( ++i);
});

c().get();

c.rollback( cp);

What happens to the vector on rollback? Is it destroyed and reconstructed? Is its initialization skipped the second time around? Is this just undefined? Is it the same behavior as setjmp/longjmp exhibit? (I think setjmp/longjmp would be undefined in this case according to 18.10p4.)

In general, is checkpoint()/rollback() equivalent to setjmp and longjmp within the context of the coroutine, i.e. it only changes execution position, not values? Or would values be changed in some way?

These seem to me to be non-trivial issues that would only muddle the proposal you have now. And I see no reason why checkpointing couldn't be added later in a separate proposal.

Sebastian

Oliver Kowalke

unread,

Apr 8, 2013, 11:09:50 AM4/8/13

to std-pr...@isocpp.org

2013/4/8 <corn...@google.com>

On Friday, March 22, 2013 10:16:39 AM UTC+1, Oliver Kowalke wrote:
I'm currently working to support checkpointing of coroutines - I think it would be useful too?!
Should I add it to the proposal?

No, absolutely not. While checkpointing might be useful, I think it is far too complicated and dangerous to be added to the initial proposal. Adding checkpointing means you have to define what this means:

checkpoint cp;
std::coroutine< int() > c(
   [&](std::coroutine< void( int) > & c) {
      int i = 0;
      cp = c.checkpoint(); // coroutine checkpoint

      std::vector<int> v(5, 5);

      while ( true) {
         c( ++i);
});

c().get();

c.rollback( cp);

What happens to the vector on rollback? Is it destroyed and reconstructed?

destroyed and after resumeing the coroutine it is constructed again (because the checkpoint was set before),

Is its initialization skipped the second time around?

no

Is this just undefined?

no

Is it the same behavior as setjmp/longjmp exhibit? (I think setjmp/longjmp would be undefined in this case according to 18.10p4.)

I don't use setjmp/longjmp.

In general, is checkpoint()/rollback() equivalent to setjmp and longjmp within the context of the coroutine, i.e. it only changes execution position, not values? Or would values be changed in some way?

code between the checkpoint and the last instruction in the coroutine-fn is unwound if a rollback is done to the checkpoint (in your case std::vector< int > gets destructed).

if the coroutine is then resumed it executes the instructions following after the checpoint - in your case std::vector< int > is constructed.

These seem to me to be non-trivial issues that would only muddle the proposal you have now. And I see no reason why checkpointing couldn't be added later in a separate proposal.

I've already some code - a proof of concept - which is working. As Beman Daws already told I'll add this later to the proposal (I think if it is working in boost.coroutine).

Oliver

Reply all

Reply to author

Forward