Standardising statement expressions as a special kind of "evil lambda"

396 views
Skip to first unread message

Niall Douglas

unread,
Oct 10, 2017, 7:02:48 PM10/10/17
to ISO C++ Standard - Future Proposals
In the process of preparing the operator try paper, I was hoping to gauge just how much people hate the idea of adding statement expressions to C++?

Statement expressions are an extension provided by GCC and clang (https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html) and take the form:

int a = ({
 
int y = foo ();
 
int z;
 
if (y > 0)
    z
= y;
 
else z = - y;
  z
;  // the output of the statement expression
});

A statement expression is executed directly inside its calling function. It does NOT create a new stack frame, so this works:

int foo()
{
 
return 1 + ({
   
int a = boo();
   
if(!a)
     
return -1;  // returns from foo()
    a
;
 
});
}


So what I have in mind is even worse than this. I'm thinking of proposing extending lambdas with some appalling new semantics:

int foo()
{
 
return 1 + [!]{
   
int a = boo();
   
if(!a)
     
return -1;  // returns from foo()
    a
;
 
}();
}

The new '!' capturing flag means "execute this in the stack frame of my caller". Or rather, it means "execute this in one stack frame above this lambda function's stack frame" because yes, you can do multiple '!':

void evil()
{
 
// Inject "return 5;" into whomever calls evil()
 
[!!] { return 5; }();
}


int foo()
{
  evil
();  // as if "return 5;"
}

And you can also do '!!!' or '!!!!' and so on up to the current inlineability limit of the code currently being compiled.

I am very sure most reading will hate this proposal, and I am making it semi-seriously to see just how much people hate it before I describe it in P0779R0. But what I would say in its favour is that it would solve the last remaining major use case for C macros in C++: injecting boilerplate into a caller's stack frame. And furthermore, I am unaware of any other proposed alternative for injecting boilerplate into a caller's stack frame other than C macros. Even Herb's metaclasses don't cover this use case as far as I read them, which means we still will be using C macros to achieve boilerplate injection in 2025.

And I don't know about you, but I'd like to see the back of C macros.

These "evil lambdas" would solve the problem. You basically get to alias a function call to expand into some boilerplate, much tidier than C macro expansion because it follows normal C++ rules, not the preprocessor's weird rules. The availability of this facility would let one turn the Coroutines TS into an almost entirely library based implementation for example. My operator try idea is just injection of boilerplate as well, it becomes much tidier with this facility in place. Ranged for loops again can be implemented using these. Anywhere where we inject boilerplate, we can inject these instead and make them library implemented, not hard coded into the compiler.

But there are also lots, and lots of other awful unforeseen consequences. Like you'd surely have to allow evil lambdas which inject programmatically determined evil lambdas right? What kind of rabbit holes does that open up? Evil lambdas can inject arbitrary new variables, modify existing ones, change control flow, inject try and catch, basically do anything which C macros can do, but without the problems associated with C macros. What about ordering? Should injection of evil lambda contents occur in the order of compilation? What if an evil lambda injects code three stack frames up which transforms the evil lambda doing the injecting? It would be easiest to just bail out, but it would be more fun if evil lambdas could recursively rewrite themselves.

What do people think?

Niall

Todd Fleming

unread,
Oct 10, 2017, 8:24:22 PM10/10/17
to ISO C++ Standard - Future Proposals
On Tuesday, October 10, 2017 at 7:02:48 PM UTC-4, Niall Douglas wrote:
void evil()
{
 
// Inject "return 5;" into whomever calls evil()
 
[!!] { return 5; }();
}


How could compilers handle this if evil() is declared in a header, defined in TU A, but called in TU B?

Todd

Niall Douglas

unread,
Oct 10, 2017, 8:28:20 PM10/10/17
to ISO C++ Standard - Future Proposals
If it can't inject in some given use case, it fails to compile with a suitable error message.

Niall 

Nicol Bolas

unread,
Oct 10, 2017, 8:55:56 PM10/10/17
to ISO C++ Standard - Future Proposals
On Tuesday, October 10, 2017 at 7:02:48 PM UTC-4, Niall Douglas wrote:
What do people think?

I think I've seen this before. I didn't much care for it then either. Or at least, the "I'm can pass 'functions' around that force the caller to return, without the caller having any clue this can happen" part.

Todd Fleming

unread,
Oct 10, 2017, 9:08:31 PM10/10/17
to ISO C++ Standard - Future Proposals
On Tuesday, October 10, 2017 at 8:28:20 PM UTC-4, Niall Douglas wrote:
If it can't inject in some given use case, it fails to compile with a suitable error message.


What about making it explicit like try expressions?

injectable string not_as_evil()
{
    if(...)
        return 5;
    else
        expr_return "foo"s;
}

int hear_no_evil()
{
    string s1 = inject not_as_evil();
    ...

    string s2 = inject [] injectable {
        if(...)
            return 4;
        else
            expr_return "bar"s;
    }();
}

Todd

Niall Douglas

unread,
Oct 10, 2017, 9:33:27 PM10/10/17
to ISO C++ Standard - Future Proposals
For the record, I feel no love for GCC type statement expressions. They are too limited to replace C macros, specifically, you can't inject variable creation with them. They also have very mixed quality of implementation depending on the compiler. That proposal you linked to was very complicated. And GCC type statement expressions as-is can't fully replace C macros, which makes that proposal dead for me just on that.

The evil lambda thing I proposed isn't complex, it literally inserts its contents into however many stack frames higher than its call site. Now that's wrong on so many levels. But I'm exploring some options here. Constructive suggestions for how to better replace the "inserting boilerplate" use case for C macros are welcome.

Niall

Niall Douglas

unread,
Oct 10, 2017, 9:39:11 PM10/10/17
to ISO C++ Standard - Future Proposals
I like the idea of marking these code objects as being of "injectable" type rather than subverting the lambda syntax. I only chose the latter because I know WG21 dislikes adding new keywords.

I don't think the explicit inject is good though. Boilerplate insertion ought to be quick to type. How about copying Rust:

void evil!()
{
 
// Inject "return 5;" into whomever calls evil!()
 
return 5;
}


int foo()
{
  evil
!();  // as if "return 5;"
}

So, if your function name ends with '!', it injects its contents into the calling scope.

And you cannot call an injecting function without its postfix of '!' which clearly indicates that this thing will be injecting code here.

Thoughts?

Niall

Nicol Bolas

unread,
Oct 10, 2017, 9:58:35 PM10/10/17
to ISO C++ Standard - Future Proposals
On Tuesday, October 10, 2017 at 9:33:27 PM UTC-4, Niall Douglas wrote:
On Wednesday, October 11, 2017 at 1:55:56 AM UTC+1, Nicol Bolas wrote:
On Tuesday, October 10, 2017 at 7:02:48 PM UTC-4, Niall Douglas wrote:
What do people think?

I think I've seen this before. I didn't much care for it then either. Or at least, the "I'm can pass 'functions' around that force the caller to return, without the caller having any clue this can happen" part.

After skimming through that thread, my overall conclusion about the idea is that the idea, as currently stated, makes a complete hash of the very idea of "structured programming".

Higher-level languages are structured because being structured allows you to read code and have a reasonable idea of what can and cannot happen at any particular point. It lets you inspect code and understand the flow of logic of the code from beginning to end. You can easily predict what can happen, and you can easily say what cannot happen.

Structured loops (for, while, do) work because the control flow from statement to statement is heavily regulated. The various forms of control flow allowed here are exiting the function, stopping current progress and repeating the loop, or exiting the loop. Each of these forms of control flow is governed by a specific keyword. If you don't see one of those keywords in the loop, then you know a priori that none of those things will happen during the loop. And therefore, you know that the loop will proceed exactly as it is structured.

Well, there is one notable exception: exceptions. These represent a form of control flow that is, by design, outside of the formal structure of the program. If you call a function in that loop, it might throw an exception through that loop, terminating it early.

Unless you only call `noexcept` functions, you cannot know that this will not happen. However, if you really want, you can prevent it. You have the power to catch every exception that tries to flow around your loop and swallow it. You can even keep it around and throw it after the loop (via `exception_ptr`). So even this most unstructured aspect of C++ can still be caught, if you so desire.

Furthermore, if you choose not to catch such exceptions, the behavior of your program may still be reasonable. Essentially, what you have are two possible circumstances: the loop completes as normal, or the loop is terminated via an exception, which provokes the destruction of all objects on the stack. That's two very different sets of behavior, but you can plan around them.

And since exceptions are primarily intended to represent failures, cases where continuing to move forward no longer makes sense, this alternative structure is... tolerable. It's not perfect, but it is an unusual circumstance, and one with well-defined and well-understood behavior.

One last point on exceptions before I continue. Without an explicit `try/catch` block, it doesn't matter what the type of the exception is that passes through your code. The thrower could throw an integer, string, `std::exception`, or something else entirely. If you don't actually catch it yourself, if you're not interested in doing so... then you do not care. And you don't have to; anyone can throw any type of exception through any exception-safe code.

What you are proposing is for us to make it so that a function call can have another way of exiting. Instead of returning a value, it can command the caller of it to return a value. And that this is possible is not indicated anywhere. The function declaration itself doesn't indicate it, and the caller does not have to explicitly permit it. Same goes for `continue/break/etc`.

This breaks structured programming. You can no longer know by inspection that a block of code will execute in sequence. Oh yes, exceptions already made this the case, but those are supposed to be rare and unexpected occurrences. But even then, an exception will only ever do one thing: terminate processing.

Your suggestion can cause a function to invoke a "continue" on its caller. That opens up an entire host of possibilities for how control will actually flow. At present, either things execute in sequence, or they stop executing. Now, maybe they'll half-execute in sequence, then continue the loop. All without actually seeing the control flow keywords at play.

Oh sure, structured programming is not completely broken by this. It's not like a function call can randomly cause the caller to jump to an arbitrary location (though your idea can in theory invoke a `goto`). But without actually seeing the structured branch command in the code, predicting the flow of the code becomes increasingly difficult.

And then there are practical issues. What happens if you have a lambda that wants to invoke `return` on its caller, but that lambda then gets shoved into a `std::function`? Well, its caller is now `std::function::operator(...)`. But it's not really that function at all; the implementation will likely call some type-erased virtual function or whatever, so there are at least two layers of function calls between the lambda and where you actually want to return from. And since the implementation of `std::function::operator()` is not defined, there's no way to tell exactly how many layers there are, so there's no way to put the right number of `!` characters in your lambda to have the behavior you want: to return from the caller of `std::function::operator(...)`.

The only way to accomplish that is to have some direct connection between that lambda and the place you want to transfer control. Exception handling actually accomplishes this. If you want to "inject" something in some code up the call stack, you `throw` an object of a certain type. The receiving code will wrap the call in a `try` block and `catch` that object type. Thus, communication is established.

What you're looking for, as I said in the other thread, is an exception-like mechanism. That's the only way to make this actually work.

For the record, I feel no love for GCC type statement expressions. They are too limited to replace C macros, specifically, you can't inject variable creation with them. They also have very mixed quality of implementation depending on the compiler. That proposal you linked to was very complicated. And GCC type statement expressions as-is can't fully replace C macros, which makes that proposal dead for me just on that.

The evil lambda thing I proposed isn't complex, it literally inserts its contents into however many stack frames higher than its call site. Now that's wrong on so many levels. But I'm exploring some options here. Constructive suggestions for how to better replace the "inserting boilerplate" use case for C macros are welcome.

But... what if I think it's a really bad idea to be able to force "inserted boilerplate" into someone's code without their will, knowledge, or consent? Because that is ultimately the heart of your proposal. It's not merely to have a macro-like thing (we have actual macros). It's to have language support for passing around macro-like things as objects, yet still having them behave in macro-like ways at their eventual site of use.

I don't think it is a good idea to allow people to do that. Especially if it is invisible to the "eventual site of use".
 

Niall

Nicol Bolas

unread,
Oct 10, 2017, 10:04:30 PM10/10/17
to ISO C++ Standard - Future Proposals
On Tuesday, October 10, 2017 at 9:39:11 PM UTC-4, Niall Douglas wrote:
On Wednesday, October 11, 2017 at 2:08:31 AM UTC+1, Todd Fleming wrote:
On Tuesday, October 10, 2017 at 8:28:20 PM UTC-4, Niall Douglas wrote:
If it can't inject in some given use case, it fails to compile with a suitable error message.


What about making it explicit like try expressions?

injectable string not_as_evil()
{
    if(...)
        return 5;
    else
        expr_return "foo"s;
}

int hear_no_evil()
{
    string s1 = inject not_as_evil();
    ...

    string s2 = inject [] injectable {
        if(...)
            return 4;
        else
            expr_return "bar"s;
    }();
}

I like the idea of marking these code objects as being of "injectable" type rather than subverting the lambda syntax. I only chose the latter because I know WG21 dislikes adding new keywords.

Don't try to second-guess the committee. Get commentary from them on the design you'd actually like, then see what can be worked out as far as syntax is concerned.

I don't think the explicit inject is good though. Boilerplate insertion ought to be quick to type. How about copying Rust:

void evil!()
{
 
// Inject "return 5;" into whomever calls evil!()
 
return 5;
}


int foo()
{
  evil
!();  // as if "return 5;"
}

So, if your function name ends with '!', it injects its contents into the calling scope.

And you cannot call an injecting function without its postfix of '!' which clearly indicates that this thing will be injecting code here.

Thoughts?

It's less bad than the original, in that there is at least syntax here which makes it clear that something dubious is going on. However, considering how damaging this is to structured programming, I'd prefer a more verbose syntax. Something that is easily searchable.

Also, this doesn't solve the indirect call problem (ie: sticking `evil` in a `std::function` and making it work through that). And note that the indirect call problem also pertains to any standard library algorithm. So if you do `std::accumulate` with one of these functions, that `return` may not make it to the actual caller.
 

Niall

Ross Smith

unread,
Oct 10, 2017, 10:19:56 PM10/10/17
to std-pr...@isocpp.org
On 2017-10-11 14:33, Niall Douglas wrote:
>
> The evil lambda thing I proposed isn't complex, it /literally/ inserts
> its contents into however many stack frames higher than its call site.
> Now that's wrong on so many levels. But I'm exploring some options here.
> Constructive suggestions for how to better replace the "inserting
> boilerplate" use case for C macros are welcome.

The boilerplate-injection motivating example doesn't seem to require
access to more than one stack frame above the Evil Lambda, and
restricting it to just one level up would remove some of the "spooky
action at a distance" issues with the proposal.

Your example uses two levels of access to create a function that can
do some of the things currently reserved for macros:

void evil() {
// Inject "return 5;" into whomever calls evil()
[!!] { return 5; }();
}

But it seems to me you only need two levels there because you've
arbitrarily restricted Evil Powers to lambdas and not ordinary
functions. Add notation for that and we can do the above in one
step:

inline export void evil() {
// Inject "return 5;" into whomever calls evil()
return 5;
}

Also, I'm pretty sure "stack frame" isn't a well defined concept as
far as the Standard is concerned. What your Evil Lambdas (or my Evil
Functions) really want access to is the _scope_ of their calling
context. The outermost scope of the Evil Function's body becomes
part of the call site's scope (modulo argument substitution,
hygienic renaming of the function body's local variables, and no
doubt any number of other complications).

Ross Smith

Olanrewaju Adetula

unread,
Oct 11, 2017, 1:56:41 AM10/11/17
to std-pr...@isocpp.org

The injection operator ->, currently being used in the metaclass proposal can be adapted for code injection as a form forced inline function. For example
->foo(a);
Copies the body of foo into the current context while doing the necessary substitution of the argument if there is any


--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/63d5cecd-c442-7077-90f5-066b0bc84697%40otoy.com.

Andrey Semashev

unread,
Oct 11, 2017, 5:10:32 AM10/11/17
to std-pr...@isocpp.org
On 10/11/17 02:02, Niall Douglas wrote:
>
> The new '!' capturing flag means "execute this in the stack frame of my
> caller". Or rather, it means "execute this in *one* stack frame above
> this lambda function's stack frame" because yes, you can do multiple '!':
>
> |
> voidevil()
> {
> // Inject "return 5;" into whomever calls evil()
> [!!]{return5;}();
> }
>
>
> intfoo()
> {
>   evil();// as if "return 5;"
> }
> |
>
> And you can also do '!!!' or '!!!!' and so on up to the current
> inlineability limit of the code currently being compiled.

If compiation is ought to fail if the compiler is not able to inject the
function body then this proposal is not functional. E.g. in debug builds
inlining is typically completely disabled (incuding forced inlining),
which means the compiler cannot inject any functions and would fail even
with a single !.

Another point is that I don't think the C++ standard defines the notion
of a stack frame or a stack at all, so the "execute this in the stack
frame of my caller" definition makes no sense. I think you should define
the feature in terms of scope visibility and the order of evaluation on
the abstract machine. In particular, you need to specify what symbols
are visible in the function being injected and when its body is
executed. For example:

int bar()
{
int b = 10;
double c = 20.0;
printf("Hello!\n");
[!!, &] { return a + b - c; } ();
printf("Bye!\n");
return 47;
}

auto foo()
{
int a = 5, c = 30;
allocate_resource();
int d = bar();
release_resource();
return d;
}

Is `a` visible in the lambda? If the function is supposed to be executed
in the scope of `foo` I would guess the answer is yes.

Is `b` visible? Intuitively, I would expet it to be visible as well
because at the point of definition of the lambda capture statement `b`
already exists.

Which `c` is being used by the lambda and what is the return type and
value of `foo`? Does the program print Hello and/or Bye? Does
`release_resource` get called? What if `bar` is in another translation
unit (ignore the LTO)? This is where I cannot give a reasonable answer
given the proposal and I think it shows the possible problems with it.

Lastly, I think the whole idea of injecting code into an unsuspecting
caller is terrible because it breaks the caller invariants and throws
away separation of logic. Injecting only into the direct caller might
not be so terrible because this can be viewed as a way to implement the
caller's body. But in this case I don't see the benefit compared to
macros or inline functions. (Yes, you can't immediately return the
caller from an inline function and macros don't check the syntax and can
be hard to debug, but neither of these is something I would trade for
the ability to easilly understand the code.) So for me the motivation
for this feature is not enough to outweigh the potential damage.

> And I don't know about you, but I'd like to see the back of C macros.

I don't have anything against macros. The preprocessor is another tool,
a very powerful one, BTW. If you want to paste some code in multiple
places it offers a way to do that and I see nothing wrong with it. The
choice whether to use it or not is always yours.

Todd Fleming

unread,
Oct 11, 2017, 10:20:15 AM10/11/17
to ISO C++ Standard - Future Proposals
On Wednesday, October 11, 2017 at 1:56:41 AM UTC-4, Olanrewaju Adetula wrote:

The injection operator ->, currently being used in the metaclass proposal can be adapted for code injection as a form forced inline function. For example
->foo(a);
Copies the body of foo into the current context while doing the necessary substitution of the argument if there is any


It looks like metaclass -> needs these changes to work for the expression try case:
  • Work in a non-constexpr context
  • Work within expressions
  • Some way to indicate the expression result (return?) vs. forcing parent to return (parent_return?)

template<typename V, typename E>
V try_
(expected<V, E> x) {
   
if(x)
       
return *x;
   
else
        parent_return x
;
}

expected
<int, error> test_expected() {
   
auto x = ->try_(f1());
   
auto y = ->try_(f2(x + 7));
   
return ->try_(f4(x + y)) / ->try_(f5(x / y));
}

 Todd

Niall Douglas

unread,
Oct 11, 2017, 10:53:03 AM10/11/17
to ISO C++ Standard - Future Proposals

The injection operator ->, currently being used in the metaclass proposal can be adapted for code injection as a form forced inline function. For example
->foo(a);
Copies the body of foo into the current context while doing the necessary substitution of the argument if there is any


It looks like metaclass -> needs these changes to work for the expression try case:
  • Work in a non-constexpr context
  • Work within expressions
  • Some way to indicate the expression result (return?) vs. forcing parent to return (parent_return?)
I like the idea of reusing syntax from C++ rather than from Rust. 

template<typename V, typename E>
V try_
(expected<V, E> x) {
   
if(x)
       
return *x;
   
else
        parent_return x
;
}

expected
<int, error> test_expected() {
   
auto x = ->try_(f1());
   
auto y = ->try_(f2(x + 7));
   
return ->try_(f4(x + y)) / ->try_(f5(x / y));
}

But I also find the above syntax ugly, plus I find the "try_" unfortunate.

The '!' based approach enables a separate namespace, thus allowing one to define to do!, try!, return! and so on without collision with C++ keywords. So, repeating your example:

template<typename V, typename E>

V
try!(expected<V, E> x) {
   
if(x)
       
return!(*x);  // The output of this "macro"
   
else
       
return x;     // Injects "return x;" into the caller
}

expected
<int, error> test_expected() {
   
auto x = try!(f1());
   
auto y = try!(f2(x + 7));
   
return try!(f4(x + y)) / try!(f5(x / y));
}

Niall

Todd Fleming

unread,
Oct 11, 2017, 12:38:41 PM10/11/17
to ISO C++ Standard - Future Proposals


On Wednesday, October 11, 2017 at 10:53:03 AM UTC-4, Niall Douglas wrote:
But I also find the above syntax ugly, plus I find the "try_" unfortunate.

The '!' based approach enables a separate namespace, thus allowing one to define to do!, try!, return! and so on without collision with C++ keywords. So, repeating your example:

template<typename V, typename E>
V
try!(expected<V, E> x) {
   
if(x)
       
return!(*x);  // The output of this "macro"
   
else
       
return x;     // Injects "return x;" into the caller
}

expected
<int, error> test_expected() {
   
auto x = try!(f1());
   
auto y = try!(f2(x + 7));
   
return try!(f4(x + y)) / try!(f5(x / y));
}

Niall

return!(*x) already has a meaning. I'm OK with the language standard redefining keyword meaning with ! when there's no existing conflict, but having some keywords with ! be user defined terrifies me.

Todd

inkwizyt...@gmail.com

unread,
Oct 11, 2017, 2:30:12 PM10/11/17
to ISO C++ Standard - Future Proposals
One observation, if lambda could pop outer function then return type of that function should be embedded in signature, something like that:
auto z = [!](bool a)
{
   
if (a)
       
return return 42; //yup double return, could be `return break;` too?
   
else
       
return;
} -> { void, int }; //we can return void or double return int
This lambda will be able to call only in functions that return `int`s. Other wise what will happen if you try return `int` when outer function return `struct`?
This will allow too sane implementation of this functionality, because each return level need have way to find way to clean up objects in outer function and this can only by done by multiple return addresses passed to inner function.

Another thing is that you can skip multiple levels with this arbitrary but if outer function it self multi return you could triple return to function that call outer function.
Some thing like that (`register` as placeholder for keyword):
{void, int, int} tripleReturn();
{int, int} doulbeReturn()
{
   
register tripleReturn(); //ok can return from `doulbeReturn` or function that call it
}

int normalReturn()
{
   
register tripleReturn(); //error, too many return levels in tripleReturn!
   
int i = register doubleReturn();
   
int j = register []{ register tripleReturn(); }(); //ok, lambda implicitly `->{int, int}` and `tripleReturn` can return from `normalReturn`
}


Thiago Macieira

unread,
Oct 11, 2017, 3:02:51 PM10/11/17
to std-pr...@isocpp.org
On terça-feira, 10 de outubro de 2017 16:02:48 PDT Niall Douglas wrote:
> The new '!' capturing flag means "execute this in the stack frame of my
> caller". Or rather, it means "execute this in *one* stack frame above this
> lambda function's stack frame" because yes, you can do multiple '!':
>
> void evil()
> {
> // Inject "return 5;" into whomever calls evil()
> [!!] { return 5; }();
> }

This can't be done, unless you decorate "evil" with the non-existent attribute
always_inline.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Niall Douglas

unread,
Oct 11, 2017, 5:34:33 PM10/11/17
to ISO C++ Standard - Future Proposals

return!(*x) already has a meaning. I'm OK with the language standard redefining keyword meaning with ! when there's no existing conflict, but having some keywords with ! be user defined terrifies me.

Oh my yes you're right. And that's going to be an issue with using any token which is valid in an expression in C++.

The only two safe characters are therefore '?' or ':' as those can never be ambiguous. Or use a new character not legal in a valid identifier.

Speaking of which, it turns out that the C preprocessor is required to pass through a '#' when not the first non-whitespace token in a line. And the compiler always errors out if it sees stray # after the preprocessor. So this is I think is the solution:

template<typename V, typename E>

V
try#(expected<V, E> x) {
   
if(x)
       
return#(*x);  // The output of this "macro"

   
else
       
return x;     // Injects "return x;" into the caller
}

expected
<int, error> test_expected() {

   
auto x = try#(f1());
   
auto y = try#(f2(x + 7));
   
return try#(f4(x + y)) / try#(f5(x / y));
}

The symmetry of # meaning a boilerplate expansion I like. The cost of this is of course that generating these boilerplate injecting functions with a parameterised C preprocessor macro expansion is going to be tricky, though not impossible.

Niall

Arthur O'Dwyer

unread,
Oct 11, 2017, 8:23:28 PM10/11/17
to ISO C++ Standard - Future Proposals
On Tuesday, October 10, 2017 at 6:08:31 PM UTC-7, Todd Fleming wrote:
On Tuesday, October 10, 2017 at 8:28:20 PM UTC-4, Niall Douglas wrote:
If it can't inject in some given use case, it fails to compile with a suitable error message.

What about making it explicit like try expressions?

injectable string not_as_evil()
{
    if(...)
        return 5;
    else
        expr_return "foo"s;
}

This is the way forward, IMO.
We currently have at least two active proposals related to "messing with the caller's frame":
- Gor's Coroutines TS with operator "co_await" and operator "co_yield"
- Niall and Vicente's proposal with operator "try"
(I think we have recently encountered another similar case, but I'm blanking on it right now.)
This idea of somehow increasing the scope of things an expression can do with its enclosing frame, via a named operator or some special syntax, is very powerful. I would like to see it approached systematically, and I would like to see the systematic approach attempted before anyone tries to merge the Coroutines TS into C++2a, because by then it will have become too late.

–Arthur

Niall Douglas

unread,
Oct 11, 2017, 8:53:43 PM10/11/17
to ISO C++ Standard - Future Proposals

We currently have at least two active proposals related to "messing with the caller's frame":
- Gor's Coroutines TS with operator "co_await" and operator "co_yield"
- Niall and Vicente's proposal with operator "try"
(I think we have recently encountered another similar case, but I'm blanking on it right now.)
This idea of somehow increasing the scope of things an expression can do with its enclosing frame, via a named operator or some special syntax, is very powerful. I would like to see it approached systematically, and I would like to see the systematic approach attempted before anyone tries to merge the Coroutines TS into C++2a, because by then it will have become too late.

And I have just tackled both cases in a single "operator try" proposal paper. See the other thread. 

Niall

Nicol Bolas

unread,
Oct 11, 2017, 9:38:25 PM10/11/17
to ISO C++ Standard - Future Proposals


On Wednesday, October 11, 2017 at 5:34:33 PM UTC-4, Niall Douglas wrote:

return!(*x) already has a meaning. I'm OK with the language standard redefining keyword meaning with ! when there's no existing conflict, but having some keywords with ! be user defined terrifies me.

Oh my yes you're right. And that's going to be an issue with using any token which is valid in an expression in C++.

The only two safe characters are therefore '?' or ':' as those can never be ambiguous. Or use a new character not legal in a valid identifier.

Speaking of which, it turns out that the C preprocessor is required to pass through a '#' when not the first non-whitespace token in a line. And the compiler always errors out if it sees stray # after the preprocessor. So this is I think is the solution:

template<typename V, typename E>
V
try#(expected<V, E> x) {
   
if(x)
       
return#(*x);  // The output of this "macro"
   
else
       
return x;     // Injects "return x;" into the caller
}

That's backwards. If `#` means "cause non-local effects", then `return#` should mean to inject that into the caller. Unadorned `return` should return from the function with that value.

Andrey Semashev

unread,
Oct 12, 2017, 4:55:58 AM10/12/17
to std-pr...@isocpp.org
On 10/12/17 00:34, Niall Douglas wrote:
>
> Speaking of which, it turns out that the C preprocessor is required to
> pass through a '#' when not the first non-whitespace token in a line.
> And the compiler always errors out if it sees stray # after the
> preprocessor. So this is I think is the solution:
>
> |
> template<typenameV,typenameE>
> V try#(expected<V, E> x) {
> if(x)
> return#(*x);  // The output of this "macro"
> else
> returnx;// Injects "return x;" into the caller
> }
>
> expected<int,error>test_expected(){
> autox =try#(f1());
> autoy =try#(f2(x + 7));
> returntry#(f4(x + y)) / try#(f5(x / y));
> }
> |

If "return#" and "try#" is not a single token then I believe the user is
allowed to type "#" separately on the next line from the preceeding
keyword, which will break the preprocessor.

Niall Douglas

unread,
Oct 12, 2017, 9:03:25 AM10/12/17
to ISO C++ Standard - Future Proposals

That's backwards. If `#` means "cause non-local effects", then `return#` should mean to inject that into the caller. Unadorned `return` should return from the function with that value.

I decided for brevity and simplicity to eliminate the ability for native C++ macros to return anything. If that bothers you a lot, please absolutely do propose a better native C++ macros proposal and champion it through the committee, I can't attend committee meetings, I am not the one able to do this.

Niall

Niall Douglas

unread,
Oct 12, 2017, 9:06:48 AM10/12/17
to ISO C++ Standard - Future Proposals

If "return#" and "try#" is not a single token then I believe the user is
allowed to type "#" separately on the next line from the preceeding
keyword, which will break the preprocessor.

Which is desirable.

I'm no parsing expert, but in parsers I've written in the past you would simply include '#' as a valid character in an identifier so long as it is not the first character. The parser will greedily consume as many tokens for an identifier as it can, so no ambiguity nor confusion will result. So no problem here I think.

Niall 

Andrey Semashev

unread,
Oct 12, 2017, 9:28:19 AM10/12/17
to std-pr...@isocpp.org
On 10/12/17 16:06, Niall Douglas wrote:
>
> If "return#" and "try#" is not a single token then I believe the
> user is
> allowed to type "#" separately on the next line from the preceeding
> keyword, which will break the preprocessor.
>
> Which is desirable.
>
> I'm no parsing expert, but in parsers I've written in the past you would
> simply include '#' as a valid character in an identifier so long as it
> is not the first character.

But in this case '#' is not part of an identifier, it's a separate
token. The fact that it breaks preprocessor in some cases contradicts
the experience we have with any other token in C/C++, which you can
format however you like, as long as the tokens can be parsed unambiguously.

Nicol Bolas

unread,
Oct 12, 2017, 10:17:34 AM10/12/17
to ISO C++ Standard - Future Proposals
On Thursday, October 12, 2017 at 9:03:25 AM UTC-4, Niall Douglas wrote:

That's backwards. If `#` means "cause non-local effects", then `return#` should mean to inject that into the caller. Unadorned `return` should return from the function with that value.

I decided for brevity and simplicity to eliminate the ability for native C++ macros to return anything.

If they don't return a value, why do they look like functions?


Niall Douglas

unread,
Oct 12, 2017, 10:45:10 AM10/12/17
to ISO C++ Standard - Future Proposals

> I'm no parsing expert, but in parsers I've written in the past you would
> simply include '#' as a valid character in an identifier so long as it
> is not the first character.

But in this case '#' is not part of an identifier, it's a separate
token. The fact that it breaks preprocessor in some cases contradicts
the experience we have with any other token in C/C++, which you can
format however you like, as long as the tokens can be parsed unambiguously.

Sorry, are you referring to the preprocessor getting confused, not the compiler?

I can see obvious problems in function macros, but off the top of my head, normal macro definitions and all other preprocessor commands are safe because it will not mess with a replacing <identifier><hash> token sequence.

(Unless you are MSVC's preprocessor of course)

So #define foo boo# will correctly replace foo with boo#.

What can't work as-is is #define foo# boo. This is illegal because a whitespace must follow an identifier in a macro definition unless it's a '(' token.

These are interesting gotchas I think I'll mention in the paper. Thanks.

Niall

Niall Douglas

unread,
Oct 12, 2017, 10:47:40 AM10/12/17
to ISO C++ Standard - Future Proposals
Would this form suit you better?

template<class T> foo#(T v) { tokens } 
foo#(auto v) { tokens }

Or are you thinking that curly brackets around the tokens needs to be different?

Niall

Matthew Woehlke

unread,
Oct 12, 2017, 11:13:12 AM10/12/17
to std-pr...@isocpp.org, Nicol Bolas
On 2017-10-10 22:04, Nicol Bolas wrote:
> On Tuesday, October 10, 2017 at 9:39:11 PM UTC-4, Niall Douglas wrote:
>> How about copying Rust:
>>
>> void evil!()
>> {
>> // Inject "return 5;" into whomever calls evil!()
>> return 5;
>> }
>>
>>
>> int foo()
>> {
>> evil!(); // as if "return 5;"
>> }
>>
>> So, if your function name ends with '!', it injects its contents into the
>> calling scope.
>>
>> And you cannot call an injecting function without its postfix of '!' which
>> clearly indicates that this thing will be injecting code here.
>
> It's less bad than the original, in that there is at least syntax here
> which makes it clear that something dubious is going on. However,
> considering how damaging this is to structured programming, I'd prefer a
> more verbose syntax. Something that is easily searchable.

(What tools are you using that can't search for '\w!'? What false
positives would that produce?)

I suppose we could try:

void bar() inline { return 5; }

int foo()
{
inline bar(); // returns 5 from foo()
}

> Also, this doesn't solve the indirect call problem (ie: sticking `evil` in
> a `std::function` and making it work through that).

Why not?

std::function<int! ()> f = evil();
f!(); // as if `return 5`

...or:

std::function<int () inline> f = bar();
inline f(); // as if `return 5`

(This would need some partial specialization of std::function, but
otherwise what's the problem?)

> And note that the indirect call problem also pertains to any standard
> library algorithm. So if you do `std::accumulate` with one of these
> functions, that `return` may not make it to the actual caller.
That is probably not solvable with a solution like this, at least not
without extra gymnastics. (In fact, much as I hate to say it, that
sounds like a time when throwing an "exception" is the right answer. In
this case, you need something that will unwind an arbitrary call stack
up to a point under your control. That, or you need something like
setjmp/longjmp.)

That said, if you partially specialize the algorithms for being passed
this special type of functor, that *might* do it... it's just that all
that specialization is expensive (in implementation cost; hence the
"extra gymnastics").

--
Matthew

Ville Voutilainen

unread,
Oct 12, 2017, 11:17:47 AM10/12/17
to ISO C++ Standard - Future Proposals
On 12 October 2017 at 18:13, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
>> Also, this doesn't solve the indirect call problem (ie: sticking `evil` in
>> a `std::function` and making it work through that).
>
> Why not?


Who cares? Why would I want to call these injection-macros indirectly?
They can't reasonably inject
in those cases, but they are not functions, so what does it matter?

Andrey Semashev

unread,
Oct 12, 2017, 12:36:00 PM10/12/17
to ISO C++ Standard - Future Proposals
On 10/12/17 17:45, Niall Douglas wrote:
>
> > I'm no parsing expert, but in parsers I've written in the past
> you would
> > simply include '#' as a valid character in an identifier so long
> as it
> > is not the first character.
>
> But in this case '#' is not part of an identifier, it's a separate
> token. The fact that it breaks preprocessor in some cases contradicts
> the experience we have with any other token in C/C++, which you can
> format however you like, as long as the tokens can be parsed
> unambiguously.
>
>
> Sorry, are you referring to the preprocessor getting confused, not the
> compiler?

Yes. There probably are cases when the preprocessor can confuse the
compiler as well by transforming the text in unintended ways.

> I can see obvious problems in function macros, but off the top of my
> head, normal macro definitions and all other preprocessor commands are
> safe because it will not mess with a replacing <identifier><hash> token
> sequence.

I have something like this in mind:

void foo()
{
int define = 10;
return
# define + 5;
}

Even if not the "define" identifier, I don't think it is allowed to have
a leading '#' in a line, followed by arbitrary text like C++ expressions.

> So #define foo boo# will correctly replace foo with boo#.

Not sure.

#define foo boo# baz

Here, foo should expand into

boo "baz"

I.e. the # preprocessor operator envelops the following tokens in
quotes. I'm not sure what should be the result if there are no tokens.

Nicol Bolas

unread,
Oct 12, 2017, 12:37:40 PM10/12/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com
How about the type-erased virtual call in `function::operator()`? If you look at this feature as "token pasting", there's no way that you can have a virtual call involve "token pasting". This is not a thing that can happen through runtime polymorphism of arbitrary numbers of objects.

Even if you remove the "token pasting" aspect and just turn it into static control over the outer scope, it still can't really happen through runtime polymorphism. Not without effectively throwing an exception.



--
Matthew

inkwizyt...@gmail.com

unread,
Oct 12, 2017, 12:43:12 PM10/12/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com
As I said in other post, this need be part of signature, with this ABI could allow passing more return address to function to handle additional exit paths.

Nicol Bolas

unread,
Oct 12, 2017, 1:08:52 PM10/12/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com
That's a pretty substantial change to the idea. It's originally specified as being "macro"-like, with token pasting and the like. It's not supposed to get to the level of being part of an ABI.

What you're now talking about is a thing where a function can return in odd ways. At which point, it's really just an oddball form exception handling, except with:

1: The caller has to explicitly ask for it.

2: The callee has to explicitly declare in its prototype that it can happen (ie: the opposite of `noexcept`).

3: There are strict limitations on what the function can do besides return values.

Indeed, object destructors might even want to do different things (transaction rollbacks) if they're destroyed due to unwinding from these kinds of returns, relative to regular scope exits.

The moment you allow this feature to work with non-static cases, you're jump directly into the realm of exception handling. Even if it is somewhat limited, it's still exception handling.

Matthew Woehlke

unread,
Oct 12, 2017, 1:13:32 PM10/12/17
to std-pr...@isocpp.org, Nicol Bolas
On 2017-10-12 12:37, Nicol Bolas wrote:
> On Thursday, October 12, 2017 at 11:13:12 AM UTC-4, Matthew Woehlke wrote:
>> On 2017-10-10 22:04, Nicol Bolas wrote:
>>> Also, this doesn't solve the indirect call problem (ie: sticking `evil`
>>> in a `std::function` and making it work through that).
>>
>> Why not?
>>
>> std::function<int! ()> f = evil();
>> f!(); // as if `return 5`
>>
>> ...or:
>>
>> std::function<int () inline> f = bar();
>> inline f(); // as if `return 5`
>>
>> (This would need some partial specialization of std::function, but
>> otherwise what's the problem?)
>
> How about the type-erased virtual call in `function::operator()`?

That would be `function::operator() inline` (see previous notes about
*partial specialization*). This would have to dispatch in some way that
can detect if the actual underlying function issued an early return (or
- if you allow such things, and I'm *not* convinced we should - other
control statements) and pass that along to the call operator in some
other fashion, which the call operator could detect and re-issue the
`return`.

This seems like it would be possible to implement, but *how* to do so is
left as an exercise to the reader. (Particularly since I can't say that
I've ever needed such a feature.)

(Oh... another matter... I think such functions either can't yield a
value normally, which makes them less useful than statement expressions,
or they would need to specify both the yielded and
possibly-returned-through-caller return types.)

> Even if you remove the "token pasting" aspect and just turn it into static
> control over the outer scope, it still can't really happen through runtime
> polymorphism. Not without effectively throwing an exception.

I dispute that. See above.

--
Matthew

inkwizyt...@gmail.com

unread,
Oct 12, 2017, 1:20:33 PM10/12/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com
 Yes exactly, I probably drift away from original proposal but for me this is only logical direction that this can go.
Other wise we will have non-fuction objects that look like functions but can't be use like any other function.
With this three points this objects stay being functions and could be used in other contexts too. Like type erased virtual calls.

Matthew Woehlke

unread,
Oct 12, 2017, 1:21:21 PM10/12/17
to std-pr...@isocpp.org, Nicol Bolas
On 2017-10-12 13:08, Nicol Bolas wrote:
> On Thursday, October 12, 2017 at 12:43:12 PM UTC-4, Marcin Jaczewski wrote:
>> As I said in other post, this need be part of signature, with this ABI
>> could allow passing more return address to function to handle additional
>> exit paths.
>
> That's a pretty substantial change to the idea. It's originally specified
> as being "macro"-like, with token pasting and the like. It's not supposed
> to get to the level of being part of an ABI.

It *has* to be part of the ABI, and no, this isn't a departure. The
original idea implicitly had the ABI "I am not an emitted function".

> What you're now talking about is a thing where a function can return in odd
> ways.

Well... yeah. That's the only way it *can* work if you're going to stick
it in a std::function, or allow it to be called when it can't be
actually inlined.

> At which point, it's really just an oddball form exception handling [...]

...except that the compiler is expected to inline where possible to
avoid all the nasty overhead of exceptions, and even when not, the
"unwinding" mechanism would look much different. This is much closer to
expected+try (which, I believe, was the intention), except the "failing
try" can return anything.

> Indeed, object destructors might even want to do different things
> (transaction rollbacks) if they're destroyed due to unwinding from these
> kinds of returns, relative to regular scope exits.

That wasn't part of the proposal, and I'm not *at all* convinced it
should be.

--
Matthew

Nicol Bolas

unread,
Oct 12, 2017, 1:23:37 PM10/12/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com
Just forget for the moment the question of this feature, and ask yourself this: can you inline through `function::operator()`? Have you seen an implementation of `std::function` with type erasure where compiler inlining through it is possible in all cases?

Because if you can't inline through `function`, then you certainly can't do what's being discussed here. As this is based on being able to inline.

I would say that the burden of proof is on you to show how you can implement `function` so that you can inline through `operator()`. In all cases.

(Oh... another matter... I think such functions either can't yield a
value normally, which makes them less useful than statement expressions,
or they would need to specify both the yielded and
possibly-returned-through-caller return types.)

... I was just about to say the same thing. That the function declaration needs to include the control structures that it wants to export, which also includes the type of the object it can export-return.

Nicol Bolas

unread,
Oct 12, 2017, 1:44:00 PM10/12/17
to ISO C++ Standard - Future Proposals, jmck...@gmail.com
On Thursday, October 12, 2017 at 1:21:21 PM UTC-4, Matthew Woehlke wrote:
On 2017-10-12 13:08, Nicol Bolas wrote:
> On Thursday, October 12, 2017 at 12:43:12 PM UTC-4, Marcin Jaczewski wrote:
>> As I said in other post, this need be part of signature, with this ABI
>> could allow passing more return address to function to handle additional
>> exit paths.
>
> That's a pretty substantial change to the idea. It's originally specified
> as being "macro"-like, with token pasting and the like. It's not supposed
> to get to the level of being part of an ABI.

It *has* to be part of the ABI, and no, this isn't a departure.

It only "has to" if these functions are actually C++ functions and can be dynamically called. The initial idea was that they're not really functions; they're macros written in C++.

Macros are not part of an ABI. If these macro-functions are intended to be macro-like and therefore always inlined, they're not part of the ABI. Especially the bit about accessing the local scope of where it's used; there's no way for that to be part of the ABI.
 
The
original idea implicitly had the ABI "I am not an emitted function".

That's called "not being part of the ABI". If you don't emit something, it isn't there.
 

> What you're now talking about is a thing where a function can return in odd
> ways.

Well... yeah. That's the only way it *can* work if you're going to stick
it in a std::function, or allow it to be called when it can't be
actually inlined.

> At which point, it's really just an oddball form exception handling [...]

...except that the compiler is expected to inline where possible to
avoid all the nasty overhead of exceptions, and even when not, the
"unwinding" mechanism would look much different.

How would it look "much different" in the non-inlining case?

Also, if the compiler could inline exception throwing in more places, why don't we see people writing code that relies on that?

The more this feature takes a non-static shape, the more it looks like exception handling. And the more likely that it will have all of exception handling's problems.

We're even discussing making people declare the type(s?) of "exceptions" that their functions "throw" in the function's signature, since we want them to have to specify the return type and the injected return type.
 
This is much closer to
expected+try (which, I believe, was the intention), except the "failing
try" can return anything.

> Indeed, object destructors might even want to do different things
> (transaction rollbacks) if they're destroyed due to unwinding from these
> kinds of returns, relative to regular scope exits.

That wasn't part of the proposal,

Them working without being inlined wasn't "part of the proposal" either, but you didn't seem to have a problem with that expansion.
 
and I'm not *at all* convinced it should be.

I'm not convinced it shouldn't be. Once you have multiple reasons for invoking the destructor of a class, then it is reasonable to investigate what that destructor's execution means.

Once you allow this feature to be used in non-static call trees, you are giving people a way to avoid exception handling. An alternative way of signaling failure up the call stack without throwing a genuine exception. At which point, users are going to start using this frequently, just to avoid exceptions. Which means that destructors which do different things in the presence of exceptions need to be able to also do different things in the presence of these "exceptions-in-all-but-name".

The potential need for this is not something which should be easily dismissed.

To put it another way, if we do this:

auto _ = scope_fail(...);

auto val = try#(some_expression);

If that `try#` function exits with an invoked return, should the `scope_fail` be called or not? I think it is very naive to think that the answer should be "no". This is something that needs to be considered very carefully.


--
Matthew

Matthew Woehlke

unread,
Oct 12, 2017, 2:07:31 PM10/12/17
to std-pr...@isocpp.org, Nicol Bolas
On 2017-10-12 13:23, Nicol Bolas wrote:
> Just forget for the moment the question of this feature, and ask yourself
> this: can you *inline* through `function::operator()`?

If by "inline", you require the absence of some for of jump instruction
to go from the caller to the callee, then... *probably* not. Not sanely,
anyway. It would require dynamic code execution of some sort, and I
don't care to even *think* about how that might work.

If I can have that jump... well, I don't want to think that hard either.
I'm not convinced it is absolutely impossible. However, I'm also not
convinced it's relevant to this discussion.

> Because if you can't inline through `function`, then you certainly can't do
> what's being discussed here. As this is based on being able to inline.

No, really, it isn't.

What's being discussed here is the ability to turn this syntactic sugar:

int bar() inline { return 5; }

int foo()
{
inline bar();
}

...into this:

std::optional<int> bar() { return 5; }

int foo()
{
{
auto __x = bar();
if (__x) return *__x;
}
}

(Things get a little more complicated if you can both yield a value,
like a statement-expression can, and *also* cause a return in the
calling scope. In that case, you have something like pair<optional<YT>,
optional<RT>>, though hopefully with less overhead than literally using
a pair of optionals. Also, note that I don't mean to imply this would
*actually use* std::optional. I'm merely using known constructs to
illustrate the manner in which it might work.)

...and with the compiler being "strongly encouraged" to perform the
"obvious" optimizations (which would make it equivalent to actual token
pasting, when possible).

(And yes, if the compiler can't inline, we probably need to specify that
an additional copy/move might occur. But maybe not, either, if the call
can pass the caller's return space to the callee. Again, ABI details...)

> I would say that the burden of proof is on you to show how you can
> implement `function` so that you can inline through `operator()`. In all
> cases.

(You keep saying `operator()` when I think you mean `operator() inline`.
I agree that you can't implement it through `operator()`, because
`operator()` can't cause a return in its caller's scope. A `function<F>`
where `F` is one of these new critters, whatever we're calling them,
would *not have* an `operator()`.)

yield_or_return<T> yielded(T&&);
yield_or_return<T> returned(T&&); // same as implicit ctor

yield_or_return<T> dispatch(...)
{
return yielded{inline m_fptr(...)};
}

YT operator()(...) inline
{
auto&& y_or_r result = dispatch();
if (y_or_r.has_return())
return y_or_r.returned_value(); // causes return in caller
yield y_or_r.yielded_value();
}

...better?

If you *really* want to get fancy, you could extend this to work with
other control flow statements. However, I think that way lies madness; I
don't see how it would work without a combinatorial explosion of the ABI
specifying exactly what types of control flow can be invoked.

>> (Oh... another matter... I think such functions either can't yield a
>> value normally, which makes them less useful than statement expressions,
>> or they would need to specify both the yielded and
>> possibly-returned-through-caller return types.)
>
> ... I was just about to say the same thing. That the function declaration
> needs to include the control structures that it wants to export, which also
> includes the type of the object it can export-return.

Agreed. I'd be rather strongly inclined to limit what it can do to
'yield a YT' or 'cause the caller to return a RT' (see "combinatorial
explosion" comment, above). This also makes it easier on the compiler; a
caller can only call something that caller-returns an RT if the caller
itself returns a type that is constructible from an RT.

For similar reasons, *if* we allowed other things, having that as part
of the signature would be important to forbid calling something that
might `continue` from a point where we can do no such thing.

--
Matthew

Matthew Woehlke

unread,
Oct 12, 2017, 2:31:04 PM10/12/17
to std-pr...@isocpp.org, Nicol Bolas
On 2017-10-12 13:44, Nicol Bolas wrote:
> On Thursday, October 12, 2017 at 1:21:21 PM UTC-4, Matthew Woehlke wrote:
>> It *has* to be part of the ABI, and no, this isn't a departure.
>
> It only "has to" if these functions are actually C++ functions and can be
> dynamically called.

I am arguing that *not* being able to dynamically call these is also
"part of the ABI". "I don't have one and you can't call me dynamically"
is still a specification of ABI.

But whatever. Semantics. Arguing this is not productive.

> The initial idea was that they're not really functions; they're
> macros written in C++.

...which is fine, but then they are necessarily inlined and can't exist
cross-TU. And, critically, can't be stuffed into anything resembling
std::function. (Which *you* apparently wanted. I'm trying to explain how
this could work for that use case.)

> Especially the bit about accessing the local scope of where it's used;
> there's no way for that to be part of the ABI.

I'll refrain from making muttering noises about implicitly passed
parameters. I'm even *less* convinced this is a desirable feature than
being able to stuff the critters in std::function and such ilk.

Ah, well, in general I see the problem. I've never really been talking
about the *original* idea at all. I've been expanding on the "let's copy
Rust" idea (which did not involve lambdas and manipulation of caller
state outside of control flow, AFAICT).

>> On 2017-10-12 13:08, Nicol Bolas wrote:
>>> At which point, it's really just an oddball form exception handling
>>
>> ...except that the compiler is expected to inline where possible to
>> avoid all the nasty overhead of exceptions, and even when not, the
>> "unwinding" mechanism would look much different.
>
> How would it look "much different" in the non-inlining case?

...because the "unwinding" is through an explicitly injected `if` (like
in the `try` proposal). Also see my other reply.

> Also, if the compiler could inline exception throwing in more places, why
> don't we see people writing code that relies on that?

...Apparently exceptions can't be inlined. Ever. (At least according to
Niall. See discussion at
https://groups.google.com/a/isocpp.org/d/msg/std-proposals/h9QrXu5agQM/uyK7IGiVAAAJ.)

At any rate, I believe exception handling is generally more complicated
than `if`. (Otherwise, why do people like `expected`?)

>> [stuff] wasn't part of the proposal,
>
> Them working without being inlined wasn't "part of the proposal" either,
> but you didn't seem to have a problem with that expansion.

Excuse me? *You* were the one complaining that "this doesn't solve the
indirect call problem". Now you're giving *me* flack for trying to do that?

If you don't actually want that, then fine. Let's stop discussing it.

>> I'm not *at all* convinced that [dtors behaving different due to
>> "normal" vs. "callee-injected" return] should be [part of the
>> proposal].
>
> I'm not convinced it *shouldn't* be. Once you have multiple reasons for
> invoking the destructor of a class, then it is reasonable to investigate
> what that destructor's execution *means*.

Okay, since you can't seem to make up your mind, now *I'm* going to play
the "this wasn't part of the original idea; why should we add it?" card...

If we go back to the original idea (either the *original* original idea,
or the one I first started picking at), which was for this to be "macro
like", I see no reasons there why return-via-callee should be different
from local return. Indeed, that almost seems *contrary* to the original
idea.

> Once you allow this feature to be used in non-static call trees, you are
> giving people a way to avoid exception handling. An alternative way of
> signaling failure up the call stack without throwing a genuine exception.

I would argue if you're really using it that way, you should be using
`expected`.

Maybe this is an argument *against* allowing this feature in anything
but *actual* token-pasting "modes". (Ergo, I return the ball to you, who
requested that in the first place, to submit arguments why we should
allow this to work through std::function.)

--
Matthew

torto...@gmail.com

unread,
Oct 15, 2017, 8:39:23 PM10/15/17
to ISO C++ Standard - Future Proposals
I see several things in this thread. On the one hand there is an attempt to focus on designing specific and safe high-level abstractions to particular problems (i.e. try and co_await)
and on the other there is trying to come up with a more general purpose mechanism that would make these easier to do.
I'm more interested in the general purpose problem myself.
I am surprised to see what I consider two potentially quite separate facilities being conflated:
  1. making something happen in a different (parent) scope
  2. meta-programming / code generation
To my mind there these two things are quite separate.

I don't know of any implementations of C++ that are not stack based.
However, as has been pointed out C++ does not formally identify a stack frame as a concept.

Instead it recognises a limited set of things you can do with it.
* declare variables  (though there is nothing to say these actually have to be on the stack)
* return
* unwind via an exception

A common library extension is to generate a stacktrace

TCL has a interesting 'function' called uplevel which lets you execute in the parent scope
(there is also an upvar which is similar but just exposes a single variable).
This is powerful enough to let you implement new control flow structures. For example "continue" can be a function.
Some other languages have similar facilities.

This is a powerful facility but open to abuse (and thus a perfect fit for C++ :-).

So why not attempt to formalise a minimal definition of call stack and stack frame that would make this possible
(allowing for threads and coroutines and perhaps more novel structures where there is a tree of stacks)

Just like the old Setcontext functions can be used to implement threads it would be nice to have something that could be used
to implement novel control flow and things like operator try.
This does suffer from all the dangers pointed out about letting a mere function call alter the control flow invisibly. However, I see this as a spiderman case "with great power comes great responsibility".

I think this can be kept quite straight forward.
Something like (straw man for picking apart):

struct stackFrameInfo
{
public:
   // obtain the address of a memory location where the return value from the given stack frame will be stored
   // or null if the stack frame refers to a function with a void return.
   void* getReturnValue();

   // go up one level to the next (non-inline) calling function.
   stackFrameInfo* parentFrame();

   // obtain the actual data in the stack frame
   // which may include:
   //    local variables - one of which might be 'the return value'
   //    the return address
   //    an exception handler
   std::pair<void*,void*> getFrame();

   static const stackFrameInfo* getCurrentStackFrame(); 

   // obtain a reference to a catch block belonging to this stack frame
   // or null if there isn't one.
   catch* getCatchBlock();

   // obtain a reference to the currently active exception handler for this frame
   // this could be a catch block in a function higher up the stack.
   catch* getExceptionHandler();
};

Much as I would love to have a meta programming interface where any scope is a programmable object
I would not have stackFrameInfo.declareVariable(). That's the compiler's job.

However I would like syntactic sugar for getting at variables by name.
So given

int foo(std::string argA)
{
   int bar = 2;
   int zuul = 3;
   return zuul;
}

foo.return  refers to zuul
foo.bar      refers to bar
foo.argA    refers to argA

Whether these are legal expressions unfortunately depends upon whether they are accessible at runtime and not optimised away.
However, the act of referring to them by name requires them not to be optimised away.

This is why you couldn't easily have a stack frame include a heterogenous container of variables you could iterate over at runtime.

However, this means you need to know that foo's stack frame is accessbile at compile time which perhaps means most of the
stackFrameInfo interface must be constexpr somehow.

Using something like this you could hopefully implement this "try" operator as a simple function.
That still leaves the problem of marking its use explicitly as "evil" in the same sense as an "evil" lambda.
Perhaps another way of putting this is I'm saying make "evil" a first class concept rather than restricting it to "evil lambdas".

You could also make a standard stacktrace library as a pure library implementation.

Now for the code generation / meta programming part why am I not hearing more talk using more 'standard' concept names like:
 AST Macros &
 Hygienic macros


Reply all
Reply to author
Forward
0 new messages