Document number: DxxxxR0
Barry Revzin
2016-10-16

Abbreviated Lambdas for Fun and Profit

Motivation
Proposal

=> expr
Omitting types
Unary operator>>

Examples
Effects on Existing Code
Prior Work
Acknowledgements and References

Motivation

There are two, somewhat related motivations for an abbreviated lambda syntax. The first is to address the problem of trying to pass in overload sets as function arguments [1]:

template <class T>
T twice(T x) { return x + x; }

template <class I>
void f(I first, I last) {
    transform(first, last, twice); // error
}

C++14 generic lambdas allow us a way to solve this problem by just wrapping the overloaded name in a lambda:

transform(first, last, [](auto&& x) {
    return twice(std::forward<decltype(x)>(x));
});

But that isn't actually correct, although it's the "obvious" code that most people would produce. It's not SFINAE-friendly and it's not noexcept-correct. Which could lead to avoidable errors:

struct Widget;

bool test(int );
bool test(Widget );

void invoke(std::function<bool(int)> );         // #1
void invoke(std::function<bool(std::string)> ); // #2

// error: unresolved overloaded function type
invoke(test);             

// still error: no known conversion from std::string to int or Widget
invoke([](auto&& x) {       
    return test(std::forward<decltype(x)>(x));
});

You'd really have to write:

// OK: calls #1 invoke([](auto&& x) noexcept(noexcept(test(std::forward<decltype(x)>(x)))) -> decltype(test(std::forward<decltype(x)>(x))) { return test(std::forward<decltype(x)>(x)); });

And that's a lot to have to type. Which brings me to the second motivation: not having to write that. For simple lambdas, those lambdas whose entire body is return expr;, the noisy stuff you have to write to get it correct where you need to use it just drowns out the signal of what it was you wanted your lambda to do in the first place. That is assuming that I succeed in writing the same code in all three places without accidentally introducing subtle differences.

Arguably the only important code in the above block is that which has been marked in blue. All I want to do is test(x), why so much boilerplate?

Proposal

This paper proposes three language extensions. The extensions themselves are independent, but together combine to allow for terse, readable, correct lambdas and functions.

`=> expr`

This paper proposes the creation of a new lambda introducer, =>, which allows for a single expression in the body that will be its return statement. This will synthesize a SFINAE-friendly, noexcept-correct lambda by doing the code triplication for you.

That is, the lambda:

[](auto&& x) => test(x)

shall be exactly equivalent to the lambda:

[](auto&& x) noexcept(noexcept(test(x))) -> decltype(test(x)) { return test(x); }

When SFINAE is important, the repetition of the function body makes the code difficult to read. At best. At worst, the code becomes error-prone when one changes the body of the function while forgetting to change the body of the trailing decltype. Even when SFINAE is not important, for the simplest lambdas, brevity is important for readability and omitting the return keyword would be helpful.

The code duplication or triplication in short function bodies is also a common problem that wants for a solution [3]. Consider the overloads of std::begin and std::not_fn:

template <typename C>
constexpr auto begin(C& cont)
    -> decltype(cont.begin())
{
    return cont.begin();
}

template <class F>
struct not_fn {
    F f;
    
    template <class... Args>
    auto operator()(Args&&... args)
        noexcept(noexcept(!std::invoke(f, std::forward<Args>(args)...)))
        -> decltype(!std::invoke(f, std::forward<Args>(args)...)) {
        return !std::invoke(f, std::forward<Args>(args)...);
    }    
    
    // + overloads for const, etc.
};

which would be able to be reduced, without loss of functionality, to:

template <typename C>
constexpr auto begin(C& cont)
    => cont.begin();

template <class F>
struct not_fn {
    F f;
    
    template <class... Args>
    auto operator()(Args&&... args)
        => !std::invoke(f, std::forward<Args>(args)...);

    // + overloads for const, etc.
};

These shorter implementations are easier to read, easier to write, and easier to maintain.

Omission of types in lambdas

One of the motivations of generic lambdas was to use auto as a substitute for long type names, which helps quite a bit. But since auto&& has become such a regular choice of argument for lambdas, and is rarely a wrong choice, it doesn't really bestow any information to the code. This paper proposes allowing for the type to be omitted, in which case it will be assumed to be auto&&. That is:

[](x) { return x+1; }

shall be exactly equivalent to the lambda:

[](auto&& x) { return x+1; }

Omitted arguments can be interspersed with provided arguments, and a leading ... will indicate a parameter pack of forwarding references.

[](x, int y) { return x < y; }
[](...args) => test(std::forward<decltype(args)>(args)...)

shall be exactly equivalent to the lambdas

[](auto&& x, int y) { return x < y; }
[](auto&&... args) noexcept(noexcept(test(std::forward<decltype(args)>(args)...))) -> decltype(test(std::forward<decltype(args)>(args)...)) { return test(std::forward<decltype(args)>(args)...); }

No default arguments will be allowed in the case of type omission, due to potential ambiguities in parsing.

Unary `operator>>`

One of the last sources of boilerplate is std::forward. In a lot of generic code, the uses of forward overwhelm all the rest of the code, to the point where many talks and examples just omit references entirely to save space. I'm occasionally tempted to introduce a macro (#define FWD(x) decltype(x)(x)) which is just wrong. Unlike std::move and std::ref, which are used to do non-typical things and deserve to be visible markers for code readability, std::forward is very typically used in the context of using forwarding references. It does not have as clear a need to be a signpost.

This paper would like to see a shorter way to forward arguments and proposes non-overloadable unary operator >>, where >>expr shall be defined as static_cast<decltype(expr)&&>(expr), and not overloadable.

This lambda:

[](auto&& x) { return test(>>x); }

is equivalent by definition to this one:

[](auto&& x) { return test(std::forward<decltype(x)>(x)); }

This operator will have equivalent precedence to the other prefix operators, like operator!. While this paper is primarily focused on abbreviated lambdas, unary operator>> will not be limited in scope to just lambdas.

Examples

Putting all three features together, binding an overload member function, func, to an instance, obj, as a function argument is reduced from:

[&obj](auto&&... args) noexcept(noexcept(obj.func(std::forward<decltype(args)>(args)...))) -> decltype(obj.func(std::forward<decltype(args)>(args)...)) { return obj.func(std::forward<decltype(args)>(args)...); }

[&obj](...args) => obj.func(>>args...)

That is a reduction from 211 characters to 38.

Here are other examples of improved usage as compared to C++14 best practices.

Sorting in decreasing order: roughly comparable typing, but arguably clearer:

std::sort(begin(v), end(v), std::greater<>{});  // C++14
std::sort(begin(v), end(v), [](x,y) => x > y);  // this proposal

Sorting in decreasing order by ID

std::sort(begin(v), end(v), [](auto&& x, auto&& y) { return x.id > y.id; }); // C++14
std::sort(begin(v), end(v), std::greater<>{}, &Object::id);                  // ranges with projections
std::sort(begin(v), end(v), [](x,y) => x.id > y.id);                         // this proposal

Calling an overload where SFINAE matters and getting it wrong is a mess:

bool invoke(std::function<bool(int)> f);         // #1
bool invoke(std::function<bool(std::string)> f); // #2

invoke([](auto x) { return x == 2; });                     // error! (283 lines of diagnostic on gcc)
invoke([](auto x) -> decltype(x == 2) { return x == 2; }); // OK C++14: calls #1
invoke([](x) => x == 2);                                   // OK this proposal: calls #1

Chaining lots of functions together from range-v3: summing the squares under 1000:

// C++14
int sum = accumulate(ints(1)
                   | transform([](int i){ return i*i; })
                   | take_while([](int i){ return i < 1000; }));
				   
// this proposal
int sum = accumulate(ints(1) | transform([](i) => i*i) | take_while([](i) => i < 1000));

Effects on Existing Code

The token => can appear in code in rare cases, such as in the context of passing a the address of the assignment operator as a template non-template parameter, as in X<Y::operator=>. However, such usage is incredibly rare, so this proposal would have very limited effect on existing code. Thanks to Richard Smith for doing a search.

Omitting a type in a lambda could change the meaning of code today. For example:

struct arg { arg(int ) { } };
auto lambda = [](arg ) { return sizeof(arg); }
int x = lambda(1);

In the language today, the lambda will be interpreted as taking a single unnamed argument of type arg, x will be sizeof(arg), which is probably 1. With this proposal, the lambda will be reinterpreted as taking a single forwarding reference named arg, so x will be sizeof(int). However, such code is rare - the lambda would have to not use the argument and either rely on the side effects of creating the parameter or other unevaluated expressions based on the parameter type name.

Unary operator>> cannot appear in legal code today, so that is a pure language extension.

Prior Work

The original paper introducing what are now generic lambdas [2] also proposed extensions for omitting the type-specifier and dropping the body of a lambda if it's a single expression. This paper provides a different path towards those that same goal.

The usage of => (or the similar ->) in the context of lambdas appears in many, many programming languages of all varieties. A non-exhaustive sampling: C#, D, Erlang, F#, Haskell, Java, JavaScript, ML, OCaml, Swift. The widespread use is strongly suggestive that the syntax is easy to read and quite useful.

Acknowledgements and References

Thanks to Andrew Sutton and Tomasz Kaminski for considering and rejecting several bad iterations of this proposal. Thanks to Richard Smith for looking into the practicality of this design. Thanks to Nicol Bolas for refocusing the paper as three independent language extensions.

[1] Overload sets as function arguments

[2] Proposal for Generic (Polymorphic) Lambda Expressions

[3] Return type deduction and SFINAE