Proposal for additional generic behaviour when dealing with functions

223 views
Skip to first unread message

DeadMG

unread,
Nov 22, 2012, 6:51:21 PM11/22/12
to std-pr...@isocpp.org
This proposal primarily aims to remove the differences between functions, and function objects. Right now, the only reason to use functions is explicit template arguments, and because they are more convenient to specify. Else, function objects are superior in every way, as they are more convenient to use, more flexible, and can be more efficient, as function pointers can interfere with inlining and such optimizations.

I propose that, in alteration of what is currently defined, the name of a function in fact refers to a function object of undefined type. When that function object is invoked, it forwards all of it's arguments to that function. For member functions, this is implicitly stored as a reference, not a copy. For example, consider a trivial example:

template<typename T> bool less(T lhs, T rhs) { return lhs < rhs; }

Currently, if you wish to pass this function as a function object to std::sort, it must be instantiated with a type at the pass site to generate a function pointer, which is often undesirable. Not only does it have the potential to interfere with inlining and other optimizations if a compiler is not sufficiently intelligent, but you have to know T in advance and explicitly specify it. A function object does not have this drawback. It would now be possible to call

std::sort(begin, end, less);

Here, the compiler would effectively generate a function object similar to the following:

struct __some_internal_type {
    template<typename... Args> auto operator()(Args&&... args) -> decltype(less(std::forward<Args>(args)...)) {
        return less(std::forward<Args>(args)...);
    }
};

std::sort(begin, end, __some_internal_type());

In order to preserve legacy code, a conversion to function pointers of an appropriate type must of course be specified. It would be quite possible to do something similar with a polymorphic lambda, except that the lambda would have to be repeatedly implemented at every call site with all the boilerplate that involves. For a member, this would look more like

struct X {
    template<typename T> bool less(T lhs, T rhs) { return lhs < rhs; }
};
std::sort(begin, end, (expression that yields an lvalue or rvalue x).less);

For a pointer to X, then

std::sort(begin, end, (expression that yields an X*)->less);

For another example, if proposals to allow type deduction on non-lambdas are accepted, it would now be possible to write a freestanding function which can be chained indefinitely.

auto f(int i) {
    std::cout << i;
    return f;
}
int main() {
    f(5)(5)(5)(5)(5)(5)(5); // arbitrary
}

These proposed changes would significantly simplify the use of functions as function parameters and bring their usability up to a par with function objects.

Nicol Bolas

unread,
Nov 22, 2012, 8:09:45 PM11/22/12
to std-pr...@isocpp.org
On Thursday, November 22, 2012 3:51:21 PM UTC-8, DeadMG wrote:
This proposal primarily aims to remove the differences between functions, and function objects. Right now, the only reason to use functions is explicit template arguments, and because they are more convenient to specify. Else, function objects are superior in every way, as they are more convenient to use, more flexible, and can be more efficient, as function pointers can interfere with inlining and such optimizations.

I propose that, in alteration of what is currently defined, the name of a function in fact refers to a function object of undefined type. When that function object is invoked, it forwards all of it's arguments to that function.

Perfect forwarding is not perfect in terms of performance. Perfect forwarding breaks input copy elision; if you have this function:

void SomeFunc(std::string str);

And you call it like this:

SomeFunc({"A String"});

The compiler can construct the temporary in-place. There will be no copying or moving of the function argument on such compilers.

However, if you use perfect forwarding, perfect forwarding via template argument deduction will effectively work like this:

void SomeFuncFwd(std::string &&str) {SomeFunc(std::move(str));}

This will cause a move where there didn't need to be one.

Perfect forwarding is perfect in terms of semantics, not performance. A compiler-generated forwarding function might be able to be more perfect, but if that problem isn't dealt with, it's a non-starter.

For member functions, this is implicitly stored as a reference, not a copy. For example, consider a trivial example:

template<typename T> bool less(T lhs, T rhs) { return lhs < rhs; }

Currently, if you wish to pass this function as a function object to std::sort, it must be instantiated with a type at the pass site to generate a function pointer, which is often undesirable. Not only does it have the potential to interfere with inlining and other optimizations if a compiler is not sufficiently intelligent, but you have to know T in advance and explicitly specify it. A function object does not have this drawback. It would now be possible to call

std::sort(begin, end, less);

I want to make sure I understand something here. You want to make massive changes to the way functions work, which will cause a number of issues (as I'll explain in a bit), all just so that you don't have to type this:

struct less
{
 
template<typename T> bool operator()(T lhs, T rhs) { return lhs < rhs; }
};

The difference between these is the location of the name, the use of `struct {};`, the use of `operator()`, and the fact that calling it requires you to use `less()` to create an instance.

You are talking about a change that will be far from transparent for users. All so that you don't have to write `struct` quite so often. I don't think it's worth it.

Here, the compiler would effectively generate a function object similar to the following:

struct __some_internal_type {
    template<typename... Args> auto operator()(Args&&... args) -> decltype(less(std::forward<Args>(args)...)) {
        return less(std::forward<Args>(args)...);
    }
};

std::sort(begin, end, __some_internal_type());

In order to preserve legacy code, a conversion to function pointers of an appropriate type must of course be specified.

So, what happens to this code?

void SomeFunc(int);
auto GetFunctor() -> std::function<decltype(SomeFunc)> { return {SomeFunc};}

In C++11, decltype(SomeFunc) is a function type, and is therefore legal as a parameter to std::function. In your proposed revision, it isn't; it's an object type. Which cannot be the type provided to a std::function.

This should be an opt-in mechanism, not a magical "everything that used to work one way works completely differently now thing. If you want it, you should have to ask for it in the declaration, using syntax like so:

void ThisIsAFunction();
void ThisIsAFunctor() functor;

At which point, you don't have to deal with decaying to function pointers (though it could be allowed for non-template functors) or even the forwarding stuff anymore. We understand that `ThisIsAFunctor` declares an empty POD class type which has an overloaded `operator()` with the contents you provide. A global variable called `ThisIsAFunctor` will be instantiated, which is admittedly going to take some very creative wording to pull off, since you now have to deal with what happens when you stick one of these in a header. Is it externed everywhere? So which translation unit defines it?

There would be no function overloading as well, which is another thing that confounds your idea, since overloading is something that can happen in different translation units in different ways (different TLs can see different sets of functions with the same name). Type definitions can't be spread across translation units, so if you wanted to group them all in a type definition, you'd have a problem.

`functor` would be one of those "contextual identifier" things like "final" and "override". So we wouldn't even need a keyword.
 

DeadMG

unread,
Nov 22, 2012, 8:24:52 PM11/22/12
to std-pr...@isocpp.org
Partly, I'd suggest that not permitting elision to work through perfect forwarding is a problem, and I think the rules for elision should be relaxed. However, there's plenty of precedent for compiler-generated special functions not having to obey the same rules- for example, new T() does not involve a move, it's constructed in place, and there's no reason why similar wording cannot be applied here. In fact, you could argue that this would be an extra justification for permitting such a proposal.

There would be no function overloading as well, which is another thing that confounds your idea, since overloading is something that can happen in different translation units in different ways (different TLs can see different sets of functions with the same name). Type definitions can't be spread across translation units, so if you wanted to group them all in a type definition, you'd have a problem.

No (although this is my explanation fail). The entire point is to simplify the use of overloaded functions which currently require explicit casts and whatnot (as well as templates). After all, if I wrote the struct definition out manually, the compiler still has to deal with that. When you use the function object, it has the same effect as calling the function with those arguments at that exact place in that exact TU. If that necessitates creating multiple types for multiple usage points, then that's what's necessary- and exactly what would happen if you used a lambda instead.

In C++11, decltype(SomeFunc) is a function type, and is therefore legal as a parameter to std::function. In your proposed revision, it isn't; it's an object type. Which cannot be the type provided to a std::function.

I had considered suggesting an implicit conversion to solve this problem, but I had forgotten about what would happen if you attempted to decltype it. In principle, Special Language Wording™ could intervene and suggest that instead, it yields a function type (for example, a special instance of my other proposal about automatic type conversions), but it would be much simpler to simply suggest that the new wording would only apply where the expression is not currently a valid function- for example, where the name is an overloaded function or a template function without parameters. Secondly, this would not prevent the use for member functions, which I feel is just as, if not more, important than free functions.

DeadMG

unread,
Nov 22, 2012, 8:45:20 PM11/22/12
to std-pr...@isocpp.org
In addition, I'd like to ask you to consider not just a simplification perspective, but also a maintenance one. Currently, given

bool f(int lhs, int rhs) { return lhs < rhs; }
std::sort(begin, end, f);

This is conforming code (assuming the required context for calling std::sort). But if I add another overload of f, even one which is completely irrelevant, it breaks.

bool f(int lhs, int rhs) { return lhs < rhs; }
bool f(std::string lhs, std::string rhs) { return lhs < rhs; }
std::sort(begin, end, f); // Illegal

In addition, you need to change the call site depending on whether you want to choose an overload, or use an instantiation, of f.

If a proposal similar to this one were accepted, this code would now be legal- that is, you can use the same code regardless of the overloads or templates, of f, unless for some reason you need to explicitly pick one (e.g. because ambiguities).

Nicol Bolas

unread,
Nov 22, 2012, 10:00:30 PM11/22/12
to std-pr...@isocpp.org
On Thursday, November 22, 2012 5:24:52 PM UTC-8, DeadMG wrote:
Partly, I'd suggest that not permitting elision to work through perfect forwarding is a problem, and I think the rules for elision should be relaxed.

That's pretty much not possible. Elision works because of a contract between the callee and the caller, defined by the function signature. In order to do its part in elision, the callee doesn't need to know more about what the caller does than the signature. Similarly, all the caller knows is that it's been given some memory; it doesn't care if something was copied into that memory or not.

In a 3-part system (callee, forwarder, and caller), everyone is blind to everyone else. The callee can only see the signature of the forwarder. It cannot assume that the forwarder is a forwarder at all. It knows nothing about what the forwarder is going to do, so it must assume that the forwarder will do what it's call signature says it will do. If it asks for a reference, it gets a reference.

The forwarder gets a reference, not a value. So when it calls a function that takes a value, it has no means of passing that reference through. It must copy it, because the function takes a value that it expects to be a unique value of that type, shared with nobody else. The forwarder can't know that the callee created a temporary that won't be used beyond this parameter. There is no special kind of "forwarding reference" to guarantee that (no matter how much Scott Meyer wants there to be one). And without that guarantee, the forwarder cannot do its part to create elision.


In fact, you could argue that this would be an extra justification for permitting such a proposal.

There would be no function overloading as well, which is another thing that confounds your idea, since overloading is something that can happen in different translation units in different ways (different TLs can see different sets of functions with the same name). Type definitions can't be spread across translation units, so if you wanted to group them all in a type definition, you'd have a problem.

No (although this is my explanation fail). The entire point is to simplify the use of overloaded functions which currently require explicit casts and whatnot (as well as templates). After all, if I wrote the struct definition out manually, the compiler still has to deal with that. When you use the function object, it has the same effect as calling the function with those arguments at that exact place in that exact TU. If that necessitates creating multiple types for multiple usage points, then that's what's necessary- and exactly what would happen if you used a lambda instead.

You're not talking about a small compiler thing anymore. Now the compiler needs to create new type every time the function is used (or at least, frequently). This is not a tweak or a minor change; this is a fundamental rewrite of some of the most basic parts of every compiler. Not to mention the sheer amount of wording changing that needs to happen in the standard to explain how this all works.

Basically all of chapter 13 would need to be massively reworked and rewritten.

This is a highly dangerous thing you're suggesting, which can massively break C++ in unpleasant and difficulty to detect ways. I'm not saying you shouldn't pursue it, but you're talking about a feature who's complexity demands a full study-group investigating it. Not to mention how it might interact with the work of other study-groups, like reflection (which is probably expecting functions to be actual functions and for individual overloads to be counted like individual overloads), concepts, and the like.

Considering that you're already working on the Unicode proposal and you mentioned something about an alternate iostreams idea, it may be too much for you to develop.

grigor...@gmail.com

unread,
Nov 23, 2012, 3:14:46 AM11/23/12
to std-pr...@isocpp.org
 
 A year ago in cpp-next thread Giovanni Deretta proposed a new syntax for passing functions to higher-order functions:
 
hof([]min, 0, 1); 
 
where []id-expression is a way of capturing an overload set. I haven't seen any further development of that idea, but it seems like a really nice way to simplify usage of functions as function parameters without fundamental changes.
 
Regards,
Gregory
Message has been deleted

DeadMG

unread,
Nov 23, 2012, 3:54:36 AM11/23/12
to std-pr...@isocpp.org
Nah, I didn't get along with my university and am currently unemployed, so.

You're not talking about a small compiler thing anymore. Now the compiler needs to create new type every time the function is used (or at least, frequently). This is not a tweak or a minor change; this is a fundamental rewrite of some of the most basic parts of every compiler. Not to mention the sheer amount of wording changing that needs to happen in the standard to explain how this all works.

This is not a big deal at all. It's just a polymorphic lambda which has been given a syntactic upgrade for maintainability. We already have polymorphic lambdas, or should do. There's nothing different between

sort(begin, end, less)

and

sort(begin, end, []<class... T>(T&&... args) { return less(std::forward<T>(args)...); });

You're asking the compiler to do the exact same amount of work, and specifying them in pretty much exactly the same way (except polymorphic lambdas are more general and thus require a lot more specifying). 

This is a highly dangerous thing you're suggesting, which can massively break C++ in unpleasant and difficulty to detect ways.

Like what? I accept that changing it for functions which are not overloads or templates would be nasty, but existing code which tries to do that is quite ill-formed, and the proposed behaviour is easily specified and quite safe. 

Nicol Bolas

unread,
Nov 23, 2012, 4:18:29 AM11/23/12
to std-pr...@isocpp.org

You are talking about changing the very meaning of what a function is. This will affect every function that calls another function. Code that passed around function pointers of a known type now will suddenly start passing around functors of unknown and unknowable types. "quite safe" is not how I would describe any change with such far-reaching implications.

DeadMG

unread,
Nov 23, 2012, 4:59:30 AM11/23/12
to std-pr...@isocpp.org
If the wording is changed to specify only overloaded or template functions, then existing code will not have a single shred of meaning changed.

very meaning of what a function is.

Overloading, operator overloading and templates and lambdas changed that. All I'm doing is passing it around as an object in a simple, convenient package.

Again, all we're talking about is implicitly creating a very simple polymorphic lambda. There's nothing more to it.

Xeo

unread,
Nov 23, 2012, 9:15:22 AM11/23/12
to std-pr...@isocpp.org, grigor...@gmail.com
I was thinking of that exact same syntax (not having seen the cpp-next thread). In my thinking, it would be simple syntactic sugar to lift a function (set) to a function object.

With the polymorphic lambda proposal, it's implementable as a preprocessor macro:

#define LIFT_FUN(name) [&]<class... Ts>(T&&... vs){ return name(std::forward<Ts>(vs)...); }

with which you can write sort(f, l, LIFT_FUN(less)).

But as we all know, "macros are evil" and there are probably some subtle pitfalls with the macro version, so I'd really like to see this added to the standard:

sort(f, l, []less);

DeadMG

unread,
Nov 23, 2012, 10:43:08 AM11/23/12
to std-pr...@isocpp.org, grigor...@gmail.com
ADL could present another problem. If you do

sort(f, l, less) 

and there is no less in scope, even though it could be found by ADL, then it would be ill-formed. The lambda version, however, would not be ill-formed. So I think that there may not be a problem with []less, which would be fine with ADL, and if you did something like auto p = []less, then there's no confusion that you're getting a function object.

Tony V E

unread,
Nov 24, 2012, 1:30:04 AM11/24/12
to std-pr...@isocpp.org
Nice.  So that would also work with the [] operator? [][]. :-)


Sent from my BlackBerry® PlayBook™
www.blackberry.com


From: "grigor...@gmail.com" <grigor...@gmail.com>
To: "std-pr...@isocpp.org" <std-pr...@isocpp.org>
CC: "grigor...@gmail.com" <grigor...@gmail.com>
Sent: 23 November, 2012 6:43 PM
Subject: [std-proposals] Re: Proposal for additional generic behaviour when dealing with functions

By the way if we allow [] with operators then we would have a really concise way of passing operators to functions, for example
 
accumulate(v.begin(), v.end(), []*); // instead of multiply<>

grigor...@gmail.com

unread,
Nov 24, 2012, 4:48:41 AM11/24/12
to std-pr...@isocpp.org
While [][] looks funny, I see no ambiguity here. So if we allow []+ and []*, we should allow [][] as well. There's ambiguity with []() - it looks like a beginning of lambda, so that should not be treated like lifted form of operator ().

Субота, 24 листопада 2012 р. 08:30:09 UTC+2 користувач Tony V E написав:

DeadMG

unread,
Nov 24, 2012, 6:00:13 AM11/24/12
to std-pr...@isocpp.org
You would need an object on which to call operator() anyway, so []() as in "Lifted operator()" doesn't really make sense.

Nicol Bolas

unread,
Nov 24, 2012, 6:15:22 AM11/24/12
to std-pr...@isocpp.org, grigor...@gmail.com
On Saturday, November 24, 2012 1:48:41 AM UTC-8, grigor...@gmail.com wrote:
While [][] looks funny, I see no ambiguity here. So if we allow []+ and []*, we should allow [][] as well. There's ambiguity with []() - it looks like a beginning of lambda, so that should not be treated like lifted form of operator ().

Grammatically, it may be unambiguous, but it is inconsistent with the rest of C++.

The syntax should be `[]<function name>`. The name of the + operator function is `operator +`. Therefore, it should be `[]operator +`. Plus, it allows you to do proper scoping where appropriate, such as `[]some_namespace::operator +`, which is already standard C++ grammar for the name of the + operator function in `some_namespace`.

grigor...@gmail.com

unread,
Nov 24, 2012, 7:29:20 AM11/24/12
to std-pr...@isocpp.org
I was not clear. In the same thread Giovanni Deretta proposed unification of member-functions and free-functions for []function syntax.
 

Regarding the []<class>::<member> syntax, after thinking about it more, it should be probably folded in the general []id-expression; syntax. I.e:

([]id-expression)(x);
should map to the first compilable of the following
x.id-expression;
x.id-expression();
id-expression(x);
 
So, my remark was in the context of that proposal. If it would be possible then []() would be analogous to boost::apply<>.
 
 

Субота, 24 листопада 2012 р. 13:00:13 UTC+2 користувач DeadMG написав:

grigor...@gmail.com

unread,
Nov 24, 2012, 7:44:11 AM11/24/12
to std-pr...@isocpp.org, grigor...@gmail.com
Субота, 24 листопада 2012 р. 13:15:22 UTC+2 користувач Nicol Bolas написав:
On Saturday, November 24, 2012 1:48:41 AM UTC-8, grigor...@gmail.com wrote:
While [][] looks funny, I see no ambiguity here. So if we allow []+ and []*, we should allow [][] as well. There's ambiguity with []() - it looks like a beginning of lambda, so that should not be treated like lifted form of operator ().

Grammatically, it may be unambiguous, but it is inconsistent with the rest of C++.

The syntax should be `[]<function name>`. The name of the + operator function is `operator +`. Therefore, it should be `[]operator +`. Plus, it allows you to do proper scoping where appropriate, such as `[]some_namespace::operator +`, which is already standard C++ grammar for the name of the + operator function in `some_namespace`.
 
Strictly speaking even []operator+ should not work for built-in operator +, as it is not a function. If we aim for consistency then if operator + (1,2) is not a valid expression then ([]operator+)(1,2) should not be one.
And I'm not proposing that []operator + should be disallowed. I'm saying that if we have shorthand syntax for calling a user-declared operator, then it would be logical to have shorthand syntax for passing operator to other functions.

 

Scott Prager

unread,
Nov 25, 2012, 8:07:08 AM11/25/12
to std-pr...@isocpp.org, grigor...@gmail.com


On Saturday, November 24, 2012 4:48:41 AM UTC-5, grigor...@gmail.com wrote:
While [][] looks funny, I see no ambiguity here. So if we allow []+ and []*, we should allow [][] as well....

How would one differentiate between multiplication and dereferencing? Or unary plus and minus with the binary?
 

grigor...@gmail.com

unread,
Nov 25, 2012, 9:18:24 AM11/25/12
to std-pr...@isocpp.org, grigor...@gmail.com
In the context where unary function object is expected []* would act as dereference operation and in the context where binary function object is expected as []* would act as multiplication.
This is natural as []function essentially captures an overload set of function.

Неділя, 25 листопада 2012 р. 15:07:08 UTC+2 користувач Scott Prager написав:

DeadMG

unread,
Nov 25, 2012, 10:10:29 AM11/25/12
to std-pr...@isocpp.org, grigor...@gmail.com
It really should be []operator+, not []+. Secondly, this is really getting away from the original proposal that I've written, which is for just names.
Message has been deleted

Alex B

unread,
Feb 1, 2013, 12:19:59 AM2/1/13
to std-pr...@isocpp.org
I think the intent of the original proposal is quite justified; it would be really nice to have a simpler way of passing/returning a function call with (yet) unresolved overloads.
However, I do agree with other writers that going "all-in" and making everything become a function object is quite abusive. The proposed use of an explicit []id-expression syntax is a very interesting idea.

How about going the implicit way (like in the original proposal) but only when needed? I mean a function object could be implicitly created like in the original proposal, but only when an overload could not be resolved. So current function pointers would be kept as function pointers, and only specific cases which are not compiling with the current standard (because of overload resolution) would now compile by implicitly generating what I would call an "implicit polymorphic function object".

Also, in the original post, I see a second proposal (was it intended?) which was not discussed so far and should be considered on its own:

std::sort(begin, end, (expression that yields an lvalue or rvalue x).less);
std::sort(begin, end, (expression that yields an X*)->less);

Here, I interpret it as automatically generating a function object by binding a member function with an object on which to invoke it. I think it is a very interesting idea (that was maybe raised somewhere else).
Reply all
Reply to author
Forward
0 new messages