Anonymous functions in gcc - what's the point of this?

Kenny McCormack

unread,

Aug 23, 2020, 5:13:16 AM8/23/20

to

WARNING: This is all gcc-specific...

I found this code on some website; it does what it claims to do:

#include <stdio.h>

int main(void)
{
int (*max)(int, int) =
({
int __fn__ (int x, int y) { return x > y ? x : y; }
__fn__;
});

printf("max(-5,5) = %d\n",max(-5,5));
}

Now, this is cute and all, but how is it any different from just declaring
max() the usual way? Given that functions and function pointers are
basically the same thing in C (just like arrays and pointers are basically
the same thing - yes, yes, I know that technically they are not, but don't
bother writing in to tell me that), when we write:

somefunction() { ... }

we know that, really, we're just creating a block of code and setting
somefunction to be the address of that function.

So, again, what is the advantage of the above code?

--
It's possible that leasing office space to a Starbucks is a greater liability
in today's GOP than is hitting your mother on the head with a hammer.

Malcolm McLean

unread,

Aug 23, 2020, 7:21:03 AM8/23/20

to

On Sunday, 23 August 2020 at 10:13:16 UTC+1, Kenny McCormack wrote:
> WARNING: This is all gcc-specific...
>
> I found this code on some website; it does what it claims to do:
>
> #include <stdio.h>
>
> int main(void)
> {
> int (*max)(int, int) =
> ({
> int __fn__ (int x, int y) { return x > y ? x : y; }
> __fn__;
> });
>
> printf("max(-5,5) = %d\n",max(-5,5));
> }
>
> Now, this is cute and all, but how is it any different from just declaring
> max() the usual way? Given that functions and function pointers are
> basically the same thing in C (just like arrays and pointers are basically
> the same thing - yes, yes, I know that technically they are not, but don't
> bother writing in to tell me that), when we write:
>
> somefunction() { ... }
>
> we know that, really, we're just creating a block of code and setting
> somefunction to be the address of that function.
>
> So, again, what is the advantage of the above code?
>

I don't know about the gcc function specifically.

But one use of lambdas is to pass to functions expecting a pointer to a
trivial function, like qsort. In standard C, you have to declare the comparison
function in file namespace, just to sort a list of structs on an integer or
string field. You can't see the comparison function at the point of the call,
so you have to scroll up to see if the the sort is ascending or descending.

With a lambda, you just write the comparison function in situ and pass it in.

David Brown

unread,

Aug 23, 2020, 8:19:22 AM8/23/20

to

On 23/08/2020 11:13, Kenny McCormack wrote:
> WARNING: This is all gcc-specific...
>
> I found this code on some website; it does what it claims to do:
>
> #include <stdio.h>
>
> int main(void)
> {
> int (*max)(int, int) =
> ({
> int __fn__ (int x, int y) { return x > y ? x : y; }
> __fn__;
> });
>
> printf("max(-5,5) = %d\n",max(-5,5));
> }
>
> Now, this is cute and all, but how is it any different from just declaring
> max() the usual way? Given that functions and function pointers are
> basically the same thing in C (just like arrays and pointers are basically
> the same thing - yes, yes, I know that technically they are not, but don't
> bother writing in to tell me that), when we write:
>
> somefunction() { ... }
>
> we know that, really, we're just creating a block of code and setting
> somefunction to be the address of that function.
>
> So, again, what is the advantage of the above code?
>

In this case, I think they are just demonstrating the syntax. (It had
not occurred to me that gcc's statement expressions could be used as a
lambda like this.)

Anonymous functions are sometimes convenient as a way of keeping
relevant code close together. For example, if you are calling "qsort"
you might use a lambda for the comparison function, instead of having to
define the comparison function somewhere physically separate in the
source file.

Bart

unread,

Aug 23, 2020, 8:57:23 AM8/23/20

to

On 23/08/2020 10:13, Kenny McCormack wrote:
> WARNING: This is all gcc-specific...
>
> I found this code on some website; it does what it claims to do:
>
> #include <stdio.h>
>
> int main(void)
> {
> int (*max)(int, int) =
> ({
> int __fn__ (int x, int y) { return x > y ? x : y; }
> __fn__;
> });
>
> printf("max(-5,5) = %d\n",max(-5,5));
> }
>
> Now, this is cute and all, but how is it any different from just declaring
> max() the usual way? Given that functions and function pointers are
> basically the same thing in C (just like arrays and pointers are basically
> the same thing - yes, yes, I know that technically they are not, but don't
> bother writing in to tell me that), when we write:
>
> somefunction() { ... }
>
> we know that, really, we're just creating a block of code and setting
> somefunction to be the address of that function.
>
> So, again, what is the advantage of the above code?

Probably just showing off, but in a bad way since I found it puzzling
until I noticed those ({ and }).

As you said, it only works in gcc, but I found it interesting to see
whether my language could express the same thing, but more clearly,
using little-used features:

proc start=
ref function(int,int)int maxx := (
function fn(int x,y)int = {max(x,y)}
cast(fn)
)

println =maxx(-5,5)
end

(The 'cast' is needed as there is a bug in matching function ptr types.
Also, 'max' is built-in anyway, hence 'maxx'. In this language,
statements and expressions are largely interchangeable; gcc's ({...}) is
how it works anyway.)

Output is:

MAXX(-(5),5)= 5

I don't know if the C version counts as a lambda function, but mine
doesn't and would fail in a version that required access to non-static
variables of the enclosing function.

BGB

unread,

Aug 23, 2020, 11:51:39 AM8/23/20

to

On 8/23/2020 4:13 AM, Kenny McCormack wrote:
> WARNING: This is all gcc-specific...
>
> I found this code on some website; it does what it claims to do:
>
> #include <stdio.h>
>
> int main(void)
> {
> int (*max)(int, int) =
> ({
> int __fn__ (int x, int y) { return x > y ? x : y; }
> __fn__;
> });
>
> printf("max(-5,5) = %d\n",max(-5,5));
> }
>
> Now, this is cute and all, but how is it any different from just declaring
> max() the usual way? Given that functions and function pointers are
> basically the same thing in C (just like arrays and pointers are basically
> the same thing - yes, yes, I know that technically they are not, but don't
> bother writing in to tell me that), when we write:
>
> somefunction() { ... }
>
> we know that, really, we're just creating a block of code and setting
> somefunction to be the address of that function.
>
> So, again, what is the advantage of the above code?
>

IIRC, the GCC extension also allows the nested function to capture
variables from the parent scope (by reference), effectively serving as a
"lambda" which remains valid until the parent function returns.

IIRC, they fall back to plain functions (with no restrictions on
lifespan) if no variable capture takes place.

Theoretically, these can be useful for callbacks, but their utility is
limited to some extent. For some contexts, it would be more useful to
have a heap-allocated lambda with by-value capture semantics.

However:
This would require a way to free the lambdas (eg: "free()");
This would make sense to have a different syntax;
Implicitly, this assumes/requires that heap memory be executable.

Or, alternately, the C library provides an alternate sub-heap for this;
making the main heap executable poses a security risk (though, similar
goes for allowing the stack to be executable).

C++ has support for lambda's support both by-reference and by-value
semantics, but are handled as a complex value type rather than a plain
function pointer, so are less useful in contexts where one needs a
function pointer.

I guess I could consider adding something like (to my C compiler):
int (*foo)(int x, int y);
int z;

foo = __func__(int x, int y):int { return (x+y)*z; };

Which would essentially allocate the lambda on the heap and capture
things by value (if needed, capture-by-reference could be done indirectly).

In effect, it would create a heap-allocated thunk which loads its own
pointer into a register and then branches to the main function body
(located within the ".text" section as-usual).

The same compiler is shared between C and another language of mine which
has lambdas which use a basically similar mechanism (just, in this
language, one can uses "delete" to free lambda's, well, and/or one can
just leak them into the ether, either way, *1...).

*1: This language also has the semantics that all allocated memory gets
freed when the main heap goes out of scope (eg: the process terminates)
which on most modern targets can be implemented reasonably cheaply...

Kaz Kylheku

unread,

Aug 23, 2020, 7:42:57 PM8/23/20

to

On 2020-08-23, Kenny McCormack <gaz...@shell.xmission.com> wrote:
> WARNING: This is all gcc-specific...
>
> I found this code on some website; it does what it claims to do:
>
> #include <stdio.h>
>
> int main(void)
> {
> int (*max)(int, int) =
> ({
> int __fn__ (int x, int y) { return x > y ? x : y; }
> __fn__;
> });
>
> printf("max(-5,5) = %d\n",max(-5,5));
> }
>
> Now, this is cute and all, but how is it any different from just declaring
> max() the usual way?

It's just a demo of some GNU C features:

- a statement block in parentheses is a kind of compound expression. The
value of the last statement in the block is the expression's value.

- GNU C has local functions, like Pascal.

Normally you cannot safely return GNU C's local functions: they can be
passed down only. This is is because they capture the lexical
enviornment (local variables) which is not a first-class object. In C,
lexical variables are blown away when their scope terminates.

The above max function doesn't have any captured lexical scope though,
so perhaps there is a special case for such functions.

> Given that functions and function pointers are
> basically the same thing in C (just like arrays and pointers are basically
> the same thing - yes, yes, I know that technically they are not, but don't
> bother writing in to tell me that), when we write:
>
> somefunction() { ... }
>
> we know that, really, we're just creating a block of code and setting
> somefunction to be the address of that function.
>
> So, again, what is the advantage of the above code?

One advantage of anonymous functions (or a facsimile thereof, like the
above) is in combination with macros.

Imagine we have

#define LAMBDA2(LEFT, RIGHT, EXPR) ({ \
int __fn__ (void *LEFT, void *RIGHT) { \
return EXPR; \
} \
__fn__; \
})

Now we can do something like

qsort(array, N, sizeof array[0], LAMBDA2(A, B, strcmp(A, B)));

This is convenient. We want to compare the elements using strcmp,
so by golly, we just write a function literal which does that
little triviality right there and then, and dispense with the
ceremony of introducing a new file-scope function.

Preprocessor macros are not able to specify a piece of code to be
hoisted to the top level. Everything that comes from the macro
expansion must replace the macro call right there in that spot of the
program. So, if the only way we can generate a function is by writing a
file-scope definition, then we cannot write a macro such as the above
LAMBDA2.

By the way, here is a structural macro system for C whose macros *are*
able to specify a piece of program text to be injected into the top
level.

https://github.com/eudoxia0/cmacro

With this macro system, I think you could make a lambda macro without
using GCC extensions.

Ah yes, look: it's one of the exmaples.

qsort is also used, haha.

Andrey Tarasevich

unread,

Aug 23, 2020, 9:41:25 PM8/23/20

to

On 8/23/2020 2:13 AM, Kenny McCormack wrote:
> WARNING: This is all gcc-specific...
>
> I found this code on some website; it does what it claims to do:
>
> #include <stdio.h>
>
> int main(void)
> {
> int (*max)(int, int) =
> ({
> int __fn__ (int x, int y) { return x > y ? x : y; }
> __fn__;
> });
>
> printf("max(-5,5) = %d\n",max(-5,5));
> }
>
> Now, this is cute and all, but how is it any different from just declaring
> max() the usual way?

You are looking at two different GCC-specific features, not one.

Firstly, the '({ ... })' part - statement expressions. A GCC extensions
allowing you to embed statement-based code into expressions. This is
often useful for writing macros, for one example. Sometimes one needs to
write a function-like macro that is difficult (or impossible) to
implement as a pure standard expression. This is when GCC statement
expressions often come to the rescue.

Secondly, local functions - the ability to define and call local
functions inside other functions. This is better than "the usual way"
because it makes the declaration more local and also gives it access to
the surrounding local context, i.e. to the local variables of the caller.

You can define local functions inside statement expressions as well. In
your example it is unnecessary to wrap a local function into a statement
expression. But you can, if you want to. One benefit of this combination
is that the function name does not pollute the local namespace. The
function pointed by 'max' ends up effectively nameless. You don't have
to worry about inventing unique names for such "temporary" functions.

One concern immediately recognizable in this code is the validity
(lifetime) of the returned function pointer. But it is OK to return a
function pointer from nested local scope into the surrounding local
scope. The same considerations apply to a pointer returned from a
statement expression. (Once the function begins to access local context,
things can easily become more complicated though...)

Returning a pointer to a local function "to the outside world" (e.g. to
the calling function) is not allowed.

--
Best regards,
Andrey Tarasevich