The problem with my before code might be that I pass a normal C function
pointer to a function<>-object. The constructor of that function-object
is constexpr and if you pass simple C function pointers to it it doesn't
allocate external memory for that function object and the function-poin-
ter is just a member of the function object.
With that the compiler could do a shortcut, bypassing the logic of the
function-object and having a usual C function call. I guessed that this
is the reason for the call to have the same performance.
So I wrote a slightly different benchmark:
#include <iostream>
#include <functional>
#include <chrono>
#include <atomic>
using namespace std;
using namespace chrono;
int main()
{
auto bench = []<typename Fn>( Fn &&fn, double rounds )
{
auto start = high_resolution_clock::now();
fn();
cout << duration_cast<nanoseconds>( high_resolution_clock::now() -
start ).count() / rounds << endl;
};
size_t const ROUNDS = 100'000'000;
bench(
[]()
{
int x = 0;
atomic<void (*)(int *)> aCFn( []( int *pX ) { ++ *pX; } );
for( size_t r = ROUNDS; r--; )
aCFn.load( memory_order_relaxed )( &x );
}, ROUNDS );
bench(
[]()
{
int x = 0;
function<void ()> cppFn( [&]() { ++x; } );
atomic<function<void ()> *> aCppFn( &cppFn );
for( size_t r = ROUNDS; r--; )
(*aCppFn.load( memory_order_relaxed ))();
}, ROUNDS );
}
With that benchmark the C++-part allocates external storage for the
copyied function-object. To prevent any optimizations I store pointers
to the C-function and the C++ function<>-object as atomics.
I didn't expect any big differences, but surprisingly the performance
is still the same !
C++ is just a so powerful beast, I really like it.