Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Performance of exceptions

20 views
Skip to first unread message

Bonita Montero

unread,
Sep 2, 2019, 2:56:46 AM9/2/19
to
I posted some code here in the Rust-thread which measures the
performance of throwing an excetion. I made a mistake in the
first place because I didn't consider the slow-down of a x264
-thread on the same core. So the "performace" was a bit sur-
prising and a throw of an int took about 3 million clock-cycles.
As I ran it without any load a throw took about 8.6000 clock
-clycles on my Ryzen 7 1800X.
I told that I'm curious about the performance of other imple-
mentations (although this isn't really practically relevant).
I further improved my code to have two cases:
1. An exception of a base-class is trown and the base-class
is catched.
2. An exception of a class derived from a class derived from
the base-class is thrown and the base-class is catche. In
this case the difference in performance results from the
matching of the base-class.
Maybe some readers haven't read so deeply into the Rust-thread
so that only a few had read my posting. So I'm trying it for
a last time to encourage the readers to modify the code for
their compilers to get different results.

This is the common header:

#pragma once
#include <exception>

void BaseThrower();
void DerivedBThrower();

struct ExcBase : public std::exception
{
};

struct ExcDerivedA : public ExcBase
{
};

struct ExcDerivedB : public ExcDerivedA
{
};

This is the translation-unit with the trowing functions:

#include "exc.h"

void BaseThrower()
{
throw ExcBase();
}

void DerivedBThrower()
{
throw ExcDerivedB();
}

This is the main translation-unit:

#include <windows.h>
#include <iostream>
#include <intrin.h>
#include "exc.h"

using namespace std;

typedef long long LL;
LL Benchmark( void (*thrower)(), unsigned const rep );

int main()
{
// to get always the same TSC
SetThreadAffinityMask( GetCurrentThread(), 1 );

unsigned const REP = 1'000'000;
LL ticks;

ticks = Benchmark( BaseThrower, REP );
cout << "BaseThrower: " << (double)ticks / REP << endl;

ticks = Benchmark( DerivedBThrower, REP );
cout << "DerivedBThrower: " << (double)ticks / REP << endl;
}

LL Benchmark( void (*thrower)(), unsigned const rep )
{
LL start = (LL)__rdtsc();
for( unsigned i = rep; i; --i )
try
{
thrower();
}
catch( ExcBase & )
{
}
return (LL)__rdtsc() - start;
}

So I would be pleased about any results.

Bonita Montero

unread,
Sep 2, 2019, 3:02:18 AM9/2/19
to
BTW: I know that actual CPUs support invariant time-stamp
-counters. So as the clock of the CPU might boost here the
TSC wouls still count at a constant rate.
But my code will still get a rough estimate of the overhead
throwing an exception.

Bonita Montero

unread,
Sep 2, 2019, 5:35:04 AM9/2/19
to
I just modified the main source to be not dependent on Windows:

#include <iostream>
#include <intrin.h>
#include <chrono>
#include "exc.h"

using namespace std;
using namespace std::chrono;

typedef long long LL;

LL Benchmark( void (*thrower)(), unsigned const rep );

void SetAffinityToFirstThread();

int main()
{
unsigned const REP = 1'000'000;
LL ns;

ns = Benchmark( BaseThrower, REP );
cout << "BaseThrower: " << (double)ns / REP << "ns" << endl;

ns = Benchmark( DerivedBThrower, REP );
cout << "DerivedBThrower: " << (double)ns / REP << "ns" << endl;
}

LL Benchmark( void (*thrower)(), unsigned const rep )
{
time_point<high_resolution_clock> start = high_resolution_clock::now();
for( unsigned i = rep; i; --i )
try
{
thrower();
}
catch( ExcBase & )
{
}
return (LL)duration_cast<nanoseconds>( high_resolution_clock::now()
- start ).count();
}

Bonita Montero

unread,
Sep 3, 2019, 1:15:18 AM9/3/19
to
I just put everything together in one file. And as the benchmark
is called with a function-pointer twice inteprocedural optimiza-
tions aren't really needed anymore to prevent the compiler from
optimizing away the throw.
I compiled this with an older gcc and found, that gcc is about
42% faster when trowing an exception; and theres a big differnece
to VC++ because it takes almost the same time cathing a derived
exception than catching the base-exception.

#include <iostream>
#include <chrono>
#include <exception>

using namespace std;
using namespace std::chrono;

struct ExcBase : public exception
{
};

struct ExcDerivedA : public ExcBase
{
};

struct ExcDerivedB : public ExcDerivedA
{
};

typedef long long LL;

void BaseThrower();
void DerivedBThrower();
LL Benchmark( void (*thrower)(), unsigned const rep );

int main()
{
unsigned const REP = 1'000'000;
LL ns;

ns = Benchmark( BaseThrower, REP );
cout << "BaseThrower: " << (double)ns / REP << "ns" << endl;

ns = Benchmark( DerivedBThrower, REP );
cout << "DerivedBThrower: " << (double)ns / REP << "ns" << endl;
}

LL Benchmark( void (*thrower)(), unsigned const rep )
{
time_point<high_resolution_clock> start = high_resolution_clock::now();
for( unsigned i = rep; i; --i )
try
{
thrower();
}
catch( ExcBase & )
{
}
return (LL)duration_cast<nanoseconds>( high_resolution_clock::now()
- start ).count();
}

0 new messages