[osg-users] SingleThreaded leading to whole application just running on one core

463 views
Skip to first unread message

Christoph Weiss

unread,
Sep 22, 2016, 11:23:45 AM9/22/16
to osg-...@lists.openscenegraph.org
Hi,

I've stumbled upon the following issue. Our renderer runs
SingleThreaded because that what most Qt integrations seem to use, or
else other issues arise. I'd be happy to change that of course.

This makes the whole application run on just one core. It's still
"multithreaded", as the load average still goes up when several threads
are started, but they all run on one core.

Is this a bug, or am I doing something completely wrong?

For reference, I simply create several std::future's using std::async()
and then query their results. Only when our composite viewer was
requested to set its threading model to SingleThreaded all these threads
run on one core, otherwise it works as it's supposed to.

Thanks,

Christoph

--
Mit freundlichen Grüßen,

Christoph Weiß

WSoptics GmbH
we...@WSoptics.de
+49 8868 181 997 3
Zugspitzstraße 9
86972 Altenstadt

HRB 204558 Gerichtsstand: München B Ust.Id.Nr.: DE289079930
Geschäftsführer: Florian Sepp, Dr. Christoph Weiß
_______________________________________________
osg-users mailing list
osg-...@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Pete Black

unread,
Sep 23, 2016, 12:23:35 AM9/23/16
to osg-...@lists.openscenegraph.org
This is, bizarrely, by design.

In single-threaded mode, OpenSceneGraph silently sets CPU affinity to a single core by default. Personally I think this is incredibly obtrusive on the programmer, and the reasons for this being default behaviour are terrible, but it is what it is.

This is what I do:


Code:
#ifdef LINUX
int numCPU = sysconf( _SC_NPROCESSORS_ONLN ); //glibc fails 'all ones' cpu masks so we need this sysconf call
cpu_set_t mask;
CPU_ZERO(&mask);
for (int i = 0; i < numCPU ; i++)
{
CPU_SET(i, &mask);
//qDebug("SETTING CPU %d",i);
}
int ret = sched_setaffinity(0, sizeof(mask), &mask); //override OpenSceneGraphs shitty CPU affinity stuff
//qDebug("CPU COUNT: %d %d",CPU_COUNT(&mask),ret);

#endif



------------------
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=68702#68702

Robert Osfield

unread,
Sep 23, 2016, 3:40:21 AM9/23/16
to OpenSceneGraph Users
On 23 September 2016 at 05:25, Pete Black <pe...@marchingcubes.com> wrote:
> This is, bizarrely, by design.
>
> In single-threaded mode, OpenSceneGraph silently sets CPU affinity to a single core by default. Personally I think this is incredibly obtrusive on the programmer, and the reasons for this being default behaviour are terrible, but it is what it is.

Affinity is set by default because the it will provide the best
performance for majority of OSG applications. This might be a
"terrible" reason for you, but the OSG development is motivated not by
just focusing on one class of users needs or preferences, default
settings we try to do the best for most OSG applications.

Please remember, The OSG is extensible, if you don't like default
behaviour in most cases you can adapt the it. In the case of threading
ViewerBase::setUpThreading() and startThreading() are both virtual and
as such are meant for the purpose of specializing them in subclasses
from ViewerBase i.e. Viewer+CompositeVIewer and for users then to
subclass from these if they wish.

Robert.

Christoph Weiss

unread,
Sep 23, 2016, 3:42:14 AM9/23/16
to osg-...@lists.openscenegraph.org
Thank you so much for pointing this out, Pete. So in
ViewerBase::setUpThreading() osg decides the whole application should be
confined to a single core.

Judging by your comment, this has already been discussed and not deemed
a fault that should be fixed?

Also, I suppose there is no way to it portably?
Thread::setProcessorAffinity() only allows restricting to a single core
and has tons of ifdefs, so I guess I need to handle every cross
compilation target individually?

Thanks again, I really hope this behavior can be changed in a future
release!

Christoph

Mit freundlichen Grüßen,

Christoph Weiß

WSoptics GmbH
we...@WSoptics.de
+49 8868 181 997 3
Zugspitzstraße 9
86972 Altenstadt

HRB 204558 Gerichtsstand: München B Ust.Id.Nr.: DE289079930
Geschäftsführer: Florian Sepp, Dr. Christoph Weiß

Christoph Weiss

unread,
Sep 23, 2016, 3:44:28 AM9/23/16
to osg-...@lists.openscenegraph.org
Hi Robert,

wouldn't it be a better approach to give the user the option to change
such a setting for their application?

But thanks for pointing out the subclassing option, that's what I'll go
for! Still it took me half a day to track down what was going on. (I
didn't suspect OSG in the beginning at all, and since there are quite a
few dependencies in our application, it took me considerable time to
track it down.) I think that's a sufficient reason to at least mention
it in the documentation, better yet make it not default but optional
behavior that the user can set?

Christoph

Mit freundlichen Grüßen,

Christoph Weiß

WSoptics GmbH
we...@WSoptics.de
+49 8868 181 997 3
Zugspitzstraße 9
86972 Altenstadt

HRB 204558 Gerichtsstand: München B Ust.Id.Nr.: DE289079930
Geschäftsführer: Florian Sepp, Dr. Christoph Weiß

Robert Osfield

unread,
Sep 23, 2016, 4:02:39 AM9/23/16
to OpenSceneGraph Users
On 23 September 2016 at 08:42, Christoph Weiss <we...@wsoptics.de> wrote:
> Thank you so much for pointing this out, Pete. So in
> ViewerBase::setUpThreading() osg decides the whole application should be
> confined to a single core.

When running SingleThread the OSG just sets the affinity of the
current thread. That's it, that's all it's doing.

It does not explicitly change the affinity of any of threads created
by user applications. I think what Pete found was under Linux is that
threads created after the set affinity to the current thread, all
threads created by that thread then inherit that affinity, which what
his code snippet looks to work around.

> Judging by your comment, this has already been discussed and not deemed a
> fault that should be fixed?

Changing something that was done by design because one class of usage
on one platform doesn't do exactly what they want. There isn't a bug
to fix here. It's sub-optimal behavior for certain types of
application usage on certain platforms.

>
> Also, I suppose there is no way to it portably?
> Thread::setProcessorAffinity() only allows restricting to a single core and
> has tons of ifdefs, so I guess I need to handle every cross compilation
> target individually?
>
> Thanks again, I really hope this behavior can be changed in a future
> release!

I haven't yet seen a reason to change the behaviour, it's not the case
that the behaviour is wrong for all users, it's not a bug. Removing
the setting of affinity would leave the main thread to float around
and increase the changes of breaking cache. The OSG is just trying to
do something sensible out of the box for most users.

In the case of Qt application, they should be run multi-theaded, for
all the native windowing systems the OSG support all work
multi-threaded without any problem. Unfortunately Qt has created a
series of problems on the threading front that we've had to try and
work around, Qt then goes and moves the goal posts though between
releases. it's been a real pain to try and keep osgQt working well
over the years. If you don't need a traditional 2D UI then it's
generally best to avoid Qt as it causes problems because it has it's
way of working that doesn't fit well with the needs of real-time
visualization.

Robert.

Christoph Weiss

unread,
Sep 23, 2016, 4:19:54 AM9/23/16
to osg-...@lists.openscenegraph.org
Hi Robert,

> When running SingleThread the OSG just sets the affinity of the
> current thread. That's it, that's all it's doing.

This actually has a very bad side effect that I only noticed now. Since
it explicitly sets the affinity to core zero, any other application that
does so will share its CPU entirely with it. In particular if you start
a SingleThread OSG application twice, it will share the core and have
degraded performance.

I also wouldn't be surprised if the kernel favors core zero for things
that aren't properly multithreaded, but I'm just guessing here, and
maybe that's a thing of the past already, at least under linux.

I'm not even sure why setting the CPU affinity to a single core should
improve performance at all unless the OS's scheduler performs poorly.

> It does not explicitly change the affinity of any of threads created
> by user applications. I think what Pete found was under Linux is that
> threads created after the set affinity to the current thread, all
> threads created by that thread then inherit that affinity, which what
> his code snippet looks to work around.

Yes, they inherit the mask. It would be quite similar if OSG set up a
ulimit().

>> Judging by your comment, this has already been discussed and not deemed a
>> fault that should be fixed?
>
> Changing something that was done by design because one class of usage
> on one platform doesn't do exactly what they want. There isn't a bug
> to fix here. It's sub-optimal behavior for certain types of
> application usage on certain platforms.

I agree, it's not a bug, it's a design choice. However, it's a rather
strict enforcement on the user.

> I haven't yet seen a reason to change the behaviour, it's not the case
> that the behaviour is wrong for all users, it's not a bug. Removing
> the setting of affinity would leave the main thread to float around
> and increase the changes of breaking cache. The OSG is just trying to
> do something sensible out of the box for most users.

I totally understand that. But I disagree with it being the most
sensible choice ;) Subclassing CompositeViewer is a good solution. For
me, it really does not make that much of a difference if the design is
changed or not. I'd be surprised if I was the last to run into this
issue though and spend quite some time on figuring this out.

Apart from this, I think in general a library should not impose
restrictions on the calling thread without it explicitly asking for
them. Suppose a library just decided to close file descriptors 0, 1, 2
because it doesn't need them and can thus save some memory...

> In the case of Qt application, they should be run multi-theaded, for
> all the native windowing systems the OSG support all work
> multi-threaded without any problem. Unfortunately Qt has created a
> series of problems on the threading front that we've had to try and
> work around, Qt then goes and moves the goal posts though between
> releases. it's been a real pain to try and keep osgQt working well
> over the years. If you don't need a traditional 2D UI then it's
> generally best to avoid Qt as it causes problems because it has it's
> way of working that doesn't fit well with the needs of real-time
> visualization.

Alas, we're quite stuck with Qt since we require a 2D UI. But thanks
for the information.

Christoph

Robert Osfield

unread,
Sep 23, 2016, 5:00:46 AM9/23/16
to OpenSceneGraph Users
Hi Chris,

On 23 September 2016 at 09:19, Christoph Weiss <we...@wsoptics.de> wrote:
> This actually has a very bad side effect that I only noticed now. Since it
> explicitly sets the affinity to core zero, any other application that does
> so will share its CPU entirely with it. In particular if you start a
> SingleThread OSG application twice, it will share the core and have degraded
> performance.

The OSG primary focus and expected user base is real-time
visualization with such applications you won't run multiple
applications at the same time and you'll be using the OSG's native
windowing, multi-threading, database paging etc. Out of the box the
OSG tries to work as well as it can for this usage model.

The OSG isn't just about real-time visualization, even if this is it's
primary focus, it does support other usage models as well, but out of
the box I wouldn't expect all the design decisions based on the above
focus to work as well, what is important is that one can adapt the OSG
to work well for these other usage models even if out of the box it
doesn't do it perfectly.

In the case of Qt usage, perhaps osgQt could provide sublcasses from
Viewer/CompositeViewer to help adapt the OSG to work better for a
typical Qt users requirements.

Robert.

Pete Black

unread,
Sep 23, 2016, 8:17:19 PM9/23/16
to osg-...@lists.openscenegraph.org
> Affinity is set by default because the it will provide the best
> performance for majority of OSG applications. This might be a
> "terrible" reason for you, but the OSG development is motivated not by
> just focusing on one class of users needs or preferences, default
> settings we try to do the best for most OSG applications.

I have no particular desire to repeat the last discussion, but i'll say it again.

Hardcoding CPU affinity is:
a) unexpected
b) a premature optimisation
c) not consistent across platforms
d) not easily reversible
e) a performance killer outside of one specific application model.
f) conflicting with other libraries that expect to set CPU affinity linked in the application


It is a terrible idea, and doing it in the context of a library is just plain wrong.

PS. Reason f) doesn't really exist because other libraries don't do this, for reasons a,b,c,d and e.

------------------
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=68716#68716

Christoph Weiss

unread,
Sep 24, 2016, 2:19:35 AM9/24/16
to osg-...@lists.openscenegraph.org
On 09/24/2016 02:18 AM, Pete Black wrote:
> I have no particular desire to repeat the last discussion, but i'll say it again.
>
> Hardcoding CPU affinity is:
> a) unexpected

In particular it is not done by some osg::setThreadingModel() but rather
deep down in the call stack by a function that is called when a
(composite) viewer is set up! Not the place I search for such a global
application wide setting.

Christoph

Sebastian Messerschmidt

unread,
Sep 24, 2016, 4:33:40 AM9/24/16
to osg-...@lists.openscenegraph.org
Hi,

Wow, before this escalates: OSG is setting the affinity for some of its own threads, which is totally legitimate.And I totally agree, that it would be nice to have an interface to control the core/wether affinity is used in single-threaded mode except from having to subclass the viewer. 

If all other threads are forced to one core (as reported), by setting the affinity of the osg-threads, it is clearly a bug and needs further inspection . I've been working with OSG in a multi-threading environment for several years and didn't experience problems so far however.
So creating a minimal example might help to find the problem, if there is one.

Cheers
Sebastian

Robert Osfield

unread,
Sep 24, 2016, 11:55:58 AM9/24/16
to OpenSceneGraph Users
Hi Pete,

> I have no particular desire to repeat the last discussion, but i'll say it again.

But then you'lll throw is some ill-informed and emotive language anyway.

From my experience of maintaining an open source project for 17 years
is that emotive language is used when people have a tendency to
manipulative rather than persuasive.

> It is a terrible idea, and doing it in the context of a library is just plain wrong.

There only thing terrible here is your etiquette. Please leave the OSG
community if you don't like the software or the decisions made. It's
not like you've paid for the software or the support you get.

I've read all you have to read before and in this thread and I simply
don't agree with you. Making the same point again and adding more
emotive terms doesn't change the technical reasons, or why the OSG
works the way it does. The technical reasons haven't changed so my
conclusions haven't changed.

If you wan to stay then be polite, respectful even if you don't agree
with everything.

If you can't be polite and feel your conduct is appropriate then go
fork the OSG, go start your own community, you can be as friendly or
abusive as you want.

Be sure though, continuing with the tone of your recent posts is
absolutely not welcome here.

Robert.

Pete Black

unread,
Sep 24, 2016, 6:21:03 PM9/24/16
to osg-...@lists.openscenegraph.org
OK,


Apologies if I caused offense.

Goodbye.

------------------
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=68731#68731

Fabian Wiesel

unread,
Sep 24, 2016, 7:42:27 PM9/24/16
to OpenSceneGraph Users
Hi,

> OSG is setting the affinity for some of its own threads, which is totally legitimate.

Currently, I was not able to confirm it on my Mac. But I think, I observed such a behaviour in my application on Linux.
But take it with a grain of salt, as it can be the result of some other side effects.
I will try to test it the following days with a simple test program, unless you can point out a mistake, and save me the trouble.

The following scenario sounds plausible to me:

If you are setting the osgViewer::Viewer::setThreadingMode(SingleThreaded), and then calling Viewer::realize(),
it will in turn call ViewerBase::setUpThreading() -> OpenThreads::SetProcessorAffinityOfCurrentThread(0); -> pthread_setaffinity_np(...)

"pthread_setaffinity_np" will be called on the main thread, which is debatable, if it is its "OSGs own thread".

The side effect is arising on Linux from the following (man page):
> A new thread created by pthread_create(3) inherits a copy of its creator's CPU affinity mask.

So, all threads created either from the view or after the Viewer::realize() will only run on the main CPU.

Given the following (pseudo-)program, I would expect the threads to run parallel on all processors, but likely they aren't on Linux.

int main(int argc, char **argv) {
std::vector<int> myvector(1024);
osgViewer::Viewer viewer;
viewer.setSceneData( node );
viewer.setThreadingModel(SingleThreaded);
viewer.realize(); // calling ViewerBase::setUpThreading() -> OpenThreads::SetProcessorAffinityOfCurrentThread(0); -> pthread_setaffinity_np(...)

// Create Threads
for (int i = 0; i < 100; ++i) pthread_create(...)

viewer.run()
pthread_join(...);
}

Cheers,
Fabian

Christoph Weiss

unread,
Sep 25, 2016, 5:45:18 AM9/25/16
to osg-...@lists.openscenegraph.org
I wrote a simple sample program that produces the following output:

% g++ -std=c++14 -losgViewer -pthread a.cpp && time ./a.out
Thread 5 done
Thread 9 done
Thread 4 done
Thread 0 done
Thread 6 done
Thread 15 done
Thread 12 done
Thread 13 done
Thread 11 done
Thread 7 done
Thread 8 done
Thread 1 done
Thread 3 done
Thread 10 done
Thread 14 done
Thread 2 done
./a.out 161.06s user 0.02s system 392% cpu 41.036 total

% g++ -std=c++14 -losgViewer -pthread -DSINGLETHREADED a.cpp && time ./a.out
Thread 6 done
Thread 13 done
Thread 9 done
Thread 8 done
Thread 15 done
Thread 10 done
Thread 11 done
Thread 14 done
Thread 5 done
Thread 4 done
Thread 1 done
Thread 3 done
Thread 7 done
Thread 12 done
Thread 2 done
Thread 0 done
./a.out 121.23s user 0.02s system 99% cpu 2:01.92 total

The program looks like this:

--------
#include <future>
#include <iostream>
#include <vector>

#include <osgViewer/Viewer>
#include <osgViewer/ViewerBase>

int main() {
constexpr int numThreads = 16;
osgViewer::Viewer viewer;
#ifdef SINGLETHREADED
viewer.setThreadingModel(osgViewer::ViewerBase::ThreadingModel::SingleThreaded);
#endif
viewer.realize();

const auto compute = [] (const int i) {
for(volatile int n = 0; n != 1'000'000'000; ++n) {
++n;
--n;
}
std::cout << "Thread " << i << " done\n";
};

std::vector<std::future<void>> futures;
for(int i = 0; i != numThreads; ++i) {
futures.push_back(std::async(std::launch::async, compute, i));
}
for(auto & future: futures) {
future.wait();
}

return 0;
}
--------

Christoph

On 09/25/2016 01:42 AM, Fabian Wiesel wrote:
> int main(int argc, char **argv) {
> std::vector<int> myvector(1024);
> osgViewer::Viewer viewer;
> viewer.setSceneData( node );
> viewer.setThreadingModel(SingleThreaded);
> viewer.realize(); // calling ViewerBase::setUpThreading() -> OpenThreads::SetProcessorAffinityOfCurrentThread(0); -> pthread_setaffinity_np(...)
>
> // Create Threads
> for (int i = 0; i < 100; ++i) pthread_create(...)
>
> viewer.run()
> pthread_join(...);
> }

Fabian Wiesel

unread,
Sep 25, 2016, 5:46:45 AM9/25/16
to OpenSceneGraph Users
Hi,

I can confirm the behaviour with the following test case:
https://github.com/fwiesel/vertexarrayfunctest/blob/threads/main.cpp#L92-L103
All threads run on CPU 0.

That clears up the mystery, which baffled me and two of my colleagues:
After upgrading to a new Ubuntu version, suddenly our application making heavy use of the Intel Thread Building Blocks failed to scale with the cores.
Explicitly initialising the TBB early in the program solved the issue, so we blamed some change in TBB, and I didn't investigate further.

It looks like OSG was previously packaged with QtThreads instead of pthreads, making the affinity operations a no-op, while the newer doesn't.

In the light that it affects any child thread, can I ask you to re-consider the affinity handling, and/or maybe rename
osgViewer::ViewerBase::SingleThreaded to osgViewer::ViewerBase::SingleThreadedCpuLocked or something?

I understand, that it is possible to override the behaviour (which I did now), but that requires some internal knowledge of the library, which you obviously have.
But for me as a user, where OSG is simply one of the libraries I use, I would not it effectively to change the behaviour of a second one.

Thanks,
Fabian

Robert Osfield

unread,
Sep 25, 2016, 11:16:32 AM9/25/16
to OpenSceneGraph Users
Thanks for the example Fabian, I'm away from my desktop system this
weekend, will have a look in the week.

Have you tried setting the affinity of the threads that are created?

Have you tried creating the threads before the call to viewer.realize()?

The way things are behaving looks to be down to the way that the Linux
threading is forcing the inheritance of the threading affinity of the
main thread to child threads.

I don't know if there is an setting on the Linux threads side that can
change this behaviour so it's more consistent with other platforms.
If there is a solution this then the place to apply it would the
OpenThreads pthread implementation.

Robert.

Fabian Wiesel

unread,
Sep 25, 2016, 5:53:01 PM9/25/16
to OpenSceneGraph Users


Hi Robert,

> Have you tried setting the affinity of the threads that are created?
> Have you tried creating the threads before the call to viewer.realize()?

Yes, both cause the threads being distributed across the cores. That is probably also why initialising TBB early in main helps, as it creates a pool of worker threads. For my app, you can consider it solved.
But don't you see a difficulty for OSG, if you cannot use any threading library without additonal setup code?

> The way things are behaving looks to be down to the way that the Linux threading is forcing the inheritance of the threading affinity of the main thread to child threads.
> I don't know if there is an setting on the Linux threads side that can change this behaviour so it's more consistent with other platforms.

I was looking for that, and my search was fruitless.
It also seems not to be Linux specific. FreeBSD seems to do the the same, as is Windows: https://msdn.microsoft.com/es-es/library/windows/desktop/ms686223(v=vs.85).aspx
> Process affinity is inherited by any child process or newly instantiated local process

It looks more like OS X is the isolated case (and qnx).

Cheers,

Fabian

Robert Osfield

unread,
Sep 26, 2016, 3:39:35 AM9/26/16
to OpenSceneGraph Users
HI Fabien,

On 25 September 2016 at 22:52, Fabian Wiesel <fabian...@gmail.com> wrote:
> Yes, both cause the threads being distributed across the cores. That is
> probably also why initialising TBB early in main helps, as it creates a pool
> of worker threads. For my app, you can consider it solved.
> But don't you see a difficulty for OSG, if you cannot use any threading
> library without additonal setup code?

The only "difficulty" is when using a certain usage model when
threading, it's not a general issue. The issue looks to be isolated
to the case where users set up threads after the viewer has been
realized.

I do agree that the inheritance of the affinity to child processes is
not ideal. It's not something that I was aware of when adding the
processor affinity to osgViewer. It would good to find a way to
prevent this implicit inheritance when not desired.

Setting the processor affinity is an important part of making a
real-time visualization application run at a solid 60Hz, having thread
jump from core to core breaks cache and can break frame, it's why the
code exists in osgViewer and why it doesn't make sense to just remove
it.

Being able to provide user control over affinity is something I've had
in mind for a number of years, with the decoupled way many of the OSG
threads are set up there isn't an easy 1:1 mapping that the users can
get there hands dirty with to set things i.e. the user doesn't
manually create the threads, the viewer, the databases, the video
plugins all create threads depending upon how they are configured and
whether they are required when certain types of data is loaded or not.
These threads that are created are assigned when possible to sensible
locations, but it's not idea.

If there was an easy and intuitive route to user control of thread
affinity it would have been implemented already.

Robert.

Christoph Weiss

unread,
Sep 26, 2016, 6:01:48 AM9/26/16
to osg-...@lists.openscenegraph.org
Dear Robert,

On 09/26/2016 09:39 AM, Robert Osfield wrote:
> The only "difficulty" is when using a certain usage model when
> threading, it's not a general issue. The issue looks to be isolated
> to the case where users set up threads after the viewer has been
> realized.

This sounds a bit as if this was a rare thing to do, or at least a
design choice that can or should be avoided. I think on the contrary,
explicit thread handling is something that is slowly going to die for
many applications, the future lies in futures (pun pun) and parallel
version of standard algorithms. C++17 adds a lot of parallel
functionality that the user can use without caring about threads
themselves. The code sample that I posted yesterday is a typical
std::future application.

> If there was an easy and intuitive route to user control of thread
> affinity it would have been implemented already.

What would be wrong with an osg::setThreadAffinity(...) function a user
who knows what they're doing can use?

Christoph

Robert Osfield

unread,
Sep 26, 2016, 6:20:49 AM9/26/16
to OpenSceneGraph Users
Hi Christoph,

On 26 September 2016 at 11:01, Christoph Weiss <we...@wsoptics.de> wrote:
> This sounds a bit as if this was a rare thing to do, or at least a design
> choice that can or should be avoided. I think on the contrary, explicit
> thread handling is something that is slowly going to die for many
> applications, the future lies in futures (pun pun) and parallel version of
> standard algorithms. C++17 adds a lot of parallel functionality that the
> user can use without caring about threads themselves. The code sample that
> I posted yesterday is a typical std::future application.

I think you need to trust me on the scene graph performance front.
Thread affinity is important for performance. Future threaded support
in languages makes no difference to the cost of threads moving from
core to core. It's the same problem no matter what API you wrap it up
with.

>> If there was an easy and intuitive route to user control of thread
>> affinity it would have been implemented already.
>
>
> What would be wrong with an osg::setThreadAffinity(...) function a user who
> knows what they're doing can use?

The OSG users don't always know what threads are active, let alone
what ones they need to set to get best performance. The OSG in it's
defaults try to do the best it can for users. It's not perfect, but
it at least tries.

Robert

Robert Osfield

unread,
Sep 26, 2016, 6:39:52 AM9/26/16
to OpenSceneGraph Users
Hi All,

I have taken Christoph's test problem and reproduced the affinity
issue on my Kubuntu 16.04, so this is useful first step.

I've also reviewed the OpenThreads code, so have a better idea of the
code in question. Please note that I'm not the author of OpenThreads,
the project just fell on my shoulders to maintain. Some of the code
in OpenThreads pthread side made me curious so I modified Christoph's
code to added a code path using OpenThreads rather than C++11 threads
used in Christoph's code and command line options to toggle between
the two, toggle beteween viewer running SingleThreaded or with
defaults (will be DrawThreadPerContext). Attached is a CmakeLists.txt
file and main.cpp.

For the OpenThreads pathway I get the exactly the same performance
when running the viewer single theaded or multi-threaded.

time ./threadtest --ot --SingleThreaded
Thread 0x142df50 done
Thread 0x1427750 done
Thread 0x1427690 done
Thread 0x142dcf0 done
Thread 0x142e1f0 done
Thread 0x142e0a0 done
Thread 0x142edc0 done
Thread 0x142e5e0 done
Thread 0x142ec70 done
Thread 0x142e9d0 done
Thread 0x142e340 done
Thread 0x142e490 done
Thread 0x142de20 done
Thread 0x142e880 done
Thread 0x142e730 done
Thread 0x142eb20 done

real 0m15.463s
user 1m59.612s
sys 0m0.044s


For the C++11 threads I get far worse performance when running single
theaded vs multi-threaded:

$ time ./threadtest --SingleThreaded
Thread 7 done
Thread 11 done
Thread 10 done
Thread 13 done
Thread 12 done
Thread 6 done
Thread 9 done
Thread 8 done
Thread 14 done
Thread 15 done
Thread 5 done
Thread 4 done
Thread 3 done
Thread 2 done
Thread 1 done
Thread 0 done

real 1m22.315s
user 1m22.064s
sys 0m0.044s


So the code in OpenThreads to set the affinity mask by default for
it's own threads resolve the issue completely (note, the
OpenThreads::Thread that I created above don't set affinity, so they
are free for the OS to place them.) This may also be why most users
don't see these issues.

This isolates the issues to users who use non OpenThreads threads when
these threads are created after the viewer has been realized and don't
set thread affinity for there threads themselves. That's three sets
of conditions that any one be changed and the problem will be entirely
resolved.

While not ideal that this issue exists at all at least we now have a
handle on it. Given the issue only occurs with a very specific set of
usage I don't think there is any reason for sweeping changes to the
core OSG, or changes to the defaults.

Robert.
CMakeLists.txt
CMakeLists.txt

Sebastian Messerschmidt

unread,
Sep 26, 2016, 7:15:34 AM9/26/16
to OpenSceneGraph Users


Hi Robert,


I
The problem as Christoph mentioned are threads which are created by C++11 features such as std::async. One simply doesn't have any control over them and cannot set any affinity whatsoever.
While not ideal that this issue exists at all at least we now have a
handle on it.  Given the issue only occurs with a very specific set of
usage I don't think there is any reason for sweeping changes to the
core OSG, or changes to the defaults.
Would a small function to enable/disable the affinity in SingleThreaded mode be a big breaking change? Defaulting to using the affinity mask, making it up to the advanced user to sacrifice the performance.
I would argue that the the issue isn't that specific when seen in a greater context. In my case OSG is used behind an interface, so it might not be possible to use create threads before the viewer.
Also it simply feels like an inappropriate side-effect which is buried without having control/access over/to it.

Cheers
Sebastian

Robert.

Robert Osfield

unread,
Sep 26, 2016, 9:14:20 AM9/26/16
to OpenSceneGraph Users
Hi Sebastian,

On 26 September 2016 at 12:15, Sebastian Messerschmidt
<sebastian.m...@gmx.de> wrote:
> Would a small function to enable/disable the affinity in SingleThreaded mode
> be a big breaking change? Defaulting to using the affinity mask, making it
> up to the advanced user to sacrifice the performance.

I would rather avoid small code changes that don't actual offer any
path forward. Adding a bool for the affinity would just be a
workaround for an issue which is somewhere else entirely, it would be
just another hack.

> I would argue that the the issue isn't that specific when seen in a greater
> context. In my case OSG is used behind an interface, so it might not be
> possible to use create threads before the viewer.

If you are using OpenThreads for your threads then you have no problem
at all anyway, you can create them before or after viewer.realizer()
without any problem.

It's only if you create your threads afterwards using another
threading library that doesn't set the affinity mask in any way so
just inherits the infinity mask.

A number of contributors to this thread have made it out to be a
general problem, it's not, it's a very specific usage case issue.

> Also it simply feels like an inappropriate side-effect which is buried
> without having control/access over/to it.

The side effect is only in certain circumstances, it's unwanted, but
it's not something that most users will likely hit up against. You
also do have control over, you can set the affinity mask for your
threads rather than inherit what ever mask has be previously set.

--

I would add though I'm not opposed to adding more control of the
affinity to OpenThreads/OSG. What API form this extra control might
take I can't say at this point as I haven't had to time to reflect on
the issue enough. I still have plenty of other tasks to get on with.

Christoph Weiss

unread,
Sep 26, 2016, 9:20:36 AM9/26/16
to osg-...@lists.openscenegraph.org
Hi Robert,

> If you are using OpenThreads for your threads then you have no problem
> at all anyway, you can create them before or after viewer.realizer()
> without any problem.
>
> It's only if you create your threads afterwards using another
> threading library that doesn't set the affinity mask in any way so
> just inherits the infinity mask.
>
> A number of contributors to this thread have made it out to be a
> general problem, it's not, it's a very specific usage case issue.

I strongly disagree with the last sentence. It is the most general use
case possible: using standard C++ features.

OSG's behavior changes the behavior of programs that, apart from OSG,
rely solely on the C++ specification. This is the most general problem
possible, and far more important than the behavior of some specific
libraries like pthreads or OpenThreads.

Christoph

Robert Osfield

unread,
Sep 26, 2016, 10:35:21 AM9/26/16
to OpenSceneGraph Users
HI Christoph,

On 26 September 2016 at 14:20, Christoph Weiss <we...@wsoptics.de> wrote:
> I strongly disagree with the last sentence. It is the most general use case
> possible: using standard C++ features.

Please quite the hyperbole. It does you no favours.

I am been writing C++ programs since 1992, been the project lead of
the OSG since 1999, the OSG has been started to embrace
mulit-ithreading in 2001. I've interacted with thousands of
developers over the years. Strongly disagree with you may, but think
twice, between me and you who might know the most general use case for
the OSG?

I'm not some wet behind the ears engineer that knows nothing about
C++, threading or performance, these are things that make the OSG what
it is.

When I say we are talking about a very specific usage case it's
because in my 1i8 years as the project lead of the OSG it in my
experience it is a specific usage case rather than a common one.

> OSG's behavior changes the behavior of programs that, apart from OSG, rely
> solely on the C++ specification. This is the most general problem possible,
> and far more important than the behavior of some specific libraries like
> pthreads or OpenThreads.

OK, lets get some facts down.

OpenThreads pre-dates C++11 by over a decade.

Yet it still doesn't offers MORE threading control and functionality
that native C++11 threads.

Does C++11 threads have support for processor affinity? NO.

Processor affinity is essentially for any serious threading library.
So C++ threading support is a bit of fail in this respect.

YES you can add processor affinity via the native_handle. You'll need
to dip into pthreads or other native threading libs for this, but this
rather defeats the point of having "portable" C++ threading.

If you own code example you make the ASSUMPTION that the affinity mask
of the thread creating your threads allows it to run on all threads.
Now if you've never done performance critical mult-threading the idea
of processor affinity may be alien, but it's not robust approach to
muli-threading.

Why do you make this assumption about thread affinity? Why don't you
set the thread affinity explicitly?

If you want to make your own code more robust remove the assumptions,
it'll make your own code more portable.

I've put in the effort to re-create the problem, I've found a number
of different solutions, none of which require any modifications to the
core OSG. If there was a flaw on the OSG side then we wouldn't be
able to provide these solutions.

Robert.

Christoph Weiss

unread,
Sep 26, 2016, 10:49:41 AM9/26/16
to osg-...@lists.openscenegraph.org
Hi Robert,

On 09/26/2016 04:35 PM, Robert Osfield wrote:
> HI Christoph,
>
> On 26 September 2016 at 14:20, Christoph Weiss <we...@wsoptics.de> wrote:
>> I strongly disagree with the last sentence. It is the most general use case
>> possible: using standard C++ features.
>
> Please quite the hyperbole. It does you no favours.
>
> I am been writing C++ programs since 1992, been the project lead of
> the OSG since 1999, the OSG has been started to embrace
> mulit-ithreading in 2001. I've interacted with thousands of
> developers over the years. Strongly disagree with you may, but think
> twice, between me and you who might know the most general use case for
> the OSG?
>
> I'm not some wet behind the ears engineer that knows nothing about
> C++, threading or performance, these are things that make the OSG what
> it is.
>
> When I say we are talking about a very specific usage case it's
> because in my 1i8 years as the project lead of the OSG it in my
> experience it is a specific usage case rather than a common one.

Robert, I'm not quite sure what prompts you to bring in personal
attributes here. Nor do I see any hyperbole to begin with. I stated
that I strongly disagree. You can in turn disagree. I never was my
intention to make this personal, and quite frankly I do not see how I
should have.

I did not question your authority on software development in general, or
OSG specifically. If you think I put this on a personal level, please
point out where. Otherwise, please refrain from arguments by authority,
they do you no favors.

My only interest here is to improve OSG. For me, it doesn't matter at
all if the behavior of OSG is changed or not -- I already fixed it in
our code base by using your suggestion for white I'm grateful. I can
quite clearly see future users run into the same issues that I had, and
I'd like to see OSG improved and not make that happen.

However, since it feels as if this is turning into something about who's
right, not technical aspects, I'll quit the entire discussion now.

Christoph

Robert Osfield

unread,
Sep 26, 2016, 11:38:26 AM9/26/16
to OpenSceneGraph Users
HI Christian,

On 26 September 2016 at 15:49, Christoph Weiss <we...@wsoptics.de> wrote:
> Robert, I'm not quite sure what prompts you to bring in personal attributes
> here. Nor do I see any hyperbole to begin with. I stated that I strongly
> disagree. You can in turn disagree. I never was my intention to make this
> personal, and quite frankly I do not see how I should have.

When it comes to perspectives of what is right or wrong for the OSG it
absolutely has everything to do with my experience. I more than
anyone else in the community I witness how the OSG gets used across
the broad spectrum of users. There is no one else who provides more
support to users, fix more bug, work with users and clients across
many professional domains and usage cases.

When I say something is a "very specific usage case" then it's based
on this experience. You then said "you strongly disagree" with this,
then I leaves me wondering how you can speak from greater authority on
how the OSG users use the OSG.

What you can say is that you know your usage model for your
application more than I. I could understand how this to you is the
most important thing from your perspective, this is exactly what I
expect of OSG users and is positive thing. However, where I feel
you've overstepped the make is making broad statements that your usage
case is the general case from the OSG perspective.

It's my job as project lead to look at all the usage cases and issues
that the OSG users have and guide the OSG in the right direction, I
take not of usage cases like your own, but it's just part of wider
project. Just in the same way I also need to know the wider context
of where the OSG sits, it's not something that is static, it evolves
over time. An import thing for a project lead is that you do take to
whimns and move the rudder of the ship left then right randomly in
response to the latest greatest opinion.

--

Repeatedly in this thread people like yourself have strongly asserted
how wrong the OSG is doing things.

As spent the time investigating the issue I've found that it's
actually that this standpoint has been based entirely on the
assumptions made by those having problems.

The assumption that the thread affinity will not be set, and for some
it seems it should never be set, is building your house on sand. If
you are writing a multi-threaded application one should be aware of
and make conscious decisions about thread affinity.

The OSG and OpenThtreads don't make assumptions that thread affinity
will be correct in it's default, inherited state, it where appropriate
explicitly sets the affinity. It may not do a perfect job in this,
but it does at least try to do what it can.

From what I have learnt this is is far more an education issue rather
than a technical one. Partly is educating about what the OSG does and
why it does it, but it's also more general than this - the assumptions
about thread affinity being made in end users applications is clearly
insufficient. This lack of education of this later issue is far
bigger than the OSG project. Sadly the lack of education on threading
issues is not helped by the lack of affinity functionality of C++11
threading as an issue ignored is not one that is solved.

Robert.

Temia Eszteri

unread,
Sep 26, 2016, 2:41:20 PM9/26/16
to osg-...@lists.openscenegraph.org
Hey Robert,

Just let it drop!

Love, Temia

On Mon, 26 Sep 2016 16:38:19 +0100
Robert Osfield <robert....@gmail.com> wrote:

> HI Christian,
--
Temia Eszteri <lami...@cleverpun.com>

Robert Osfield

unread,
Sep 26, 2016, 2:42:24 PM9/26/16
to OpenSceneGraph Users
When reviewing doing a code review of the affinity code in OpenThreads
today I noticed that OpenThreads doesn't fully expose all the
functionality available for setting or resetting the thread affinity,
as OpenThrreads just allowed you to specify a single cpu number for a
thread to have affinity with.

To improve the flexibility of setting the Thread processor affinity I
have introduced a new function:

/**
* Set the processor affinity mask of current thread. If you want to
allow thread to run on any processor core use ~0ul for the cpumask
*/
extern OPENTHREAD_EXPORT_DIRECTIVE int
SetProcessorAffinityMaskOfCurrentThread(unsigned long cpumask);


And a new Thread method:

* If you want this threadd to run on any processor core then use
a cpumask of ~0ul
* This call must be made before
* start() or startThread() and has no effect after the thread
* has been running. Returns 0 on success, implementation's
* error on failure, or -1 if ignored.
*/
int setProcessorAffinityMask( unsigned long cpumask);

I have implemented these for pthreads so far, creating a git branch
openthreads_affinity_mask:

https://github.com/openscenegraph/OpenSceneGraph/tree/openthreads_affinity_mask

It's experimental, the API isn't settled yet, but it's a start. I
will need to boot into Windows tomorrow and implement the Windows
side. Members of the community are welcome to dive in and beat me to
this.

As a test of this new API I have modified the thread testing example
that Christian wrote and I extended earlier today, the addition simple
is:

if (arguments.read("--reset"))
{
OpenThreads::SetProcessorAffinityMaskOfCurrentThread(~0);
}

What this does is set the affinity of the main thread to run on all
threads. With this change even the troublesome C++11 threads now work
fine even if the OSG viewer sets the affinity initialize to CPU 0.

This addresses a serious short coming in C++11 threads without needing
to just move all code across to OpenThreads, this call above is
sufficient to fix the lack of affinity support in C++ threads. Since
OSG applications that might use C++11 threads will be using
OpenThreads anyway this doesn't add any extra dependencies.

This could be used more widely as well, for instance is you want to
set the affinity to just four of the CPU cores this can easily be done
by just setting up the mask appropriately.

I'm tempted to tweak the previous affinity API to:

OpenThreads::SetProcessorAffinityOfCurrentThread(unsigned int
basecpunum, unsgined int numcores=1)

Which would allow you to say set the affinity for cores 4,5,6,7 via:

OpenThreads::SetProcessorAffinityOfCurrentThread(4,4)

I'll sleep on this bit of the API.

These changes in some form will make it into OSG-3.6. They also could
potentially utilized in osgViewer.

Robert.
main.cpp

Robert Osfield

unread,
Sep 26, 2016, 3:04:31 PM9/26/16
to OpenSceneGraph Users
HI Temia,

On 26 September 2016 at 19:41, Temia Eszteri <lami...@cleverpun.com> wrote:
> Just let it drop!

Drop what? The project leadership? Doing support for free?

Investigating issues and getting to the bottom of them? If you've
actually followed this thread and read what's been written I've done
far more than talk, I've been reproduce issues, finding solutions,
finding solutions for inadequacies of C++ threads and members code.

Asking members of the community to avoid being rude or manipulative?

I include you post in this, you comment is wholly lacking in technical
contribution. Did you not consider that that your comment wouldn't
come as obnoxious/insulting. Of all the posts to this thread you is
most inappropriate.

I have had to put up with lot of non technical crap in this thread,
lots of opinions paraded as "facts" that on investigation have been
proven to be pretty unsubstantiated or mis-judged. The main issue in
this thread is entirely down to users making broad assumptions and
believing that when these assumptions are proven to not always hold
they don't get the results they expected based on their assumptions of
how things "should" work.

Of course I'm going to get irritated when not everyone is contribute
positively to a thread, you know I'm human.

Robert Osfield

unread,
Sep 27, 2016, 6:30:13 AM9/27/16
to OpenSceneGraph Users
Hi all,

I have refined the affinity support in OpenThreads further introducing
an OpenThreads::Affinity class that stores a set of CPU core numbers
that the affinity should be set for, this is then used as input when
setting the the native threading API's CPU affinity mask.

The Affinity class can be set up with simply constructors or adding
the CPU numbers one by one. Since the new class wraps up various ways
of setting up the affinity there isn't any need for having multiple
function/methods that handle different parameters, it will also scale
to any number of CPUs that you have on your system, so if you have
something crazy like 4096 cores in your system then it'll handle it
just fine.

The class looks like:

/**
* @class Affinity
* @brief Simple container for specifying which CPU a thread should
have affinity with.
* An empty Affinity.activeCPUs/default constructed Affinity
signifies that a thread should not have any specific affinity and be
able to run on all available CPUs.
*/
class Affinity
{
public:

Affinity() {}

Affinity(unsigned int cpuNumber) { activeCPUs.insert(cpuNumber); }

Affinity(unsigned int cpuNumber, unsigned int cpuCount) {
while(cpuCount>0) { activeCPUs.insert(cpuNumber++); --cpuCount; } }

Affinity(const Affinity& rhs) : activeCPUs(rhs.activeCPUs) {}

Affinity& operator = (const Affinity& rhs) { if (&rhs!=this) {
activeCPUs = rhs.activeCPUs; } return *this; }


/** add a specfied cpu core from the list to have affinity to. */
void add(unsigned int cpuNmber) { activeCPUs.insert(cpuNmber); }

/** remove a specfied cpu core from the list to have affinity to. */
void remove(unsigned int cpuNmber) { activeCPUs.erase(cpuNmber); }

/** return true if affinity has been provided for specific CPU cores.*/
operator bool () const { return !activeCPUs.empty(); }

typedef std::set<unsigned int> ActiveCPUs;

/** Set of CPUs that a thread should have affinity to.*/
ActiveCPUs activeCPUs;
};

I have implemented this for pthreads so far, this afternoon I'll boot
into Windows and implement support for it. The changes are checked
into the openthreads_affinity_mask branch.

I have also modified the thread test code previous posted in this
thread so it illustrates/tests the new API, file attached. The
relevant part are:

if (arguments.read("--reset"))
{
OpenThreads::SetProcessorAffinityOfCurrentThread(OpenThreads::Affinity());
}

unsigned int cpuNumber, cpuCount;
if (arguments.read("--cpus", cpuNumber, cpuCount))
{
OpenThreads::SetProcessorAffinityOfCurrentThread(OpenThreads::Affinity(cpuNumber,
cpuCount));
}

if (arguments.read("--cpu", cpuNumber))
{
OpenThreads::SetProcessorAffinityOfCurrentThread(cpuNumber);
}

Note the old usage of just passing a uint for the cpuNumber to
SetProcessorAffinityOfCurrentThread() still works as its that to the
constructors in Affinity so old OSG/OpenThreads that set affinity to
specific core will still work just fine.

For users that are having problem when using C++11 threads that are
set up after the OSG's viewer is realized and sets the affinity to a
specific core , you can use the default constructed Affinity() shown
in the --reset branch above as this will effectively disable the
affinity (empty is affinity is no affinity) and allow the threads to
run on all cores.

However, for graphics performance reasons I'd recommend against not
setting affinity for you additional threads. The right way to manage
threading in a graphics application is to make sure the thread that
runs the frame loop and runs any graphics thread run on with defined
affinity to prevent the thread jumping cores and breaking CPU cache
and pushing up bandwdith load on memory. Scene graph traversals
primarily CPU cache and bandwidth limited so it's crucial for
performance. This means that it's important that scene graph threads
have affinity assigned to them, but it's also important that your
other threads that are doing work don't get dropped on the same cores
as the ones handling the scene graph otherwise the sharing of cores
with multiple threads we create competition for cache and bandwidth,
which would reduced performance and increase the likelyhood of frame
drops.

The way to avoid your own threads from interferring with the main
scene graph threads would be to allocated your additional threads to
cores that don't overlap. So for an application running on iCoe7 that
has core 0 and 1 dedicated to scene graph traversal/rendering, anuse
something like:

OpenThreads::SetProcessorAffinityOfCurrentThread(OpenThreads::Affinity(2,
6));

Then set up your threads so they all run on core 2,3,4,5,6,7. After
you have set up all your threads and they are running then you'll need
to reset the main thread affinity before running the frame loop so the
OSG viewer's thread are left running on the correct cores.

Then set up your threads so they all run on core 2,3,4,5,6,7. After
you have set up all your threads and they are running then you'll need
to reset the main thread affinity before running the frame loop so the
OSG viewer's thread are left running on the correct cores:

OpenThreads::SetProcessorAffinityOfCurrentThread(0);

If you set up all your additional threads before the OSG sets up it's
threads then you'll not need this last step as the osgViewer will set
up the affinity for you.

Robert.
main.cpp

Robert Osfield

unread,
Sep 27, 2016, 12:10:24 PM9/27/16
to OpenSceneGraph Users
Hi All,

I have now implemented the new OpenThreads::Affinity class in Windows,
this is now checked in the the openthreads_affinity_mask branch.

at the same time I spotted a missing feature - I'm not the author of
OpenThreads and hadn't need to touch the Win32 implementation before
so hadn't looked at (I'm a linux dev 95% of the time). The missing
feature was setting the affinity for the current thread when the
current thread isn't an OpenThreads thread. This missing feature
meant unix (pthreads) vs windows was behaving different w.r.t the
osgViewer setting the affinity of the main thread. Now that I've
implemented this missing feature we should see the same behaviour.

Another inconsistency that I will be investigating next is the setting
the affinity of main thread when running the viewer multi-threaded vs
single-threaded. Both by default should do exactly the same thing and
set the affinity of the main thread to core 0. This is needed for
performance reasons.

One approach I'm considering is putting an OpenThreads::Affinity
object into osgViewer::VIewerBase so that this can be used to set the
affinity of the viewer's main thread. I'm also considering doing
something similar for osg::GraphicsContext::Traits and
osgDB::DatabasePager to provide a means of user specifying what
affinity they want. In the case as the database pager you would
typically want a pool of database paging threads that you give the
same affinity to, such as have them all sit on cores 4,5,6,7, while
the main viewer threads sit on cores 0,1,2,3 etc. I'm just giving
examples of how it could be laid out.

If doing the above is viable to implement then potentially one could
set the Affinity object for the viewer to allow the main thread to run
on all threads and therefore allow the usage case discussed earlier in
this thread. While it might be possible to get the OSG to work this
way, it's not something I would recommend at all. Not setting
affinity is bad for performance, I wouldn't want to encourage users to
end up with poorly performing applications because we have the
opportunity to totally ignore process affinity. My goal with is to
get the OSG to perform well out of the box as far as possible, this
has been what the OSG has been around since it's inception, using
sensible affinity setting out of the box is part of this.

Robert Osfield

unread,
Sep 28, 2016, 3:52:48 PM9/28/16
to OpenSceneGraph Users
I have now implemented support for manually setting the Affinity of
the various threads that the viewer manages - so main thread,
camera's, graphics threads and database pager threads. You'll now can
set these via:

osg::Camera::setProcessorAffinity(OpenThreads::Affinity);
osg::GraphicsContext::Traits::setProcessorAffinity(OpenThreads::Affinity);
osgDB::DatabasePager::setProcessorAffinity(OpenThreads::Affinity);
osgViewer::Viewer::setProcessorAffinity(OpenThreads::Affinity);

I have also added a ViewerBase::configureAffinity() that sets these up
automatically based on the viewer's combination of threading, cameras,
graphics contexts and database pagers. These new method does a better
job of setting up affinity that the previous codes as it also handles
the DatabasePager affinity, making sure that the DatabasPager threads
don't sit on any of the other viewer threads. This will be really
helpful for applications that use paged databases as it'll avoid the
possibility of a loader bumping a viewer thread during frame and so
reducing the likelhood of a frame break.

There is also a ViewerBase::setUseConfigureAffinity(bool flag) method
that users can call before ViewerBase::realize() to specify whether
the ViewerBase::setUpThreads() method calls the
ViewerBase::configureAffinity() method. If you set this property to
false then the setUpThreads() will not call configureAffinity() and
the set up the thread affinity based on the Camera,
GraphicsContext::Traits and DatabasePager's ProcessorAffinity
properties. By default all the Affinity settings are empty, which
signifies no CPU affinity so the associated threads will run on all
available CPUs.

If you do want to disable the call to configureAffinity() then thread
affinity is entirely in your hands, if you are power user and know
exactly what affinity settings to use for all the threads then it will
be a very useful mode, however, I'd caution on using without due
consideration as you can easily end up with an application that breaks
frame or runs at more erratic and lower frame rate. For the majority
of users I would recommend just using the defaults and let
configureAffinity() do it's magic.

For those who kicked off this thread, setting setUseConfigureAffinity
to false with the ViewerBase's ProcessorAffinity left as default will
mean the main thread will be told to run on all available CPU's so any
threads you set up after the viewer realize() will now run across all
threads. This might be what you are asking for... BUT... I would
strongly recommend against this approach. Not setting affinity in a
highly threaded visualization application will lead to a higher risk
frame drops and erratic and lower framerates. The correct thing to
do is have the Viewer use ConfigureAffinty or manually set the
Affinity and then set your additional threads to run on entirely
separate cores so they don't interfere with the rendering and
visa-versa so you end up with a better performing application.

The power and flexibility of handling affinity is now entirely in your
hands if you want, but think twice about it - if you are doing
threading then you really need to understand the topic including the
importance of processor affinity.

All these changes are checked into the openthreads_affinity_mask
branch. All the functionality is now in place so it's now just a case
of testing it out across various platforms and usage patterns. There
remains some debugging output to the console that I'll remove once
more testing has been done. Once things have settled down on this
branch and everything looks good I'll merge it with master so it'll be
part of the up coming OSG-3.6 release.

Robert.

Christian Buchner

unread,
Sep 29, 2016, 5:14:19 AM9/29/16
to OpenSceneGraph Users
Hi Robert,

Thank you very much for spending so much effort on the threading affinity (re)design. It is very much appreciated.

Christian

Chris Djali / AnyOldName3

unread,
Mar 30, 2020, 8:04:57 PM3/30/20
to OpenSceneGraph Users
Hi Robert (and anyone else who's paying attention),

I'm sorry to bump a controversial thread from four years ago, especially as what I'm about to say isn't on strictly the same topic, but I came across this thread while looking for something else and then read it anyway, and I had a couple of thoughts about it. They may well have been addressed in the intervening time, so I'm completely okay with a concise "it's not a concern" reply if they're things that have already been considered.

I fully accept that with a fixed-frequency CPU, provided nothing else has affinity locked to the same core, the caching benefits of thread affinity are beneficial. However, with processors from the last decade, they're very rarely fixed frequency, and sometimes a core has multiple threads, and there are other complicating factors, too.

I know of some software that performs badly with fixed thread affinity due to variable frequency. If you lock it to one core, that core gets hotter than the others, then the processor goes to a lower power state (e.g. turning off Turbo Boost) to let it cool down. If you don't lock it to one core, the OS scheduler attempts to balance keeping temperatures down against degrading cache, and the CPU stays in a higher-power state more of the time. Any speedup from improved cache usage is going to be dwarfed by a 33% drop in clock speeds, so I'd suggest testing if a CPU-bound OSG application still performs better with affinity set on a range of contemporary-ish hardware, as it may no longer be a valid assumption.

As for multiple threads per core (simultaneous multithreading, e.g. Intel's Hyperthreading), it adds complications to assigning affinity, as if you put two intensive threads on logical threads that are implemented on the same physical core, they're both going to contend for the same resources and performance will be degraded compared to if they'd been assigned logical threads on different cores, or it had been up to the OS to assign cores as it saw fit. It's even more complex than that, though. For the first couple of weeks after AMD's Ryzen launch, performance was worse than it is now as OS schedulers were making assumptions that held for Intel's Hyperthreading, but not for AMD's SMT. Some of this was also due to Ryzen's architecture, with inter-core communication delays being different for different cores, too - software needs to assign threads that talk a lot to the same CCX.

It's still possible to come up with an affinity assignment system that correctly optimises for all the constraints (otherwise hardware manufacturers wouldn't risk adding so many constraints) but I'm unconvinced it's a reasonable use of development time as it's a moving target and OS schedulers are pretty good these days. I'd be surprised if modernish hardware still saw a performance uplift from OSG managing affinity, and if it did, I'd be even more surprised if it was major. I'd feel more comfortable knowing that this has been tested at some point in the last few years.

I could be wrong, but it's always a good idea to re-evaluate whether your assumptions still hold as time goes on.

Cheers,

Chris
Reply all
Reply to author
Forward
0 new messages