Threads - When?

dd.fo...@googlemail.com

unread,

Dec 22, 2006, 5:42:43 PM12/22/06

to

How long will it be before c++ has anything to say about threads?
Richard Kavanagh

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Mathias Gaunard

unread,

Dec 23, 2006, 3:58:54 AM12/23/06

to

dd.fo...@googlemail.com wrote:
> How long will it be before c++ has anything to say about threads?

2 to 3 years.

Le Chaud Lapin

unread,

Dec 23, 2006, 4:03:10 AM12/23/06

to

dd.fo...@googlemail.com wrote:
> How long will it be before c++ has anything to say about threads?
> Richard Kavanagh

A Very Long Time, hopefully.

I think that it is possible to use C++ as it is effectively in a
multi-threaded environment, provided that good primitives to support
such an environment have been created. In particular, the
synchronization primitives are critical (no pun intended).

I know, for example, that the design of the threading model is not
arbitrary. If one is to write code that is portable and
multi-threaded, there are fundamental principles that are to be
discovered when devising the thread framework, let alone the other
classes like events, mutexes, and semaphores. In other words, simply
wrapping the OS's thread function in a class and "getting on with it"
is not sufficient.

I guess what I am trying to say is that, before C++ is design to
support threads, there remains the task of completing the model of a
thread itself, not at a low-level, but a higher-level, within any
language.

To give two hints:

1. There was a post not long about where a programmer wanted to
manually rollback the stack of another thread, essentially duplicating
what exception handling code would do.

2. There is also the recurring scenario where thread B is in the middle
of a lengthy process and thread A wants it to stop right away, but in a
clean way also.

I believe that both of these problems, if solved elegantly, would force
a fundamental reconsideration of the basic principles underlying thread
usage, and a framework supporting the new model could be implemented
entirely in C++, relatively portably, with no changes to the language.

-Le Chaud Lapin-

Joe Seigh

unread,

Dec 23, 2006, 2:16:17 PM12/23/06

to

Le Chaud Lapin wrote:
> dd.fo...@googlemail.com wrote:
>
>>How long will it be before c++ has anything to say about threads?
>>Richard Kavanagh
>
>
> A Very Long Time, hopefully.
>
> I think that it is possible to use C++ as it is effectively in a
> multi-threaded environment, provided that good primitives to support
> such an environment have been created. In particular, the
> synchronization primitives are critical (no pun intended).
>
> I know, for example, that the design of the threading model is not
> arbitrary. If one is to write code that is portable and
> multi-threaded, there are fundamental principles that are to be
> discovered when devising the thread framework, let alone the other
> classes like events, mutexes, and semaphores. In other words, simply
> wrapping the OS's thread function in a class and "getting on with it"
> is not sufficient.
>
> I guess what I am trying to say is that, before C++ is design to
> support threads, there remains the task of completing the model of a
> thread itself, not at a low-level, but a higher-level, within any
> language.
>

Unfortunately, they seem to be concentrating on low level design
right now, specifically the memory model. The problem is they
don't know how to do high level specification so they're creating
a low level specification and will do the higher level constructs
in the form of a meta implementation. It turns out that things
like deadlock are actually artifacts of implementation. There's
nothing inherent in locking that requires deadlock to happen. But
if your meta implementation can allow deadlock to occur then all
your actual lock implementations will be required to deadlock
under certain conditions.

There might be a loophole where you could claim if the specification
has non-observable behavior, you could ignore it. But with the fondness
for post condition definitions in C++, they will likely make behavior that
is non-observable in a multi-threaded environment deterministically
observable in a single threaded environment.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Gianni Mariani

unread,

Dec 23, 2006, 2:14:27 PM12/23/06

to

Le Chaud Lapin wrote:
...

>
> I believe that both of these problems, if solved elegantly, would force
> a fundamental reconsideration of the basic principles underlying thread
> usage, and a framework supporting the new model could be implemented
> entirely in C++, relatively portably, with no changes to the language.

Many threaded C++ libraries are available without having solved the
problems you state. While they might be nice to solve, there is a large
problem space that does not need to solve these.

IMHO, the fact that there are so many threaded C++ applications means
that it's sorely missing.

Gianni Mariani

unread,

Dec 24, 2006, 6:03:43 AM12/24/06

to

Joe Seigh wrote:
...

>
> There might be a loophole where you could claim if the specification
> has non-observable behavior, you could ignore it. But with the fondness
> for post condition definitions in C++, they will likely make behavior that
> is non-observable in a multi-threaded environment deterministically
> observable in a single threaded environment.

I think I know what you're saying but I don't think many others do.

I hadn't thought about post condition definitions in threaded
environments until just now but you do bring up a very interesting point.

There was one time where a colleague had created a primitive for local
static initialization and I remember he spent some time debugging it
only to find there was no way to test it using post conditions since the
optimal solution, while deterministic, was not testable. The test would
fail after many many hours of "stress testing" but apart from the test
failure there could be no undesirable behaviour. This was somewhat
convoluted and it turned out that the test was invalid. But the point
is, one very experienced and smart programmer can take many hours on an
invalid test, I wonder how much time will be wasted on post-conditions
that become invalid in a threaded environment.

--

Le Chaud Lapin

unread,

Dec 24, 2006, 6:03:10 AM12/24/06

to

Hi Joe,

Joe Seigh wrote:
> Unfortunately, they seem to be concentrating on low level design
> right now, specifically the memory model. The problem is they
> don't know how to do high level specification so they're creating
> a low level specification and will do the higher level constructs
> in the form of a meta implementation.

What do you mean here? Are they concocting new low-level C++
primitives that presume the existence of hardware instructions to
support those new primitives?

> It turns out that things
> like deadlock are actually artifacts of implementation.

Implementation of the programmer's problem?

If so, I agree. Deadlock is an engineering (design) issue. If a
system deadlocks, it deadlocks because the programmer has constructed
the program in a manner in which it should not have been constructed.
The fundamental questions then become:

1. Was there a lack of in C++ or OS primitives that made deadlock
inevitable?

2. Was the use of those primitives improper?

3. Were the primitives present, but in a form that obscures the path to
good system structure?

It has been my experience that, at least on industrial strength
operating systems like Microsoft Windows, the problem is never #1.
Programmers who appreciate their art will learn what they need to know,
so often, the problem is not #2. The problem is most often #3. I
could go to Google right now and do a search on "thread class" and
"C++", and find a bunch of examples of people who have wrapped threads
in classes. I think this is the wrong approach. Wrong because there
is a certain mood that a programmer should when using any framework,
and that mood, IMO, is not achievable by wrapping a thread in a class.
That mood is, however, achievable by wrapping the _other_
synchronization primitives in classes.

IBM has written a series of excellent articles that show how to port
applications to Linux from Windows. Reading these articles helps to
convince oneself that there is a defect in our current expectations of
the thread function, considering many scenarios:
http://www.ibm.com/Search/?q=threading+mutex+Windows&v=14&lang=en&cc=us&en=utf&Search.x=0&Search.y=0&Search=Search

These synchronization primitives do not exist in a form that is
portable, of course, but at least on many industrial strength operating
systems, they are present. The only qualm I with Microsoft's Win32 API
for synchronization is the critical section. It is the only primitive
where it is impossible to wrap it in a truly portable (no #define's)
C++ wrapper facade. It's raw presentation is a Windows-native struct,
as opposed to a HANDLE which is actually a pointer type. All the other
primitives can be made more or less portable.

But again, what I learned by wrapping these primitives is that the
problem with synchronization has nothing to do with defects in C++. It
has to do with how one thinks about concurrent programming. In
particular, to find a regular framework for multi-threading, it helps
to make certain fundamental assumptions about the thread function. An
analogy would be a stack for recursion. One does not really need a
stack to do recursion, but the assumption and expectation that it is
available makes life easier, and no modern programmer would think of
using recursive functions on any other basis.

The same applies to threading. Do we need a breakthrough? No.
Modification in expectations? Yes. For threading to feel natural, we
need to have confidence knowing that complexity will remain constant no
matter how many threads or locked global objects we have. This is a
matter of elegance. Take a look at this link, and imagine having 80 or
so invocations of this function and its friends in one program (as I do
now), and ask yourself if you would feel confident that your code would
not deadlock.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/waitformultipleobjects.asp

I wouldn't.

> There's
> nothing inherent in locking that requires deadlock to happen. But
> if your meta implementation can allow deadlock to occur then all
> your actual lock implementations will be required to deadlock
> under certain conditions.

What do you mean by this? :)

> There might be a loophole where you could claim if the specification
> has non-observable behavior, you could ignore it. But with the fondness
> for post condition definitions in C++, they will likely make behavior that
> is non-observable in a multi-threaded environment deterministically
> observable in a single threaded environment.
>

And this?

-Le Chaud Lapin-

--

Mathias Gaunard

unread,

Dec 24, 2006, 6:02:44 AM12/24/06

to

Le Chaud Lapin wrote:

> I think that it is possible to use C++ as it is effectively in a
> multi-threaded environment

The C++ standard doesn't guarantee that C++ is usable in a
multi-threaded environment.

That is what is being corrected.

Joe Seigh

unread,

Dec 27, 2006, 12:19:45 PM12/27/06

to

Le Chaud Lapin wrote:
> Hi Joe,
>
> Joe Seigh wrote:
>
>>There's
>>nothing inherent in locking that requires deadlock to happen. But
>>if your meta implementation can allow deadlock to occur then all
>>your actual lock implementations will be required to deadlock
>>under certain conditions.
>
>
> What do you mean by this? :)

A meta implementations is one way to specify a design. If your
meta implementation is inherently prone to deadlock, then your
actual implementations must be prone to deadlock as well or else
they're not compliant implementations. Deadlock would be a
required behavior.

Be careful what you ask for. You just might get it.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

James Kanze

unread,

Dec 31, 2006, 3:18:20 PM12/31/06

to

Le Chaud Lapin wrote:
> Joe Seigh wrote:
> > Unfortunately, they seem to be concentrating on low level design
> > right now, specifically the memory model. The problem is they
> > don't know how to do high level specification so they're creating
> > a low level specification and will do the higher level constructs
> > in the form of a meta implementation.

> What do you mean here? Are they concocting new low-level C++
> primitives that presume the existence of hardware instructions to
> support those new primitives?

No. They're defining what it means to access memory from more
than one thread at a time. Until this is done, any
multithreaded program in C++ works either by chance (which is
probably the case of most multithreaded programs), or because it
is based on additional guarantees given by the
implementation---there are "intuitive" mappings for much of the
Posix C model to C++, for example, even if they do leave a
number of C++ issues unanswered. (Note that on the platform I
most frequently program on, Sun CC and g++ use incompatible
mappings, which means that a program which is correct with
regards to threading for Sun CC may not be with g++, and vice
versa.)

--
James Kanze (Gabi Software) email: james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Dec 31, 2006, 3:19:05 PM12/31/06

to

Le Chaud Lapin wrote:
> dd.fo...@googlemail.com wrote:
> > How long will it be before c++ has anything to say about threads?

> A Very Long Time, hopefully.

In other words, you prefer the current situation, where it is
impossible to use threads at all in portable C++. (Even on the
same system: under Solaris, code written for Sun CC does not
work with g++, and vice versa, even though both use the same
underlying Posix implementation.)

> I think that it is possible to use C++ as it is effectively in a
> multi-threaded environment, provided that good primitives to support
> such an environment have been created. In particular, the
> synchronization primitives are critical (no pun intended).

You need more than primitives. As long as accessing memory from
two different threads is undefined behavior, you cannot write
multithreaded programs. And the current sequence point model
simply doesn't work in a multithreaded environment.

[...]
> To give two hints:

> 1. There was a post not long about where a programmer wanted to
> manually rollback the stack of another thread, essentially duplicating
> what exception handling code would do.

> 2. There is also the recurring scenario where thread B is in the middle
> of a lengthy process and thread A wants it to stop right away, but in a
> clean way also.

> I believe that both of these problems, if solved elegantly, would force
> a fundamental reconsideration of the basic principles underlying thread
> usage, and a framework supporting the new model could be implemented
> entirely in C++, relatively portably, with no changes to the language.

C++ has a requirement that it be implementable under current
operating systems. Under current operating systems---at least
under Solaris and Linux, and I think under Windows---there is no
way to implement a general solution to these problems; both
require significant collaboration in the manipulated thread. So
it would be extremely surprising if the C++ standard required a
solution to them. And useless---a standard that cannot be
implemented under Windows or Unix isn't of any interest to
anyone.

--
James Kanze (Gabi Software) email: james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Le Chaud Lapin

unread,

Dec 31, 2006, 8:55:57 PM12/31/06

to

James Kanze wrote:
> Le Chaud Lapin wrote:

> > I think that it is possible to use C++ as it is effectively in a
> > multi-threaded environment, provided that good primitives to support
> > such an environment have been created. In particular, the
> > synchronization primitives are critical (no pun intended).
>
> You need more than primitives. As long as accessing memory from
> two different threads is undefined behavior, you cannot write
> multithreaded programs. And the current sequence point model
> simply doesn't work in a multithreaded environment.

It's not undefined in my framework. I use old-fashioned
locking/unlocking.

>
> C++ has a requirement that it be implementable under current
> operating systems. Under current operating systems---at least
> under Solaris and Linux, and I think under Windows---there is no
> way to implement a general solution to these problems; both
> require significant collaboration in the manipulated thread. So
> it would be extremely surprising if the C++ standard required a
> solution to them. And useless---a standard that cannot be
> implemented under Windows or Unix isn't of any interest to
> anyone.

I have Windows and Linux covered meaning that, if you use only the
interfaces of my synchronization primitives (Mutex, Event, etc.) it
will cross compile, but the implementation is not portable of course.
But if you compile the _code_ that uses these primitives, that code is
portable in the strictest sense, meaning, it is not possible to look at
the code and tell which OS is for. This hints that there are actually
two problems:

1. Having available the basic primitives of synchronization
2. Constructing a framework in such way that the use of that framework
does not drive the programmer insane.

These are two separate issues.

The first problem is the domain of the OS people. For example, if
there were no atomic operations like test-and-set, or even an
ALU-managed XOR to external store, there would be nothing the C++
library writer could do to guarantee safeness. Luckily, that's not the
case. We not only have that, we have a lot more. It's just that some
OS vendors have not paid attention to those who have already crossed
the finish line.

Once the first part is done, then the second part can be finished, but
not before.

This is what I said about the OS people doing their part. If such a
framework is to have and minimal standards of elegance, it has to be
pure C++ to start with, the model itself has to make some sense, the
usage pattern has to make some sense, and there cannot be any
unexpected quirkiness because you had to compensate for the lack of a
kernel-mode primitive on a particular OS (like spawning a new thread
just to get a WaitableTimer on Windows CE).

In my experience, despite Win32's bland synchro API, it is wrappable,
and what is surprising is that if you ignore the limitation on number
of handles for WaitForMultipleObjects, and the fact that Windows CE
does not support Waitable Timers, all the primitives to provide closure
for synchronization are present. IBM also did a great job showing that
the same model can be realized on Linux.

The most important thing I think that is coming out of all of this
activity is that the realization that certain primitives are
fundamental, and not really optional, something discovered long ago
with Dijkstra's execution stack as it became generally accepted that an
execution stack is not something he simply dreamed up and thought would
be nice to have, but fundamental.

Also, to remain honest, I did not do critical sections, which means I
use Mutex's exclusively for locking, with performance hit, but that
will fixed. My gut feeling is that Microsoft implemented them using
nothing more than test-and-set with failover to a real mutex on
Windows. I neglected them because it was hard to make something
portable if your C++ class must include in its declaration something
that #include's from Wnidows.h].

So to summarize: The OS people can make the primitives as raw and
unsightly as they want as long as the primitives are present and
complete. From that, it becomes easy to write a clean, C++ wrapper
framework to support robust multi-threading.

-Le Chaud Lapin-

Mathias Gaunard

unread,

Jan 1, 2007, 10:48:12 AM1/1/07

to

Le Chaud Lapin wrote:

> It's not undefined in my framework. I use old-fashioned
> locking/unlocking.

Locking for reading is not that efficient.
Yes, even reading is undefined behaviour.

> The first problem is the domain of the OS people. For example, if
> there were no atomic operations like test-and-set, or even an
> ALU-managed XOR to external store, there would be nothing the C++
> library writer could do to guarantee safeness.

There is a proposal for adding atomic operations like this to the
language. I suppose that if they're available on hardware they will be
used as is and otherwise they would be software emulated with mutexes.

James Kanze

unread,

Jan 2, 2007, 10:35:47 AM1/2/07

to

Le Chaud Lapin wrote:
> James Kanze wrote:
> > Le Chaud Lapin wrote:

> > > I think that it is possible to use C++ as it is effectively in a
> > > multi-threaded environment, provided that good primitives to support
> > > such an environment have been created. In particular, the
> > > synchronization primitives are critical (no pun intended).

> > You need more than primitives. As long as accessing memory from
> > two different threads is undefined behavior, you cannot write
> > multithreaded programs. And the current sequence point model
> > simply doesn't work in a multithreaded environment.

> It's not undefined in my framework. I use old-fashioned
> locking/unlocking.

So you don't work under Windows, Linux (at least on a PC) or
Solaris. Or more likely, you're counting on some additional
guarantees from the implementation. Some of those guarantees
are pretty wide spread. Others not. The problem is that even
the widespread ones aren't part of the C++ standard (which
doesn't recognize threads) nor Posix (which doesn't recognize
C++). And not all are well documented, or even really
guaranteed from one release of the compiler to the next.

> > C++ has a requirement that it be implementable under current
> > operating systems. Under current operating systems---at least
> > under Solaris and Linux, and I think under Windows---there is no
> > way to implement a general solution to these problems; both
> > require significant collaboration in the manipulated thread. So
> > it would be extremely surprising if the C++ standard required a
> > solution to them. And useless---a standard that cannot be
> > implemented under Windows or Unix isn't of any interest to
> > anyone.

> I have Windows and Linux covered meaning that, if you use only the
> interfaces of my synchronization primitives (Mutex, Event, etc.) it
> will cross compile, but the implementation is not portable of course.

I'm less concerned about the interface -- I can always wrap
that. I'm more concerned about the actual synchronization
issues. My own experience is that code written for g++ doesn't
work correctly with Sun CC, and vice versa, because the actual
guarantees given by each compiler are different.

> But if you compile the _code_ that uses these primitives, that code is
> portable in the strictest sense, meaning, it is not possible to look at
> the code and tell which OS is for. This hints that there are actually
> two problems:

> 1. Having available the basic primitives of synchronization
> 2. Constructing a framework in such way that the use of that framework
> does not drive the programmer insane.

> These are two separate issues.

There is a fundamental problem with 1: knowing what the basic
primitives actually guarantee.

> The first problem is the domain of the OS people.

Partially, although not completely. Typically, the OS people
define a C interface, and you have to depend on an
implementation specific mapping to C++. Thus:

std::string s ; // Initialized from argv...

void f()
{
static std::string t( "xxx" ) ; // OK with g++, not with Sun
CC
if ( s[ 0 ] == t[ 0 ] ) { // OK with Sun CC, not with
g++
}
}

(I think that the g++ developpers consider this latter case an
error, and plan to fix it. There are other issues, like the
handling of pthread_cancel, however, where it is less than
obvious.)

> For example, if
> there were no atomic operations like test-and-set, or even an
> ALU-managed XOR to external store, there would be nothing the C++
> library writer could do to guarantee safeness.

Actually, test and set or an atomic xor are far from sufficient.
(They're also not typically available without actually writing
inline assembler.)

> Luckily, that's not the
> case. We not only have that, we have a lot more. It's just that some
> OS vendors have not paid attention to those who have already crossed
> the finish line.

> Once the first part is done, then the second part can be finished, but
> not before.

You can't do the first part until you define exactly what you
mean by synchronization. That's about the point where C++ is
now. And it's difficult, because you do want something that is
implementable (with reasonable overhead) on mainline systems.

[...]

> Also, to remain honest, I did not do critical sections, which means I
> use Mutex's exclusively for locking, with performance hit, but that
> will fixed.

I presume here that you are confusing the name of the system
request with the actual functionality it provides. I know of no
system today which provides a pure critical section; what
Windows calls a critical section is in fact a mutex.

> My gut feeling is that Microsoft implemented them using
> nothing more than test-and-set with failover to a real mutex on
> Windows.

My timing measurements on Solaris suggest that
pthread_mutex_wait only calls the system when it has to,
regardless of the threading model being used. Since I don't
think Sun have some miracle technology unavailable to others, I
rather suspect that most systems do this.

> I neglected them because it was hard to make something
> portable if your C++ class must include in its declaration something
> that #include's from Wnidows.h].

At some point, you have to. There are no synchronization
primitives in any of the standard C++ headers. The usual
solution is to create a wrapper class, with system dependant
implementations. (If the wrapper classes use the pimpl idiom,
there is no need for user code to depend on system specific
headers.)

> So to summarize: The OS people can make the primitives as raw and
> unsightly as they want as long as the primitives are present and
> complete. From that, it becomes easy to write a clean, C++ wrapper
> framework to support robust multi-threading.

The problem is that there is no portable set of guarantees you
can count on.

--
James Kanze (Gabi Software) email: james...@gmail.com

Conseils en informatique orientie objet/
Beratung in objektorientierter Datenverarbeitung
9 place Simard, 78210 St.-Cyr-l'Icole, France, +33 (0)1 30 23 00 34

Le Chaud Lapin

unread,

Jan 2, 2007, 11:23:17 AM1/2/07

to

Mathias Gaunard wrote:
> Locking for reading is not that efficient.
> Yes, even reading is undefined behaviour.

I am not trying to cause a big discussion on locking mechanisms but...

Locking is efficient if an atomic test-and-set operation is used _with_
old-fashioned kernel-mode wait-state models. This presumes that the
acquire() is satisfied most of the time. If there are times that it is
not and a wait-state has to be entered, it is not efficient, but that
would have happened anyway.

> > The first problem is the domain of the OS people. For example, if
> > there were no atomic operations like test-and-set, or even an
> > ALU-managed XOR to external store, there would be nothing the C++
> > library writer could do to guarantee safeness.
>
> There is a proposal for adding atomic operations like this to the
> language.

You cannot simply add it to the language. There must be atomic
operations in the processor that support the software. Either these
atomic operations are present, or they are not. If they are present,
then the system is ultimately relies on them to implement
synchronization. If they are not, then it should be intuitively
obvious that you are out of luck.

> I suppose that if they're available on hardware they will be
> used as is and otherwise they would be software emulated with mutexes.

Software mutexes, events, semaphores, etc. are all backed by atomic
operations. Even the most primitive form of spin-locking requires
atomic operations. If you ever see a paper talking about about
"lock-free" operations, quickly scan the paper for two things:

1. Use of atomic hardware instruction
2. Spinning (note that even this requires atomic reads and writes)

Once you have #2, all best are off, as spinning is an extremely
inefficient form of waiting. Kernel-mode device drivers use it often
because "they know it is ok to do so". For example, if you are
processing an Ethernet frame that just arrived over the MAC, and you
are holding a lock that another kernel-mode component needs, it might
be better for that component to simply spin and wait it out that go
through the trouble of enter a full-blow wait state.

Here is paper where author hand-waves a bit on the spinning issue. He
claims that, while a process is trying to (effectively) acquire a lock
it should "DoSomethingUseful()" or "just wait or spin".

http://www.ddj.com/dept/cpp/189401457?pgno=3

1. You cannot just "DoSomethingUseful()". Generalized software do
have tasks sitting around "waiting to be done".

2. "wait" is more like it, but processes are not human. They cannot
say, "Hey CPU..I am in a tight while loop at the moment. Could you
just not give me any quanta please? BUT!! I do not want to enter a
wait state. I am trying to process that lock-free operations work.
Just give get me off the CPU quick each time around and give my quanta
to someone else."

3. "spin" is more like it, because that is precisely what he is doing.
And if you are spinning, you are not in a wait-state, as far as the
CPU is concerned. And if you are not in a wait-state, the CPU will
allow the full-quanta. The machine will become intolerably slow.

-Le Chaud Lapin-

Dilip

unread,

Jan 2, 2007, 2:05:48 PM1/2/07

to

James Kanze wrote:
> Le Chaud Lapin wrote:
> > Also, to remain honest, I did not do critical sections, which means I
> > use Mutex's exclusively for locking, with performance hit, but that
> > will fixed.
>
> I presume here that you are confusing the name of the system
> request with the actual functionality it provides. I know of no
> system today which provides a pure critical section; what
> Windows calls a critical section is in fact a mutex.

No. The OP is right in this regard. Critical Sections are a
lightweight mechanism to synchronize access between 2 threads *within*
the same process. Grabbing a critical section is probably an atomic
test-and-set instruction. Mutexes on the other hand, have an
independant existence under windows. However critical sections do have
to drop down to kernel mode if a lock is under contention. If you are
interested you can take a look at this link:
http://msdn.microsoft.com/msdnmag/issues/03/12/CriticalSections/default.aspx
It delves into great depths.

Dilip

unread,

Jan 3, 2007, 8:46:42 AM1/3/07

to

Le Chaud Lapin wrote:
>
> 1. Use of atomic hardware instruction
> 2. Spinning (note that even this requires atomic reads and writes)
>
> Once you have #2, all best are off, as spinning is an extremely
> inefficient form of waiting. Kernel-mode device drivers use it often
> because "they know it is ok to do so". For example, if you are
> processing an Ethernet frame that just arrived over the MAC, and you
> are holding a lock that another kernel-mode component needs, it might
> be better for that component to simply spin and wait it out that go
> through the trouble of enter a full-blow wait state.
>
> Here is paper where author hand-waves a bit on the spinning issue. He
> claims that, while a process is trying to (effectively) acquire a lock
> it should "DoSomethingUseful()" or "just wait or spin".

[snipped]

> 3. "spin" is more like it, because that is precisely what he is doing.
> And if you are spinning, you are not in a wait-state, as far as the
> CPU is concerned. And if you are not in a wait-state, the CPU will
> allow the full-quanta. The machine will become intolerably slow.

This is getting wildly OT so I will confine myself to correcting a
small misconception. Spinning need not cause any kind of "intolerable"
slowless as you put it. For an example how a spin-lock can be
implemented efficiently under windows, take a look at this:
http://msdn.microsoft.com/msdnmag/issues/05/10/ConcurrentAffairs/#S3

Le Chaud Lapin

unread,

Jan 3, 2007, 8:57:12 AM1/3/07

to

James Kanze wrote:
> Le Chaud Lapin wrote:
> So you don't work under Windows, Linux (at least on a PC) or
> Solaris.

Don't know where you got that. My code works on all three for the most
part. What I said was that some OS's are just getting their
kernel-mode work into a state of closure, and Microsoft Windows is one
OS where it is more closed than not.

> Or more likely, you're counting on some additional
> guarantees from the implementation. Some of those guarantees
> are pretty wide spread. Others not. The problem is that even
> the widespread ones aren't part of the C++ standard (which
> doesn't recognize threads) nor Posix (which doesn't recognize
> C++). And not all are well documented, or even really
> guaranteed from one release of the compiler to the next.

I do not think C++ should be thread-aware any more than it should be
"device driver" aware. I lose not one second of sleep knowing that, if
I have a multi-threaded application where two threads attack the same
global variable, there is potential for a race-condition. This is
obvious, and IMO, has nothing to do with C++.

> I'm less concerned about the interface -- I can always wrap
> that. I'm more concerned about the actual synchronization
> issues. My own experience is that code written for g++ doesn't
> work correctly with Sun CC, and vice versa, because the actual
> guarantees given by each compiler are different.

The kernel-mode people should be concerned with synchronization issues,
not the compiler writer, IMO.

> > The first problem is the domain of the OS people.
>
> Partially, although not completely. Typically, the OS people
> define a C interface, and you have to depend on an
> implementation specific mapping to C++. Thus:
>
> std::string s ; // Initialized from argv...
>
> void f()
> {
> static std::string t( "xxx" ) ; // OK with g++, not with Sun
> CC
> if ( s[ 0 ] == t[ 0 ] ) { // OK with Sun CC, not with
> g++
> }
> }

What is not OK?

> (I think that the g++ developpers consider this latter case an
> error, and plan to fix it. There are other issues, like the
> handling of pthread_cancel, however, where it is less than
> obvious.)

> Actually, test and set or an atomic xor are far from sufficient.

> (They're also not typically available without actually writing
> inline assembler.)

And that is a good thing, and is entirely congruent, for example, with
not having access to the instruction that forces a task switch, or an
instruction that flushes the translation look-aside buffer (TLB).

> You can't do the first part until you define exactly what you
> mean by synchronization. That's about the point where C++ is
> now. And it's difficult, because you do want something that is
> implementable (with reasonable overhead) on mainline systems.

I mean synchronization in Andy Tanenbaum's book on Operating System's
sense, and I do not think it is difficult. I am always surprised that
C++ think "there is a lack of support for threads.." in C++. There is
a lack of support for threads in any language. The support comes from
a few hardware primitives that guarantee atomicity and kernel-mode
software techniques that have been well-understood for almost 40 years.
I do not understand what magic C++ programmers expect the language to
perform. All paths will ultimately lead to whether you have a hardware
test-and-set or something like it, and the basics of kernel-mode
synchronization.

Now if people are dismayed that we do not have a *library* that
interfaces to hardware/kernel-mode primitives, then that is
understandable, but it should be recognized that, if you expect such a
library, you had better expect a piece of hardware or an OS that has
such primitives. If you do not have atomicity of fundamental
operations, there is *nothing* you can do. This is intuitively
obvious, and I would imagine that someone somewhere made it his thesis
to prove it.

> I presume here that you are confusing the name of the system
> request with the actual functionality it provides. I know of no
> system today which provides a pure critical section; what
> Windows calls a critical section is in fact a mutex.

There is no confusion. I wrote my first kernel-mode threading system
almost 20 years ago, that let you define C functions to be called back
from kernel in user-mode. This critical section is most likely
implemented as a spin-lock with failover to a mutex. There is really
no other way to do it. In fact, after finishing this post, I am going
to write a program to use one, break into the debugger, and check.

> My timing measurements on Solaris suggest that
> pthread_mutex_wait only calls the system when it has to,
> regardless of the threading model being used. Since I don't
> think Sun have some miracle technology unavailable to others, I
> rather suspect that most systems do this.

Right, so if they are calling this a mutex, that's their choice. But
if it were a system-wide mutex, where multiple threads in different
processes needed to wait on the mutex, then there would be no user-mode
test-and-set (spin-locking), as the address spaces would be separate.
In that case, it would go to directly to the kernel, first time, every
time. This is simply a matter of choice of terminology. I did not
study the primitives on Solaris, but my gut feeling is that Microsoft
did some serious thinking in this area and got all their socks-on-feet
and gloves-on-hands. I would hope that Sun did the same thing. I
could not imagine that they did not.

Does sun have inter-process (named or unnamed) mutexes?

> > So to summarize: The OS people can make the primitives as raw and
> > unsightly as they want as long as the primitives are present and
> > complete. From that, it becomes easy to write a clean, C++ wrapper
> > framework to support robust multi-threading.
>
> The problem is that there is no portable set of guarantees you
> can count on.

Yes, that is hardly surprising, and no C++ compiler writer is going to
change that. They can influence that by having constructive
conversations with the people who will - the CPU manufacturers and
kernel-mode OS writers. Beyond the assumption that they have access to
atomic operations on a particular CPU, there is nothing that a C++
compiler writer is going to achieve to help C++ "support threads",
beyond implementing some kind of spin-lock.

Perhaps I am wrong, but I get the feeling that some people (not
necessarily you James) are not yet convinced that there is nothing you
can do inside the language proper to get synchronization to "work".
There is not! You might as well be having a conversation about how to
implement a portable cryptographically secure pseudo-random number
generator (CSPRNG) without asking the hardware and kernel-mode people
if every system upon with the CSPRNG is to be implemented will have a
source of entropy. If *hardware* or "kernel-mode" support is not
available, all bets are off.

Finally, I think it is delusional to think "We'll just tweak a little
bit here and there for the sake of global variables..." meaning that,
knowing that most hardware will probably provide some form of atomic
test-and-set/swap, etc., you try to create a systems where programmers
could expect global variables to be "covered".

You get two options: to wait or not to wait. There is no intermediate
state. You cannot block on access to a variable and simultaneously
pretend you are not waiting. You either spin, or you enter a
wait-state. There is nothing in between.

-Le Chaud Lapin-

Rupert Kittinger

unread,

Jan 3, 2007, 8:53:11 AM1/3/07

to

{ extraneous line breaks removed. -mod }

Dilip schrieb:

> James Kanze wrote:
>> Le Chaud Lapin wrote:
>>> Also, to remain honest, I did not do critical sections, which means I
>>> use Mutex's exclusively for locking, with performance hit, but that
>>> will fixed.
>> I presume here that you are confusing the name of the system
>> request with the actual functionality it provides. I know of no
>> system today which provides a pure critical section; what
>> Windows calls a critical section is in fact a mutex.
>
> No. The OP is right in this regard. Critical Sections are a
> lightweight mechanism to synchronize access between 2 threads *within*
> the same process. Grabbing a critical section is probably an atomic
> test-and-set instruction. Mutexes on the other hand, have an
> independant existence under windows. However critical sections do have
> to drop down to kernel mode if a lock is under contention. If you are
> interested you can take a look at this link:
>
http://msdn.microsoft.com/msdnmag/issues/03/12/CriticalSections/default.aspx
> It delves into great depths.
>
>

It may have been better to write something like "what Windows calls a
critical section has the same properties as what posix calls a mutex
(and what Windows calls a mutex corresponds to a posix semophore.)"

Fortunately, it is clear what is meant by "condition_variable" since
they are not available in the Win32 API :-)

Rupert

Lourens Veen

unread,

Jan 3, 2007, 2:29:28 PM1/3/07

to

Le Chaud Lapin wrote:

> James Kanze wrote:
>
>> Or more likely, you're counting on some additional
>> guarantees from the implementation. Some of those guarantees
>> are pretty wide spread. Others not. The problem is that even
>> the widespread ones aren't part of the C++ standard (which
>> doesn't recognize threads) nor Posix (which doesn't recognize
>> C++). And not all are well documented, or even really
>> guaranteed from one release of the compiler to the next.
>
> I do not think C++ should be thread-aware any more than it should be
> "device driver" aware. I lose not one second of sleep knowing that,
> if I have a multi-threaded application where two threads attack the
> same
> global variable, there is potential for a race-condition. This is
> obvious, and IMO, has nothing to do with C++.

The problem is not whether it will work or not, the problem is that
there is currently no way of writing a (guaranteed) portable C++
programme that uses threads. Currently, anything to do with threads
is undefined behaviour.

The C++ language semantics are specified in terms of an
abstract "virtual machine" to make the language specification
independent of any OS. If we want to implement a threading library,
then we have to specify its semantics in terms of that virtual
machine, just like with the rest of the language. And to be able to
define a mutex and say "No more than one thread shall own a mutex at
the same time", the virtual machine needs to support multithreading.

So, the question is how to extend the C++ virtual machine to support
threads, and do it in a way that keeps it compatible with all the
mainstream operating systems in use today. Only when we have that can
we describe primitives and/or a standard library extension in terms
of it.

Lourens

Dilip

unread,

Jan 3, 2007, 2:30:53 PM1/3/07

to

{ to mods: I swear this will be my last OT post :-) }

Rupert Kittinger wrote:
> Fortunately, it is clear what is meant by "condition_variable" since
> they are not available in the Win32 API :-)

Now they are, but only in Vista. See the 2nd point in this post:
http://www.bluebytesoftware.com/blog/PermaLink,guid,17433c64-f45e-40f7-8772-
dedb69ab2190.aspx

Le Chaud Lapin

unread,

Jan 4, 2007, 8:56:39 AM1/4/07

to

Dilip wrote:
> Le Chaud Lapin wrote:

> This is getting wildly OT so I will confine myself to correcting a
> small misconception. Spinning need not cause any kind of "intolerable"
> slowless as you put it. For an example how a spin-lock can be
> implemented efficiently under windows, take a look at this:
> http://msdn.microsoft.com/msdnmag/issues/05/10/ConcurrentAffairs/#S3

IMO, this is precisely on topic. There seems to be a lot of C++
programmers who think that they will be able to wield some special
magic to violate what I believe are fundamental principles in OS
design. I am glad we are discussing this, because, I intend to show
that that there is no magic, that there is nothing that will be done in
code to make us able to code-and-forget.

First, spin-locks, from a C++ programmers point of view, cannot be
implemented efficiently. I will get to what I mean by that in a
moment, but first, some notes about the article you posted. Jeffrey
Richter writes:

"No programmer in his right mind wants to think about thread
synchronization as he is coding his app. This is because thread
synchronization has little to do with making the program do what its
true intention is."

These two sentences prey upon what I believe is a false notion that
programmers are going to be able to avoid multi-threading issues
altogether and "just code". This is false. It is possible to build a
system were every single access to state is protected by a mutex, and
that would work, but short of that, you are going to have to do some
thinking.

He continues to write:

"Obviously, a thread that stops running is exhibiting very bad
performance. "

This is misleading. A thread that "stops running" does not have bad
performance. A thread that stops running has almost "no" performance.
Saying that the thread has "bad performance" gives the impression that
CPU cycles are being wasted, which is simply not true (in general). If
a thread is in a wait-state, then it is in that wait-state for a
reason, and the most likely reason is that, either it has nothing to
do, or it is logistically impossible for it to have been doing anything
anyway. There is a multiple-CPU situation where it is not ideal to
block if there is anything to do for a reader-writer pair, but that is
a special case.

And now to the "cannot be implemented efficiently" statement. I will
first quote what Jeffrey Richter wrote:

"It's impossible to know for sure if spinning in user mode would yield
better performance than transitioning to kernel mode, context-switching
to another thread, and returning back to user mode. It depends on how
much time the other thread needed to complete its task and this
information cannot be determined. Furthermore, instrumenting your code
in order to attempt to determine it would negatively impact
performance, which is what you're trying to improve. There is just no
way to guarantee a win with this. Also, there are many factors that
impact this: CPU make and model, CPU speed (GHz), hyper-threading,
dual-core, other threads running on the system, and so forth."

The first few words of the preceding quote is the thesis of what I have
been saying, and it something that anyone who has ever had to write a
high-performance device driver already knows. They *know* when to use
spin-locking and when not. If they do, they already know, in advance,
that it is going to be efficient. If they do not know in advance,
while designing the software, that is going to be efficient, they will
*not* use it. The *context* of the use of the spin-lock is critical
(no pun intended). People who write little user-mode spin-locks hoping
to get lucky will see their machine quickly enter rigamortis. Without
that context, without knowing that it is OK to spin, you *will* being
executing a user-to-kernel-mode transition and a context switch.

Jeffrey Richter writes:

"If the number of CPUs is 1, then Enter calls SwitchToThread before
looping around again."

And there is is. Got a guarantee, spin. No guarantee, to go kernel.
SwitchToThread goes to the kernel.

-Le Chaud Lapin-

Le Chaud Lapin

unread,

Jan 4, 2007, 8:57:43 AM1/4/07

to

Lourens Veen wrote:
> The problem is not whether it will work or not, the problem is that
> there is currently no way of writing a (guaranteed) portable C++
> programme that uses threads. Currently, anything to do with threads
> is undefined behaviour.

And IMO, that is the fault of the OS people and library writers, not
the language proper.

> The C++ language semantics are specified in terms of an
> abstract "virtual machine" to make the language specification
> independent of any OS. If we want to implement a threading library,
> then we have to specify its semantics in terms of that virtual
> machine, just like with the rest of the language. And to be able to
> define a mutex and say "No more than one thread shall own a mutex at
> the same time", the virtual machine needs to support multithreading.

I would agree with this point of view under one condition: absolutely
no changes to C++ would be made. Note that I am not requesting that no
changes to C++ be made, I am trying to prove a point, which is...

Threading is an OS issue, not a language issue. It is OK to have
expectations of what the OS supports. I do not think it is ok that the
"power of the language" will make all better. In other words, I would
be entirely happy if the OS people provided a full set of primitives,
and the C++ built a portable library on top of those primitives.

> So, the question is how to extend the C++ virtual machine to support
> threads, and do it in a way that keeps it compatible with all the
> mainstream operating systems in use today. Only when we have that can
> we describe primitives and/or a standard library extension in terms
> of it.

Write multi-threading applications first, *try* to build a multi-OS,
portable threading facade, and once you feel that you have a good idea
of the set of primitives that are required, go back to the OS people
(not C++ standards committee), and ask them to implement those
primitives on their OS's. Once that is done, normalize the threading
library, and then you are done.

I get the feeling that C++ programmers are, perhaps, expecting some new
keywords to be added to the language to make threading easier, or
whatever. Those keywords will end up being nothing more than library
calls in disguise. And most importantly, the _same_ attention to
detail will have to be exercise in the employment of those keywords as
would be required in the employment of thread-oriented library calls.

-Le Chaud Lapin-

Francis Glassborow

unread,

Jan 4, 2007, 4:05:45 PM1/4/07

to

In article <ef645$459bc545$8259a2fa$78...@news1.tudelft.nl>, Lourens Veen
<lou...@rainbowdesert.net> writes

>So, the question is how to extend the C++ virtual machine to support
>threads, and do it in a way that keeps it compatible with all the
>mainstream operating systems in use today. Only when we have that can
>we describe primitives and/or a standard library extension in terms
>of it.

Yes, we need to redesign/specify the virtual machine. The problem is
that the current virtual machine is a linear one and assumes that only
one processor is operating. That means that conventional MT assumes that
processing will be sequential even if there are periodic switches
between threads. This simply is not good enough for the hardware now
coming on line. We need a virtual machine that supports concurrent
processing (and before the version of C++ we are working on is
superseded i.e. circa 2019, I anticipate that standard machines will
have at least 256 cores (version of Moore's law based on the idea that
the number of cores will double every 18 months)

--
Francis Glassborow ACCU
Author of 'You Can Do It!' and "You Can Program in C++"
see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

James Kanze

unread,

Jan 4, 2007, 4:02:52 PM1/4/07

to

Dilip wrote:
> James Kanze wrote:
> > Le Chaud Lapin wrote:
> > > Also, to remain honest, I did not do critical sections, which means I
> > > use Mutex's exclusively for locking, with performance hit, but that
> > > will fixed.

> > I presume here that you are confusing the name of the system
> > request with the actual functionality it provides. I know of no
> > system today which provides a pure critical section; what
> > Windows calls a critical section is in fact a mutex.

> No. The OP is right in this regard. Critical Sections are a
> lightweight mechanism to synchronize access between 2 threads *within*
> the same process.

They implement what has commonly been called a mutex in the
threading community. Not a critical section. The two are
fundamentally different (although you can use a mutex to
implement critical sections, and vice versa). Depending on the
implementation of threads, in case of contention, you need a
system call to suspend the thread.

> Grabbing a critical section is probably an atomic
> test-and-set instruction.

Acquiring a mutex which is currently free is almost certainly a
single atomic instruction (with the necessary fences and/or
memory barriers, of course) in any modern OS. But that's
irrelevant. Mutexes and critical sections do different things:
a mutex protects data, where as a critical section protects a
part of the code. The semantics of a CriticalSection, under
Windows, are exactly those of a mutex, and quite different from
those of a critical section. (I'm most familiar with critical
sections from the work of Per Brinch-Hansen, but I think Hoare
was involved with their definition as well.)

> Mutexes on the other hand, have an
> independant existence under windows. However critical sections do have
> to drop down to kernel mode if a lock is under contention. If you are
> interested you can take a look at this link:
> http://msdn.microsoft.com/msdnmag/issues/03/12/CriticalSections/default.aspx
> It delves into great depths.

Don't get Microsoft's names for their API confused with the
underlying concepts, which have been around for a lot longer
than Windows. (Per Brinch-Hansen's monitor processes date from
the 1970's. And a mutex is a special case of a semaphore, which
was invented by Dijkstra, and is an even older concept.)

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Jan 4, 2007, 4:08:25 PM1/4/07

to

Le Chaud Lapin wrote:
> Dilip wrote:
> > Le Chaud Lapin wrote:
> > This is getting wildly OT so I will confine myself to correcting a
> > small misconception. Spinning need not cause any kind of "intolerable"
> > slowless as you put it. For an example how a spin-lock can be
> > implemented efficiently under windows, take a look at this:
> > http://msdn.microsoft.com/msdnmag/issues/05/10/ConcurrentAffairs/#S3

> IMO, this is precisely on topic. There seems to be a lot of C++
> programmers who think that they will be able to wield some special
> magic to violate what I believe are fundamental principles in OS
> design.

You keep claiming this, but you've presented no evidence for it.
And of course, the fact that bad programmers can abuse a feature
has never stopped it from being adopted into C++.

For what it's worth, I don't know of anyone on the committee who
isn't aware of the additional complexities that threading
implies. On the other hand, there are many cases where it is
the appropriate solution, and some where it is the only
solution. Ignoring it effectively either bans C++ from such
applications, or forces programmers to depend on implementation
definitions of undefined behavior.

> I am glad we are discussing this, because, I intend to show
> that that there is no magic, that there is nothing that will be done in
> code to make us able to code-and-forget.

But everyone already knows that. There's no silver bullet.

> First, spin-locks, from a C++ programmers point of view, cannot be
> implemented efficiently.

Who cares? As far as I know, there's no proposal to add
spin-locks to the language.

> Jeffrey
> Richter writes:

> "No programmer in his right mind wants to think about thread
> synchronization as he is coding his app. This is because thread
> synchronization has little to do with making the program do what its
> true intention is."

At which point, I stopped reading, because the author obviously
doesn't know much about threading and thread safety.

I think we agree that threading is hard, and threading issues
(who is responsible for what, etc.) must be addressed at the
design level.

> These two sentences prey upon what I believe is a false notion that
> programmers are going to be able to avoid multi-threading issues
> altogether and "just code". This is false. It is possible to build a
> system were every single access to state is protected by a mutex, and
> that would work,

Probably not. The granularity is too small.

There is no silver bullet. Threading must be part of the high
level design.

--
James Kanze (GABI Software) email:james...@gmail.com

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Lourens Veen

unread,

Jan 4, 2007, 4:10:46 PM1/4/07

to

Le Chaud Lapin wrote:

> Lourens Veen wrote:
>> The problem is not whether it will work or not, the problem is that
>> there is currently no way of writing a (guaranteed) portable C++
>> programme that uses threads. Currently, anything to do with threads
>> is undefined behaviour.
>
> And IMO, that is the fault of the OS people and library writers, not
> the language proper.

Consider the following

int i(42);

struct A {
A() : a(new int) {}
~A() {
delete a;
}
A(const A &);

int * a;
};

int f() {
A a;
++i;
}

int main() {
f();
std::cout << i << std::endl;
}

Would an optimising compiler be allowed to move the increment of i in
f() to before the construction of a if that is more efficient? Well,
yes, since there is no interaction between the two, so the result
would be the same.

Now imagine A is a lock class rather than a useless one, and that
there's another thread calling f(). And then explain to me how the
library people and the OS people are supposed to prevent this from
happening.

> Threading is an OS issue, not a language issue. It is OK to have
> expectations of what the OS supports. I do not think it is ok that
> the "power of the language" will make all better. In other words, I
> would be entirely happy if the OS people provided a full set of
> primitives, and the C++ built a portable library on top of those
> primitives.

But how do you define what the primitives in this library do?

Lourens

James Kanze

unread,

Jan 4, 2007, 4:10:02 PM1/4/07

to

Le Chaud Lapin wrote:
> Lourens Veen wrote:
> > The problem is not whether it will work or not, the problem is that
> > there is currently no way of writing a (guaranteed) portable C++
> > programme that uses threads. Currently, anything to do with threads
> > is undefined behaviour.

> And IMO, that is the fault of the OS people and library writers, not
> the language proper.

If the language says its undefined, no one else can make it
portably defined.

Let's take two simple cases from real life:

G++ (like everyone else, I think) uses objects with static
lifetime in its exception handling. Before 3.0, it used them in
such a way that if two threads raised an exception at the same
time, the exception object was corrupted. Please explain how
this is the fault of the OS people and library writers.

Also g++ pre-3.0: every constructor of std::string modified
(incremented, and often later decremented) a single global
counter object. You don't really expect users to use an
explicit lock everytime they construct a string, do you? Of
course, you might consider this a library issue, and not a
language issue, but std::string is defined in the C++ standard,
as part of the language.

> > The C++ language semantics are specified in terms of an
> > abstract "virtual machine" to make the language specification
> > independent of any OS. If we want to implement a threading library,
> > then we have to specify its semantics in terms of that virtual
> > machine, just like with the rest of the language. And to be able to
> > define a mutex and say "No more than one thread shall own a mutex at
> > the same time", the virtual machine needs to support multithreading.

> I would agree with this point of view under one condition: absolutely
> no changes to C++ would be made.

But adding such statements would be a change. In fact, analysis
has shown that the sequence point model doesn't adapt at all to
a multithreaded context. So a major part of the definition of
expression semantics, and what is or is not conforming, will
change.

> Note that I am not requesting that no
> changes to C++ be made, I am trying to prove a point, which is...

> Threading is an OS issue, not a language issue.

Threading is an issue at all levels. The OS can provide the
best primitives in the world, but if your code misuses them,
it's all for naught. At the lowest level, threading is a
hardware issue---if hardware does speculative loads, and pushes
loads forward (as modern hardware does), and doesn't provide the
necessary fence or membar instructions to inhibite this behavior
on command, there's no way an OS can offer the necessary
primitives. But of course, the OS has to play its part; you
don't want to have to implement all of the primitives in each
application, and typically, you probably can't, because some
things will require privileged mode. After that, the compiler
has to ensure that these primitives are actually available to
you, and that it doesn't automatically do anything forbidden in
the generated code. Intel processors have very few registers,
and so often have to spill in complex expressions. If the
compiler spills to static memory, it doesn't matter what the
processor and the OS offers. And of course, having a *standard*
interface, defined in the standard library, certainly helps
portability.

None of which, of course, means that the programmer can ignore
the issues, and not pay attention to threads. All it means is
that when I write a program, I only have to analyse its thread
safety once, against the standard guarantees, rather than for
each platform, against the guarantees given by that platform.

[...]

> I get the feeling that C++ programmers are, perhaps, expecting some new
> keywords to be added to the language to make threading easier, or
> whatever.

>From where do you get this feeling? My impression is just the
opposite, that there is an enormous resistance to adding any new
keywords for threading; that no one wants any new keywords for
thread support.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Sean Kelly

unread,

Jan 4, 2007, 4:42:58 PM1/4/07

to

Le Chaud Lapin wrote:
> Lourens Veen wrote:
> > The problem is not whether it will work or not, the problem is that
> > there is currently no way of writing a (guaranteed) portable C++
> > programme that uses threads. Currently, anything to do with threads
> > is undefined behaviour.
>
> And IMO, that is the fault of the OS people and library writers, not
> the language proper.

Perhaps I'm missing something, but aren't multithreading libraries
typically written in C/C++? And if so, then the library programmer
must currently rely on compiler-specific knowledge of how to prevent
optimizations across critical code barriers. Isn't this something
better suited to standardization? I think there's a definite appeal to
be able to write multithreaded code in a language and to have its
behavior be well-defined. The actual synchronization mechanism used
(if any) is immaterial.

> Threading is an OS issue, not a language issue. It is OK to have
> expectations of what the OS supports. I do not think it is ok that the
> "power of the language" will make all better. In other words, I would
> be entirely happy if the OS people provided a full set of primitives,
> and the C++ built a portable library on top of those primitives.

Same here, but as above, I still think that it is benificial for these
primitives have well-defined behavior in whatever language they're
implemented. Assuming they're just pure assembler, I suppose the C/C++
standardization issue need not apply beyond ensuring that the compiler
will not try to optimize across the calls (which seems like an implicit
guarantee anyway, given that they would be opaque).

> > So, the question is how to extend the C++ virtual machine to support
> > threads, and do it in a way that keeps it compatible with all the
> > mainstream operating systems in use today. Only when we have that can
> > we describe primitives and/or a standard library extension in terms
> > of it.
>
> Write multi-threading applications first, *try* to build a multi-OS,
> portable threading facade, and once you feel that you have a good idea
> of the set of primitives that are required, go back to the OS people
> (not C++ standards committee), and ask them to implement those
> primitives on their OS's. Once that is done, normalize the threading
> library, and then you are done.

Isn't this essentially what the POSIX thread definition is?

> I get the feeling that C++ programmers are, perhaps, expecting some new
> keywords to be added to the language to make threading easier, or
> whatever. Those keywords will end up being nothing more than library
> calls in disguise.

Not necessarily true. Some keywords may indicate to the compiler what
optimizations are allowable on or across particular data. Sure, much
of this can be faked by library calls to assembler code now, but I
would think that some language support beyond the ill-defined
"volatile" keyword might be useful.

> And most importantly, the _same_ attention to
> detail will have to be exercise in the employment of those keywords as
> would be required in the employment of thread-oriented library calls.

Certainly.

Sean

Mathias Gaunard

unread,

Jan 4, 2007, 4:42:35 PM1/4/07

to

Le Chaud Lapin wrote:
> Mathias Gaunard wrote:
>> Locking for reading is not that efficient.
>> Yes, even reading is undefined behaviour.
>
> I am not trying to cause a big discussion on locking mechanisms but...
>
> Locking is efficient

You don't seem to understand.
On most architectures, you don't need a mutual exclusion to read
constant data. Many threads should be able to read the same data at the
same time without any kind of locking.
However, this is not allowed by the C++ standard.

> You cannot simply add it to the language. There must be atomic
> operations in the processor that support the software.

While atomic operations allow to make more efficient locks, locks can be
made without those.
Dekker's or Peterson's algorithms are possible substitutes if atomic
locking operations are not available.

Le Chaud Lapin

unread,

Jan 5, 2007, 2:58:50 PM1/5/07

to

James Kanze wrote:
> Le Chaud Lapin wrote:
> You keep claiming this, but you've presented no evidence for it.
> And of course, the fact that bad programmers can abuse a feature
> has never stopped it from being adopted into C++.

There are things that I could argue about all day. This isn't one of
them. It is patently obvious (to me) the nature of FSM's, atomic
operations, etc. I fail to see where the mystery lies. What more needs
to be known? What is the committee looking for. And for the record, I
never said that threading was not to be used because it can be abused.
I use multi-threading each day, every day, and have been for about 7
years.

> For what it's worth, I don't know of anyone on the committee who
> isn't aware of the additional complexities that threading
> implies. On the other hand, there are many cases where it is
> the appropriate solution, and some where it is the only
> solution. Ignoring it effectively either bans C++ from such
> applications, or forces programmers to depend on implementation
> definitions of undefined behavior.

Great. multi-threading is good. I agree.

> > First, spin-locks, from a C++ programmers point of view, cannot be
> > implemented efficiently.
>
> Who cares? As far as I know, there's no proposal to add
> spin-locks to the language.

Then what is there to discuss? What is expected to be found? What is
the problem that is being addressed?

> > Jeffrey

>
> I think we agree that threading is hard, and threading issues
> (who is responsible for what, etc.) must be addressed at the
> design level.

Well, in my case, it is not hard any more. I want to make it clear
that I *never* expected any magic from C++ to help me in
multi-threading. I knew that any problems that I would have with
threading would be to

1. my own ignorance of the primitives available
2. deficiency in the primitive set
3. carelessness in the application of existing primitives.

Personally, I have fixed #1, at least enough that I have found closure
for big system design. I use most of the Windows synchronization
primitives except for atomic inter-locked exchange and friends and
fibers, which are not really threads. I also discovered, in my own
view, that #2 does not exist. The Windows team (old VMS guard) did a
great job here. As far as #3 goes, I got seriously burned twice, which
was enough. I am at the point now where, if it happens again, I go
into a mode of discipline to stop it. I haven't had any more problems
since.

> There is no silver bullet. Threading must be part of the high
> level design.

In any language. Which is why I see little reason for tweakage of the
language.

To reiterate, I do think it would be useful to ask:

1. Did the OS people provide closure in set of primitives offered?
2. Can we write a nice, clean library around those primitives.

Never of these tasks requires tweakage of C++ proper.

-Le Chaud Lapin-

Thomas Richter

unread,

Jan 5, 2007, 3:05:45 PM1/5/07

to

James Kanze wrote:

> there are "intuitive" mappings for much of the
> Posix C model to C++, for example, even if they do leave a
> number of C++ issues unanswered. (Note that on the platform I
> most frequently program on, Sun CC and g++ use incompatible
> mappings, which means that a program which is correct with
> regards to threading for Sun CC may not be with g++, and vice
> versa.)

Would you mind to elaborate a bit on this? Reason is that I'm
working on multithreaded code that is both targetted to the g++
as well as the Sun CC compiler. It would be helpful to know
some of the quirks.

Thanks,
Thomas

Lourens Veen

unread,

Jan 5, 2007, 3:03:49 PM1/5/07

to

James Kanze wrote:
> Le Chaud Lapin wrote:
>> Jeffrey
>> Richter writes:
>
>> "No programmer in his right mind wants to think about thread
>> synchronization as he is coding his app. This is because thread
>> synchronization has little to do with making the program do what
>> its true intention is."
>
> At which point, I stopped reading, because the author obviously
> doesn't know much about threading and thread safety.
>
> I think we agree that threading is hard, and threading issues
> (who is responsible for what, etc.) must be addressed at the
> design level.

Personally, if there was a way to do it, I'd much rather not think
about thread synchronisation while I'm coding my app. I'd much rather
say "X needs to be done" and let the compiler figure out how to
divide that task across the (potentially dozens or hundreds of)
available cores.

C++ covers a wide range of abstraction levels, from all the way down
to raw pointers and bit fields to very abstract things like
Boost.spirit, and everything in between. At lower levels threading is
definitely an issue to be reckoned with, but it would be nice if we
could eradicate it from the higher levels of abstraction and use a
more declarative style of programming there.

Generic programming can help there, and C++ is good at that. The STL
already allows parallel implementations of applicable algorithms on
containers for example.

Lourens

Le Chaud Lapin

unread,

Jan 5, 2007, 3:34:46 PM1/5/07

to

James Kanze wrote:
> Le Chaud Lapin wrote:
> > Lourens Veen wrote:
> > > The problem is not whether it will work or not, the problem is that
> > > there is currently no way of writing a (guaranteed) portable C++
> > > programme that uses threads. Currently, anything to do with threads
> > > is undefined behaviour.
>
> > And IMO, that is the fault of the OS people and library writers, not
> > the language proper.
>
> If the language says its undefined, no one else can make it
> portably defined.
>
> Let's take two simple cases from real life:
>
> G++ (like everyone else, I think) uses objects with static
> lifetime in its exception handling. Before 3.0, it used them in
> such a way that if two threads raised an exception at the same
> time, the exception object was corrupted. Please explain how
> this is the fault of the OS people and library writers.

The person who wrote that part of the G++ library probably expected it
to be used in a single-thread application. If it had been intended to
be used in a multi-threaded application, the library writer should have
written it for multi-threaded applications. If he did not write a
multi-threaded version, then that is the fault of the library writer.

If I write a program that contains 10 threads, and they all invoke
operator new() simultaneously, and cause the heap to become corrupted,
that will be either my fault, or the library writer's fault. If I am
using the single-threaded library to call against the heap, it will be
my fault. If I am using the multi-threaded library to call against the
heap, and it still happens, then it will be the library writer's fault.
If I am doing what I do today, calling the multi-threaded version in a
multi-threaded application without incident, then there is no fault.

> Also g++ pre-3.0: every constructor of std::string modified
> (incremented, and often later decremented) a single global
> counter object. You don't really expect users to use an
> explicit lock everytime they construct a string, do you? Of
> course, you might consider this a library issue, and not a
> language issue, but std::string is defined in the C++ standard,
> as part of the language.

Library issue. I have a string class too. It does not have this
problem.

There is a programmer currently writing a very nice Big Integer class
that I would hope would someday become part of the problem. Two weeks
ago, he stated, "I am going to get rid of static dependencies so that
it can be thread-safe." So he knows.

(BTW, the first time someone told me that STL was doing this (in
several places), what you mentioned above, I was appalled. He also
said there were similar issues with the standard containers.)

> But adding such statements would be a change. In fact, analysis
> has shown that the sequence point model doesn't adapt at all to
> a multithreaded context. So a major part of the definition of
> expression semantics, and what is or is not conforming, will
> change.

Statements or library functions? Note that is entirely conceivable
that, at the end of the day, the code that uses "standard" C++
multi-threading support by way of libraries does not have much visual
distinction from that which supports by way of added keywords. The
thesis of what I am saying, that it is my opinion, based on my own
experience, that changes to the language are not necessary, any more
than it is necessary to change the language to grab random numbers from
the architecture, for example.

> Threading is an issue at all levels. The OS can provide the
> best primitives in the world, but if your code misuses them,
> it's all for naught. At the lowest level, threading is a
> hardware issue---if hardware does speculative loads, and pushes
> loads forward (as modern hardware does), and doesn't provide the
> necessary fence or membar instructions to inhibite this behavior
> on command, there's no way an OS can offer the necessary
> primitives. But of course, the OS has to play its part; you
> don't want to have to implement all of the primitives in each
> application,

You cannot implement them in the application. They have to be part of
the harware/OS environment. There is no programmer in the world who is
going to wield any type of magic to implement atomicity if it is not
already present in the hardware infrastructure. Incidentally, I spoke
to a friend who helps design the Power microprocessors for IBM to
confirm this this morning (random back-off tricks notwithstanding).

>and typically, you probably can't, because some
> things will require privileged mode. After that, the compiler
> has to ensure that these primitives are actually available to
> you, and that it doesn't automatically do anything forbidden in
> the generated code. Intel processors have very few registers,
> and so often have to spill in complex expressions. If the
> compiler spills to static memory, it doesn't matter what the
> processor and the OS offers. And of course, having a *standard*
> interface, defined in the standard library, certainly helps
> portability.

???? Spill to static memory? What about the stack? Even Microsoft
publish the rules governing the use of registers for their compilers.
I have been writing assembly language on Intel platforms since the
8088. I have never known any programmer to go to static memory because
of the paucity of register space, unless it was a situation where it
was conceptually appropriate to do so (frame buffer operations).
Needless to say, those functions would not be entrant, but no one would
expect them to be.

If compilers do this, they obviate recursion, in both trivial and
non-trivial cases.

> None of which, of course, means that the programmer can ignore
> the issues, and not pay attention to threads. All it means is
> that when I write a program, I only have to analyse its thread
> safety once, against the standard guarantees, rather than for
> each platform, against the guarantees given by that platform.

I do sympathize with the "mood" that you seek. I am 100% in agreement
with having the programmer feel good while using the primitives for
multi-threading Again, I think that "feel good" can be achieved
entirely using libraries without perturbations to the language itself.

> >From where do you get this feeling? My impression is just the
> opposite, that there is an enormous resistance to adding any new
> keywords for threading; that no one wants any new keywords for
> thread support.

Well, that makes me feel a lot better.

I get the feeling from what others have been writing about CSP,
efficiencies, or lack there-of with blocking, how it is bad that two
threads operating against an unprotected global variable can cause that
global variable to become corrupt...??? Isn't this obvious????

The last complaint is particularly important. It gives the impression
that, if the problem is being complained about, then those who are
doing the complaining do not know what is the solution. Otherwise,
they would not be complaining. And if they do not know the solution,
they are not yet aware that these problems can be easily fixed using
existing primitives (libraries) that are entirely orthogonal to the
language itself (as is my opinion and that of others). And if they
complain in the context of c++.moderated, there is the impression that,
not only are libraries not enough, something more must be done with
C++. But if the libraries are not enough, the only thing left to be
done with C++ is to change the language itself.

-Le Chaud Lapin-

Le Chaud Lapin

unread,

Jan 5, 2007, 3:33:47 PM1/5/07

to

Lourens Veen wrote:
> int i(42);
>
> struct A {
> A() : a(new int) {}
> ~A() {
> delete a;
> }
> A(const A &);
>
> int * a;
> };
>
> int f() {
> A a;
> ++i;
> }
>
> int main() {
> f();
> std::cout << i << std::endl;
> }
>
> Would an optimising compiler be allowed to move the increment of i in
> f() to before the construction of a if that is more efficient? Well,
> yes, since there is no interaction between the two, so the result
> would be the same.

I would expect the value that is printed out to be 43.

I do not agree that the compiler writer would be allowed to reorder the
statements so that i is incremented first. For example, there is only
the declaration of the copy constructor of A above. There is no
guarantee that the definition of the copy constructor does not depend
upon the value of i. So in a world were there were no such thing as
multi-threading, IMHO, a reordering of the code would be incorrect.

> Now imagine A is a lock class rather than a useless one, and that
> there's another thread calling f(). And then explain to me how the
> library people and the OS people are supposed to prevent this from
> happening.

I do not understand what you mean by a lock class. Do you mean it
acquires a lock?

If there is a potential for deadlock because some function is calling
another function that contains an auto object that will block on an
attempt to call a lock, that is the fault of the programmer, not the
language. The perennial question is, as always, :

"What was the programmer trying to do when he did that?"

> > Threading is an OS issue, not a language issue. It is OK to have
> > expectations of what the OS supports. I do not think it is ok that
> > the "power of the language" will make all better. In other words, I
> > would be entirely happy if the OS people provided a full set of
> > primitives, and the C++ built a portable library on top of those
> > primitives.
>
> But how do you define what the primitives in this library do?

Ah. This is a catch-22. It is difficult to develop an intuitive
notion of what primitives are necessary without gaining much experience
with multi-threading. But it is difficult to gain much experience with
multi-threading if ones spends too much time trying to debug race
conditions and deadlocks.

There are two solutions to this dilemma:

1. Study first then do. I imagine that a programmer's experience with
synchronization often comes when there is a need to implement
multi-threading in an application. This might not be the best approach
to learning. Perhaps it would be better to experiment first, not on a
production system, but a toy. Play much. Get a good feel for what
everything is for. Do not make the same mistake I did and assume that
much of what you see is superfluous, for example, in the case of
Microsoft's implementation. The day might come where you change your
mind. After playing and learn, _then_ apply to real systems, but do not
use the API in the raw. It is too unsightly for that. Use C++ wrapper
classes to wrap. DuplicateHanndle and its equivalent are necessary.

2. Talk to experts, the kind that spend 6 hours a day in the kernel.
It seems that the C++ community is not yet aware of the right question
to ask about threading. If this is the case, it takes only a simple
conversation with, say, 10 experts, who have been working with
synchronization for 30 years, and at least convey to them the visions
we have, no matter how vague they are. Those experts will be able to
complete the picture, and indicate if we are asking for something that
we think we need but do not. If the opposite is the case, then they
will indicate that also.

If #1 is the approach, after learning, write multi-threaded
applications. Write many. Get burned once or twice (a burn can last
several days if it is really hot). There will come a point when the
same primitives recur repeatedly. Not suprisingly, they are the ones
you would expect from reading any book on OS design: Events,
Semaphores, Mutexes. Then, if you do any real-time or device driver
work, you can see where spin-locks make sense. Then you might do some
timer-related work, such as processing any of N queues that are
schedule for processing at different times. Waitable timers shine
here. Then you can see that system-wide mutexes are inefficient if you
are using them in the same process (related group of threads on
Windows), so the addition of user-mode spin-locks with mutex failover
becomes a very-nice-to-have, and come to appreciate what Microsoft
chose to call critical sections.

There are other primitives both in user-mode and kernel-mode for atomic
test-and-set, pointer swapping, etc. Bus-locking has been around
forever. These are useful of course, but the ones that the average C++
programmer needs to write (very large) multi-threaded system are the
bread-and-butter primitives: events, semaphores, mutexes, waitable
timers, critical sections.

Finally, the really, really big one, the one that ties everything
together, and makes you feel like you do not have to struggle with
managing what would otherwise be overwhelming complexity, is none other
than WaitForMultipleObjects and its equivalents. This surprisingly
useful function that I had originally placed in the
"Hmmm..interesting...not sure why someone would want that...." category
long ago when I first saw it. The day I discovered what it is really
used for, I was ready to kiss the feet of the Microsoft engineer who
wrote it. This function is crucial for large, complex, multi-threaded
applications!

Again, I do not use these primitives in the raw. I have a set of
wrapper classes, which makes using the more pleasurable than the raw
Microsoft API.

Basic primitives that you might need to write large multi-threaded
applications:

1. Events
2. Mutexes
3. Semaphores
4. Waitable Timers (block until a point in time occurs. *not* the same
as sleeping)
5. Critical Sections (spin in user mode, drop to kernel if spin did not
work)
5. WaitForSingleObject
6. WaitForMultipleObjects (importance should not be underestimated,
IMO)

Things like spin-locks, asynchronous procedure calls, condition
variables, timer queues, atomic operations, fibers...these can be
useful in other circumstances...but I would think that a C++ programmer
who wants to have something relatively complete without too much fuss
could get by with these, all wrapped of course.

-Le Chaud Lapin-

Gerhard Menzl

unread,

Jan 5, 2007, 4:16:19 PM1/5/07

to

Le Chaud Lapin wrote:

> Threading is an OS issue, not a language issue. It is OK to have
> expectations of what the OS supports. I do not think it is ok that
> the "power of the language" will make all better. In other words, I
> would be entirely happy if the OS people provided a full set of
> primitives, and the C++ built a portable library on top of those
> primitives.

If you believe, as it seems to be the case, that a library built on OS
primitives is all it takes to make any C++ program thread-safe, then
please consider:

struct A { A(); };

void f()
{
static A a;
}

and explain how you would prevent the variable a from being constructed
more than once, using library calls only, and without any thread-related
guarantees by the language.

--
Gerhard Menzl

Non-spammers may respond to my email address, which is composed of my
full name, separated by a dot, followed by at, followed by "fwz",
followed by a dot, followed by "aero".

Le Chaud Lapin

unread,

Jan 6, 2007, 11:46:28 AM1/6/07

to

Gerhard Menzl wrote:
> Le Chaud Lapin wrote:

> If you believe, as it seems to be the case, that a library built on OS
> primitives is all it takes to make any C++ program thread-safe, then
> please consider:
>
> struct A { A(); };
>
> void f()
> {
> static A a;
> }
>
> and explain how you would prevent the variable a from being constructed
> more than once, using library calls only, and without any thread-related
> guarantees by the language.

Hi Gerhard,

I was hoping for this example. This is one of the cases where the C++
standards bodies should have been well aware of the consequences. I
remember reading that the initialization of a is done once, and only
once, and thinking, "Ok, but...surely they know the consequences of
making that guarantee." Furthermore, there is an issue that is
actually unrelated to multi-threading here: the compiler has to make
sure there is a way to know if the a has already been initialized
(extra variable).

So yes, to answer you question, this is one of the examples were a race
condition, on principle, can occur. But I place this in the "Warning:
This is not a recursive function category." Note that, even if you
were guaranteed that the a were somehow magically initialized once, you
will still have the problem where it is a global variable in disguise.
Multiple threads would still have to work against this single,
fundamentally global, variable.

My position:

Given that there is really only one variable for multiple threads, and
that, if the function is expected to operate against that global
variable, then multiple initializations is only one problem - there
remains the problem of mutual exclusion.

Redesign the code, make the variable global (just outside the function,
if necessary), and put a mutex on it.

-Le Chaud Lapin-

--

Le Chaud Lapin

unread,

Jan 6, 2007, 11:47:42 AM1/6/07

to

Lourens Veen wrote:
> Personally, if there was a way to do it, I'd much rather not think
> about thread synchronisation while I'm coding my app. I'd much rather
> say "X needs to be done" and let the compiler figure out how to
> divide that task across the (potentially dozens or hundreds of)
> available cores.
>
> C++ covers a wide range of abstraction levels, from all the way down
> to raw pointers and bit fields to very abstract things like
> Boost.spirit, and everything in between. At lower levels threading is
> definitely an issue to be reckoned with, but it would be nice if we
> could eradicate it from the higher levels of abstraction and use a
> more declarative style of programming there.
>
> Generic programming can help there, and C++ is good at that. The STL
> already allows parallel implementations of applicable algorithms on
> containers for example.

Hi Lorurens,

You certainly have my empathy regarding ease-of-use.

Yet, if I thought that were something that the language could do to
help with threading by way of say, adding keywords or operators, I
would support it enthusiastically. But as I mentioned in another post,
I think if such keywords where "invented", there would be an extremely
strong parallel between the use of those keywords and the use of
library calls that do the same thing. Note that implicit in my view
is that the "I just need something to happen" mentality will not work
with multi-threading. Threading is just about being able to spawn and
kill new threads. Nor is it only about preventing simultaneous access
to state. It goes far beyond that.

What many in this group is doing (including myself sometimes) is mixing
"multi-threading" with synchronization.

Multi-threading means just that. You can have multiple threads. But
this does not necessarily imply synchronization. You could have 10
threads working on 10 global objects, 1 object per thread, and never
need to think about synchronization. No mutexes needed.

Or you could have one thread only in an application, that blocks into a
GUI message is placed into the Window message queue of WIN32K.SYS in
Microsoft Windows. Here, there is only one thread, but you need an
event, and the part of the system that is not yours, the part that
controls the queues, has a spin-lock backed by a mutex.

Or you can have system that mixes both threading and synchronization.

One thing that I know for myself is that threading/synchronization
*requires* conscious design. This is a system design issue. That your
poor global variable got trampled on because you did not protect it
with a mutex is not simply a practical "inconvenience". People who use
semaphores on queues and mutexes on global state are not simply trying
to serialize access. They are organizing processes, as in flows of
execution, deliberately coordinated with deliberate control. The same
thing goes on in a chemical factory - you _must_ think from a systems
mentality. You must be cognizant of the reason you have multiple
threads in the first place.

I would not call my threading experience spectacular, but I wrote my
first kernel-mode preemptive multi-tasker on an iAPX 80386 in assembler
and C in 1989, and I have been using Windows threading and
synchronization daily at since 1998, and I I can tell you from my own
experience: there is no magic to be had. I think the only way to share
my point of view is to write a lot of threading applications. I am not
saying that programmers here have not. Frankly, I have no idea what
threading experience people here have, but based on posts about
multiple CPU's, the post about spin-locking being efficient, etc...I
get the feeling that it is not something that they do every day.

As far multiple processors and multiple cores, that is handled
automatically. The engineers at Intel, Freescale, ARM, IBM, National
Semiconductor, MIPs...when they design these chips, they spend much
time anticipating how they will be used, both by the general programmer
and by the OS writer who has to support all the things that a
multi-tasking OS is expected to support (optimal switching,
mutual-exclusion, paging, life-time management, kernel-mode exceptions,
double-preemption, the list is endless). If you look at the original
architecture of the 80386, for example, you will see that they not only
anticipated how things should be, but also accommodated the legacy
system by providing the option of segmentation and page, the registry
set, the curiously convenient word that could be used to locate a page
in the swap file on hard disk that just happened to be *EXACTLY* the
right size you would want....these engineers know what we need. They
anticipate _very_ well, so the result is that, if you write against the
API that is provided by your compiler, you will hardly ever have to
worry about these things. The know that, if it happens that there CPU
is being used in a 4-way or 8-way system, the average programmer would
really rather not care about that. If it *does* happen that you have to
start consciously thinking about cores, they are not going to let you
find out the hard way. They will simply tell you, in the reference
manuals.

To give another example, am typing this post on a dual core machine.
Another machine I had a few months ago was a multiway AMD. Not once
did I give a thought about the cores. Multi-threading, the kind that
90%+ of programmers want, is old hat. The need to be conscious of the
design of system-wide synchronization is not going to be obviated by
the language.

My suggestion: Write some multi-threading code (if you have not
already) :)

If you have a multi-threading application, feel free to send me a
description (even something vague would be OK), and I will post here a
design description that shows that it can be done elegantly and
robustly in the context of the existing language using synchronization
primitives.

-Le Chaud Lapin-

Chris Vine

unread,

Jan 6, 2007, 12:14:41 PM1/6/07

to

Thomas Richter wrote:

> James Kanze wrote:
>
>> there are "intuitive" mappings for much of the
>> Posix C model to C++, for example, even if they do leave a
>> number of C++ issues unanswered. (Note that on the platform I
>> most frequently program on, Sun CC and g++ use incompatible
>> mappings, which means that a program which is correct with
>> regards to threading for Sun CC may not be with g++, and vice
>> versa.)
>
> Would you mind to elaborate a bit on this? Reason is that I'm
> working on multithreaded code that is both targetted to the g++
> as well as the Sun CC compiler. It would be helpful to know
> some of the quirks.

The OP may be thinking of something else, but the main cases where different
implementations take different approaches seem to be in relation to thread
cancellation. Because Posix only specifies how a C program is to behave,
it does not specify, for example, if and how local objects in scope with
non-trivial destructors are to be dealt with (in particular, whether the
stack is to be unwound). NPTL (rather than g++ itself) takes a rather
idiosyncratic approach to this.

A related issue is how uncaught exceptions are to propagate in
multi-threaded programs (should they propagate to thread termination, or do
what the standard requires which is terminate the entire process, which
implies crossing thread boundaries).

As it happens, on a literal construction Posix imposes unusual obligations
with respect to memory synchronization in paragraph 4.10 of General
Concepts which I do not think any practical implementation would want to
follow, but that is not a C->C++ issue.

Chris

--
To reply by e-mail, remove the "--nospam--"

Gerhard Menzl

unread,

Jan 8, 2007, 11:50:25 AM1/8/07

to

Le Chaud Lapin wrote:

> Gerhard Menzl wrote:
>
>> struct A { A(); };
>>
>> void f()
>> {
>> static A a;
>> }

> I was hoping for this example.

Then I wonder why you didn't bring it up yourself. *g*

> My position:
>
> Given that there is really only one variable for multiple threads, and
> that, if the function is expected to operate against that global
> variable, then multiple initializations is only one problem - there
> remains the problem of mutual exclusion.
>
> Redesign the code, make the variable global (just outside the
> function, if necessary), and put a mutex on it.

Why should I be forced to make a variable global when its scope is local
by nature? I am sorry, but to me this is just a cop-out. I think that
this sort of well-don't-do-that attitude is not enough to make C++ stay
competitive. There has been wide agreement in another thread here
recently (I don't remember which) that there are far too many cases of
undefined behaviour in the language. This, like multithreading in
general, is one of them, so it should be addressed by the standard.

Note that this does not mean that synchronization primitives go into the
core language, as in Java. It simply means that the definition of the
abstract machine acknowledges concurrent paths of execution. None of
your points about treating multiple threads as a high-level design issue
is invalidated by this.

--
Gerhard Menzl

Non-spammers may respond to my email address, which is composed of my
full name, separated by a dot, followed by at, followed by "fwz",
followed by a dot, followed by "aero".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Lourens Veen

unread,

Jan 8, 2007, 12:42:00 PM1/8/07

to

Le Chaud Lapin wrote:

>
> Lourens Veen wrote:
>> int i(42);
>>
>> struct A {
>> A() : a(new int) {}
>> ~A() {
>> delete a;
>> }
>> A(const A &);
>>
>> int * a;
>> };
>>
>> int f() {
>> A a;
>> ++i;
>> }
>>
>> int main() {
>> f();
>> std::cout << i << std::endl;
>> }
>>
>> Would an optimising compiler be allowed to move the increment of i
>> in f() to before the construction of a if that is more efficient?
>> Well, yes, since there is no interaction between the two, so the
>> result would be the same.
>
> I would expect the value that is printed out to be 43.
>
> I do not agree that the compiler writer would be allowed to reorder
> the
> statements so that i is incremented first. For example, there is
> only the declaration of the copy constructor of A above.

I intentionally did not define it, because I didn't want any A objects
to be copied. The idiom is to make the constructor private I think, I
should have done that to make it more obvious. Anyway, what you see
there is the entire programme. There are no A's being copied at all.
In fact, the A being constructed when entering f() could be optimised
away completely, since it has no visible effects.

> There is no
> guarantee that the definition of the copy constructor does not
> depend
> upon the value of i. So in a world were there were no such thing as
> multi-threading, IMHO, a reordering of the code would be incorrect.

There is no copy constructor being called anywhere, so this is
irrelevant.

>> Now imagine A is a lock class rather than a useless one, and that
>> there's another thread calling f(). And then explain to me how the
>> library people and the OS people are supposed to prevent this from
>> happening.
>
> I do not understand what you mean by a lock class. Do you mean it
> acquires a lock?

I mean a class like the boost::mutex::scoped_lock class. It
encapsulates a lock; when you construct an object it acquires the
lock, and when destructed it releases the lock. It's simply RAII
applied to locking.

> If there is a potential for deadlock because some function is
> calling another function that contains an auto object that will
> block on an attempt to call a lock, that is the fault of the
> programmer, not the
> language. The perennial question is, as always, :
>
> "What was the programmer trying to do when he did that?"

He was trying to lock a mutex _before_ writing to a shared variable,
but the compiler, having no concept of threading and no way to see
how the lock would be related to the shared variable, moved the write
to before the locking operation, thus creating an incorrect
programme.

>> > Threading is an OS issue, not a language issue. It is OK to have
>> > expectations of what the OS supports. I do not think it is ok
>> > that the "power of the language" will make all better. In other
>> > words, I would be entirely happy if the OS people provided a full
>> > set of primitives, and the C++ built a portable library on top of
>> > those primitives.
>>
>> But how do you define what the primitives in this library do?
>
> Ah. This is a catch-22. It is difficult to develop an intuitive
> notion of what primitives are necessary without gaining much
> experience
> with multi-threading. But it is difficult to gain much experience
> with multi-threading if ones spends too much time trying to debug
> race conditions and deadlocks.

I'm not asking for an intuitive notion. I'm asking what the C++
standard should say about the std::acquire_lock() function (assume
for a moment that we want to have one). It can hardly
say "std::acquire_lock() does what you would intuitively expect it to
do."

Lourens

Le Chaud Lapin

unread,

Jan 8, 2007, 3:29:59 PM1/8/07

to

Gerhard Menzl wrote:
> Le Chaud Lapin wrote:
>
> > Gerhard Menzl wrote:
> >
> >> struct A { A(); };
> >>
> >> void f()
> >> {
> >> static A a;
> >> }
>
> > I was hoping for this example.
>
> Then I wonder why you didn't bring it up yourself. *g*
>
> > My position:
> >
> > Given that there is really only one variable for multiple threads, and
> > that, if the function is expected to operate against that global
> > variable, then multiple initializations is only one problem - there
> > remains the problem of mutual exclusion.
> >
> > Redesign the code, make the variable global (just outside the
> > function, if necessary), and put a mutex on it.
>
> Why should I be forced to make a variable global when its scope is local
> by nature?

Because it is actually global in nature.

> I am sorry, but to me this is just a cop-out. I think that
> this sort of well-don't-do-that attitude is not enough to make C++ stay
> competitive.

Then be prepared to change many things about C++. If a function F(x)
returned the digits of Pi on each call, but also happened to launch a
ballistic missile on each invocation, you could not blame the compiler
for the side effect of superfluous launchings caused by multiple use of
the function name in an expression. There are many things in C++ that
presume that the programmer has context about they way things are.
This is simply one of them, IMO.

>There has been wide agreement in another thread here
> recently (I don't remember which) that there are far too many cases of
> undefined behaviour in the language. This, like multithreading in
> general, is one of them, so it should be addressed by the standard.

Well, as I have said before, when I look at that code, I never had any
illusions that it was not a global variable. The only thing I was
curious about was whether the compiler would initialize the global
variable before call to main() or wait until the first invocation of
the function. So I checked my TCPPPL, and saw that it was on first
invocation, which immediately lead to the question, "How does it know
it's the first time", which mean a global variable, which immediately
leads one to think, "Ok, I guess that works, but in a multi-threaded
program, it is going to be an issue."

But not one that is easily circumvented.

> Note that this does not mean that synchronization primitives go into the
> core language, as in Java. It simply means that the definition of the
> abstract machine acknowledges concurrent paths of execution. None of
> your points about treating multiple threads as a high-level design issue
> is invalidated by this.

Yes you're right. I definitely support discussing multi-threading
discussions in context of C++. Perhaps I was wrong to assume that
there are people who feel like the solution is to augment C++ with a
bunch of synchronization-related keywords, or worse, magically
ascertain the intent of the programmer in a complex, multi-threaded
application.

-Le Chaud Lapin-

--

James Kanze

unread,

Jan 8, 2007, 11:37:53 PM1/8/07

to

Le Chaud Lapin wrote:
> James Kanze wrote:
> > Le Chaud Lapin wrote:
> > You keep claiming this, but you've presented no evidence for it.
> > And of course, the fact that bad programmers can abuse a feature
> > has never stopped it from being adopted into C++.

> There are things that I could argue about all day. This isn't one of
> them. It is patently obvious (to me) the nature of FSM's, atomic
> operations, etc. I fail to see where the mystery lies.

That's because you don't understand threading. It's not a
mystery, but it's not trivial, either.

> What more needs to be known? What is the committee looking

> for?

To start with, a definition of the language which isn't based on
a single threaded model. Sequence points, for example, don't
mean anything in a multithreaded environment.

> And for the record, I never said that threading was not
> to be used because it can be abused. I use multi-threading
> each day, every day, and have been for about 7 years.

> > For what it's worth, I don't know of anyone on the committee who
> > isn't aware of the additional complexities that threading
> > implies. On the other hand, there are many cases where it is
> > the appropriate solution, and some where it is the only
> > solution. Ignoring it effectively either bans C++ from such
> > applications, or forces programmers to depend on implementation
> > definitions of undefined behavior.

> Great. multi-threading is good. I agree.

> > > First, spin-locks, from a C++ programmers point of view, cannot be
> > > implemented efficiently.

> > Who cares? As far as I know, there's no proposal to add
> > spin-locks to the language.

> Then what is there to discuss? What is expected to be found? What is
> the problem that is being addressed?

For starters, define what it means in the context of C++ for a
multi-threaded program to run.

> > > Jeffrey

> > I think we agree that threading is hard, and threading issues
> > (who is responsible for what, etc.) must be addressed at the
> > design level.

> Well, in my case, it is not hard any more. I want to make it clear
> that I *never* expected any magic from C++ to help me in
> multi-threading.

Nobody is expecting magic. On the other hand, you can't very
well have undefined behavior the second you invoke
pthread_create (which is the current status, at least as far as
the standard is concerned).

The standard defines C++ in terms of an execution model, and
that execution model is single threaded.

> I knew that any problems that I would have with
> threading would be to

> 1. my own ignorance of the primitives available
> 2. deficiency in the primitive set
> 3. carelessness in the application of existing primitives.

> Personally, I have fixed #1, at least enough that I have found closure
> for big system design. I use most of the Windows synchronization
> primitives except for atomic inter-locked exchange and friends and
> fibers, which are not really threads. I also discovered, in my own
> view, that #2 does not exist. The Windows team (old VMS guard) did a
> great job here.

Actually, it's known and recognized that one of the most
important primitives, conditions, isn't present in Windows (or
rather wasn't---it's recently been added).

But you keep returning to "primitives". There's a lot more to
threading than that.

--
James Kanze (Gabi Software) email: james...@gmail.com

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Jan 8, 2007, 11:38:43 PM1/8/07

to

Lourens Veen wrote:
> James Kanze wrote:
> > Le Chaud Lapin wrote:
> >> Jeffrey
> >> Richter writes:

> >> "No programmer in his right mind wants to think about thread
> >> synchronization as he is coding his app. This is because thread
> >> synchronization has little to do with making the program do what
> >> its true intention is."

> > At which point, I stopped reading, because the author obviously
> > doesn't know much about threading and thread safety.

> > I think we agree that threading is hard, and threading issues
> > (who is responsible for what, etc.) must be addressed at the
> > design level.

> Personally, if there was a way to do it, I'd much rather not think
> about thread synchronisation while I'm coding my app. I'd much rather
> say "X needs to be done" and let the compiler figure out how to
> divide that task across the (potentially dozens or hundreds of)
> available cores.

I'm familiar with Fortran compilers which do this. It's more
difficult with C/C++ because of aliasing. One of the major
motivations for adding the restrict keyword to C was to allow
it.

But that's only one use of threading. (One that isn't well
supported by the existing Posix functions, as it happens.) In
my current work, I don't have any large loops or arrays which
would be susceptible to this sort of optimization. On the other
hand, I have a couple of hundred client connections; processing
a client request generally involves a number of blocking
operations, and I need to be able to advance other client
requests while waiting. In an earlier application, a GUI, I
needed to be able to redraw the screen while waiting for the
server to respond. In these cases, it's possible to write the
application without threads, by maintaining a lot of explicit
state, but it's much easier to use threads, and keep the state
implicitly in local variables, on the stack.

--
James Kanze (Gabi Software) email: james...@gmail.com

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Jan 8, 2007, 11:37:18 PM1/8/07

to

Le Chaud Lapin wrote:
> Lourens Veen wrote:
> > int i(42);

> > struct A {
> > A() : a(new int) {}
> > ~A() {
> > delete a;
> > }
> > A(const A &);

> > int * a;
> > };

> > int f() {
> > A a;
> > ++i;
> > }

> > int main() {
> > f();
> > std::cout << i << std::endl;
> > }

> > Would an optimising compiler be allowed to move the increment of i in
> > f() to before the construction of a if that is more efficient? Well,
> > yes, since there is no interaction between the two, so the result
> > would be the same.

> I would expect the value that is printed out to be 43.

And that's almost all that the standard requires. (The presence
of new and delete complicate things, since the user can replace
them, and the replacement could cause different observable
behavior depending on the value of i. But a compiler could very
easily determine whether the replacement functions access i or
not.)

A good compiler might even replace the entire program with the
equivalent of:

int
main()
{
std::cout << "43\n" ;
std::cout.flush() ;
}

And be done with it. (And some compilers are this good.)

> I do not agree that the compiler writer would be allowed to reorder the
> statements so that i is incremented first. For example, there is only
> the declaration of the copy constructor of A above. There is no
> guarantee that the definition of the copy constructor does not depend
> upon the value of i.

The code never calls the copy constructor, so what it does is
irrelevant. If it did call the copy constructor, there must be
a definition of it somewhere, so the compiler can know whether
it affects the value of i or not.

> So in a world were there were no such thing as
> multi-threading, IMHO, a reordering of the code would be incorrect.

You don't seem to understand how good optimization works. The
best optimizers today look at the entire program. (I think that
even VC++ does this now.) The C++ standard defines the
semantics of the entire program, not just individual functions;
given program x, and input y, the observable behavior is z. The
compiler can do anything it wishes, as long as the observable
behavior is z. In fact, there can be several different
legal observable behaviors, and all the compiler has to do is
ensure that the actual observable behavior corresponds to one of
them. And the compiler has the right to assume that the program
contains no undefined behavior.

[...]

> Finally, the really, really big one, the one that ties everything
> together, and makes you feel like you do not have to struggle with
> managing what would otherwise be overwhelming complexity, is none other
> than WaitForMultipleObjects and its equivalents. This surprisingly
> useful function that I had originally placed in the
> "Hmmm..interesting...not sure why someone would want that...." category
> long ago when I first saw it.

Curious. I felt it intuitively natural, and implemented it in
my very first OS (back in 1980). On the other hand, it's fairly
easy to simulate: just wait for each event in a separate thread,
and have them feed into a common message queue.

It's actually not unusual for all external events to feed into a
single dispatcher thread, which in turn ventilates them out to
the different processing threads.

> The day I discovered what it is really
> used for, I was ready to kiss the feet of the Microsoft engineer who
> wrote it. This function is crucial for large, complex, multi-threaded
> applications!

I'd say that it's far more useful in small applications, things
simple enough not to need central dispatching. But of course,
you implement it once, and then reuse the implementation.
(There are also tricks under Posix using pipes: for every event
in the queue, you write a single byte in a pipe; you can then
wait on multiple pipes using select.)

> Again, I do not use these primitives in the raw. I have a set of
> wrapper classes, which makes using the more pleasurable than the raw
> Microsoft API.

> Basic primitives that you might need to write large multi-threaded
> applications:

> 1. Events
> 2. Mutexes
> 3. Semaphores
> 4. Waitable Timers (block until a point in time occurs. *not* the same
> as sleeping)
> 5. Critical Sections (spin in user mode, drop to kernel if spin did not
> work)

That's a mutex, not a critical section. A critical section
protects a block of code, not data, and as far as I know, is not
actually implemented in the more modern OS's. (It's not present
in either Windows or Unix, for example.)

> 5. WaitForSingleObject
> 6. WaitForMultipleObjects (importance should not be underestimated,
> IMO)

You definitly want conditions, or something along those lines.

> Things like spin-locks, asynchronous procedure calls, condition
> variables, timer queues, atomic operations, fibers...these can be
> useful in other circumstances...but I would think that a C++ programmer
> who wants to have something relatively complete without too much fuss
> could get by with these, all wrapped of course.

It depends on the application domain. In my own work, all I
need is message queues and mutexes; message queues are
implemented in terms of conditions. For people doing numeric
work, however, where the purpose of multi-threading is to spread
the work out over a large number of CPU's, these are too heavy
weight, and something simpler and lighter is called for.

The problem with defining the user interface is precisely that
threading serves many different purposes, in different
communities.

--
James Kanze (Gabi Software) email: james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

JohnQ

unread,

Jan 9, 2007, 2:56:29 AM1/9/07

to

"Le Chaud Lapin" <jaibu...@gmail.com> wrote in message
news:1168284389.3...@q40g2000cwq.googlegroups.com...

>> Note that this does not mean that synchronization primitives go into the
>> core language, as in Java. It simply means that the definition of the
>> abstract machine acknowledges concurrent paths of execution. None of
>> your points about treating multiple threads as a high-level design issue
>> is invalidated by this.
>
> Yes you're right. I definitely support discussing multi-threading
> discussions in context of C++. Perhaps I was wrong to assume that
> there are people who feel like the solution is to augment C++ with a
> bunch of synchronization-related keywords, or worse, magically
> ascertain the intent of the programmer in a complex, multi-threaded
> application.

IOW, "oh, nevermind". This thread has been enlightening in exposing
some of the low level issues which could possibly affect MT code and
served also to bring awareness to the unsuspecting (me for one). It
leaves me wondering if writing MT code is even to be pursued at this
point in time, or at least "where's the guidebook to read is that will
facilitate the writing of MT code without engaging compiler-level
gotchas?". Can the low level issues be dealt with in the short term
by following certain rules, perhaps platform-specific ones, or is robust
MT code not a possibility at all at this time? Are the low level issues
just ones of portability and all those MT Windows programs developed
over the last few years can be considered probably OK?

John

Gerhard Menzl

unread,

Jan 9, 2007, 9:15:26 PM1/9/07

to

Le Chaud Lapin wrote:

> Gerhard Menzl wrote:
>> Why should I be forced to make a variable global when its scope is
>> local by nature?
>
> Because it is actually global in nature.

No, it isn't. The way the C++ language is specified, static local
variables have local scope, hence the name. If the semantics of my
program are best expressed via function-level visibility and program
lifetime, then a static local is what I am going to use. What the
compiler does behind the scenes is of no interest to me. All I am
interested in is the guarantees I get from the compiler, and if
thread-safety is not among these guarantees, I am hosed, and C++ cannot
be used reliably and portably used with multiple threads.

> Then be prepared to change many things about C++. If a function F(x)
> returned the digits of Pi on each call, but also happened to launch a
> ballistic missile on each invocation, you could not blame the compiler
> for the side effect of superfluous launchings caused by multiple use
> of the function name in an expression. There are many things in C++
> that presume that the programmer has context about they way things
> are. This is simply one of them, IMO.

I am sorry, but I don't see what contrived missile launches have
anything to do with what we are discussing here.

> Well, as I have said before, when I look at that code, I never had any
> illusions that it was not a global variable. The only thing I was
> curious about was whether the compiler would initialize the global
> variable before call to main() or wait until the first invocation of
> the function. So I checked my TCPPPL, and saw that it was on first
> invocation, which immediately lead to the question, "How does it know
> it's the first time", which mean a global variable, which immediately
> leads one to think, "Ok, I guess that works, but in a multi-threaded
> program, it is going to be an issue."

I don't care about how the compiler does it, I only care whether it does
it correctly. That's why I am using a compiler, not an assembler.

> But not one that is easily circumvented.

Nobody has claimed that making C++ thread-aware is easy.

--
Gerhard Menzl

Non-spammers may respond to my email address, which is composed of my
full name, separated by a dot, followed by at, followed by "fwz",
followed by a dot, followed by "aero".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

James Kanze

unread,

Jan 9, 2007, 7:46:52 PM1/9/07

to

Le Chaud Lapin wrote:

[concerning a local static variable with a non-trivial
initializer...]

> Well, as I have said before, when I look at that code, I never had any
> illusions that it was not a global variable. The only thing I was
> curious about was whether the compiler would initialize the global
> variable before call to main() or wait until the first invocation of
> the function. So I checked my TCPPPL, and saw that it was on first
> invocation, which immediately lead to the question, "How does it know
> it's the first time", which mean a global variable, which immediately
> leads one to think, "Ok, I guess that works, but in a multi-threaded
> program, it is going to be an issue."

That depends on the guarantees given by the compiler. IIRC, g++
simply puts the initialization in a pthread_once, and lets the
system take care of ensuring that it is only executed once. (At
least, that is the obvious implementation under Posix, where
you have a system primitive precisely for this sort of thing.)

> But not one that is easily circumvented.

boost::once ? Under Posix, it maps directly to pthread_once,
but they've also implemented it under Windows.

> > Note that this does not mean that synchronization primitives go into the
> > core language, as in Java. It simply means that the definition of the
> > abstract machine acknowledges concurrent paths of execution. None of
> > your points about treating multiple threads as a high-level design issue
> > is invalidated by this.

> Yes you're right. I definitely support discussing multi-threading
> discussions in context of C++. Perhaps I was wrong to assume that
> there are people who feel like the solution is to augment C++ with a
> bunch of synchronization-related keywords, or worse, magically
> ascertain the intent of the programmer in a complex, multi-threaded
> application.

Finally. (There probably are such people, but they're not
involved in the C++ standardization effort.)

--
James Kanze (GABI Software) email:james...@gmail.com

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Jan 9, 2007, 7:45:45 PM1/9/07

to

Le Chaud Lapin wrote:
> James Kanze wrote:
> > Le Chaud Lapin wrote:
> > > Lourens Veen wrote:
> > > > The problem is not whether it will work or not, the problem is that
> > > > there is currently no way of writing a (guaranteed) portable C++
> > > > programme that uses threads. Currently, anything to do with threads
> > > > is undefined behaviour.

> > > And IMO, that is the fault of the OS people and library writers, not
> > > the language proper.

> > If the language says its undefined, no one else can make it
> > portably defined.

> > Let's take two simple cases from real life:

> > G++ (like everyone else, I think) uses objects with static
> > lifetime in its exception handling. Before 3.0, it used them in
> > such a way that if two threads raised an exception at the same
> > time, the exception object was corrupted. Please explain how
> > this is the fault of the OS people and library writers.

> The person who wrote that part of the G++ library probably expected it
> to be used in a single-thread application.

Exactly. (And it's not part of the library, at least not
formally. Exceptions are part of the language.)

That's what I've been saying all along. You need some
guarantees from the compiler.

> If it had been intended to
> be used in a multi-threaded application, the library writer should have
> written it for multi-threaded applications. If he did not write a
> multi-threaded version, then that is the fault of the library writer.

There's no "fault". The standard doesn't recognize multiple
threads. The compiler implemented the standard.

> If I write a program that contains 10 threads, and they all invoke
> operator new() simultaneously, and cause the heap to become corrupted,
> that will be either my fault, or the library writer's fault.

In the absence of a contract, it's your fault. And the only
contracts today are from the compiler vendors, and they vary
from one compiler to the next.

[...]

> > But adding such statements would be a change. In fact, analysis
> > has shown that the sequence point model doesn't adapt at all to
> > a multithreaded context. So a major part of the definition of
> > expression semantics, and what is or is not conforming, will
> > change.

> Statements or library functions?

Statements, but does it matter? The only thing that matters, in
the end, is that it is code that you didn't write directly, but
that is provided for you in one way or the other by the compiler
(or a third party vendor---the issues are the same, but third
party software doesn't normally come under the scope of the
language specification).

> Note that is entirely conceivable
> that, at the end of the day, the code that uses "standard" C++
> multi-threading support by way of libraries does not have much visual
> distinction from that which supports by way of added keywords. The
> thesis of what I am saying, that it is my opinion, based on my own
> experience, that changes to the language are not necessary, any more
> than it is necessary to change the language to grab random numbers from
> the architecture, for example.

Well, you certainly need guarantees from somewhere. We've
pointed out no end of reasons here. And from where else, if not
from the language standard. Leaving it to the implementors
means every compiler gives a different set of guarantees, and
even the simplest portable code is chancy.

[...]

> >and typically, you probably can't, because some
> > things will require privileged mode. After that, the compiler
> > has to ensure that these primitives are actually available to
> > you, and that it doesn't automatically do anything forbidden in
> > the generated code. Intel processors have very few registers,
> > and so often have to spill in complex expressions. If the
> > compiler spills to static memory, it doesn't matter what the
> > processor and the OS offers. And of course, having a *standard*
> > interface, defined in the standard library, certainly helps
> > portability.

> ???? Spill to static memory?

Compilers have been known to do it.

> What about the stack?

On a modern IA-32 processor, that's the obvious place. On other
processors, it varies; it's the obvious place on most modern
architectures, but accessing stack based memory can be very
expensive on some older architectures.

My point is simply that you need a guarantee that the generated
code is thread-safe if you are going to use the compiler for a
multithreaded program. You have been denying that the compiler
has anything to do with it, but it does. At the lowest level,
it is the compiler which writes the machine code, not you, and
it is the code at the machine code level which ultimately
determines whether your function is thread-safe or not.

> > None of which, of course, means that the programmer can ignore
> > the issues, and not pay attention to threads. All it means is
> > that when I write a program, I only have to analyse its thread
> > safety once, against the standard guarantees, rather than for
> > each platform, against the guarantees given by that platform.

> I do sympathize with the "mood" that you seek. I am 100% in agreement
> with having the programmer feel good while using the primitives for
> multi-threading Again, I think that "feel good" can be achieved
> entirely using libraries without perturbations to the language itself.

It has nothing to do with "feel good". In order to write thread
safe code, you need certain guarantees from the compiler. I and
others here continue to show examples as to why, and you
continue to ignore them. Unless you have a guarantee that the
machine code generated by the compiler is thread safe, you
cannot safely use the compiler for multithreaded applications.
And unless this guarantee is present in some standard document,
you cannot use it portably: different vendors give different
guarantees. Today.

[...]

> The last complaint is particularly important. It gives the impression
> that, if the problem is being complained about, then those who are
> doing the complaining do not know what is the solution.

There are two aspects to the solution. The experts all know
that we need to add support to the language for threading. If
not, we remain in the current situation, where portable
multi-threaded code is not possible (and I'm not talking only
about the names given to the primitives). That's the
"solution". There is a lack of consensus with regards to
exactly what should be guaranteed, at least with regards to
certain details. (I think that there is consensus on the
fundamental points, e.g. that concurrent access to an object is
undefined behavior if any thread attempts to modify the object,
but legal and defined otherwise, and that with a few exceptions
(e.g. bit fields), accessing different objects may take place
concurrently, regardless of the type of access. And that the
only accesses you have to worry about are those in your
code---if a compilers spills to static memory, it has to
generate whatever synchronization is necessary. I don't
remember hearing the issue discussed, but I hope that there is
also consensus that you can throw concurrently from several
threads. I don't think that there's any consensus yet with
regards to the construction of local static variables,
however---at present, g++ allows it, Sun CC (and I think VC++)
doesn't. And there is definitly discussion concerning the
meaning of volatile, and the implications of causality.)

> Otherwise,
> they would not be complaining.

People are complaining today because it is impossible in C++
today to write portable, correct code for a multithreaded
environment.

> And if they do not know the solution,
> they are not yet aware that these problems can be easily fixed using
> existing primitives (libraries) that are entirely orthogonal to the
> language itself (as is my opinion and that of others).

And that is simply false. You keep repeating yourself, and
ignoring the many examples others have posted indicating why it
isn't sufficient.

> And if they
> complain in the context of c++.moderated, there is the impression that,
> not only are libraries not enough, something more must be done with
> C++.

Which is a proven fact.

> But if the libraries are not enough, the only thing left to be
> done with C++ is to change the language itself.

Correct.

Note that changing the language does NOT necessarily mean
introducing new keywords. It does mean, for example, redefining
the concept of sequence points so that it has some meaning in a
multi-threaded context. And ensuring that things like spilling
to static memory, or using unsynchronized modifications to
static memory in exception handling, result in some way in an
non-conforming implementation. And either explicitly requiring
the user to externally synchronize when calling a function with
a local static variable with non-trivial initialization, or
explicitly saying that such a function can be called
concurrently. And adding text which forbids some implicit
parallelization: off hand, for example, I think that the
standard will have to guarantee that the initialization of
static objects (at namespace scope) does not take place
concurrently. (To be frank, I'm not even sure we have that
guarantee today.) And no doubt a lot of other things I haven't
mentionned.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

carton

unread,

Jan 10, 2007, 3:14:39 PM1/10/07

to

On Jan 6, 12:47 pm, "Le Chaud Lapin" <jaibudu...@gmail.com> wrote:

> But as I mentioned in another post,
> I think if such keywords where"invented", there would be an extremely
> strong parallel between the use of those keywords and the use of
> library calls that do the same thing.

I think you're misunderstading the reasons why people want to update
the standard.

Let's consider for a moment a single thread of execution in a multi-
threaded environemnt. If access to shared memory isn't properly
syncronized we know that the behavor (in practice) can be undefined,
however according to the C++ standard, the behavior may be
perfectly well defined because it doesn't recognize the possibility
that another thread might be modifying the memory.

The point of updating the standard is to 'fix' it so that it recognizes
a multi-threaded environment and gives us useful rules about what is
defined behavior and what is undefined. In order to do that we need
to add the concept of threads and atomic operations to the language
and give the compiler rules about how they must behave and what
operations can be re-orderd, etc.

Inventing keywords might be required to do this but that's not the
point.

Chris Carton