Threading in new C++ standard

Dann Corbit

unread,

Apr 15, 2008, 6:28:43 PM4/15/08

to

Rather than create a new way of doing things:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
why not just pick up ACE into the existing standard:
http://www.cse.wustl.edu/~schmidt/ACE.html
the same way that the STL (and subsequently BOOST) have been subsumed?
Since it already runs on zillions of platforms, they have obviously worked
most of the kinks out of the generalized threading and processes idea (along
with many other useful abstractions).

Even more interesting than generalized threading would be generalized
software transactions. The Intel compiler has an experimental version that
does this:
http://softwarecommunity.intel.com/articles/eng/1460.htm

As we scale to larger and larger numbers of CPUs, the software transaction
model is the one that gains traction. This document is very illuminating in
that regard:
http://internap.dl.sourceforge.net/sourceforge/libltx/spaa05_submitted.pdf

** Posted from http://www.teranews.com **

Thomas J. Gritzan

unread,

Apr 15, 2008, 7:12:36 PM4/15/08

to

Dann Corbit wrote:
> Rather than create a new way of doing things:
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> why not just pick up ACE into the existing standard:
> http://www.cse.wustl.edu/~schmidt/ACE.html
> the same way that the STL (and subsequently BOOST) have been subsumed?
> Since it already runs on zillions of platforms, they have obviously worked
> most of the kinks out of the generalized threading and processes idea (along
> with many other useful abstractions).

It's not a new way. Its from boost.

Click on the Link for n2497, scroll down and read:

"Acknowledgments
The overall design of this threading library is based on William Kempf's
Boost.Thread Library, as refined by literally hundreds of other Boost users
and contributors."

--
Thomas
http://www.netmeister.org/news/learn2quote.html
"Some folks are wise, and some otherwise."

Sam

unread,

Apr 15, 2008, 7:34:42 PM4/15/08

to

Dann Corbit writes:

> Rather than create a new way of doing things:
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> why not just pick up ACE into the existing standard:
> http://www.cse.wustl.edu/~schmidt/ACE.html

Perhaps because not everyone shares your enthusiasm for ACE?

> the same way that the STL (and subsequently BOOST) have been subsumed?

Boost is not universally liked either. As the saying goes: just because
everyone decides to jump off a cliff, should you follow?

James Kanze

unread,

Apr 16, 2008, 3:42:38 AM4/16/08

to

On Apr 16, 12:28 am, "Dann Corbit" <dcor...@connx.com> wrote:
> Rather than create a new way of doing things:http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> why not just pick up ACE into the existing
> standard:http://www.cse.wustl.edu/~schmidt/ACE.html the same
> way that the STL (and subsequently BOOST) have been subsumed?

There are serious problems with the model ACE uses.

> Since it already runs on zillions of platforms, they have
> obviously worked most of the kinks out of the generalized
> threading and processes idea (along with many other useful
> abstractions).

It's not actually used on "zillions of platforms"---as far as I
know, it supports Windows and Unix, and that's about it.

ACE is somewhat dated, and uses what was the classical model for
thread objects at the time it was developed. The general
concensus today is that this model isn't really that
appropriate, at least for a language with value semantics like
C++.

The second aspect is that ACE (and Boost, which represents a
more "state of the art" model) only address one of the reasons
threading is desired, and don't effectively support uses like
parallelizing, which are becoming important with the diffusion
of multicore machines.

> Even more interesting than generalized threading would be
> generalized software transactions. The Intel compiler has an
> experimental version that does
> this:http://softwarecommunity.intel.com/articles/eng/1460.htm

I think we're still largely at the beginning when it comes to
knowing the best idioms to support threading. Rather than
standardize something experimental, which later turns out to be
far less than ideal, I think we should only standardize an
essential minimum, on which people can build and experiment.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Apr 16, 2008, 3:45:36 AM4/16/08

to

On Apr 16, 1:12 am, "Thomas J. Gritzan" <Phygon_ANTIS...@gmx.de>
wrote:

> Dann Corbit wrote:
> > Rather than create a new way of doing things:
> >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> > why not just pick up ACE into the existing standard:
> >http://www.cse.wustl.edu/~schmidt/ACE.html
> > the same way that the STL (and subsequently BOOST) have been subsumed?
> > Since it already runs on zillions of platforms, they have obviously worked
> > most of the kinks out of the generalized threading and processes idea (along
> > with many other useful abstractions).

> It's not a new way. Its from boost.

> Click on the Link for n2497, scroll down and read:

> "Acknowledgments
> The overall design of this threading library is based on
> William Kempf's Boost.Thread Library, as refined by literally
> hundreds of other Boost users and contributors."

Boost only represents a small part of threading in the standard.
Significant parts of the interface will doubtlessly profit from
the experience we've gotten with Boost; other parts are designed
to address lower level threading issues. And of course, Boost
more or less ignored the language issues (you get whatever the
platform gives), which are the most important in a standard.

Douglas C. Schmidt

unread,

Apr 16, 2008, 8:39:57 AM4/16/08

to

Hi James,

>There are serious problems with the model ACE uses.

ACE supports several threading models. Can you please elaborate on
these problems?

>> Since it already runs on zillions of platforms, they have
>> obviously worked most of the kinks out of the generalized
>> threading and processes idea (along with many other useful
>> abstractions).
>
>It's not actually used on "zillions of platforms"---as far as I
>know, it supports Windows and Unix, and that's about it.

ACE has been ported to many more platforms than Windows and Unix,
though not surprisingly those are the most common platforms on which
it's used. Please see

http://www.dre.vanderbilt.edu/~schmidt/DOC_ROOT/ACE/ACE-INSTALL.html#platforms

for more info on platforms that ACE has been ported to and the
status of these ports. The status of the builds on the most commonly
used platforms is available in real-time from

http://www.dre.vanderbilt.edu/scoreboard/integrated.html

Thanks,

Doug
--
Dr. Douglas C. Schmidt Professor and Associate Chair
Electrical Engineering and Computer Science TEL: (615) 343-8197
Vanderbilt University WEB: www.dre.vanderbilt.edu/~schmidt
Nashville, TN 37203 NET: d.sc...@vanderbilt.edu

Ke Jin

unread,

Apr 16, 2008, 12:21:47 PM4/16/08

to

On Apr 15, 3:28 pm, "Dann Corbit" <dcor...@connx.com> wrote:
> Rather than create a new way of doing things:http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> why not just pick up ACE into the existing standard:http://www.cse.wustl.edu/~schmidt/ACE.html
> the same way that the STL (and subsequently BOOST) have been subsumed?
> Since it already runs on zillions of platforms, they have obviously worked
> most of the kinks out of the generalized threading and processes idea (along
> with many other useful abstractions).
>

First, ACE thread is a library based solution. There are many
discussions on why C++0x thread choose a new language based approach
instead (See <http://www.artima.com/forums/flat.jsp?
forum=226&thread=180936> and Boehm's article of "threads cannot be
implemented as a library").

Second, ACE folks tried to sell it to standard bodys few years ago.
However, instead of going to a language or platform standard
organization (such as relevant committees in ANSI, ISO, IEEE, and
OpenGroup), they went to OMG (see the discussion in this thread
http://groups.google.com/group/comp.object.corba/browse_thread/thread/c12b3a5630b1882a).

Ke

>
> Even more interesting than generalized threading would be generalized
> software transactions. The Intel compiler has an experimental version that
> does this:http://softwarecommunity.intel.com/articles/eng/1460.htm
>
> As we scale to larger and larger numbers of CPUs, the software transaction
> model is the one that gains traction. This document is very illuminating in

> that regard:http://internap.dl.sourceforge.net/sourceforge/libltx/spaa05_submitte...
>
> ** Posted fromhttp://www.teranews.com**

Ke Jin

unread,

Apr 16, 2008, 12:58:26 PM4/16/08

to

On Apr 16, 5:39 am, schm...@tango.dre.vanderbilt.edu (Douglas C.

Schmidt) wrote:
> Hi James,
>
> >There are serious problems with the model ACE uses.
>
> ACE supports several threading models. Can you please elaborate on
> these problems?
>

See http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf

Ke

Szabolcs Ferenczi

unread,

Apr 16, 2008, 8:35:50 PM4/16/08

to

On Apr 16, 12:28 am, "Dann Corbit" <dcor...@connx.com> wrote:

> Rather than create a new way of doing things:http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html

> ...

Are there some examples available in this new notation?

Can someone demonstrate here some canonical examples in this new
proposed
notation?

- Producer-consumer problem (bounded buffer)
- Readers and writers problem
- Dining philosophers problem
- Sleeping barber problem
- Santa Claus problem (http://stwww.weizmann.ac.il/g-cs/benari/
articles/santa-claus-problem.pdf)

It would be interesting to see how the proposal works out.

Best Regards,
Szabolcs

Szabolcs Ferenczi

unread,

Apr 17, 2008, 12:40:42 PM4/17/08

to

On Apr 16, 12:28 am, "Dann Corbit" <dcor...@connx.com> wrote:

> Rather than create a new way of doing things:http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html

> ...

I like that some higher level construction is also there at last (at
condition_variable):

<quote>
template <class Predicate>
void wait(unique_lock<mutex>& lock, Predicate pred);
Effects:
As if:
while (!pred())
wait(lock);
</quote>

Now programming the get operation of a bounded buffer would look
something like this, I guess:

queue BB::q;
mutex BB::m;
condition_variable BB::empty;
condition_variable BB::full;
int BB::get()
{
int b;
unique_lock<mutex> lock(m);
full.wait(lock, !q.empty());
b = q.get();
empty.notify_one();
return b;
}

Although there is still some duplication in `full.wait(lock, !
q.empty());', it is much closer to this simple pseudo code (where
`when' is a keyword for a conditional critical region statement) than
a pthread version:

shared queue BB::q;
int BB::get()
{
int b;
when (!q.empty()) {
b = q.get();
}
return b;
}

Best Regards,
Szabolcs

Douglas C. Schmidt

unread,

Apr 17, 2008, 3:25:30 PM4/17/08

to

Hi James,

Thanks for providing some additional info.

> Or no model; like most other libraries, it's just a wrapper for
> whatever the implementation gives you.

That's not entirely the case, e.g., the ACE_Thread_Manager is more
than just a wrapper over the underlying implementation. Naturally, if
the OS doesn't support threads at all it's hard for a library-based
implementation to do much about this..

> Seriously, what I was referring to was the fact that you have (or
> had, at least) a Thread class, from which the user had to derive.

I assume you're referring to ACE_Task here? It implements an Active
Object model that's similar to Java's Thread class + a synchronized
message queue.

> This was, of course, the accepted practice when ACE first appeared,
> but we've found better solutions since then.

Can you please point to some descriptions of these solutions?

> There's also the problem, IIRC, that there is no real distinction
> between detached threads and joinable threads.

It's not clear what you mean by this - can you please explain? The
ACE_Thread_Manager implements both detached and joinable threads atop
all the underlying operating systems, not just POSIX pthreads.

Dann Corbit

unread,

Apr 17, 2008, 4:10:55 PM4/17/08

to

"Chris Thomasson" <cri...@comcast.net> wrote in message
news:ELSdnc_64ahEypvV...@comcast.com...
> "Dann Corbit" <dco...@connx.com> wrote in message
> news:52e98$48052c1c$22...@news.teranews.com...

>> Rather than create a new way of doing things:
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
>> why not just pick up ACE into the existing standard:
>> http://www.cse.wustl.edu/~schmidt/ACE.html
>> the same way that the STL (and subsequently BOOST) have been subsumed?
>> Since it already runs on zillions of platforms, they have obviously
>> worked
>> most of the kinks out of the generalized threading and processes idea
>> (along
>> with many other useful abstractions).
>>
>> Even more interesting than generalized threading would be generalized
>> software transactions. The Intel compiler has an experimental version
>> that
>> does this:
>> http://softwarecommunity.intel.com/articles/eng/1460.htm
>

> Check this out:
>
> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/e5e27f5e0570910d
>
>
> Here is how to completely live-lock their STM:
>
> http://groups.google.com/group/comp.programming.threads/msg/e1a8ed18e97130b2
>
>
> It's no where near ready for prime time. STM has issues with scalability
> and live-lock.

>
>
>
>
>> As we scale to larger and larger numbers of CPUs, the software
>> transaction
>> model is the one that gains traction. This document is very illuminating
>> in
>> that regard:
>> http://internap.dl.sourceforge.net/sourceforge/libltx/spaa05_submitted.pdf
>

> This is _very_ misleading to say the least. They only talk about live-lock
> in terms of transactions getting invalidated by other low-priority
> transactions. This is definitely not the only way to live-lock a STM.
>
> Here is just one of many ways:
>
> http://groups.google.com/group/comp.programming.threads/msg/3a356c832812213b
>
>
>
>
> STM have a whole lot of scaling issues. Here are some of them:
>
> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/8bef56a090c052d9
>
>
> Its no silver-bullet indeed. I would go with distributed message-passing
> over STM any day.

I was not aware of the shortfalls of the STM approach, having only read
glowing reviews. Thanks for the pointers. I was going to download and try
the Intel compiler (but was foiled by the Java interface) so I think I will
halt efforts in that direction until things shake out more thoroughly.

Chris Thomasson

unread,

Apr 17, 2008, 5:31:05 PM4/17/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message
news:2427d471-dad0-4b26...@d1g2000hsg.googlegroups.com...

On Apr 16, 12:28 am, "Dann Corbit" <dcor...@connx.com> wrote:
> Rather than create a new way of doing
> things:http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> ...

> I like that some higher level construction is also there at last (at
> condition_variable):

[...]

Indeed. That's a major plus for me. The really cool thing, for me at least,
is that the next C++ will allow me to create a 100% standard implementation
of my AppCore library <http://appcore.home.comcast.net> which currently uses
POSIX Threads and X86 assembly. The fact that C++ has condition variables,
mutexs, atomic operations and fine-grain memory barriers means a lot to me.
Finally, there will be such a thing as 100% portable non-blocking
algorithms. I think that's so neat.

Chris Thomasson

unread,

Apr 17, 2008, 5:40:49 PM4/17/08

to

"Dann Corbit" <dco...@connx.com> wrote in message

news:57136$4807aed5$29...@news.teranews.com...

> "Chris Thomasson" <cri...@comcast.net> wrote in message
> news:ELSdnc_64ahEypvV...@comcast.com...

[...]

>> STM have a whole lot of scaling issues. Here are some of them:

[...]

>>
>> Its no silver-bullet indeed. I would go with distributed message-passing
>> over STM any day.
>
> I was not aware of the shortfalls of the STM approach, having only read
> glowing reviews. Thanks for the pointers.

No problem. Unfortunately, 99% of the STM research papers seem to
continently leave out a list of caveat's. IMHO, STM is a good tool to have
around, but its not the "silver-bullet" that most of the papers hint at. You
can make some good use out of a STM, but you have to be really careful, and
know exactly what your doing inside those atomic-blocks. If you make poor
decisions in those transactional atomic-blocks, your going to get burned by
scaling issues and sporadic periods of live-lock during periods of high, or
even moderate, load on the system.

> I was going to download and try the Intel compiler (but was foiled by the
> Java interface) so I think I will halt efforts in that direction until
> things shake out more thoroughly.

Yeah. I would lurk around their forum and study some of the discussions
going on in there. One post in particular caught my eye:

http://softwarecommunity.intel.com/isn/Community/en-US/forums/thread/30248420.aspx

Here is the exact response that Haris Volos got from Jim Cownie who works in
the lab where this experimental development is being conducted:

"What semantics are you expecting from transactions?
If you consider them to have the same semantics as a single global lock,
then this code clearly deadlocks, so it is no big surprise that it can also
deadlock with TM..."

He is telling the truth indeed.

Szabolcs Ferenczi

unread,

Apr 17, 2008, 6:49:05 PM4/17/08

to

On Apr 16, 6:21 pm, Ke Jin <kjin...@gmail.com> wrote:

> First, ACE thread is a library based solution. There are many
> discussions on why C++0x thread choose a new language based approach
> instead
> (See <http://www.artima.com/forums/flat.jsp?forum=226&thread=180936>
> and Boehm's article of "threads cannot be
> implemented as a library").

Is not "Multi-threading Library for Standard C++" just another
library?

I cannot see the language features for concurrency but only the
library ones. For instance, you can instantiate a thread from a
library class. There is no language element for defining a thread, is
there?

There is no keyword in C++0x to mark the shared data. Consequently the
compiler will not be able to check whether the shared data is accessed
properly inside critical region or not.

The low level library means such as mutex and condition variable are
there but there is no higher level language means to define a critical
region or conditional critical region or monitor (see the `shared
class' as it was originally proposed). I mean at the language level
you have nothing new compared to the library level. At least I cannot
see anything like that in the attached "Multi-threading Library for
Standard C++ (Revision 1)"
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html#thread.threads

Perhaps someone can help me out.

Best Regards,
Szabolcs

gpderetta

unread,

Apr 17, 2008, 7:29:13 PM4/17/08

to

On Apr 18, 12:49 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 16, 6:21 pm, Ke Jin <kjin...@gmail.com> wrote:
>
> > First, ACE thread is a library based solution. There are many
> > discussions on why C++0x thread choose a new language based approach
> > instead
> > (See <http://www.artima.com/forums/flat.jsp?forum=226&thread=180936>
> > and Boehm's article of "threads cannot be
> > implemented as a library").
>
> Is not "Multi-threading Library for Standard C++" just another
> library?

Did you read Bohem's article?

>
> I cannot see the language features for concurrency but only the
> library ones.

The biggest advancement that C++0x will add to C++ in the are of
threading (other than acknowledge that it exists of course) is the
introduction of a (quite advanced, BTW) memory model, which is a
*pure* language feature.
Also it introduces atomic operations which must be implemented at
the language level.

Everything else is pretty much just standardizing pretty the status
quo.

> For instance, you can instantiate a thread from a
> library class. There is no language element for defining a thread, is
> there?

So what?

>
> There is no keyword in C++0x to mark the shared data.

No need for that if you use locks. For small atomic POD objects there
is the atomic<> template class which uses the appropriate magic to
make access atomic without locks.

> Consequently the
> compiler will not be able to check whether the shared data is accessed
> properly inside critical region or not.
>

The memory model prohibits any optimizations (very few actually) that
could
break properly locked (or fenced) code.

> The low level library means such as mutex and condition variable are
> there but there is no higher level language means to define a critical
> region

scoped_lock? Why do you need a language level mean when a library
approach
works just fine and is more in line with the 'C++ Way'

> or conditional critical region or monitor (see the `shared
> class' as it was originally proposed). I mean at the language level
> you have nothing new compared to the library level. At least I cannot
> see anything like that in the attached "Multi-threading Library for

> Standard C++ (Revision 1)"http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html#th...
>

Higher level abstractions can (and will) be built on top of the low
level infrastructure as pure library extensions. The standard
committee mostly
standardizes what already exists and tries to introduce as little
(untested)
innovations as possible.

--
Giovanni P. Deretta

James Kanze

unread,

Apr 18, 2008, 4:52:54 AM4/18/08

to

On Apr 17, 9:25 pm, schm...@tango.dre.vanderbilt.edu (Douglas C.

Schmidt) wrote:
> > Or no model; like most other libraries, it's just a wrapper for
> > whatever the implementation gives you.

> That's not entirely the case, e.g., the ACE_Thread_Manager is
> more than just a wrapper over the underlying implementation.
> Naturally, if the OS doesn't support threads at all it's hard
> for a library-based implementation to do much about this..

What I meant is that you don't really define when external
synchronization is needed, and when it isn't. Under Posix,
external synchronization is needed when Posix says it's needed,
and under Windows, when Windows says it's needed.

This is the aspect which really has to be treated at the
language level: what does a "sequence point" mean in a
multithreaded environment? There are also library aspects: when
do I need external synchronization when using library components
in two different threads.

> > Seriously, what I was referring to was the fact that you
> > have (or had, at least) a Thread class, from which the user
> > had to derive.

> I assume you're referring to ACE_Task here? It implements an
> Active Object model that's similar to Java's Thread class + a
> synchronized message queue.

And Java's Thread class is an example of a poorly designed
interface for threading. At least today; it was pretty much
standard practice when it appeared, but we've learned a lot
since then. And are still learning: I'm not sure that even
today, threading technology is mature enough to be technically
ready for standardization. (But politically, we don't have a
choice. If the next version of the C++ standard doesn't address
threading, the language is probably dead.)

> > This was, of course, the accepted practice when ACE first
> > appeared, but we've found better solutions since then.

> Can you please point to some descriptions of these solutions?

Boost offers parts of them. The whole concept of functional
objects changes the situation considerably.

IMHO, too, what I've learned from practical experience is that
joinable threads and detached threads are two very different
beasts. For detached threads, the best solution I've found so
far is a template function, something more or less like:

template< typedef Function >
void
startDetachedThread(
Function f )
{
boost::thread( f ) ;
}

Fire and forget, in sum, with copy semantics handling all of the
memory management issues.

For joinable threads, you still want the thread class to take
the functional object as an argument (so that you're ensured
that it fully constructed before the thread starts). In Java,
for example, deriving from java.lang.Thread is a common source
of errors (which don't occur if you pass a Runnable to the
constructor of Thread). In C++, the issue is further
complicated by memory management issues, since there's no
garbage collection. (I don't think that they're necessarily
difficult issues, and I suspect that there are several
acceptable solutions. But they do have to be addressed.)

I'm not really too sure how ACE handles this; it's been a long
time since I last looked at it (and it may have evolved since
then).

I'm also not too sure how (or if) ACE handles propagation of an
exception accross a join. It's important that the standard
offer some sort of support for this as well. Although I don't
think it should be the default, and it doesn't bother me if it
requires some programming effort to get, it is currently
impossible to implement in Boost, and I suspect in any other
threading system (since it also requires some language support,
which isn't there at present).

> > There's also the problem, IIRC, that there is no real
> > distinction between detached threads and joinable threads.

> It's not clear what you mean by this - can you please explain?
> The ACE_Thread_Manager implements both detached and joinable
> threads atop all the underlying operating systems, not just
> POSIX pthreads.

But do you even need a thread manager for detachable threads.

Note that I'm talking here about the lowest level; the one which
must be in the standard. There's definitely place for more
elaborate (and less general) models at a higher level.

With regards to standardization, what ACE has going for it, of
course, is the fact that it probably has more users than all
other systems combined, and we have a lot of concrete experience
with it. What it has going against it is its age (it doesn't
take into account all of the changes in programming style in the
last years), and the fact that no one in the committee is
championing it. (The Boost group are very active in
standardization. You and others working on ACE don't seem to be
so.)

What all existing libraries have going against them (IMHO) is
that the "existing practice" with regards to threads is that
most multi-threaded programs don't actually work, or won't if
the active thread changes at the wrong moment, of if they're
ported to a multi-processor system, or if any one of a number of
other things occur. From a technical point of view, this means
that it's probably too early to standardize threading this go
round. From a political point of view, however, I don't think
we have a choice.

James Kanze

unread,

Apr 18, 2008, 4:59:52 AM4/18/08

to

On Apr 17, 11:31 pm, "Chris Thomasson" <cris...@comcast.net> wrote:
> "Szabolcs Ferenczi" <szabolcs.feren...@gmail.com> wrote in message

> [...]

That was, I believe, a do or die requirement from some of the
participants.

I haven't mentionned it much here because it's not really
relevent to the work I do, and I'm not too familiar with it.
But it is definitly one reason why just adopting an existing
library isn't sufficient. Threading is used for different
purposes, at different levels, and someone who's interested in
parallelizing array operations on a multi-core system will have
different requirements than someone who just needs a worker
thread in a GUI.

Szabolcs Ferenczi

unread,

Apr 18, 2008, 6:23:21 PM4/18/08

to

> On Apr 18, 12:49 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:
>
> > On Apr 16, 6:21 pm, Ke Jin <kjin...@gmail.com> wrote:
>
> > > First, ACE thread is a library based solution. There are many
> > > discussions on why C++0x thread choose a new language based approach
> > > instead
> > > (See <http://www.artima.com/forums/flat.jsp?forum=226&thread=180936>
> > > and Boehm's article of "threads cannot be
> > > implemented as a library").
>
> > Is not "Multi-threading Library for Standard C++" just another
> > library?
>
> Did you read Bohem's article?

Sure I did. What else?

> > I cannot see the language features for concurrency but only the
> > library ones.
>
> The biggest advancement that C++0x will add to C++ in the are of
> threading (other than acknowledge that it exists of course)

Whether C++0x acknowledges threading or not, that does not affect
threading. Do you seriously think that threading will die out if C++
will not deal with it?

> is the
> introduction of a (quite advanced, BTW) memory model, which is a
> *pure* language feature.

Is that all? The argument went that other solutions are not good
because they are library based whereas the C++0x provides solution at
the language level. What are the language level elements then?

A memory model "is a *pure* language feature?" Sure? By the way: Do
you know any programming languages which are designed for concurrency
at the language level rather than at the library level? Do they have
anything to do with any "memory model"? (Hint: Try to thing something
else than Java.)

> Also it introduces atomic operations which must be implemented at
> the language level.

They are just part of yet another *library* in C++0x. So how do you
connect them to the language level?

> Everything else is pretty much just standardizing pretty the status
> quo.

The status quo of what? Pthreads library? Boost library? ACE library?
or what?---since C++ has currently nothing to do with concurrent
programming.

> > For instance, you can instantiate a thread from a
> > library class. There is no language element for defining a thread, is
> > there?
>
> So what?

So, it is a *library* issue and not any language issue. Just that.

> > There is no keyword in C++0x to mark the shared data.
>
> No need for that if you use locks. For small atomic POD objects there
> is the atomic<> template class which uses the appropriate magic to
> make access atomic without locks.

If you use locks, you are at the library level. If you claim that in
contrast to other solutions which are library level solutions C++0x
provides language level ones, what are they?

> > Consequently the
> > compiler will not be able to check whether the shared data is accessed
> > properly inside critical region or not.
>
> The memory model prohibits any optimizations (very few actually) that
> could
> break properly locked (or fenced) code.

I did not talk about optimisation but that a concurrent language must
provide some means to mark the shared resource and then the compiler
*must* check whether the shared resource is really accessed within the
critical region only. This cannot be substituted by any memory model
or optimisation.

Anyway, a high level language is not about optimisations. Optimisation
is a technical issue for the compiler. However, the compiler is not
there, in the first place, to optimise but to translate one notation
to another.

Once upon a time an eminent scholar said: "There are two views of
programming. In the old view it is regarded as the purpose of our
programs to instruct our machines; in the new one it will be the
purpose of our machines to execute our programs."
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD512.html

Clearly, times has changed: The current trend is that we must write
programs for the compiler to optimise it. So we might adapt the
saying: "There are two views of programming. In the old view it is
regarded as the purpose of our programs to BE EASILY OPTIMISED BY
COMPILERS; in the new one it will be the purpose of our COMPILERS to
COMPILE our programs." I rather support the latter view.

> > The low level library means such as mutex and condition variable are
> > there but there is no higher level language means to define a critical
> > region
>
> scoped_lock? Why do you need a language level mean when a library
> approach
> works just fine and is more in line with the 'C++ Way'

First of all some people here claimed that C++0x provides language
level solutions to multi-threading as opposed to the library
solutions. Well, scoped lock belongs to the library level. Is it
really the 'C++ Way'?

On the other hand, in concurrent programming it is even more important
to stay closer to the problem set rather than to the hardware or
library level. There is one important issue most of the people who
thinks there was no life before the pthread library forget about: The
compiler is not there to optimise your code. The compiler must help
you to write correct code and bridge the gap between the high level
human oriented notation and the low level machine language. In case of
concurrent programs it is even more important than it was in case of
sequential programs that the compiler should catch concurrent
programming errors at compile time, if possible. The compiler can only
do that if you use redundant high level language to express what you
want to achieve. If the compiler does the job, however, the translated
code will not be redundant.

If you use library calls, you are at the low level, the compiler
cannot help you much, as is explained in the referenced paper: Boehm
"Threads Cannot be Implemented as a Library" By the way, did you read
it at least?

> > or conditional critical region or monitor (see the `shared
> > class' as it was originally proposed). I mean at the language level
> > you have nothing new compared to the library level. At least I cannot
> > see anything like that in the attached "Multi-threading Library for
> > Standard C++ (Revision 1)"http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html#th...
>
> Higher level abstractions can (and will) be built on top of the low
> level infrastructure as pure library extensions.

If you do that in the user code, the compiler support is lost. On the
other hand, you can do that in any library, you do not need C++0x at
all for that. However, this way you will never have high level
abstractions at the language level, since you are going to stay at
library level "as pure library extensions."

Best Regards,
Szabolcs

James Kanze

unread,

Apr 19, 2008, 3:25:06 AM4/19/08

to

Szabolcs Ferenczi a écrit :

> > On Apr 18, 12:49 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:

> > > On Apr 16, 6:21 pm, Ke Jin <kjin...@gmail.com> wrote:

[...]

> > > I cannot see the language features for concurrency but only the
> > > library ones.

> > The biggest advancement that C++0x will add to C++ in the
> > are of threading (other than acknowledge that it exists of
> > course)

> Whether C++0x acknowledges threading or not, that does not
> affect threading. Do you seriously think that threading will
> die out if C++ will not deal with it?

No, but C++ will gradually die out if it doesn't acknowledge
threading.

> > is the introduction of a (quite advanced, BTW) memory model,
> > which is a *pure* language feature.

> Is that all? The argument went that other solutions are not
> good because they are library based whereas the C++0x provides
> solution at the language level. What are the language level
> elements then?

> A memory model "is a *pure* language feature?" Sure? By the
> way: Do you know any programming languages which are designed
> for concurrency at the language level rather than at the
> library level? Do they have anything to do with any "memory
> model"? (Hint: Try to thing something else than Java.)

Posix C. Posix defines more or less how memory accesses work in
the presence of threads.

And Ada, and Eiffel, and probable a number of others.

[...]

> > Everything else is pretty much just standardizing pretty the
> > status quo.

That's not quite true. The status quo is often that you don't
know what guarantees you have. The standard seeks to aleviate
this.

> The status quo of what? Pthreads library? Boost library? ACE
> library? or what?---since C++ has currently nothing to do
> with concurrent programming.

The main issue isn't the library interface (although that is
important too). The main issue involves language rules: when
must a compiler ensure that a write is visible in other threads,
for example.

[...]

> > > Consequently the compiler will not be able to check
> > > whether the shared data is accessed properly inside
> > > critical region or not.

> > The memory model prohibits any optimizations (very few
> > actually) that could break properly locked (or fenced) code.

> I did not talk about optimisation but that a concurrent
> language must provide some means to mark the shared resource
> and then the compiler *must* check whether the shared resource
> is really accessed within the critical region only. This
> cannot be substituted by any memory model or optimisation.

You may not have talked about optimisation, but it is certainly
an important issue. As is visibility.

> Anyway, a high level language is not about optimisations.
> Optimisation is a technical issue for the compiler. However,
> the compiler is not there, in the first place, to optimise but
> to translate one notation to another.

A language must define which optimizations are legal, and which
ones aren't.

> Once upon a time an eminent scholar said: "There are two views
> of programming. In the old view it is regarded as the purpose
> of our programs to instruct our machines; in the new one it
> will be the purpose of our machines to execute our programs."
> https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD512.html

Certainly. But if the language doesn't define what a statement
means, you can't use it at either level. C++ currently doesn't
define what a statement means in a multithreaded envirionment.
Doing so is a language issue.

Szabolcs Ferenczi

unread,

Apr 19, 2008, 2:03:28 PM4/19/08

to

On Apr 19, 9:25 am, James Kanze <james.ka...@gmail.com> wrote:

> The main issue isn't the library interface (although that is
> important too). The main issue involves language rules: when
> must a compiler ensure that a write is visible in other threads,
> for example.

It is a good example. The visibility is quite straightforward issue in
any decent concurrent programming language: At the end of the critical
region a write that happened inside the critical region is visible to
the other threads.

However, if the language is missing any language level means to define
a critical region, that is a problem. It is typically the case when a
language offers library level means for concurrency.

In C++ it could have been solved with a single keyword: `shared'. A
shared class is basically a monitor (note that originally the monitor
was introduced as a shared class). Here is an example:

shared class A {
int x;
public:
int s(const int n) {const int t = x; x = n; return t;}
};

The `x' is a critical resource which can be accessed inside a critical
region only. The compiler can implement the critical region with the
most appropriate low level means on a given platform. The language
level, however, is platform independent. Here the visibility is clear
and simple.

Best Regards,
Szabolcs

Chris Thomasson

unread,

Apr 19, 2008, 4:46:46 PM4/19/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

news:5bce933b-5e59-4464...@c58g2000hsc.googlegroups.com...

On Apr 19, 9:25 am, James Kanze <james.ka...@gmail.com> wrote:

> > The main issue isn't the library interface (although that is
> > important too). The main issue involves language rules: when
> > must a compiler ensure that a write is visible in other threads,
> > for example.

> It is a good example. The visibility is quite straightforward issue in
> any decent concurrent programming language: At the end of the critical
> region a write that happened inside the critical region is visible to
> the other threads.

> However, if the language is missing any language level means to define
> a critical region, that is a problem. It is typically the case when a
> language offers library level means for concurrency.

POSIX Threads puts restrictions on C compilers such that threading is
defined without any need for new keywords; its not a problem.

> In C++ it could have been solved with a single keyword: `shared'. A
> shared class is basically a monitor (note that originally the monitor
> was introduced as a shared class). Here is an example:

> shared class A {
> int x;
> public:
> int s(const int n) {const int t = x; x = n; return t;}
> };

[...]

That's at a too high of level for C++. What if you wanted to atomically
modify x without using any locks? What if you wanted fine-grain control over
memory barriers involved? What if you wanted to be able to place the
barriers exactly where you want them? How does a shared keyword help? What
if I wanted to use plain loads and stores with membars instead of
interlocked rmw instructions?

James Kanze

unread,

Apr 20, 2008, 4:33:42 AM4/20/08

to

On 19 avr, 20:03, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 19, 9:25 am, James Kanze <james.ka...@gmail.com> wrote:

> > The main issue isn't the library interface (although that is
> > important too). The main issue involves language rules: when
> > must a compiler ensure that a write is visible in other threads,
> > for example.

> It is a good example. The visibility is quite straightforward
> issue in any decent concurrent programming language: At the
> end of the critical region a write that happened inside the
> critical region is visible to the other threads.

Which is a language issue.

> However, if the language is missing any language level means
> to define a critical region, that is a problem.

Why?

> It is typically the case when a language offers library level
> means for concurrency.

Posix does it for C, without any real problems. It means that
language issues and the library are somewhat mixed, but there's
nothing new there.

> In C++ it could have been solved with a single keyword:
> `shared'. A shared class is basically a monitor (note that
> originally the monitor was introduced as a shared class).
> Here is an example:

> shared class A {
> int x;
> public:
> int s(const int n) {const int t = x; x = n; return t;}
> };

> The `x' is a critical resource which can be accessed inside a
> critical region only.

Which covers about 10% of the reasonable uses. I rather prefer
the way it is being handled, where I have more options when I'm
building my higher level abstractions.

Szabolcs Ferenczi

unread,

Apr 20, 2008, 7:42:30 AM4/20/08

to

On Apr 20, 10:33 am, James Kanze <james.ka...@gmail.com> wrote:
> On 19 avr, 20:03, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:

> ...

> > However, if the language is missing any language level means
> > to define a critical region, that is a problem.
>
> Why?

Well, because visibility becomes a problem only if the concept of
Critical Region is missing from the language level. If the Critical
Region is part of the language, there is no issue like visibility.

Visibility and memory model are very low level concerns created by the
library approach itself.

Note that this discussion thread has just started with the claim that
a library is not appropriate (some referred to the Boehm article) but C
++0x handles concurrency at the language level. E.g. ACE has been
rejected with this claim.

On the contrary, what we can see is that C++0x provides a purely
library-based approach for the threading, however, on the other hand, C
++0x pays a great effort to (re)solve problems created by the library
approach.

Any decent concurrent programming language must provide some language
means to:

(1) mark the shared resources
(2) mark the associated Critical Regions

This way the compiler can and must check whether the shared resource
is accessed inside Critical Regions only. In an object-oriented
language the two issues (i.e. the shared resource and the associated
Critical Region) are naturally combined in the concept of the shared
class (see also Monitor).

The checking function of the compiler can prevent a good deal of
concurrent programming errors and that is one of the main differences
between library level and language level. This essential difference is
not discussed in the Boehm paper which is a golden reference here
somehow.

If you claim that you are going to tackle multi-threading at language
level in C++0x, it is unavoidable to incorporate Critical Regions and
Conditional Critical Regions. Well, you should do so if the language
level is the concern. Otherwise you might play at low level such as
libraries, visibility and memory models but you should be aware of it
too and not claiming the language level.

After all, it seems that in C++0x you were trying to collect the
elements of the state of the art library-based approaches to multi-
threading and, consequently, you had to extend C++ at the language
level to deal with the problems generated by the library-based
approach. However, threading itself remains at the library-level. I
think we can summarise it that way, can't we.

Best Regards,
Szabolcs

P.S.: Once you know the details of the new proposal so well, would you
be so kind and publish some canonical concurrency problems solved in C+
+0x here, just to see how the proposal works out in some real code.
http://groups.google.com/group/comp.lang.c++/msg/9607b37a3b0323f3

gpderetta

unread,

Apr 20, 2008, 9:45:06 AM4/20/08

to

On Apr 20, 1:42 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 20, 10:33 am, James Kanze <james.ka...@gmail.com> wrote:
>
> > On 19 avr, 20:03, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> > ...
> > > However, if the language is missing any language level means
> > > to define a critical region, that is a problem.
>
> > Why?
>
> Well, because visibility becomes a problem only if the concept of
> Critical Region is missing from the language level. If the Critical
> Region is part of the language, there is no issue like visibility.
>

If you use locks, std::scoped_lock (or whatever it is called today),
will mark your critical section just fine: even if it is defined in
the standard library it doesn't means that compiler can't or doesn't
have to apply appropriate magic to it.

If you use low levels atomic operations, then a critical section is
of very little use. The standard committee decided, IMVHO correctly,
that, as a system level language, C++ has to provide the most fine
grained approach to concurrency as possible (from relaxed atomics
to locks and condition variables).

> Visibility and memory model are very low level concerns created by the
> library approach itself.

No. Java has an high level approach to concurrency (monitors
and synchronized methods), but still defines a memory model.
In fact the C++ memory model has been heavily inspired by Java.

>
> Note that this discussion thread has just started with the claim that
> a library is not appropriate (some referred to the Boehm article) but C
> ++0x handles concurrency at the language level.

A library *alone* is not sufficient. The compiler must be aware of
threading for the program to work correctly. Even posix threads,
apparently a pure library solution put some requirements on the
compiler (see the recent try_lock discussion on the gcc mailing list).

> E.g. ACE has been
> rejected with this claim.
>
> On the contrary, what we can see is that C++0x provides a purely
> library-based approach for the threading,

How the memory model is a purely library based approach? How can you
implement atomic operations with a pure library extensions?
How can you be sure that a try_lock will work without the compiler
being is explicitly prohibited to perform speculative stores?
How can you be sure that the compiler doesn't move operations in or
out
of a critical section? C++0x is a language based approach. The
interface,
as most of C++0x appears as a library feature (and the idea is that it
could be implemented as such, as with little help from the compiler
as possible, but some help is needed).

BTW, the standard allows std::vector (or anything else in the standard
library) to be implemented as language construct!

> however, on the other hand, C
> ++0x pays a great effort to (re)solve problems created by the library
> approach.
>

You have to prove this claim.

> Any decent concurrent programming language must provide some language
> means to:
>
> (1) mark the shared resources
> (2) mark the associated Critical Regions
>

This is of course your opinion. Certainly C++ allows the second via
scoped_lock. You can do the first easily via encapsulation.

> This way the compiler can and must check whether the shared resource
> is accessed inside Critical Regions only.

But won't prevent you from deadlocking nor starving other threads or
many more common multithread problems. Letting your self blow up your
foot has always been C++ philosophy :)

> In an object-oriented
> language the two issues (i.e. the shared resource and the associated
> Critical Region) are naturally combined in the concept of the shared
> class (see also Monitor).

C++ is not (only) an OO language. And anyways, implementing a monitor
with std::condition_variable and std::mutex (or whatever their names
are), is pretty trivial.

But with the tools C++0x will provide you can also build other
abstractions; for example in the multithread programming community
there
is some certain interest in the Join Calculus, and it would be
interesting to see it implemented in C++.

>
> The checking function of the compiler can prevent a good deal of
> concurrent programming errors and that is one of the main differences
> between library level and language level. This essential difference is
> not discussed in the Boehm paper which is a golden reference here
> somehow.

I think this is considered a quality of implementation issue.
Producing
good thread debugging tools is very hard, and the standard does not
mandate a c++ compiler to implement them. You can get them from
a third party, but it is going to cost $$!

>
> If you claim that you are going to tackle multi-threading at language
> level in C++0x, it is unavoidable to incorporate Critical Regions and
> Conditional Critical Regions.
> Well, you should do so if the language
> level is the concern. Otherwise you might play at low level such as
> libraries, visibility and memory models but you should be aware of it
> too and not claiming the language level.
>

You should be aware that just because these facilities have a library
interface in C++, it doesn't means that they aren't part of the
language.
The line between the library and the language is much tinnier in C++
than
in other languages.

> After all, it seems that in C++0x you were trying to collect the
> elements of the state of the art library-based approaches to multi-
> threading and, consequently, you had to extend C++ at the language
> level to deal with the problems generated by the library-based
> approach.

The previous two major C++ thread interfaces (Win32 threads and
POSIX threads) certainly are not a pure library based approach,
but their requirements on a compiler were a bit fuzzy or not
existent at all. C++0x took care to formalize that. There doesn't
seem to be any need to be beyond that. A pure language based solution
would mean: no lock-free data structures and algorithms, no (easy)
way to replace the compiler provided thread API, any user extension
would feel second class etc.

> However, threading itself remains at the library-level. I
> think we can summarise it that way, can't we.

Far from that.

>
> P.S.: Once you know the details of the new proposal so well, would you
> be so kind and publish some canonical concurrency problems solved in C+
> +0x here, just to see how the proposal works out in some real code.http://groups.google.com/group/comp.lang.c++/msg/9607b37a3b0323f3

I do not know the details, I'm not a threading exert nor I have the
time
to write 'real code' for you.

--
gpd

Szabolcs Ferenczi

unread,

Apr 20, 2008, 12:22:10 PM4/20/08

to

Yes, it is clear from your whole answer that you are "not a threading
ex[p]ert".

On top of all that you clearly intermix the language level with the
library level. E.g. you keep saying that with help of scoped lock you
can implement a Critical Region. Well, I never doubted that.
Furthermore, I can implement a Critical Region with help of semaphores
too. I can implement even scoped locks or Monitors with help of
semaphores and without any visibility problem.---However, that depends
on pure discipline and it will never be a language level solution and
all the compiler support will be missing.

No problem if you cannot write 'real code' for us, never mind. I did
not address the request to you. You can find the questions addressed
to you in:
http://groups.google.com/group/comp.lang.c++/msg/4328af93a959bf75

Besides, I asked someone else, who seems to know the details, to put
some code fragments here. If you do not know the details *when* we
come to something exact to do, hmmm...., well---despite you talked a
lot about it creating the illusion as if you were in possession of the
details.

Anyway, thank you for your efforts. If you just do not have time to do
it, let us wait for the experts then.

Best Regards,
Szabolcs

James Kanze

unread,

Apr 20, 2008, 12:36:50 PM4/20/08

to

On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 20, 10:33 am, James Kanze <james.ka...@gmail.com> wrote:

> > On 19 avr, 20:03, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> > ...
> > > However, if the language is missing any language level means
> > > to define a critical region, that is a problem.

> > Why?

> Well, because visibility becomes a problem only if the concept
> of Critical Region is missing from the language level. If the
> Critical Region is part of the language, there is no issue
> like visibility.

You've got things backwards, I think. You're imposing a
specific solution. A language definition of a critical region
can define visibility, of course, but that's not the only
solution. And it's far from the best on.e

> Visibility and memory model are very low level concerns
> created by the library approach itself.

> Note that this discussion thread has just started with the
> claim that a library is not appropriate (some referred to the
> Boehm article) but C ++0x handles concurrency at the language
> level. E.g. ACE has been rejected with this claim.

No. Boehm claims that a library solution alone is not
sufficient. He never claims that the complete solution will not
contain any library elements. The programmer's API is in the
library. And since ACE just differs to the underlying platform
for this, it doesn't reject anything.

> On the contrary, what we can see is that C++0x provides a
> purely library-based approach for the threading, however, on
> the other hand, C ++0x pays a great effort to (re)solve
> problems created by the library approach.

You're not making sense. C++0x doesn't (or won't) provide a
purely library-based approach. Threading issues will be
addressed in the language. C++0x makes the distinction between
library issues (the API) and language issues (the memory model
and visibility).

> Any decent concurrent programming language must provide some
> language means to:

> (1) mark the shared resources
> (2) mark the associated Critical Regions

> This way the compiler can and must check whether the shared
> resource is accessed inside Critical Regions only.

From what I understand, that's more or less the way Ada works.
The result is that everyone ends up creating larger structures
using this base, and running into the same problems.

> In an object-oriented language the two issues (i.e. the shared
> resource and the associated Critical Region) are naturally
> combined in the concept of the shared class (see also
> Monitor).

> The checking function of the compiler can prevent a good deal
> of concurrent programming errors

Funny. Concurrent programs in Ada have exactly the same
problems as those in C++.

> and that is one of the main differences between library level
> and language level. This essential difference is not discussed
> in the Boehm paper which is a golden reference here somehow.

> If you claim that you are going to tackle multi-threading at
> language level in C++0x, it is unavoidable to incorporate
> Critical Regions and Conditional Critical Regions. Well, you
> should do so if the language level is the concern. Otherwise
> you might play at low level such as libraries, visibility and
> memory models but you should be aware of it too and not
> claiming the language level.

I don't think you really understand what concurrency is all
about. Boehm's point, which has been accepted by the committee,
is that you have to address certain threading issues at the
language level. There was never a proposal for introducing the
threading API at the language level, however; that just seems
contrary to the basic principles of C++, and in practice, is
very limiting and adds nothing.

> After all, it seems that in C++0x you were trying to collect
> the elements of the state of the art library-based approaches
> to multi- threading and, consequently, you had to extend C++
> at the language level to deal with the problems generated by
> the library-based approach. However, threading itself remains
> at the library-level. I think we can summarise it that way,
> can't we.

No. The language addresses threading, as it must.

Pete Becker

unread,

Apr 20, 2008, 1:04:16 PM4/20/08

to

On 2008-04-20 12:36:50 -0400, James Kanze <james...@gmail.com> said:

> There was never a proposal for introducing the
> threading API at the language level, however; that just seems
> contrary to the basic principles of C++, and in practice, is
> very limiting and adds nothing.
>

There actually was a proposal for threading at the language level:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1875.html.
But, as you say, that is contrary to the basic principles of C++.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Szabolcs Ferenczi

unread,

Apr 20, 2008, 1:17:23 PM4/20/08

to

On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
> On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:

> ...

> > On the contrary, what we can see is that C++0x provides a
> > purely library-based approach for the threading, however, on
> > the other hand, C ++0x pays a great effort to (re)solve
> > problems created by the library approach.
>
> You're not making sense. C++0x doesn't (or won't) provide a
> purely library-based approach. Threading issues will be
> addressed in the language. C++0x makes the distinction between
> library issues (the API) and language issues (the memory model
> and visibility).

So, you say "threading issues will be addressed in the language."
Good. Let us see. Now, please tell me, just to make some sense:

(1) How you can start a thread of computation in C++0x? Do you have a
language element for it, or do you have just some library stuff?

(2) How can you define mutual exclusion for the concurrent threads of
computations in C++0x? Do you have a language element for it, or do
you have just some library stuff?

(3) How can you make threads synchronised with each other? Do you have
a language element for it, or do you have just some library stuff?

Please try to answer these questions. If you sincerely answer them we
can see whether C++0x addresses threading at the language level or at
the library level.

Chris Thomasson

unread,

Apr 20, 2008, 2:12:07 PM4/20/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

news:cd5b98c0-5be5-46a2...@l42g2000hsc.googlegroups.com...
[...]

> P.S.: Once you know the details of the new proposal so well, would you
> be so kind and publish some canonical concurrency problems solved in C+
> +0x here, just to see how the proposal works out in some real code.
> http://groups.google.com/group/comp.lang.c++/msg/9607b37a3b0323f3

Here is dining philosophers problem solved with POSIX Threads and C:

http://appcore.home.comcast.net/misc/diner_c.html

Its going to be absolutely trivial to implement that in C++0x because it
offers everything one needs to do it.

Here is are some full-blown reader-writer solution in x86:

http://appcore.home.comcast.net/misc/pc_sample_h_v1.html
(Proxy GC implementation...)

http://appcore.home.comcast.net
(SMR implementation...)

Luckily C++0x is going to provide all the atomics and membars needed to
implement those algorithms. Even the first one (e.g., proxy gc)... It
requires DWCAS, however, C++0x is low-level enough for me to use alignment
and offset tricks to elude its use. The threading tools provided in C++0x
will allow one to code solutions to all the problems you listed.

Finally, C++0x will allow somebody to create 100% standard implementations
of many state-of-the-art synchronization algorithms. That's a good thing
indeed!

:^)

Chris Thomasson

unread,

Apr 20, 2008, 2:27:35 PM4/20/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

news:7c97ff82-f878-4fcb...@24g2000hsh.googlegroups.com...

On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
> > On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> > ...
> > > On the contrary, what we can see is that C++0x provides a
> > > purely library-based approach for the threading, however, on
> > > the other hand, C ++0x pays a great effort to (re)solve
> > > problems created by the library approach.
> >
> > You're not making sense. C++0x doesn't (or won't) provide a
> > purely library-based approach. Threading issues will be
> > addressed in the language. C++0x makes the distinction between
> > library issues (the API) and language issues (the memory model
> > and visibility).

> So, you say "threading issues will be addressed in the language."
> Good. Let us see.

Your not grasping the concept. POSIX Threads implementations are implemented
as a library, however, a conforming C compiler MUST comply with the
restrictions that the PThread standard imposes on it. You can read a VERY
brief description from Mr. PThread himself:

http://groups.google.com/group/comp.programming.threads/msg/729f412608a8570d

;^)

Here is what happens when a compiler does not follow the rules imposed by
POSIX Threads:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/63f6360d939612b3

This is not a legal optimization for a compiler to perform IF it claims that
it works with POSIX systems.

> Now, please tell me, just to make some sense:

> (1) How you can start a thread of computation in C++0x? Do you have a
> language element for it, or do you have just some library stuff?

std::thread

> (2) How can you define mutual exclusion for the concurrent threads of
> computations in C++0x? Do you have a language element for it, or do
> you have just some library stuff?

std::mutex

> (3) How can you make threads synchronised with each other? Do you have
> a language element for it, or do you have just some library stuff?

std::condition_variable
std::atomic

> Please try to answer these questions. If you sincerely answer them we
> can see whether C++0x addresses threading at the language level or at
> the library level.

It address threading at both the language and library level.

[...]

peter koch

unread,

Apr 20, 2008, 5:12:25 PM4/20/08

to

On 20 Apr., 19:17, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
>

[snip]

> So, you say "threading issues will be addressed in the language."
> Good. Let us see. Now, please tell me, just to make some sense:
>
> (1) How you can start a thread of computation in C++0x? Do you have a
> language element for it, or do you have just some library stuff?
>
> (2) How can you define mutual exclusion for the concurrent threads of
> computations in C++0x? Do you have a language element for it, or do
> you have just some library stuff?
>
> (3) How can you make threads synchronised with each other? Do you have
> a language element for it, or do you have just some library stuff?

Well... instead of persevering with your questions, I really believe
that you should try to read the Böhm paper. If not for your own sake,
then for ours ;-)
The answer to your questions above is obvious (it is in a library),
but this still does not change the fact that threading for a large
part is a language issue.

/Peter

Szabolcs Ferenczi

unread,

Apr 20, 2008, 7:14:45 PM4/20/08

to

On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
> On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:

> ...

> > Any decent concurrent programming language must provide some
> > language means to:
> > (1) mark the shared resources
> > (2) mark the associated Critical Regions
> > This way the compiler can and must check whether the shared
> > resource is accessed inside Critical Regions only.
>
> From what I understand, that's more or less the way Ada works.

From what you understand, maybe yes. I must tell you that there was
life before Ada in the area of concurrent programming and there are
other genuine concurrent programming languages than Ada. But Ada was
certainly the state of the art concurrent programming language at a
time.

However, the multi-core boom was not there yet.

> > In an object-oriented language the two issues (i.e. the shared
> > resource and the associated Critical Region) are naturally
> > combined in the concept of the shared class (see also
> > Monitor).
> > The checking function of the compiler can prevent a good deal
> > of concurrent programming errors
>
> Funny. Concurrent programs in Ada have exactly the same
> problems as those in C++.

Funny indeed. C++ is a pure sequential programming language but Ada is
a high level concurrent programming language tackling concurrency at
the language level. How can you compare them from the concurrency
point of view?

Concurrent programs has nothing to do with C++ so far since you are
just working on the enhancement of C++.

Consequently, concurrent programs in Ada CANNOT have exactly the same
problems as those in C++. It would be funny, indeed.

Best Regards,
Szabolcs

James Kanze

unread,

Apr 21, 2008, 4:45:16 AM4/21/08

to

On Apr 21, 1:14 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:

> > On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> > ...
> > > Any decent concurrent programming language must provide some
> > > language means to:
> > > (1) mark the shared resources
> > > (2) mark the associated Critical Regions
> > > This way the compiler can and must check whether the shared
> > > resource is accessed inside Critical Regions only.

> > From what I understand, that's more or less the way Ada works.

> From what you understand, maybe yes. I must tell you that
> there was life before Ada in the area of concurrent
> programming and there are other genuine concurrent programming
> languages than Ada. But Ada was certainly the state of the art
> concurrent programming language at a time.

I'm aware of that. Since then, the state of the art has moved
on, and the Ada model is really considered a bit too
constricting.

> However, the multi-core boom was not there yet.

Maybe not in your world. I had to deal with parallel processors
in some of my work in the late 1970's. During the eighties, the
automation of parallel processing was the rage in compilers.
It's only recently that such parallelism has descended to the
PC's, but it's hardly new.

> > > In an object-oriented language the two issues (i.e. the
> > > shared resource and the associated Critical Region) are
> > > naturally combined in the concept of the shared class (see
> > > also Monitor). The checking function of the compiler can
> > > prevent a good deal of concurrent programming errors

> > Funny. Concurrent programs in Ada have exactly the same
> > problems as those in C++.

> Funny indeed. C++ is a pure sequential programming language
> but Ada is a high level concurrent programming language
> tackling concurrency at the language level. How can you
> compare them from the concurrency point of view?

Why not? You have exactly the same problems writing correct
multithreaded code in Ada as you do in C or C++ under Posix.

> Concurrent programs has nothing to do with C++ so far since
> you are just working on the enhancement of C++.

> Consequently, concurrent programs in Ada CANNOT have exactly
> the same problems as those in C++. It would be funny, indeed.

Except that they do. The suffer from deadlocks, and race
conditions, and all of the same problems C and C++ programs
suffer from.

Welcome to the real world.

James Kanze

unread,

Apr 21, 2008, 4:38:08 AM4/21/08

to

On Apr 20, 7:17 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:

> > On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> > ...
> > > On the contrary, what we can see is that C++0x provides a
> > > purely library-based approach for the threading, however, on
> > > the other hand, C ++0x pays a great effort to (re)solve
> > > problems created by the library approach.

> > You're not making sense. C++0x doesn't (or won't) provide a
> > purely library-based approach. Threading issues will be
> > addressed in the language. C++0x makes the distinction between
> > library issues (the API) and language issues (the memory model
> > and visibility).

> So, you say "threading issues will be addressed in the language."
> Good. Let us see. Now, please tell me, just to make some sense:

> (1) How you can start a thread of computation in C++0x? Do you have a
> language element for it, or do you have just some library stuff?

> (2) How can you define mutual exclusion for the concurrent threads of
> computations in C++0x? Do you have a language element for it, or do
> you have just some library stuff?

> (3) How can you make threads synchronised with each other? Do you have
> a language element for it, or do you have just some library stuff?

You don't seem to be reading what I am writing. These are not
fundamental threading issues; they're simply means of accessing
the underlying thread model. Since they're just an API, they
are naturally defined in the library; it would be a serious
design error to do otherwise.

The real threading issue is rather when you need 2, and when you
need 3 (and of course, when memory is synchronized between
different threads), not how you access the primitives which
provide them. And that is a pure language issue.

gpderetta

unread,

Apr 21, 2008, 10:18:26 AM4/21/08

to

On Apr 20, 6:22 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 20, 3:45 pm, gpderetta <gpdere...@gmail.com> wrote:
>
> > I do not know the details, I'm not a threading exert nor I have the
> > time
> > to write 'real code' for you.
>
> Yes, it is clear from your whole answer that you are "not a threading
> ex[p]ert".

Yes, I'm not a self-appointed threading expert.

>
> On top of all that you clearly intermix the language level with the
> library level.

In c++ the library level and the language *are* intermixed.

> E.g. you keep saying that with help of scoped lock you
> can implement a Critical Region. Well, I never doubted that.
> Furthermore, I can implement a Critical Region with help of semaphores
> too. I can implement even scoped locks or Monitors with help of
> semaphores and without any visibility problem.

How do you know that your compiler is not going to optimize around
your semaphore? I, and pretty much everybody else, claim that it
is impossible to write a correct semaphore class in ISO C++. Not
without extra help from the compiler.

> ---However, that depends
> on pure discipline and it will never be a language level solution and
> all the compiler support will be missing.

After lot's of talking you still fail to explain us how a
keyword based interface would be better than a library based
interface.

--
Giovanni P. Deretta

Ke Jin

unread,

Apr 21, 2008, 12:52:35 PM4/21/08

to

Szabolcs Ferenczi wrote:
> Note that this discussion thread has just started with the claim that
> a library is not appropriate (some referred to the Boehm article) but C
> ++0x handles concurrency at the language level. E.g. ACE has been
> rejected with this claim.
>

Szabolcs,

I think you are referring to my original post. To clarify, I didn't
claim that ACE had been rejected for this reason. I was just pointing
out that the C++0x thread goes beyond ACE thread.

Second, I am not aware of ACE has ever been submitted to C++0x, not to
mention it was rejected for any reason. In fact, I pointed out in my
original post that ACE folks did attempt to have it stamped by a
standard organization. However, instead of pursuing ANSI or ISO
(therefore C++0x committee) suggested by us, they went to OMG (see the
link in the original post).

Regards,
Ke

Szabolcs Ferenczi

unread,

Apr 21, 2008, 1:07:23 PM4/21/08

to

On Apr 21, 10:45 am, James Kanze <james.ka...@gmail.com> wrote:
> On Apr 21, 1:14 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:

> ...

> > However, the multi-core boom was not there yet.
>
> Maybe not in your world.

Not in your world either unless you are dreaming.

> I had to deal with parallel processors
> in some of my work in the late 1970's.

I do not doubt that you did. Well done.

However, it is not an argument against that at the time Ada was
designed the multi-core boom was not there yet. That is what I said.
You are going against the facts, I must tell you.

> > > > In an object-oriented language the two issues (i.e. the
> > > > shared resource and the associated Critical Region) are
> > > > naturally combined in the concept of the shared class (see
> > > > also Monitor). The checking function of the compiler can
> > > > prevent a good deal of concurrent programming errors
> > > Funny. Concurrent programs in Ada have exactly the same
> > > problems as those in C++.
> > Funny indeed. C++ is a pure sequential programming language
> > but Ada is a high level concurrent programming language
> > tackling concurrency at the language level. How can you
> > compare them from the concurrency point of view?
>
> Why not?

Just because Ada is a concurrent programming language with proper
language means for concurrency. C++ is not one---today. That is why.
Simple, isn't it?

> You have exactly the same problems writing correct
> multithreaded code in Ada as you do in C or C++ under Posix.

Well, on the same ground you can tell that the problem applies to any
meaningful concurrent programming languages today. Not to C++,
however, since it is not a concurrent programming language as far as
the language level is concerned. And it does not want to be one too.

Concurrency library + adjusted sequential language != concurrent
language

You can make programs in the combination of library and a hacked
sequential language, of course, but the compiler support is missing
for the concurrent parts. I am not talking about optimisation because
it is not compiler support. I am talking about the compiler checking
the correct use of the shared resources. For that, however, you should
be able to mark the shared resources and the cirtical regions at the
language level.

> > Concurrent programs has nothing to do with C++ so far since
> > you are just working on the enhancement of C++.
> > Consequently, concurrent programs in Ada CANNOT have exactly
> > the same problems as those in C++. It would be funny, indeed.
>
> Except that they do.

Today C++ is a pure sequential programming language without any
language elements for concurrency. I hope you can admit that. Ada on
the other hand is a concurrent language with proper language means for
concurrency. Consequently, programs written in the two language cannot
have the same concurrency problems since one of the languages is
sequential.

> The suffer from deadlocks, and race
> conditions, and all of the same problems C and C++ programs
> suffer from.
>
> Welcome to the real world.

In the real world C and C++ programs only suffer from their low level
features like pointers and their type system. Concurrency at the
language level has nothing to do with either C or C++, and as it looks
like, it never will be.

That is the real world.

I hope I could help.

Best Regards,
Szabolcs

Szabolcs Ferenczi

unread,

Apr 21, 2008, 1:53:48 PM4/21/08

to

On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
> On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:

> ...

> > and that is one of the main differences between library level
> > and language level. This essential difference is not discussed
> > in the Boehm paper which is a golden reference here somehow.
> > If you claim that you are going to tackle multi-threading at
> > language level in C++0x, it is unavoidable to incorporate
> > Critical Regions and Conditional Critical Regions. Well, you
> > should do so if the language level is the concern. Otherwise
> > you might play at low level such as libraries, visibility and
> > memory models but you should be aware of it too and not
> > claiming the language level.
>
> I don't think you really understand what concurrency is all
> about.

Do you think it is necessary for you go to that level? Let me ignore
your attempt of personal insult.

> Boehm's point, which has been accepted by the committee,
> is that you have to address certain threading issues at the
> language level.

Yes, he only deals with the situation what follows when you introduce
library level means. My point was that Boehm did not addressed the
important issue what is the fundamental difference having something at

the language level or at the library level.

See: "The checking function of the compiler can prevent a good deal
of concurrent programming errors and that is one of the main

differences between library level and language level. This essential
difference is not discussed in the Boehm paper which is a golden
reference here somehow."

> There was never a proposal for introducing the

> threading API at the language level, however; that just seems
> contrary to the basic principles of C++, and in practice, is
> very limiting and adds nothing.

You have been unveiled already in this blunder by someone who is
involved in C++0x:
http://groups.google.com/group/comp.lang.c++/msg/1ea5fdf80fd7f461

Best Regards,
Szabolcs

Chris Thomasson

unread,

Apr 21, 2008, 2:07:05 PM4/21/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

news:29ca5980-897b-4de2...@m73g2000hsh.googlegroups.com...

On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
> > On 20 avr, 13:42, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> > ...
> > > and that is one of the main differences between library level
> > > and language level. This essential difference is not discussed
> > > in the Boehm paper which is a golden reference here somehow.
> > > If you claim that you are going to tackle multi-threading at
> > > language level in C++0x, it is unavoidable to incorporate
> > > Critical Regions and Conditional Critical Regions. Well, you
> > > should do so if the language level is the concern. Otherwise
> > > you might play at low level such as libraries, visibility and
> > > memory models but you should be aware of it too and not
> > > claiming the language level.
> >
> > I don't think you really understand what concurrency is all
> > about.

> Do you think it is necessary for you go to that level? Let me ignore
> your attempt of personal insult.

It sounds like your trying to say that C++0x is no good because it won't
have very high-level keywords which deal with automatic critical sections
and the like... If that it true, well, your a bit misguided to say the
least...

> > Boehm's point, which has been accepted by the committee,
> > is that you have to address certain threading issues at the
> > language level.

> Yes, he only deals with the situation what follows when you introduce
> library level means. My point was that Boehm did not addressed the
> important issue what is the fundamental difference having something at
> the language level or at the library level.

He did not need to go into that. He was trying to point out that C++ can use
threads if the compilers follow some specific rules. Much like the rules
POSIX Threads puts on C compilers. Your not getting the fact that the
library and language wrt threads needs have a very intimate relationship
with each other. C++ does not need any new keywords to implement threading.

> See: "The checking function of the compiler can prevent a good deal
> of concurrent programming errors and that is one of the main
> differences between library level and language level. This essential
> difference is not discussed in the Boehm paper which is a golden
> reference here somehow."

> > There was never a proposal for introducing the
> > threading API at the language level, however; that just seems
> > contrary to the basic principles of C++, and in practice, is
> > very limiting and adds nothing.

> You have been unveiled already in this blunder by someone who is
> involved in C++0x:
> http://groups.google.com/group/comp.lang.c++/msg/1ea5fdf80fd7f461

Huh? Please explain how your 'shared' keyword would allow me to program an
OS kernel which depends on asynchronous signal-safe atomic operations and
fine-grain memory barriers for use in interrupt handlers? How could I use
your 'shared' keyword to program RCU? IMHO, you keyword would break the
spirit of C++.

Szabolcs Ferenczi

unread,

Apr 24, 2008, 12:37:40 PM4/24/08

to

On Apr 21, 6:52 pm, Ke Jin <kjin...@gmail.com> wrote:
> Szabolcs Ferenczi wrote:
> > Note that this discussion thread has just started with the claim that
> > a library is not appropriate (some referred to the Boehm article) but C
> > ++0x handles concurrency at the language level. E.g. ACE has been
> > rejected with this claim.
>
> Szabolcs,
>
> I think you are referring to my original post. To clarify, I didn't
> claim that ACE had been rejected for this reason. I was just pointing
> out that the C++0x thread goes beyond ACE thread.

Hi Ke,

I do not refer specifically to your post it is just an accident that
my comment to your post has somehow attracted some forum fighters.

I remember having seen a number of arguments that day going that the
library solution is not adequate (with reference to the paper of
Boehm) and that C++0x goes for a language-based approach instead.
(Unfortunately, Google could have some problem that day since some
posts have disappeared from the web pages.) Having heard that C++0x
tackles threading at the language level I became quite enthusiastic
but when I checked into the document
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
I had to conclude that C++0x also has a pure library approach as far
as threading is concerned.

I was also searching for some language elements for concurrency in the
document but all I could find was just library stuff. When I wanted to
learn from the forum members about the exact language _elements_ in C+
+0x, I got _elementary_ attack instead.

That C++0x will have threading at the library level is not good but
that alone is not a big problem. Only it will just not be any
different from the other languages which have threading at the library
level, including Java. But then, after all, ACE is a library too.

That C++0x will have threading at the library level as well is not a
problem, at least not if people are aware of it. I could not get it
acknowledged here that threading is at library level in C++0x. Well,
some people here often refer to the development in threading in the
last ten years and claiming some state-of-the-art threading which is
missing from ACE. There is development in the hardware side, I think,
but there is hardly any development or rather there is a retrogression
in the software side as far as concurrent programming is concerned.
With the current trend in C++ it will just continue, I guess.

> Second, I am not aware of ACE has ever been submitted to C++0x, not to
> mention it was rejected for any reason. In fact, I pointed out in my
> original post that ACE folks did attempt to have it stamped by a
> standard organization. However, instead of pursuing ANSI or ISO
> (therefore C++0x committee) suggested by us, they went to OMG (see the
> link in the original post).

Perhaps ACE itself was not submitted and neither rejected, I do not
know it either. I am not especially an ACE fan. I just remembered some
claim like library itself is not an approach C++0x goes, but what I
can see in the doc is just library stuff.

So I just wanted to learn from some experts here what are the language
elements then in C++0x with respect to threading. Sadly, I could not
find any competent person here, except for some enthusiastic forum
fighters, it seems.

Best Regards,
Szabolcs

Pete Becker

unread,

Apr 24, 2008, 1:05:47 PM4/24/08

to

On 2008-04-24 12:37:40 -0400, Szabolcs Ferenczi
<szabolcs...@gmail.com> said:

>
> I remember having seen a number of arguments that day going that the
> library solution is not adequate (with reference to the paper of
> Boehm) and that C++0x goes for a language-based approach instead.

That's not correct, as several people have told you. C++0x has
low-level language-based requirements and a higher level library that
relies on them.

> (Unfortunately, Google could have some problem that day since some
> posts have disappeared from the web pages.) Having heard that C++0x
> tackles threading at the language level I became quite enthusiastic
> but when I checked into the document
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> I had to conclude that C++0x also has a pure library approach as far
> as threading is concerned.

You're right that C++0x has a high level library. You're not right that
it's a "pure library approach". The library can't be implemented
correctly without compiler support, in the form of a stricter memory
model that makes it possible to define and to reason about the order of
operations in a multi-threaded program.

> [...]

>
>> Second, I am not aware of ACE has ever been submitted to C++0x, not to
>> mention it was rejected for any reason. In fact, I pointed out in my
>> original post that ACE folks did attempt to have it stamped by a
>> standard organization. However, instead of pursuing ANSI or ISO
>> (therefore C++0x committee) suggested by us, they went to OMG (see the
>> link in the original post).
>
> Perhaps ACE itself was not submitted and neither rejected, I do not
> know it either. I am not especially an ACE fan. I just remembered some
> claim like library itself is not an approach C++0x goes, but what I
> can see in the doc is just library stuff.

Look at the section entitled "Mult-threaded executions and data races".
Without this you cannot write a reliable, portable multi-threaded
application.

Szabolcs Ferenczi

unread,

Apr 24, 2008, 7:56:51 PM4/24/08

to

On Apr 20, 11:12 pm, peter koch <peter.koch.lar...@gmail.com> wrote:
> On 20 Apr., 19:17, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:
>
> > On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
>
> [snip]
>
> > So, you say "threading issues will be addressed in the language."
> > Good. Let us see. Now, please tell me, just to make some sense:
>
> > (1) How you can start a thread of computation in C++0x? Do you have a
> > language element for it, or do you have just some library stuff?
>
> > (2) How can you define mutual exclusion for the concurrent threads of
> > computations in C++0x? Do you have a language element for it, or do
> > you have just some library stuff?
>
> > (3) How can you make threads synchronised with each other? Do you have
> > a language element for it, or do you have just some library stuff?
>
> Well... instead of persevering with your questions,

Why don't you dare to face those simple questions?

> I really believe
> that you should try to read the Böhm paper. If not for your own sake,
> then for ours ;-)

Of course I have read it already before I have made my first
contribution to this discussion thread. Interesting that anyone, who
cannot find any argument himself, calls me to read that article.

Now I must tell you that Boehms article tackes the problem in a wrong
way. He consideres the situation where shared variables are not
accessed in a mutually exclusive way and tries to draw conclusion from
it. I have a good feeling that all of his three examples (Section 4)
are inadequate (buggy) from the concurrent prgramming point of view.

Already in the 60s it was clear, that whenever concurrent processes
share some variables, i.e. they may access them simultaneously so that
one of the processes changes it, those variables must be protected by
critical sections.

For an example see Section 3 in Boehm:

<quote>
Thus if we started in an initial state in which all variables are
zero, and one thread executes:
x = 1; r1 = y;
while another executes
y = 1; r2 = x;
either the assignment to x or the assignment to y must be executed
first, and either r1 or r2 must have a value of 1 when execution
completes.
... Essentially all re-alistic programming language implementations
supporting
true concurrency allow both r1 and r2 to remain zero in the above
example.
</quote>

Since variables `x' and `y' are accessed in both threads, i.e. they
are shared ones, it is evident that Boehm's example simply has a
concurrent bug. As soon as the bug is fixed (by an embracing critical
region) Boehm's problem disappears.

E.g. if `cr' means critical region (or you can use mutexes):
cr{ x = 1; r1 = y; }
cr{ y = 1; r2 = x; }

You would see the same symptom in Section 4 as well (4.2 is perhaps a
corner case). The whole discussion is built on the assumption: what if
you do not know how to write a correct concurrent program and you use
shared variables without any protection.

I suggest you to study concurrent programming and then re-read the
Boehm paper. You will be surprised.

> The answer to your questions above is obvious (it is in a library),

At least someone who can admit the obvious.

> but this still does not change the fact that threading for a large
> part is a language issue.

It should be a language issue but where are those language elements in
C++0x? You say it is a large part of it. It is fact, you say. Can you
show here just a small part of it? Factually.

Best Regards,
Szabolcs

Chris Thomasson

unread,

Apr 24, 2008, 8:43:22 PM4/24/08

to

> "Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

> news:68977d32-c840-4029-a3f1->
> dfe131...@r66g2000hsg.googlegroups.com...

> On Apr 20, 11:12 pm, peter koch <peter.koch.lar...@gmail.com> wrote:
> > On 20 Apr., 19:17, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> >
> > > On Apr 20, 6:36 pm, James Kanze <james.ka...@gmail.com> wrote:
> >
> > [snip]
> >
> > > So, you say "threading issues will be addressed in the language."
> > > Good. Let us see. Now, please tell me, just to make some sense:
> >
> > > (1) How you can start a thread of computation in C++0x? Do you have a
> > > language element for it, or do you have just some library stuff?
> >
> > > (2) How can you define mutual exclusion for the concurrent threads of
> > > computations in C++0x? Do you have a language element for it, or do
> > > you have just some library stuff?
> >
> > > (3) How can you make threads synchronised with each other? Do you have
> > > a language element for it, or do you have just some library stuff?
> >
> > Well... instead of persevering with your questions,
>
> Why don't you dare to face those simple questions?

You are completely misguided!

:^|

> > I really believe
> > that you should try to read the Böhm paper. If not for your own sake,
> > then for ours ;-)

> Of course I have read it already before I have made my first
> contribution to this discussion thread. Interesting that anyone, who
> cannot find any argument himself, calls me to read that article.

> Now I must tell you that Boehms article tackes the problem in a wrong
> way. He consideres the situation where shared variables are not
> accessed in a mutually exclusive way and tries to draw conclusion from
> it. I have a good feeling that all of his three examples (Section 4)
> are inadequate (buggy) from the concurrent prgramming point of view.
>
> Already in the 60s it was clear, that whenever concurrent processes
> share some variables, i.e. they may access them simultaneously so that
> one of the processes changes it, those variables must be protected by
> critical sections.

Ever heard of non-blocking algorithms?

I suggest you learn about low-level high-performance multi-threading
algorithms. C++ will have efficient support for them in the form of
standardized fine-grain atomic operations and memory barriers.

C++ is going to provide support for the state-of-the-art multi-threading
techniques. If they standard committee listened to you, well, forget about
it. We would all be back to using assembly language. C++ is going to remove
the need for asm and non-blocking algorithms. Why do you fail to understand
that?

I can finally create a 100% standard version of my AppCore library with C++.
If your input was taken into account, well, I could forget about that.
Please, take a break.

:^/

> > The answer to your questions above is obvious (it is in a library),

> At least someone who can admit the obvious.

> > but this still does not change the fact that threading for a large
> > part is a language issue.

> It should be a language issue but where are those language elements in
> C++0x? You say it is a large part of it. It is fact, you say. Can you
> show here just a small part of it? Factually.

You just don't get it. The language and the library ARE very intimate with
each other indeed!

Szabolcs Ferenczi

unread,

Apr 25, 2008, 4:33:22 AM4/25/08

to

On Apr 24, 7:05 pm, Pete Becker <p...@versatilecoding.com> wrote:
> On 2008-04-24 12:37:40 -0400, Szabolcs Ferenczi

> <szabolcs.feren...@gmail.com> said:

> ...

> > I remember having seen a number of arguments that day going that the
> > library solution is not adequate (with reference to the paper of
> > Boehm) and that C++0x goes for a language-based approach instead.
>
> That's not correct, as several people have told you.

Several people keep saying that there are language elements for multi-
threading in C++0x but nobody could enumerate any single languge
element so far.

Several people did not reply to my request to point out what are the
language elements for concurrency in the proposed C++0x standard. When
I make a list for them, asking (1) what are the language elements for
starting a thread of compulation, (2) by which language elements
mutual exclusion is specified, (3) what are the language elements for
making the threads synchronised---the answer is silence or ignorance
from several people. (Ok, one of them has admitted that these are at
the library level rather than at the language level.)

> C++0x has
> low-level language-based requirements and a higher level library that
> relies on them.
>
> > (Unfortunately, Google could have some problem that day since some
> > posts have disappeared from the web pages.) Having heard that C++0x
> > tackles threading at the language level I became quite enthusiastic
> > but when I checked into the document
> >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2497.html
> > I had to conclude that C++0x also has a pure library approach as far
> > as threading is concerned.
>
> You're right that C++0x has a high level library.

I only said library. I would not consider it a high level one. In what
respect would it be higher than say ACE library?

For instance let us see the condition variable. It is an object from a
class both in ACE and in C++0x. In ACE you can associate the
corresponding mutex with the condition variable at instantiation time.
Consequently, later on you can just put:

cv.wait();

In C++0x you always have to be at the low level and specify the mutex
again and again which is not only at a low level but it is error prone
too:

cv.wait(lck);

From this respect I would not consider a condition variable in C++0x
being at a high level at all. All I like about it is that there is an
additional method specified which allows you to wait for a predicate
at the higher level indeed:

template <class Predicate>
void wait(unique_lock<mutex>& lock, Predicate pred);

Still, you have to supply the lock.

> You're not right that
> it's a "pure library approach". The library can't be implemented
> correctly without compiler support,

Many libraries are implemented so far correctly (without any compiler
support), as far as I know.

The missing compiler support for algorithms written on top of those
libraries is another question. Still, correct concurrent programs must
be obeyed by the compiler otherwise there is a problem in the
compiler.

By correct concurrent program I mean one where shared variables are
accessed in a mutually exclusive way, as is expected in any correct
concurrent program.

So, what would be really necessary in any decent concurrent language
is to mark the shared variables and then the compiler could check
whether the shared variables are accessed in a mutually exclusive way
or not.

> ...

> Look at the section entitled "Mult-threaded executions and data races".

Thanks for the hint. I had a look at it. Alltogether, that section
considers how the compiler should react to an incorrect concurrent
program. By incorrect concurrent program I mean again as above.

I would not guessed it before that those three pages are the new
language features for concurrency in C++0x.

> Without this you cannot write a reliable, portable multi-threaded
> application.

Hmmm... How about portable multi-threaded applications written so far?

Best Regards,
Szabolcs

James Kanze

unread,

Apr 25, 2008, 5:45:45 AM4/25/08

to

On Apr 25, 2:43 am, "Chris Thomasson" <cris...@comcast.net> wrote:
> > "Szabolcs Ferenczi" <szabolcs.feren...@gmail.com> wrote in message
> > news:68977d32-c840-4029-a3f1->

> > > I really believe

> > > that you should try to read the Böhm paper. If not for your own sake,
> > > then for ours ;-)
> > Of course I have read it already before I have made my first
> > contribution to this discussion thread.

Maybe he should have said "understood it", then.

[...]

> > The whole discussion is built on the assumption: what if
> > you do not know how to write a correct concurrent program and you use
> > shared variables without any protection.

Not really. The thrust of Boehm's article is that specifying
when protection is needed, and when it isn't, is a language
issue. Compilers can move code around, and on modern
processors, read and write pipelines also mean that register
loads and stores don't map immediately to memory cycles unless
the compiler generates special instructions around them. The
language has to specify what the compiler is allowed to do, and
what it must do.

> > I suggest you to study concurrent programming and then
> > re-read the Boehm paper. You will be surprised.

> I suggest you learn about low-level high-performance
> multi-threading algorithms. C++ will have efficient support
> for them in the form of standardized fine-grain atomic
> operations and memory barriers.

Which is a good thing, in itself, but even without such support,
the language still has to address the fundamental issues of
memory synchronization. (Similarly, whether the programmer API
is built into the language, as in Java, or is in the library, as
will be the case in C++, doesn't fundamentally change anything
in this regard.)

James Kanze

unread,

Apr 25, 2008, 5:57:07 AM4/25/08

to

On Apr 25, 10:33 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 24, 7:05 pm, Pete Becker <p...@versatilecoding.com> wrote:
>
> > On 2008-04-24 12:37:40 -0400, Szabolcs Ferenczi
> > <szabolcs.feren...@gmail.com> said:
> > ...
> > > I remember having seen a number of arguments that day going that the
> > > library solution is not adequate (with reference to the paper of
> > > Boehm) and that C++0x goes for a language-based approach instead.

> > That's not correct, as several people have told you.

> Several people keep saying that there are language elements

> for multi-threading in C++0x but nobody could enumerate any

> single languge element so far.

The Boehm paper explained them in detail. Several people have
suggested that you read it. In the current standard, the basic
execution and memory models start from the point of view that
code is executed sequentially.

> Several people did not reply to my request to point out what
> are the language elements for concurrency in the proposed
> C++0x standard. When I make a list for them, asking (1) what
> are the language elements for starting a thread of
> compulation, (2) by which language elements mutual exclusion
> is specified, (3) what are the language elements for making
> the threads synchronised---the answer is silence or ignorance
> from several people. (Ok, one of them has admitted that these
> are at the library level rather than at the language level.)

No. Several people have pointed out several times where the
language addresses concurrency issues.

[...]

> > You're not right that it's a "pure library approach". The
> > library can't be implemented correctly without compiler
> > support,

> Many libraries are implemented so far correctly (without any
> compiler support), as far as I know.

The ones I know all depend on compiler support. It's not
defined by the language standard, but it is available as an
extension in many compilers. Not always in compatible manners,
of course.

> The missing compiler support for algorithms written on top of
> those libraries is another question. Still, correct concurrent
> programs must be obeyed by the compiler otherwise there is a
> problem in the compiler.

It's still standard conforming.

> > Without this you cannot write a reliable, portable multi-threaded
> > application.

> Hmmm... How about portable multi-threaded applications written
> so far?

Do you know of any? Without some sort of platform dependencies?

Pete Becker

unread,

Apr 25, 2008, 7:43:56 AM4/25/08

to

On 2008-04-25 04:33:22 -0400, Szabolcs Ferenczi
<szabolcs...@gmail.com> said:

>
>> ...
>> Look at the section entitled "Mult-threaded executions and data races".
>
> Thanks for the hint. I had a look at it. Alltogether, that section
> considers how the compiler should react to an incorrect concurrent
> program. By incorrect concurrent program I mean again as above.

Not just that. The terms that it defines are used in other places to
specify the meaning of a valid C++ program.

>
> I would not guessed it before that those three pages are the new
> language features for concurrency in C++0x.

They're the foundation for it.

>
>> Without this you cannot write a reliable, portable multi-threaded
>> application.
>
> Hmmm... How about portable multi-threaded applications written so far?
>

As you have pointed out, Boehm's article says that library-only
solutions, as used in applications written so far, can't guarantee
semantics. Which is why I referred to "RELIABLE, portable
multi-threaded applications".

Nick Keighley

unread,

Apr 25, 2008, 9:11:37 AM4/25/08

to

On 25 Apr, 00:56, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>

I think the problem is that Szabolcs Ferenczi thinks "it's in a
library"
means it has no affect on the core language and "it affects the core
language" means there must be some new syntax; such as a new language
word (hey why not overload static! :-) )

There are many things in the C library (and hence in C++) that cannot
be implemented in pure C (or C++). The standard merely specifies that
they *can* be implemented. For instance a pragma or some formally
undefined construct could have defined behaviour **in that
particular implementation**.

Things such as offsetof(), stdarg, stdjmp, signal

Even i/o and malloc() require extra-standard support from
the implementation even though their semantics are well
defined.

They appear to be specifying libraries for things
like std::semaphore, but the compiler will have to
be tweaked to support these libraries. Hence the core
languge definition needs to be modified.

--
Nick Keighley

In a sense, there is no such thing as a random number;
for example, is 2 a random number?
(D.E.Knuth)

James Kanze

unread,

Apr 25, 2008, 12:55:10 PM4/25/08

to

On Apr 25, 1:43 pm, Pete Becker <p...@versatilecoding.com> wrote:
> On 2008-04-25 04:33:22 -0400, Szabolcs Ferenczi

> <szabolcs.feren...@gmail.com> said:

> >> ...
> >> Look at the section entitled "Mult-threaded executions and
> >> data races".

> > Thanks for the hint. I had a look at it. Alltogether, that
> > section considers how the compiler should react to an
> > incorrect concurrent program. By incorrect concurrent
> > program I mean again as above.

> Not just that. The terms that it defines are used in other
> places to specify the meaning of a valid C++ program.

I think that's the key. Szabolcs keeps harping about an
"incorrect concurrent program", but without something in the
language itself, we have no means of determining whether a
program is correct or not.

Erik Wikström

unread,

Apr 25, 2008, 1:23:16 PM4/25/08

to

On 2008-04-25 10:33, Szabolcs Ferenczi wrote:
> On Apr 24, 7:05 pm, Pete Becker <p...@versatilecoding.com> wrote:
>> On 2008-04-24 12:37:40 -0400, Szabolcs Ferenczi
>> <szabolcs.feren...@gmail.com> said:
>
>> ...
>> > I remember having seen a number of arguments that day going that the
>> > library solution is not adequate (with reference to the paper of
>> > Boehm) and that C++0x goes for a language-based approach instead.
>>
>> That's not correct, as several people have told you.
>
> Several people keep saying that there are language elements for multi-
> threading in C++0x but nobody could enumerate any single languge
> element so far.

They have, you just have to realise that a language element does not
have to be a keyword.

> Several people did not reply to my request to point out what are the
> language elements for concurrency in the proposed C++0x standard. When
> I make a list for them, asking (1) what are the language elements for
> starting a thread of compulation, (2) by which language elements
> mutual exclusion is specified, (3) what are the language elements for
> making the threads synchronised---the answer is silence or ignorance
> from several people. (Ok, one of them has admitted that these are at
> the library level rather than at the language level.)

"This International Standard specifies requirements for implementations
of the C + + programming language." These are the very first words in
the C++ standard document, since the very same document specifies the
C++ standard library I have to conclude that the library is a *part* of
the language. There are a number of things in the standard library that
can not be written in pure C++, these are things that the committee
decided would be better of in the library instead of using new keywords,
that does not make them any less a part of the language.

I understand that you are not satisfied with the syntax/semantics the
committee have decided on for solving the concurrency issues, but that
does not mean that the solution is not a language solution.

--
Erik Wikström

Szabolcs Ferenczi

unread,

Apr 25, 2008, 4:49:47 PM4/25/08

to

On Apr 25, 7:23 pm, Erik Wikström <Erik-wikst...@telia.com> wrote:

> I understand that you are not satisfied with the syntax/semantics the
> committee have decided on for solving the concurrency issues, but that
> does not mean that the solution is not a language solution.

Good that you mention the committee. That must be the key. So it is a
new brave multi-threaded C++ designed by a committee. Just like a
horse, which is designed by a committee. Although, to the external
observer it looks like a camel, it must be a horse because that was on
the agenda of the committee.

That explans a lot. Thanks.

Best Regards,
Szabolcs

Szabolcs Ferenczi

unread,

Apr 25, 2008, 5:04:21 PM4/25/08

to

On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:
> ...

> No. Several people have pointed out several times where the
> language addresses concurrency issues.

Oh yes, you are one of them who continously keep saying that C++0x
adresses concurrency issues at the language level but who fail to put
here just a single language element for it.

Just like tha Bandar-log in The Jungle Book. "We are great. We are
free. We are wonderful. ... We all say so, and so it must be true."

The Bandar-log never complete anything just like the several people
here who keep talking about the new brave language level concurrency
elements in C++0x but nobody can point out any humble element.

Nobody is able to show a single code example here:
http://groups.google.com/group/comp.lang.c++/msg/9607b37a3b0323f3
Several times I have asked for it but nobody is able to do any exact
work except keep talking like Bandar-log.

Just keep saying the Bandar-log song: Yes, we have it, we have it at
the language level and it is true because we say so.

Bravo.

Best Regards,
Szabolcs

Chris Thomasson

unread,

Apr 25, 2008, 5:36:56 PM4/25/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

news:d7b8b86c-baa5-4770...@24g2000hsh.googlegroups.com...

Do you have ANY idea who is on that committee? Did you know that Paul
McKenney is working with them?

http://www.rdrop.com/users/paulmck

He, and some others are pushing for very relaxed memory barriers. Thanks to
them, we will be able to do a standard user-space RCU implementation in C++.
Pretty darn cool if you ask me!

:^)

Chris Thomasson

unread,

Apr 25, 2008, 5:38:27 PM4/25/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

news:ffce1a28-1f8f-4887...@27g2000hsf.googlegroups.com...

On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:
> > ...
> > No. Several people have pointed out several times where the
> > language addresses concurrency issues.

> Oh yes, you are one of them who continously keep saying that C++0x
> adresses concurrency issues at the language level but who fail to put
> here just a single language element for it.

[...]

There is a VERY CLOSE relationship between the language and the library. C++
does not need any new keyword to define low-level high-performance threading
semantics.

Bo Persson

unread,

Apr 26, 2008, 3:48:05 AM4/26/08

to

Szabolcs Ferenczi wrote:
> On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:
>> ...
>> No. Several people have pointed out several times where the
>> language addresses concurrency issues.
>
> Oh yes, you are one of them who continously keep saying that C++0x
> adresses concurrency issues at the language level but who fail to
> put here just a single language element for it.

That's the beauty of C++. :-)

Seriously, the library is part of the language specification. An
implementation has to supply both a compiler (or an interpreter) and a
complete library implementation. It also has to assure that it works
according to spec.

Is it a flaw that the standard doesn't explain exactly how this is to
be done?

Bo Persson

Szabolcs Ferenczi

unread,

Apr 26, 2008, 2:45:13 PM4/26/08

to

On Apr 26, 9:48 am, "Bo Persson" <b...@gmb.dk> wrote:
> Szabolcs Ferenczi wrote:
> > On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:
> >> ...
> >> No. Several people have pointed out several times where the
> >> language addresses concurrency issues.
>
> > Oh yes, you are one of them who continously keep saying that C++0x
> > adresses concurrency issues at the language level but who fail to
> > put here just a single language element for it.
>
> That's the beauty of C++. :-)

Really? I only hope not only that's the beauty of C++.

> Seriously, the library is part of the language specification.

"Parallel programs are particularly prone to time-dependent errors,
which
either cannot be detected by program testing nor by run-time checks.
It is therefore very important that a high-level language designed for
this purpose should provide complete security against time-dependent
errors by means of a compile-time check."
C. A. R. Hoare, Towards a Theory of Parallel Programming (1971)

You cannot provide this kind of compiler support with any library-
based approach. This is one of the failure in Boehm's paper that he
completely ignored this issue and now the ignorant trendy fans take
his paper as a Bible.

> An
> implementation has to supply both a compiler (or an interpreter) and a
> complete library implementation. It also has to assure that it works
> according to spec.

An implementation is not identical with the language. Implementation
and language are two different although related issues. The
implementation may provide libraries though they never belong to the
language itself, even if many people erroneously think so.

> Is it a flaw that the standard doesn't explain exactly how this is to
> be done?

Not only that. What is claimed about C++0x does not match with what is
provided. Language level multi-threading is claimed but library level
multi-threading is provided. It would be more honest to claim the
truth that C++0x provides no more than what is available now in the
average library-based approaches.

Someone mentioned that it is a committee who decides on it: So, if the
committee claims they are working on a beautiful super horse but
external observers can see an average camel coming out, well, that is
discordant. However, if the committee says they are designing a camel,
well, that is fair and honest.

Best Regards,
Szabolcs

Bo Persson

unread,

Apr 26, 2008, 8:05:49 PM4/26/08

to

Szabolcs Ferenczi wrote:
> On Apr 26, 9:48 am, "Bo Persson" <b...@gmb.dk> wrote:
>> Szabolcs Ferenczi wrote:
>>> On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:
>>>> ...
>>>> No. Several people have pointed out several times where the
>>>> language addresses concurrency issues.
>>
>>> Oh yes, you are one of them who continously keep saying that C++0x
>>> adresses concurrency issues at the language level but who fail to
>>> put here just a single language element for it.
>>
>> That's the beauty of C++. :-)
>
> Really? I only hope not only that's the beauty of C++.
>
>> Seriously, the library is part of the language specification.
>
> "Parallel programs are particularly prone to time-dependent errors,
> which
> either cannot be detected by program testing nor by run-time checks.
> It is therefore very important that a high-level language designed
> for this purpose should provide complete security against
> time-dependent errors by means of a compile-time check."
> C. A. R. Hoare, Towards a Theory of Parallel Programming (1971)
>
> You cannot provide this kind of compiler support with any library-
> based approach. This is one of the failure in Boehm's paper that he
> completely ignored this issue and now the ignorant trendy fans take
> his paper as a Bible.

It isn't a library based support, it is a library interface that
requires compiler support for the implementation. The standard
document describes the interface to the features, not the
implemenation.

Is that you problem?

>
>> An
>> implementation has to supply both a compiler (or an interpreter)
>> and a complete library implementation. It also has to assure that
>> it works according to spec.
>
> An implementation is not identical with the language. Implementation
> and language are two different although related issues. The
> implementation may provide libraries though they never belong to the
> language itself, even if many people erroneously think so.

The standard requires an implementation to supply some libraries.
These certainly belongs to the language.

Let's quote paragraph 1 of the standard document:

"This International Standard specifies requirements for

implementations of the C++ programming language. The first such
requirement is that they implement the language, and so this
International Standard also defined C++."

>
>> Is it a flaw that the standard doesn't explain exactly how this is
>> to be done?
>
> Not only that. What is claimed about C++0x does not match with what
> is provided. Language level multi-threading is claimed but library
> level multi-threading is provided. It would be more honest to claim
> the truth that C++0x provides no more than what is available now in
> the average library-based approaches.

It is defined as a set of interfaces, library style. It certainly will
need some compiler support, just like the type_info class of the
library does.

You have noticed, haven't you, that

#include <mutex>

will let you use mutexes in C++0x, but it doesn't <mutex> to be a
file, it could be built into the compiler (in whole, or in part).

Generally, the C++ library defined an interface for the features, not
the implementation. The features, like std::vector, are allowed, but
not required, to be implemented as a library. The same goes for the
new threading primitives - they are allowed, but not required, to be
implemented in the compiler or in the library, or as a combination.

>
> Someone mentioned that it is a committee who decides on it: So, if
> the committee claims they are working on a beautiful super horse but
> external observers can see an average camel coming out, well, that
> is discordant. However, if the committee says they are designing a
> camel, well, that is fair and honest.

The committee tries to define the common interface to a, possibly four
legged, animal that can carry your goods. It doesn't prescribe the
number of humps the animal must have. Perhaps even a small mule will
work, for embedded systems?

Bo Persson

kwikius

unread,

Apr 26, 2008, 9:48:56 PM4/26/08

to

"Erik Wikström" <Erik-w...@telia.com> wrote in message
news:8soQj.6549$R_4....@newsb.telia.net...

> On 2008-04-25 10:33, Szabolcs Ferenczi wrote:

<...>

>> Several people did not reply to my request to point out what are the
>> language elements for concurrency in the proposed C++0x standard. When
>> I make a list for them, asking (1) what are the language elements for
>> starting a thread of compulation, (2) by which language elements
>> mutual exclusion is specified, (3) what are the language elements for
>> making the threads synchronised---the answer is silence or ignorance
>> from several people. (Ok, one of them has admitted that these are at
>> the library level rather than at the language level.)
>
> "This International Standard specifies requirements for implementations
> of the C + + programming language." These are the very first words in
> the C++ standard document, since the very same document specifies the
> C++ standard library I have to conclude that the library is a *part* of
> the language. There are a number of things in the standard library that
> can not be written in pure C++, these are things that the committee
> decided would be better of in the library instead of using new keywords,
> that does not make them any less a part of the language.
>
> I understand that you are not satisfied with the syntax/semantics the
> committee have decided on for solving the concurrency issues, but that
> does not mean that the solution is not a language solution.

FWIW An Interesting talk on C++0x from last tear including concurrency
issues by Lawrence Crowl

http://www.youtube.com/watch?v=ZAG5txfYnW4

regards
Andy Little

Markus Elfring

unread,

Apr 28, 2008, 3:35:48 AM4/28/08

to

> You're right that C++0x has a high level library. You're not right that
> it's a "pure library approach". The library can't be implemented
> correctly without compiler support, in the form of a stricter memory
> model that makes it possible to define and to reason about the order of
> operations in a multi-threaded program.

I hope that a formal notation can be achieved. Were the open issues really
clarified more than we all know from the Pthreads memory model so far?
http://groups.google.de/group/comp.programming.threads/msg/61f57419ec0d87e5

> Look at the section entitled "Mult-threaded executions and data races".

Will C language committee members also agree on details from the document
"Concurrency memory model"?
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2429.htm#races

Regards,
Markus

James Kanze

unread,

Apr 28, 2008, 6:14:45 AM4/28/08

to

On Apr 25, 11:04 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:

> > ...
> > No. Several people have pointed out several times where the
> > language addresses concurrency issues.

> Oh yes, you are one of them who continously keep saying that
> C++0x adresses concurrency issues at the language level but
> who fail to put here just a single language element for it.

We've been pointing out language issues constantly. The fact
that you don't read them isn't our problem.

> Just like tha Bandar-log in The Jungle Book. "We are great. We
> are free. We are wonderful. ... We all say so, and so it must
> be true."

> The Bandar-log never complete anything just like the several
> people here who keep talking about the new brave language
> level concurrency elements in C++0x but nobody can point out
> any humble element.

The definition of the memory model. The most important aspect
of multithreading.

James Kanze

unread,

Apr 28, 2008, 6:20:16 AM4/28/08

to

On Apr 28, 9:35 am, Markus Elfring <Markus.Elfr...@web.de> wrote:
> > You're right that C++0x has a high level library. You're not right that
> > it's a "pure library approach". The library can't be implemented
> > correctly without compiler support, in the form of a stricter memory
> > model that makes it possible to define and to reason about the order of
> > operations in a multi-threaded program.

> I hope that a formal notation can be achieved. Were the open
> issues really clarified more than we all know from the
> Pthreads memory model so

> far? http://groups.google.de/group/comp.programming.threads/msg/61f57419ec...

Yes. Times change, and we've learned enough from other
languages to be able to precisely define a reasonable memory
model.

> > Look at the section entitled "Mult-threaded executions and
> > data races".

> Will C language committee members also agree on details from the document
> "Concurrency memory model"?http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2429.htm#races

I believe so. In this case, there is constant discussion
between the two committees, and the intent is that the language
elements will be acceptable to both.

Boehm, Hans

unread,

Apr 28, 2008, 2:42:21 PM4/28/08

to

On Apr 21, 10:07 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:
>

> Concurrency library + adjusted sequential language != concurrent
> language
>

In my mind, it depends on the adjustments, and the concurrent
language.

When we started the C++ effort, there were several problems in trying
to write concurrent C++ programs. People disagree on the priorities,
but I'd personally order them as:

1) Since C++ was a single-threaded language, there was no attempt to
define the meaning of shared variables. Even if you added the Posix
spec,there was no guarantee that compilers could not introduce
"speculative" writes that add data races to the naive interpretation
of the source code, and change program behavior. This did happen very
occasionally in completely unpredictable ways. It happened more
frequently in the form of a structure field update rewriting adjacent,
concurrently modified, fields. Posix did not clearly prohibit these,
either. And for structure fields, you need a more complicated rule to
deal with bit-fields (a problem that Java doesn't have, and as far as
I can tell, Ada punts on the equivalent issue there).

The C++0x WP contains a solution to this that is simple, unless you
need the performance of the explicitly ordered ("low level") atomic
operations. You get sequential consistency (thread steps appear
simply interleaved) for data-race-free programs, and undefined
semantics for programs with data races. Data races are bascially
simultaneous accesses, one of them an update, to non-atomic scalars,
with the one complication that contiguous sequences of bit-fields
count as a single location in determining the existence of a data
race.

If you need low-level atomics, things are very tricky anyway, and you
will actually have to read the standard or the committee papers (try
N2480).

Aside from the low-level atomics, this is similar to the Ada
approach. And Ada83 probably was the first major language to
explicitly take this route. (There are some tricky issues in making
this work for try_lock(), but I couldn't find an equivalent of that in
Ada95, so this issue may simply not have arisen.) It's consistent
with the Java approach, as revised in 2005, though Java needs to
address the much harder, and in my opinion still not completely
solved, problem of giving partial semantics to programs with races.

2) There was no portable way to get access to hardware atomic
operations (including simple atomic loads and stores). In practice,
this seems to often be necessary, for everything from maintaining a
simple global counter, to the commonly used (and often misimplemented)
"double-checked-locking" idiom. This is fixed in the C++ working
paper, with the addition of atomic<T>, etc. This is roughly analogous
to Java volatiles. (Caution: In C++, increments of atomics are atomic
operations; increments of Java volatiles contain two atomic
operations, and are not atomic. I think Ada behaves like Java here.)

Ada95 also has atomic operations, but if I read the spec correctly,
they seem to have largely overlooked the memory ordering issues. In
particular, it I write the equivalent of

Thread 1:
x = 42;
x_init = true;

Thread 2:
while (!x_init);
assert(x == 42);

this can fail (and in fact is incorrect) even if x_init is atomic.
More interestingly, I think something like RCU can't be made to work
at all. Neither does passing data between threads through a message
queue implemented with atomic objects. Of course, nobody else got
this right in 1995 either.

3) There was no portable API for thread creation and synchronization.
The current WP has one that is largely a Boost descendent, allowing
portable code to create threads. Work on higher level concurrency
APIs was explicitly postponed.

I think (1) and often (2) are essential for a useful concurrent
language. But languages designed for concurrency from the start
didn't always get them right either.

Hans

Markus Elfring

unread,

Apr 28, 2008, 5:03:53 PM4/28/08

to

> I think (1) and often (2) are essential for a useful concurrent
> language. But languages designed for concurrency from the start
> didn't always get them right either.

Are linkers also part of related considerations?
Are specifications needed to prevent unwanted instruction reordering during the
linking process?

Regards,
Markus

James Kanze

unread,

Apr 29, 2008, 3:49:09 AM4/29/08

to

The C++ standard does not distinguish linking as a separate
operation. It's one of the "phases of translation". The C++
standard specifies behavior for a legal program, and gives the
programmer certain guarantees, without regard to who does what
in any particular implementation's translation process.

Markus Elfring

unread,

Apr 30, 2008, 1:55:21 AM4/30/08

to

> The C++ standard does not distinguish linking as a separate
> operation. It's one of the "phases of translation". The C++
> standard specifies behavior for a legal program, and gives the
> programmer certain guarantees, without regard to who does what
> in any particular implementation's translation process.

Are you generally looking for instruction reordering prevention that should be
supported by compilers and various linkers?
Are there still any dangers for the concurrency memory model because of
potential optimisations?

Regards,
Markus

Ian Collins

unread,

Apr 30, 2008, 1:59:32 AM4/30/08

to

Please retain the attributions, it's rude to snip then ans makes
following the thread difficult.

--
Ian Collins.

Szabolcs Ferenczi

unread,

Apr 30, 2008, 6:23:58 AM4/30/08

to

On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:

Let me put forward that all your problems are coming from the facts
that in C++0x:

(1) you are still going to solve multi-threading at the library level;
and

(2) your only concern is the tuning of the OPTIMISATION of the
compilers which is developed for the SEQUENTIAL execution in the first
place.

> On Apr 21, 10:07 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:
>
> > Concurrency library + adjusted sequential language != concurrent
> > language
>
> In my mind, it depends on the adjustments, and the concurrent
> language.

You are right: If the adjustments are just made for the shake of
changing the *sequential* optimisation rules of the compiler that
supports the language, it is not very adequate. In that case no matter
how hard you try to adjust, it will always remain to be a sequential
language, i.e. until you address concurrency at the language level.

You say it depends on the concurrent language. Well, a concurrent
language must contain language elements for issues coming from the
concurrent, i.e. simultaneous execution of some parts of the program.
These are:

1) Defining and starting the threads of computation

2) Separating the shared resources and the resources local to a single
process

3) Synchronisation and communication of the simultaneously executed
parts
3.a) Must provide means for mutual exclusion at the language level
3.b) Must handle non-determinism at the language level

Item (1): In a procedural language this can be a kind of a parallel
statement, e.g.

parallel S;
else (int i=0..pmax) P(i);
else {Q; L;}
end

In object-oriented languages the threads of computation can be
combined with objects resulting in some form of active objects, see
for instance the language proposal Distributed Processes.
http://brinch-hansen.net/papers/1978a.pdf

If the process is marked at the language level, the compiler can check
whether the process accesses local and shared resources properly.

Item (2): In a procedural language a single keyword like `shared' or
`resource' may help as a property to the types. In object-oriented
languages the natural unit of marking with the shared property is the
class.

If the shared variables are marked, the compiler can check whether the
processes access the shared resources properly, i.e. excluding each
other.

Item (3): In a well designed concurrent language most probably you can
find an adapted form of Dijkstra's Guarded Commands to deal with non-
determinism (see:
http://www.cs.utexas.edu/users/EWD/transcriptions/EWD04xx/EWD418.html)
GC has been already adapted to message communication (see
Communicating Sequential Processes and its language realisation OCCAM)
as well as to shared memory communication (see Edison or Distributed
Processes).
http://brinch-hansen.net/papers/1981b.pdf
http://brinch-hansen.net/papers/1978a.pdf

You can find GC adapted in Ada too.

In an object-oriented language, Guarded Commands could be combined
with classes and Conditional Critical Regions (C. A. R. Hoare, Towards
a Theory of Parallel Programming, 1971), something like this:

shared class A {
int i, k;
public:
A() i(0), k(0) {}
void foo() {
when (i>10) {
S;
}
else (i > k) {
P;
}
else (k > i) {
Q;
}
}
...
};

Class A being a shared class means that private members `i' and `k'
are shared variables and public methods are Critical Regions already.
So without classes, in a C-like language it would look something like
this:

shared int i=0, k=0;
void foo() {
with (i,k) {
when (i>10) {
S;
}
else (i > k) {
P;
}
else (k > i) {
Q;
}
}
}

Note that if some notations like the ones shown above are used, the
compiler can easily check whether a shared variable is accessed in a
wrong way or in a proper way.

The compiler can also optimise how it translates a Conditional
Critical Region the most optimal way on a given platform. This is,
however, not sequential optimisation any more. Neither is it about
suppressing sequential optimisations here and there. However,
sequential optimisations can be used unrestricted in parts of the
processes which parts are working on local variables only. In a
concurrent language it is clear to the compiler what parts these are.

What I have shown above are just examples how concurrency can be
addressed at the language level. I am not claiming that you should
exactly include these elements into C++0x. However, I do claim that
you should address concurrent programming at the language level in C+
+0x otherwise C++ will lag behind in concurrent programming.

Now, as you can see:

Concurrency library + adjusted sequential language != concurrent
language

Quod erat demonstrandum

Best Regards,
Szabolcs

James Kanze

unread,

Apr 30, 2008, 7:25:38 AM4/30/08

to

I'm not sure I understand the question. The current rules allow
considerable reordering, and don't take threading issues into
consideration. Which means that you can't write concurrent code
without additional, implementation defined guarantees.

Markus Elfring

unread,

Apr 30, 2008, 3:35:30 PM4/30/08

to

Szabolcs Ferenczi schrieb:

> Let me put forward that all your problems are coming from the facts
> that in C++0x:
> (1) you are still going to solve multi-threading at the library level;
> and
>
> (2) your only concern is the tuning of the OPTIMISATION of the
> compilers which is developed for the SEQUENTIAL execution in the first
> place.

Why does the concurrency memory model not seem to be an important part of the
software development game in your view?

Regards,
Markus

Szabolcs Ferenczi

unread,

Apr 30, 2008, 4:20:29 PM4/30/08

to

On Apr 30, 9:35 pm, Markus Elfring <Markus.Elfr...@web.de> wrote:
...

> Why does the concurrency memory model not seem to be an important part of the
> software development game in your view?

Can you summarise here the so-called concurrency memory model?

If you do so, we can talk about it but first we should agree what we
are talking about.

How does the concurrency memory model in your view makes it possible
for the compiler to make basic checks about whether shared variables
are accessed inside or outside Critical Regions?

How does the concurrency memory model in your view helps the compiler
determining whether a piece of code is part of a particular thread of
execution?

Best Regards,
Szabolcs

Boehm, Hans

unread,

May 1, 2008, 1:21:15 AM5/1/08

to

On Apr 30, 3:23 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:
>
> Let me put forward that all your problems are coming from the facts
> that in C++0x:
>
> (1) you are still going to solve multi-threading at the library level;
> and

No, though we still have library calls to create threads and for
synchronization operations.

>
> (2) your only concern is the tuning of the OPTIMISATION of the
> compilers which is developed for the SEQUENTIAL execution in the first
> place.

No. I am concerned about optimizations (hardware and compiler). But
even in most more commonly used concurrent languages, it is tricky to
define what optimizations are allowed, and what the user can rely on.
See the recent work on the Java memory model, for example. And, as I
said in the last message, I don't think Ada has gotten this quite
right in the presence of atomic variables.

I think you're advocating a particular kind of concurrent language
that does not provide any kind of variable that can be safely accessed
concurrently by multiple threads, i.e. nothing like Java volatiles or C
++ atomics. Everything must be protected by locks or a similar
mechanism. That does simplify things. I would argue that the result
is often impractical. If you don't believe that, try reference
counting with the count operations protected by locks.

However, even if you go this route (and for many programs that's
fine), the problem does not go away completely. Consider

Thread 1:
x = 42;

lock(l);

Thread 2:
while (trylock(l) == SUCCESS) unlock(l);
r1 = x;

Is this allowable? Does it guarantee r1 == 42? The answer can have
substantial effect on the cost of the lock() implementation. C++0x
resolves it in a new and interesting way.

You can of course make this problem go away too by moving to ever more
restrictive languages, in which you can't express something like
trylock(), or cannot even express code that might involve races. I
think neither is practical, in that it doesn't let me write code that
commonly needs to be written, unless we abandon the shared-memory,
lock-based programming model altogether. I think all widely used
concurrent languages and thread libraries allow me to write both
trylock (or a lock with a timeout) and data races.

For example, I need to be able to write code that initializes an
object in one thread without holding a lock, makes a shared pointer p
point to it, reads p in another thread, and then access the referenced
object in the second thread, again without holding a lock. (The
accesses to p may be lock protected.) This involves no data race.
But it's hard to tell that statically.

I also need to be able to protect a bunch of objects with a smaller
set of locks, by hashing the object address to an entry in an array of
locks, or do hand-over-hand locking on a linked list.

That does mean that we would like other tools to detect data races,
since the compiler can't do so. Unfortunately that seems to be hard
precisely because syntactic disciplines that preclude races are too
restrictive.

Hans

Owen Jacobson

unread,

May 1, 2008, 2:07:45 AM5/1/08

to

On Apr 25, 5:04 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 25, 11:57 am, James Kanze <james.ka...@gmail.com> wrote:
>
> > ...
> > No. Several people have pointed out several times where the
> > language addresses concurrency issues.
>
> Oh yes, you are one of them who continously keep saying that C++0x
> adresses concurrency issues at the language level but who fail to put
> here just a single language element for it.

If the only "language element" your world-view will admit is a new
syntactic construct, then you're right and the new C++ standard does
not contain any language elements to support threading. However,
that's an extremely limiting definition, not shared by very many
people.

A comprehensive memory model is *required* for correct threaded code,
and it's something C++ does not yet have. At the most basic level, a
memory model is a set of rules dictating what writes are and are not
allowed to affect a given read. In the following simple example:

struct foo {
short a;
short b;
};

foo a_foo;

a memory model provides hard and fast rules about whether or not a
read of a_foo.b is allowed to be affected by writes to a_foo.a, or to
a_foo itself. Note that this doesn't necessarily need to involve
threads in the definition: the rules will hold under all (valid)
executions of the program possible for a conforming implementation,
including multithreaded executions.

If there is a comprehensive memory model that allows it (as with the
one in the upcoming C++ standard), *then* a library can provide the
threading primitives that are correct with respect to that memory
model. Without the rules a memory model provides, you can't state
that

struct bar {
short a;
short b;

pthread_mutex_t a_mtx;
pthread_mutex_t b_mtx;
};

bar a_bar;

void thread_a () {
pthread_mutex_lock (&a_bar.a_mtx);
a_bar.a++;
pthread_mutex_unlock (&a_bar.a_mtx);
}

void thread_b () {
pthread_mutex_lock (&a_bar.b_mtx);
a_bar.b = 5;
std::cout << a_bar.b << std::endl;
pthread_mutex_unlock (&a_bar.b_mtx);
}

is either correct *or* incorrect if thread_a and thread_b are called
on different threads, because nothing guarantees that the write to
a_bar.a will not affect a_bar.b.

*That's* the nature of the language support being added. It's not
about syntax: it's about semantics and rules. The tools for creating
and synchronizing threads are being added to the library because there
is no need to modify the language to support them, and because
modifying the C++ language itself is fraught with peril. The language
is being extended to provide rules that allow the library to be both
portable and correct.

-o

Markus Elfring

unread,

May 1, 2008, 10:09:06 AM5/1/08

to

Szabolcs Ferenczi schrieb:

> Can you summarise here the so-called concurrency memory model?

It specifies the rules under which actions are correctly performed on memory
locations in the context of multi-threaded executions.

> How does the concurrency memory model in your view makes it possible
> for the compiler to make basic checks about whether shared variables
> are accessed inside or outside Critical Regions?

The tool will check if function/method implementations are compliant to the
fundamental rules.

> How does the concurrency memory model in your view helps the compiler
> determining whether a piece of code is part of a particular thread of
> execution?

I guess that the tool can only determine this relationship if whole program
analysis would be applied.

Regards,
Markus

Szabolcs Ferenczi

unread,

May 1, 2008, 1:12:54 PM5/1/08

to

On May 1, 7:21 am, "Boehm, Hans" <hans.bo...@hp.com> wrote:
> On Apr 30, 3:23 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:> On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:
>
> > Let me put forward that all your problems are coming from the facts
> > that in C++0x:
>
> > (1) you are still going to solve multi-threading at the library level;
> > and
>
> No, though we still have library calls to create threads and for
> synchronization operations.

If you `still have library calls to create threads and for
synchronization operations' that exactly means: threading is at the
library level. (I cannot understand why do you have to start your
comments with the word `no' even if you confirm what I claim.)

That you support threading at the library level is too bad since with
respect to the concurrency it is like some assembly language
programming:

MT library calls = assembly programming

In words: Threading at the library level can be compared to assembly
programming. In both cases you miss the support of a compiler. In both
cases you think you are in control of the details. In both cases the
control of the details prevents you to make large applications.

> > (2) your only concern is the tuning of the OPTIMISATION of the
> > compilers which is developed for the SEQUENTIAL execution in the first
> > place.
>
> No. I am concerned about optimizations (hardware and compiler).

What do you deny here again with `no'? Your statement tells the same
as point (2).

> But
> even in most more commonly used concurrent languages, it is tricky to
> define what optimizations are allowed, and what the user can rely on.

You mention the commonly used concurrent languages. What are they?

Please note that Java is not designed properly for concurrency either.
So it is just false to prove something with another badly designed
language.

Let me call your attention to that not only I claim that Java is not
well designed for concurrency. See Java's Insecure Parallelism, Per
Brinch Hansen (1999).
http://brinch-hansen.net/papers/1999b.pdf

"The author concludes that Java ignores the last twenty-five years of
research in parallel programming languages."

Therefore, it is simply funny that some guys on this discussion list
refer to the state-of-the-art in concurrency and they mean the same
things Java has.

> See the recent work on the Java memory model, for example.

I am afraid that in C++0x you are trying to copy the wrong way Java
develops. Besides being badly designed for concurrency, Java brings in
this false concern what they refer to as the memory model. The same
mistake is there in Java from the concurrency point of view. Exactly
the same what you are dealing with: In Java with the so-called memory
model they try to fix a buggy concurrent program where shared
variables are accessed without any protection.

The concern about memory model is some low level consideration that
can be taken into account in the implementation of the high level
elements of a programming language.

> And, as I
> said in the last message, I don't think Ada has gotten this quite
> right in the presence of atomic variables.

Yes, you have said that really. However, you did not tell whether you
consider an erroneous Ada program or a correct one from the concurrent
programming point of view. Please write an example in Ada illustrating
what do you mean.

> I think you're advocating a particular kind of concurrent language
> that does not provide any kind of variable that can be safely accessed
> concurrently by multiple threads, i.e. nothing like Java volatiles or C
> ++ atomics.

Well, the small grained atomic operations is just a very small part of
the problem domain in concurrent programming. Java volatiles do not
provide atomicity for you except for atomic write and atomic read in
case of the most simple types. For instance the increment of a
volatile is not atomic even with the simple type integer.

The usability of the proposed C++ atomics is limited to certain kind
of problems. They cannot be regarded as general language means. Atomic
operations start to be useful when the check and the set is atomic
together. It is, however, a built in small Critical Region. You see in
this case the Critical Region may not be implemented with locking.

> Everything must be protected by locks or a similar
> mechanism. That does simplify things. I would argue that the result
> is often impractical.

I would not say that `everything must be protected by locks'.

In fact locks should not appear at the language level. Locks are
library level elements. You do not need them at the language level.
Well, at least you do not need them in a well designed concurrent
language.

What I claim is that a concurrent language must provide some language
level means to specify that the intention of the programmer is to
define some block where he or she wants no data race but mutual
exclusion. Usually the language level means for this is the Critical
Region. A Critical Region must not necessarily be implemented by
locking. It can be at the discretion of the compiler (optimisation)
what is the most appropriate implementation of the language level
Critical Region on a given architecture.

Best Regards,
Szabolcs

Szabolcs Ferenczi

unread,

May 1, 2008, 1:54:11 PM5/1/08

to

On May 1, 7:21 am, "Boehm, Hans" <hans.bo...@hp.com> wrote:

> On Apr 30, 3:23 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:> On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:

> [...]

> However, even if you go this route (and for many programs that's
> fine), the problem does not go away completely. Consider
>
> Thread 1:
> x = 42;
> lock(l);
>
> Thread 2:
> while (trylock(l) == SUCCESS) unlock(l);
> r1 = x;
>
> Is this allowable?

Not at all. It is not only a buggy concurrent program but a very
inefficient one too. It locks and unlocks an object while busy
waiting.

> Does it guarantee r1 == 42?

Not in all circumstances. It is timing dependent, which is a typical
concurrent bug. Who can guarantee it that just when one thread is
locking on `l' another one would not overwrite `x' with another value.
The `x' is meant to be a shared variable but it is not accessed in a
safe way. It is a timing dependent construction and therefore an
unsafe one.

> The answer can have
> substantial effect on the cost of the lock() implementation.

My answer is that it has a concurrency bug and it must be fixed.
Besides, I do not think that the cost of an operation should have any
effect on the functional behaviour.

> C++0x
> resolves it in a new and interesting way.

How does C++0x resolve it then? I am curious.

How will you prevent another thread changing the value `x' as I
described above?

> You can of course make this problem go away too by moving to ever more
> restrictive languages, in which you can't express something like
> trylock(), or cannot even express code that might involve races.

The lock does not belong to the language level unless you are dealing
with some kind of assembly level programming. In a high level language
you do not instruct the computer what to do rather you express the
conditions what you would like to achieve. Leave it to the compiler to
instruct the machine at a low level.

"There are two views of programming. In the old view it is regarded as
the purpose of our programs to instruct our machines; in the new one
it will be the purpose of our machines to execute our programs."
E.W. Dijkstra, Comments at a Symposium (1975)
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD512.html

> I
> think neither is practical, in that it doesn't let me write code that
> commonly needs to be written, unless we abandon the shared-memory,
> lock-based programming model altogether.

What you mean by a `restrictive language' is the one that restricts
you in committing the most common concurrent programming errors.

"Parallel programs are particularly prone to time-dependent errors,
which
either cannot be detected by program testing nor by run-time checks.
It is therefore very important that a high-level language designed for
this purpose should provide complete security against time-dependent
errors by means of a compile-time check."

C. A. R. Hoare, Towards a Theory of Parallel Programming (1971)

"Well, if we cannot make concurrent programs work by proofreading
or testing, then I can see only one other effective method at the
moment:
to write all concurrent programs in a programming language that is so
structured that you can specify exactly what processes can do to
shared
variables and depend on a compiler to check that the programs satisfy
these assumptions. Concurrent Pascal is the first language that makes
this possible."
Per Brinch Hansen: The Architecture of Concurrent Programs (1977)

> I think all widely used
> concurrent languages and thread libraries allow me to write both
> trylock (or a lock with a timeout) and data races.

I would say that no decent concurrent programming language would allow
you to instruct the machine such a low level as locking. On the other
hand, low level thread libraries do allow you to write both trylock
and data races.

> For example, I need to be able to write code that initializes an
> object in one thread without holding a lock, makes a shared pointer p
> point to it, reads p in another thread, and then access the referenced
> object in the second thread, again without holding a lock. (The
> accesses to p may be lock protected.) This involves no data race.
> But it's hard to tell that statically.

It does involve a data race and there is a way to solve it. So, your
requirement is that one thread constructs an object, passes its
address to another thread, and the other thread has exclusive access
to it. Let us re-state the problem in a high level language notation:

shared struct {
Ob *x;
bool constructed;
} res {NULL, false};

Thread 1:
with (res) {
constructed = true;
x = new Ob();
}

Thread 2:
with (res) {
when (constructed == true) {
// manipulate x
}
}

Now it is up to an optimising compiler to insert locks or any other
means to implement mutual exclusion. We have just expressed by
language means what we are going to achieve. The rest is up to the
compiler.

> I also need to be able to protect a bunch of objects with a smaller
> set of locks, by hashing the object address to an entry in an array of
> locks, or do hand-over-hand locking on a linked list.

I am 100% sure that the problem to be solved does not specify that you
must use locks.

The problem might require that you arrange your algorithm in a way
that those bunch of objects should be accessed in a mutually exclusive
way. Then it is better if the language allows you to express the real
requirement and you do not have to over-specify the solution. Nothing
prevents you to apply smaller granularity of mutual exclusion
specification as required by your problem so solve.

> That does mean that we would like other tools to detect data races,
> since the compiler can't do so. Unfortunately that seems to be hard
> precisely because syntactic disciplines that preclude races are too
> restrictive.

The compiler can do a lot of checks for you including preventing data
races provided the language is well designed with respect to
concurrency. The language must include means to mark shared variables
and Critical Regions. That is the key to it.

Best Regards,
Szabolcs

Boehm, Hans

unread,

May 1, 2008, 2:20:38 PM5/1/08

to

On May 1, 10:12 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>

wrote:
> On May 1, 7:21 am, "Boehm, Hans" <hans.bo...@hp.com> wrote:
>
> > On Apr 30, 3:23 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:> On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:
>
> > > Let me put forward that all your problems are coming from the facts
> > > that in C++0x:
>
> > > (1) you are still going to solve multi-threading at the library level;
> > > and
>
> > No, though we still have library calls to create threads and for
> > synchronization operations.
>
> If you `still have library calls to create threads and for
> synchronization operations' that exactly means: threading is at the
> library level. (I cannot understand why do you have to start your
> comments with the word `no' even if you confirm what I claim.)

We're arguing about terminology here. The point is that no matter
what thread creation etc. looks like, the language semantics, and in
particular, the semantics of shared variables, have to talk about
concurrency. In the C++0x working paper, they do. If you call that
"threading at the library level", fine.

>
> That you support threading at the library level is too bad since with
> respect to the concurrency it is like some assembly language
> programming:
>
> MT library calls = assembly programming
>
> In words: Threading at the library level can be compared to assembly
> programming. In both cases you miss the support of a compiler. In both
> cases you think you are in control of the details. In both cases the
> control of the details prevents you to make large applications.

The point is that I don't know how to get useful compiler support
without severely restricting the utility of the language. You do want
the compiler to ensure, for example, that locks are released when an
exception is thrown. But that's easily handled in C++0x without
additional language syntax. And things like avoiding data races are
hard, whether or not you add language syntax.

>
> > > (2) your only concern is the tuning of the OPTIMISATION of the
> > > compilers which is developed for the SEQUENTIAL execution in the first
> > > place.
>
> > No. I am concerned about optimizations (hardware and compiler).
>
> What do you deny here again with `no'? Your statement tells the same
> as point (2).

No, the problem is not limited to sequential languages:

>
> > But
> > even in most more commonly used concurrent languages, it is tricky to
> > define what optimizations are allowed, and what the user can rely on.
>
> You mention the commonly used concurrent languages. What are they?
>
> Please note that Java is not designed properly for concurrency either.
> So it is just false to prove something with another badly designed
> language.

Clearly Java is a concurrent language. See the original Java Language
Specification. Whether or not you think it's well designed is a
different issue. But it seems to me it's probably the most widely
used one, probably followed by C# and then Ada as a distant third.

>
> Let me call your attention to that not only I claim that Java is not
> well designed for concurrency. See Java's Insecure Parallelism, Per

> Brinch Hansen (1999).http://brinch-hansen.net/papers/1999b.pdf

>
> "The author concludes that Java ignores the last twenty-five years of
> research in parallel programming languages."
>
> Therefore, it is simply funny that some guys on this discussion list
> refer to the state-of-the-art in concurrency and they mean the same
> things Java has.

Thanks for the reference.

However, it seems to me that there is really a pretty straightforward
trade-off between expressivity and safety here. You can statically
detect data races (Brinch Hansen's approach), or you can have a
language that's roughly as expressive as the mainstream concurrent
languages, and can express things like non-nested locking, hashing of
objects to locks, reuse of existing sequential libraries inside
critical sections, etc. I wish I knew hot to do both, but I don't.
If you want Concurrent Pascal fine. But the standards committee can't
turn C++ into Concurrent Pascal.

>
> > See the recent work on the Java memory model, for example.
>
> I am afraid that in C++0x you are trying to copy the wrong way Java
> develops. Besides being badly designed for concurrency, Java brings in
> this false concern what they refer to as the memory model. The same
> mistake is there in Java from the concurrency point of view. Exactly
> the same what you are dealing with: In Java with the so-called memory
> model they try to fix a buggy concurrent program where shared
> variables are accessed without any protection.
>
> The concern about memory model is some low level consideration that
> can be taken into account in the implementation of the high level
> elements of a programming language.

At a serious cost in what you can express in the resulting language.
And not a cost that any mainstrean languages have been willing to
incur.

>
> > And, as I
> > said in the last message, I don't think Ada has gotten this quite
> > right in the presence of atomic variables.
>
> Yes, you have said that really. However, you did not tell whether you
> consider an erroneous Ada program or a correct one from the concurrent
> programming point of view. Please write an example in Ada illustrating
> what do you mean.

I am not an Ada programmer, and hence won't attempt the syntax,
especially in a C++ newsgroup. As I said in my earlier message:

Thread 1:
x = 42; x_init = true;

Thread 2:
while (!x_init); assert(x == 42);

where x_init is declared atomic with the appropriate pragma, is an
interesting example. I believe the assignment in thread 1 to x_init
does not "signal" (in the sense of 9.10, Ada95 RM) the final read of
x_init in thread 2. Hence the accesses to x are not sequential, and
this is erroneous. I'd argue that this is (a) easy to implement, (b)
unexpected, and (c) fairly useless to programmers. For example, it
means that Ada atomic variables cannot be used to implemetn double-
checked locking. (Double-checked locking is both a sufficient
performance win and sufficiently widely used, that I think it needs to
be supported without resorting to assembly code. Of course, you can't
write it in Concurrent Pascal either.)

[Discussion of atomics omitted. See last message.]

>
> What I claim is that a concurrent language must provide some language
> level means to specify that the intention of the programmer is to
> define some block where he or she wants no data race but mutual
> exclusion. Usually the language level means for this is the Critical
> Region. A Critical Region must not necessarily be implemented by
> locking. It can be at the discretion of the compiler (optimisation)
> what is the most appropriate implementation of the language level
> Critical Region on a given architecture.

We agree here that that's a goal. The problem is that we don't know
how to do this without sacrificing a lot of very useful (necessary if
you want to get the standard approved) flexibility. Even translating
a simple critical section containing only an increment to an atomic
increment instruction (plus fences) is nontrivial in a C++-like
setting.

Hans
>
> Best Regards,
> Szabolcs

Dmitriy V'jukov

unread,

May 1, 2008, 4:06:19 PM5/1/08

to

On 18 апр, 01:31, "Chris Thomasson" <cris...@comcast.net> wrote:

> Indeed. That's a major plus for me. The really cool thing, for me at least,
> is that the next C++ will allow me to create a 100% standard implementation
> of my AppCore library <http://appcore.home.comcast.net> which currently uses
> POSIX Threads and X86 assembly. The fact that C++ has condition variables,
> mutexs, atomic operations and fine-grain memory barriers means a lot to me.
> Finally, there will be such a thing as 100% portable non-blocking
> algorithms. I think that's so neat.

There are no compiler ordering barriers in C++Ox.
So you can't implement VZOOM object lifetime management in
autodetection mode, you need compiler_acquire_barrier and
compiler_release_barrier.
Also you can't implement SMR+RCU, asymmetric reader-writer mutex,
asymmetric eventcount etc. Basically any algorithm where you want to
eliminate all hardware barriers from fast-path.

There are no specific barriers in C++Ox. Like 'load-release-wrt-loads'
which you need to implement sequence mutex (SeqLock). In C++0x you can
use 'load-release' barrier as replacement for 'load-release-wrt-
loads'. But 'load-release' includes #StoreLoad. 'load-release-wrt-
loads' doesn't inlcude #StoreLoad.

Thus, as I understand it now, C++0x will be suitable only for 'basic'
synchronization algorithms. If you want to implement 'advanced'
synchronization algorithms you still will have to implement 'my own
atomic library' and port it to every compiler and hardware platform
manually. Yikes!

Dmitriy V'jukov

Szabolcs Ferenczi

unread,

May 1, 2008, 4:29:37 PM5/1/08

to

On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:

> On Apr 21, 10:07 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:

> [...]

> When we started the C++ effort, there were several problems in trying
> to write concurrent C++ programs.

I would be interested in these concurrent C++ programs.

I have invited some people on this list already to try to provide some
concurrent programs in the C++0x proposal for some canonical
problems.
http://groups.google.com/group/comp.lang.c++/msg/9607b37a3b0323f3
Everyone just keeps saying that: yes, we have a concurrent language (C+
+0x) but nobody was able to provide a single line of example so far.

> People disagree on the priorities,
> but I'd personally order them as:
>
> 1) Since C++ was a single-threaded language, there was no attempt to
> define the meaning of shared variables.

That is the root of the problem. If you add a notion to mark shared
variables and Critical Regions, all your problem is gone with respect
to compiler optimisation.

> Even if you added the Posix
> spec,there was no guarantee that compilers could not introduce
> "speculative" writes that add data races to the naive interpretation
> of the source code, and change program behavior. This did happen very
> occasionally in completely unpredictable ways. It happened more
> frequently in the form of a structure field update rewriting adjacent,
> concurrently modified, fields.

If the structure is marked as shared this is also out of issue. Until
it is not marked as shared, it means that it is guaranteed that only
one thread can access it. If it is marked as shared, however, you have
to access it in Critical Region, thus, sequential property is back
again.

> Posix did not clearly prohibit these,
> either.

I am afraid Posix does not tackle concurrency at the right level.
However, life does not start with Posix in concurrent programming.
Posix should not be a frozen definition. After all, everything is
changing around why should Posix be eternal?

> And for structure fields, you need a more complicated rule to
> deal with bit-fields (a problem that Java doesn't have, and as far as
> I can tell, Ada punts on the equivalent issue there).
>
> The C++0x WP contains a solution to this that is simple, unless you
> need the performance of the explicitly ordered ("low level") atomic
> operations. You get sequential consistency (thread steps appear
> simply interleaved) for data-race-free programs, and undefined
> semantics for programs with data races.

That is far too simple, really. It might work in specific cases,
though.

An atomic operation is a tiny Critical Region anyway. Note that the
notion of Critical Region refers to mutual exclustion at the language
level and it is not necessarily connected with locking (trendy people
are just afraid of locking like of anything so much that it has become
some kind of fashion to be afraid of locking.) If you have atomic
operations defined, it is fine to a certain degree. Your problem
starts when two or more atomic operations are related to each other.
Then you cannot avoid introducing Critical Region or Conditional
Critical Region in a way or the other.

If you introduce mutexes or semaphores at the library level plus you
hack the _sequential_ optimisation rules of the compiler, you get hand
built Critical Regions but you will not get a concurrent programming
language.

If you do not have concurrent prgramming language, you lack language
support for concurrent programming that the compiler could otherwise
support for you. For instance your compiler will not be able to check
it either whether a shared variable is accesses inside Critical
Regions only. Thus the compiler will just allow constructions for data
races.

> Data races are bascially
> simultaneous accesses, one of them an update, to non-atomic scalars,
> with the one complication that contiguous sequences of bit-fields
> count as a single location in determining the existence of a data
> race.

Well, you call it a data race but it is a bad concurrent program where
the variable meant to be shared among concurrent processes is not
declared as such and the compiler can provide no check to reject the
incorrect program. Now, you naively write your incorrect program, try
to run it, and you conclude that there is a data race. The story does
not start with the data race, however.

Best Regards,
Szabolcs

Chris Thomasson

unread,

May 1, 2008, 5:08:43 PM5/1/08

to

"Dmitriy V'jukov" <dvy...@gmail.com> wrote in message
news:0d49490d-2a0e-461b...@k13g2000hse.googlegroups.com...

> On 18 апр, 01:31, "Chris Thomasson" <cris...@comcast.net> wrote:

> > Indeed. That's a major plus for me. The really cool thing, for me at
> > least,
> > is that the next C++ will allow me to create a 100% standard
> > implementation
> > of my AppCore library <http://appcore.home.comcast.net> which currently
> > uses
> > POSIX Threads and X86 assembly. The fact that C++ has condition
> > variables,
> > mutexs, atomic operations and fine-grain memory barriers means a lot to
> > me.
> > Finally, there will be such a thing as 100% portable non-blocking
> > algorithms. I think that's so neat.

> There are no compiler ordering barriers in C++Ox.

Thats what I thought.

> So you can't implement VZOOM object lifetime management in
> autodetection mode, you need compiler_acquire_barrier and
> compiler_release_barrier.

No way could I do highly platform dependant auto-detection with C++0x. You
can get it to work with signals, but that's a little crazy:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/9545d3e17806ccfe

;^)

> Also you can't implement SMR+RCU, asymmetric reader-writer mutex,
> asymmetric eventcount etc. Basically any algorithm where you want to
> eliminate all hardware barriers from fast-path.

Probably not.

> There are no specific barriers in C++Ox. Like 'load-release-wrt-loads'
> which you need to implement sequence mutex (SeqLock). In C++0x you can
> use 'load-release' barrier as replacement for 'load-release-wrt-
> loads'. But 'load-release' includes #StoreLoad. 'load-release-wrt-
> loads' doesn't inlcude #StoreLoad.

Perhaps you should post this over on the cpp-threads mailing list:

http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads

> Thus, as I understand it now, C++0x will be suitable only for 'basic'
> synchronization algorithms. If you want to implement 'advanced'
> synchronization algorithms you still will have to implement 'my own
> atomic library' and port it to every compiler and hardware platform
> manually. Yikes!

For automatic epoch detection, and all that "type" of stuff, I think your
correct.

Szabolcs Ferenczi

unread,

May 1, 2008, 5:26:24 PM5/1/08

to

On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:

> On Apr 21, 10:07 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:

> [...]

> If you need low-level atomics, things are very tricky anyway,

I do not need low level atomics because I can write correct concurrent
programs with Conditional Critical Regions. An optimising compiler may
implement some Critical Regions with atomics. However, there are
hackers around who are just crying for atomics and low level means.

> and you
> will actually have to read the standard or the committee papers (try
> N2480).

You have the same problem there: Your concern is not about any
concurrent language feature but about what if an _incorrect concurrent
program_ is optimised by the compiler which compiler optimises for the
_sequential execution._ You do not want to inform the compiler that it
should not optimise for sequential execution any more because it is
not a sequential program but rather a concurrent one. You could inform
the compiler if threading would be introduced at the language level
(see marking shared variables and marking critical sections at the
language level).

Besides, the concepts of `memory model' and `visibility' are false
concerns in programming languages. At the programming language level
the issues about memory are already abstracted away into the notion of
variables. Visibility is only a concern if the language has no means
to mark shared variables and Critical Regions. Otherwise, if the
language has these means, visibility simply is not an issue. What is
declared as shared variable is seen by all processes, hence, these
variables can only be accessed in an exclusive manner. All the other
variables are assigned to a single process only and thus visible for
that process only. It is as simple as this.

> Aside from the low-level atomics, this is similar to the Ada
> approach. And Ada83 probably was the first major language to
> explicitly take this route. (There are some tricky issues in making
> this work for try_lock(), but I couldn't find an equivalent of that in
> Ada95, so this issue may simply not have arisen.) It's consistent
> with the Java approach, as revised in 2005, though Java needs to
> address the much harder, and in my opinion still not completely
> solved, problem of giving partial semantics to programs with races.
>
> 2) There was no portable way to get access to hardware atomic
> operations (including simple atomic loads and stores). In practice,
> this seems to often be necessary, for everything from maintaining a
> simple global counter, to the commonly used (and often misimplemented)
> "double-checked-locking" idiom. This is fixed in the C++ working
> paper, with the addition of atomic<T>, etc. This is roughly analogous
> to Java volatiles. (Caution: In C++, increments of atomics are atomic
> operations; increments of Java volatiles contain two atomic
> operations, and are not atomic. I think Ada behaves like Java here.)
>
> Ada95 also has atomic operations, but if I read the spec correctly,
> they seem to have largely overlooked the memory ordering issues.

From your point of view it may seem as an overlook but the truth is
that there is no such problem as memory ordering in a well designed
concurrent language. Just look at the comments to your following code
fragments.

> In
> particular, it I write the equivalent of
>
> Thread 1:
> x = 42;
> x_init = true;
>
> Thread 2:
> while (!x_init);
> assert(x == 42);
>
> this can fail (and in fact is incorrect) even if x_init is atomic.

Of course it is incorrect. Here both `x' and `x_init' are shared
variables but you miss to declare them as such. Furthermore, you are
trying to access them `in sequential manner' i.e. as if they were
variables in a sequential program. However, then, why are you
wondering that an incorrect concurrent program can fail?

I guess the meaning of your fragment that you would like Thread 2 to
proceed only when shared variable becomes 42:

shared int x = 0;

Thread 1:
with (x) {x = 42;}

Thread 2:
with (x) when (x == 42) {
// do whatever you want to do when x==42
}

Let me note that if you want Thread 2 to do anything when x==42, you
should specify that action inside the Critical Region. It is because
in a concurrent programming environment the change of the shared
variable must be regarded as a non-deterministic event.

However, it is still not correct from the concurrent point of view if
shared variable `x' can be changed by other processes. Then you cannot
be sure that Thread 2 will detect the situation when for a transient
period x==42 and your Thread 2 would hang forever waiting for the
status.

Moving from sequential programming to concurrent programming is not so
easy. It needs quite another kind of thinking from the programmer. You
can just escalate this problem if you are trying to hack a sequential
programming language so that is must stay as a sequential language
first of all, but you also want concurrency just as an after thought.

> More interestingly, I think something like RCU can't be made to work
> at all. Neither does passing data between threads through a message
> queue implemented with atomic objects. Of course, nobody else got
> this right in 1995 either.
>
> 3) There was no portable API for thread creation and synchronization.
> The current WP has one that is largely a Boost descendent, allowing
> portable code to create threads. Work on higher level concurrency
> APIs was explicitly postponed.
>
> I think (1) and often (2) are essential for a useful concurrent
> language. But languages designed for concurrency from the start
> didn't always get them right either.

I think the contrary: (1) and (2) simply does not apply to a decent
language designed for concurrency. Even (3) is not an API in any well
designed concurrent language but language element (see parallel
statement). Let me note that Java is not one of them, however, the
trendy crawd thinks it is (see Java's Insecure Parallelism)

Let me draw your attention to the first concurrent language Concurrent
Pascal which already contained the notion of the shared class named a
monitor. There are other languages designed for concurrency such as
Edison, OCCAM to name a few. The latter is so genuine that the
sequencing two operations after each other as a default (see semicolon
in most languages) is missing from the language.

Finally, let me stress that I am not suggesting that you would make
either Concurrent Pascal, Edison, Ada or OCCAM out of C++. These are
just examples containing useful ideas with respect to concurrent
programming language features.

Once, C++ was a success because it could add object-oriented
programming concepts to a procedural language. Stroustrup himself
claims that it did not seem such a straightforward idea to take up
object-oriented programming at that time: "all sensible people "knew"
that OOP didn't work in the real world: It was too slow (by more than
an order of magnitude), far too difficult for programmers to use,
didn't apply to real-world problems, and couldn't interact with all
the rest of the code needed in a system."
http://ddj.com/cpp/207000124

The situation is very similar now with respect to adding concurrency
at the language level to an OOP language. All sensible people "knows"
that it is inefficient to have it at the language level. All sensible
people "knows" that you need memory model and you have to care about
visibility concerns.

You need a brave step for the success, though.

Otherwise C++0x will be yet another MT library-based language.

Best Regards,
Szabolcs

Boehm, Hans

unread,

May 1, 2008, 8:35:07 PM5/1/08

to

On May 1, 2:26 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:
>
> > On Apr 21, 10:07 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > wrote:
> > [...]
> > If you need low-level atomics, things are very tricky anyway,
>
> I do not need low level atomics because I can write correct concurrent
> programs with Conditional Critical Regions. An optimising compiler may
> implement some Critical Regions with atomics. However, there are
> hackers around who are just crying for atomics and low level means.

And unfortunately for good reason. If your optimizer can generate
something like the Linux RCU implementation, or even double-checked
locking, from code based on critical regions, I'm impressed. And
unfortunately, if we're writing parallel code, it's often because we
need the performance.

We have had a lot of arguments about the need for "low-level"
atomics. But I don't think there's been much question about the need
for some form of atomics to supplement critical regions / locks.
>
> ...

> > Ada95 also has atomic operations, but if I read the spec correctly,
> > they seem to have largely overlooked the memory ordering issues.
>
> From your point of view it may seem as an overlook but the truth is
> that there is no such problem as memory ordering in a well designed
> concurrent language. Just look at the comments to your following code
> fragments.
>
> > In
> > particular, it I write the equivalent of
>
> > Thread 1:
> > x = 42;
> > x_init = true;
>
> > Thread 2:
> > while (!x_init);
> > assert(x == 42);
>
> > this can fail (and in fact is incorrect) even if x_init is atomic.
>
> Of course it is incorrect. Here both `x' and `x_init' are shared
> variables but you miss to declare them as such. Furthermore, you are
> trying to access them `in sequential manner' i.e. as if they were
> variables in a sequential program. However, then, why are you
> wondering that an incorrect concurrent program can fail?

This is correct in Java and the C++0x working paper if x_init is
volatile(Java)/atomic(C++). The variable x cannot be simultaneously
accessed, hence there is no data race on x. And x_init is effectively
declared shared.

In a C-like environment, it doesn't work to declare x shared in cases
like this. Often x = 42 is really an initialization of some library
that has no idea whether it wil be called from a single thread,
possibly in a single threaded application, or from multiple threads
which serialize access to the library. The library doesn't know
whether its variables are shared; and it doesn't matter.

>
> I guess the meaning of your fragment that you would like Thread 2 to
> proceed only when shared variable becomes 42:
>
> shared int x = 0;
>
> Thread 1:
> with (x) {x = 42;}
>
> Thread 2:
> with (x) when (x == 42) {
> // do whatever you want to do when x==42
>
> }
>
> Let me note that if you want Thread 2 to do anything when x==42, you
> should specify that action inside the Critical Region. It is because
> in a concurrent programming environment the change of the shared
> variable must be regarded as a non-deterministic event.

But now you've effectively been forced to move code that cannot
participate in a race into a critical section. As a result you're
holding locks much longer than you need to, reducing concurrency, at
least in a lock-based implementation.

Hans

Boehm, Hans

unread,

May 1, 2008, 8:56:57 PM5/1/08

to

On May 1, 2:08 pm, "Chris Thomasson" <cris...@comcast.net> wrote:
> "Dmitriy V'jukov" <dvyu...@gmail.com> wrote in message
>
> news:0d49490d-2a0e-461b...@k13g2000hse.googlegroups.com...

>
> > On 18 ÁÐÒ, 01:31, "Chris Thomasson" <cris...@comcast.net> wrote:
> > > Indeed. That's a major plus for me. The really cool thing, for me at
> > > least,
> > > is that the next C++ will allow me to create a 100% standard
> > > implementation
> > > of my AppCore library <http://appcore.home.comcast.net> which currently
> > > uses
> > > POSIX Threads and X86 assembly. The fact that C++ has condition
> > > variables,
> > > mutexs, atomic operations and fine-grain memory barriers means a lot to
> > > me.
> > > Finally, there will be such a thing as 100% portable non-blocking
> > > algorithms. I think that's so neat.
> > There are no compiler ordering barriers in C++Ox.
>
> Thats what I thought.

Currently not. There is still some low key discussion about that. I
don't think that affects threads use, though. It does affect code
using asynchronous signals.

>
> > So you can't implement VZOOM object lifetime management in
> > autodetection mode, you need compiler_acquire_barrier and
> > compiler_release_barrier.
>
> No way could I do highly platform dependant auto-detection with C++0x. You
> can get it to work with signals, but that's a little crazy:
>

> http://groups.google.com/group/comp.programming.threads/browse_frm/th...

>
> ;^)
>
> > Also you can't implement SMR+RCU, asymmetric reader-writer mutex,
> > asymmetric eventcount etc. Basically any algorithm where you want to
> > eliminate all hardware barriers from fast-path.

At least parts of this are under active discussion. See N2556. It
turns out that this extremely tricky to do at the source language
level and actually get guaranteed correctness. RCU usually relies on
the hardware enforcing dependency-based memory ordering. But
compilers like to break dependencies because to shortens critical
paths. And not all hardware agrees on what constitutes a dependency
anyway. I think we finally know a reasonable way to do this, though.

>
> Probably not.
>
> > There are no specific barriers in C++Ox. Like 'load-release-wrt-loads'
> > which you need to implement sequence mutex (SeqLock). In C++0x you can
> > use 'load-release' barrier as replacement for 'load-release-wrt-
> > loads'. But 'load-release' includes #StoreLoad. 'load-release-wrt-
> > loads' doesn't inlcude #StoreLoad.

load-release actually doesn't make much sense ...

We currently don't generally have ordering forms that apply to only
loads or only stores. See N2176 for a rationale.

>
> Perhaps you should post this over on the cpp-threads mailing list:
>
> http://www.decadentplace.org.uk/cgi-bin/mailman/listinfo/cpp-threads
>
> > Thus, as I understand it now, C++0x will be suitable only for 'basic'
> > synchronization algorithms. If you want to implement 'advanced'
> > synchronization algorithms you still will have to implement 'my own
> > atomic library' and port it to every compiler and hardware platform
> > manually. Yikes!
>
> For automatic epoch detection, and all that "type" of stuff, I think your
> correct.

You can always implement using either sequentially consistent atomics,
or explicitly ordered atomics, by using the next stronger ordering
constraint. For sequentially consistent atomics, we know that it's
highly platform dependent how much performance you lose. There is a
hand-waving argument that on X86, it's probably not much. (The added
cost there is all in the stores, and in most cases, they are either
greatly outnumbered by loads, or typically encounter a coherence miss
anyway. This is not a 100% argument.)

I think that for explicitly ordered atomics, the performance cost is
generally quite small across most architectures. A few issues are
still under discussion. The chosen ordering constraints are a
compromise between simplicity and performance. But they were also
pruned a bit by observations that some of the other common ones are in
fact very hard to use correctly and/or describe.

Hans

Chris Thomasson

unread,

May 2, 2008, 1:08:39 AM5/2/08

to

"Boehm, Hans" <hans....@hp.com> wrote in message
news:f4443fd5-c47a-4c00...@p25g2000pri.googlegroups.com...
[...]

One clarification. When I write about "auto-detection", I am referring to
using highly platform specific means to extract/parse synchronization
information from operating system interfaces. A full "sync-epoch" is
established after a plurality of managed threads have executed "something"
that acts like a "full" memory barrier. If you can "detect" when this event
occurs, well, your basically already working within an environment that is
fairly "friendly" to memory-barrier free algorithm implementations indeed.

- Joe Seigh shows how SMR can be implemented 'without' using any "stringent"
#StoreLoad memory-barrier ordering constraints:

http://atomic-ptr-plus.sourceforge.net

Basic idea, you only poll SMR after you observe RCU synchronization epoch
(e.g., grace-period). Microsoft has the 'FlushProcessWriteBuffers()'
function:

http://msdn.microsoft.com/en-us/library/ms683148(VS.85).aspx

one caveat, it can sometimes be too "coarse" for certain workloads. In other
words, you must "actively" call this function in order to generate
synchronization information which can end up activating excessive IPI
traffic. Also, I think IBM might have this patented in one of the RCU
variants. Some methods include, but are not limited to, "passive" sync
detection using basic polling logic against platform specific outputs
(e.g.,"/proc/").

Oh yeah, there is a way to amortize MS 'FlushProcessWriteBuffers()' API, but
I am not sure I want to post the code yet. Here is one aspect:
'FlushProcessWriteBuffers()' only ever casts IPI within the calling
processes bound cpu affinity mask, therefore one could organize
multi-threaded processes into groups bound to "local" cpu regions; think
NUMA:

http://groups.google.com/group/comp.arch/msg/6557e261be681be9

http://groups.google.com/group/comp.arch/msg/18dbf634f491f46b
(sicortex super-computer)

http://groups.google.com/group/comp.arch/msg/2e5eeaecd0e69aed
(sicortex super-computer)

Basic example:

You can define a local synchronization epoch as the time when a CPU executes
a memory-barrier. One could further build on that by defining a global sync
epoch as the point in time in which each CPU has observed a local
synchronization epoch.

BTW, I really appreciate all the time and energy you put into enhancing the
C++ Standard; thanks Hans!

:^D

Markus Elfring

unread,

May 2, 2008, 3:56:56 AM5/2/08

to

Szabolcs Ferenczi schrieb:

> You have the same problem there: Your concern is not about any
> concurrent language feature but about what if an _incorrect concurrent
> program_ is optimised by the compiler which compiler optimises for the
> _sequential execution._ You do not want to inform the compiler that it
> should not optimise for sequential execution any more because it is
> not a sequential program but rather a concurrent one. You could inform
> the compiler if threading would be introduced at the language level
> (see marking shared variables and marking critical sections at the
> language level).

I imagine that this abstraction level will be an use case for meta-compilation.
http://en.wikipedia.org/wiki/OpenC++

Which will the implementation language be for the key word "shared" to specify a
specific class property?

I guess that proxies or interceptors can help to apply advanced properties or
features to objects.

Regards,
Markus

Szabolcs Ferenczi

unread,

May 2, 2008, 7:35:45 AM5/2/08

to

On May 2, 2:35 am, "Boehm, Hans" <hans.bo...@hp.com> wrote:
> On May 1, 2:26 pm, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> wrote:> On Apr 28, 8:42 pm, "Boehm, Hans" <hans.bo...@hp.com> wrote:
>
> > > On Apr 21, 10:07 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
> > > wrote:

> > > [...]

> > > In
> > > particular, it I write the equivalent of
>
> > > Thread 1:
> > > x = 42;
> > > x_init = true;
>
> > > Thread 2:
> > > while (!x_init);
> > > assert(x == 42);
>
> > > this can fail (and in fact is incorrect) even if x_init is atomic.
>
> > Of course it is incorrect. Here both `x' and `x_init' are shared
> > variables but you miss to declare them as such. Furthermore, you are
> > trying to access them `in sequential manner' i.e. as if they were
> > variables in a sequential program. However, then, why are you
> > wondering that an incorrect concurrent program can fail?
>
> This is correct in Java and the C++0x working paper if x_init is
> volatile(Java)/atomic(C++). The variable x cannot be simultaneously
> accessed, hence there is no data race on x. And x_init is effectively
> declared shared.

First of all, can you please decide which statement of yours is what
you really hold:

1) `this can fail (and in fact is incorrect) even if x_init is
atomic.'

2) `This is correct in Java and the C++0x working paper if x_init is
volatile(Java)/atomic(C++).'

Both statements are made by you and both refer to the same code
fragment. The only difference is that the first statement is made by
you for your original code fragment and the second one is made in
response to my comments.

What do you really think about it? Is it correct or incorrect?

Besides, I can admit that if you declare `x_init' volatile(Java)/
atomic(C++), the access to `x_init' will be exclusive (as if there
would be a tiny Critical Region defined).

On the other hand, it is a typical concurrent bug to think that if
`x_init' is volatile(Java)/atomic(C++), that has any effect on
accessing another variable `x' subsequent to inspecting atomically the
volatile(Java)/atomic(C++) one. The atomicity is valid for one and
only one operation individually. Two atomic operations in a sequence
cannot be build on the result of each other. You need a Critical
Region for that.

In fact in concurrent programming you can assume any long delay
between two subsequent operations and if those two operations together
is not declared to be an atomic one (see Critical Region) anything may
happen in between.

Thread 2:
while (!x_init);
// <--- x_init may become false right here
// delay(1 day); <--- e.g. x=33
assert(x == 42);

In this case anything may happen with `x' during the delay because
`while (!x_init);' and `assert(x == 42);' are not inside the same
Critical Region. Thus, you cannot be sure about the assert.

On the other hand, let us consider

Thread 2:
with (x) when (x == 42) {

// delay(1 day); <--- x cannot be accessed

// do whatever you want to do when x==42
}

In this case anything may happen during the delay _except_ with `x'
since we expressed our intention that we want to do something with it
when the condition holds (see the Conditional Critical Region).

Best Regards,
Szabolcs

Erik Wikström

unread,

May 2, 2008, 9:03:33 AM5/2/08

to

Both are correct, it is just you who failed to quote all the relevant
parts of the original message. If you had you would see that the 'this
can fail...' part referred to that (kind of) code using Ada's memory
model, and the 'This is correct...' part is for the same code using
Java's and C++0x's memory model.

--
Erik Wikström

Szabolcs Ferenczi

unread,

May 2, 2008, 9:14:18 AM5/2/08

to

Thank you for your effort. So you think both are correct. It is
correct and incorrect at the same time. Brilliant solution. Well done.

However, I think he can answer the question which was put to him.

If you are so ambitious, can you comment this "correct C++0x code"?

Thread 1:
x = 42;
x_init = true;

Thread 2:
while (!x_init);

// <--- x_init may become false right here
// delay(1 day); <--- e.g. x=33
assert(x == 42);

You are in an easy position if it is both correct and incorrect for
you.

Best Regards,
Szabolcs

Lionel B

unread,

May 2, 2008, 9:51:55 AM5/2/08

to

No. Read Erik Wikström's answer again, then go back and read the context
you snipped.

--
Lionel B

Szabolcs Ferenczi

unread,

May 2, 2008, 10:02:46 AM5/2/08

to

Thank you, Lionel. You have been most helpful. Can you in the meantime
look at the part you snipped:

If you are so ambitious, can you comment this "correct C++0x code"?

Thread 1:

x = 42;
x_init = true;

Thread 2:
while (!x_init);

// <--- x_init may become false right here
// delay(1 day); <--- e.g. x=33
assert(x == 42);

Thanks a lot.

Best Regards,
Szabolcs

Lionel B

unread,

May 2, 2008, 10:16:02 AM5/2/08

to

On Fri, 02 May 2008 07:02:46 -0700, Szabolcs Ferenczi wrote:

[snip]

> Thank you, Lionel. You have been most helpful. Can you in the meantime
> look at the part you snipped:

Sure... uh-oh, where'd it go?

> If you are so ambitious, can you comment this "correct C++0x code"?

Ambitious? Me?

[snip]

--
Lionel B

Ben Bacarisse

unread,

May 2, 2008, 2:01:00 PM5/2/08

to

Szabolcs Ferenczi <szabolcs...@gmail.com> writes:

> If you are so ambitious, can you comment this "correct C++0x code"?

OK, I know nothing about C++0x but it seems clear from what I've been
reading here that there has been a basic misunderstanding about the
purpose of Hans Boehm's example. It was posted as an example of how
atomic variables do not solve the problem (following your request for
an such an example). It was never intended to be correct, but the
point it illustrates is not the one you've taken from it.

> Thread 1:
> x = 42;
> x_init = true;
>
> Thread 2:
> while (!x_init);
> // <--- x_init may become false right here
> // delay(1 day); <--- e.g. x=33
> assert(x == 42);

The point was to show that the correctness (if it is to be correct)
relies on more than the atomicity of x_init. To make the example show
what was intended we need thread 1 to exit and to assert that no other
threads are involved (so nothing else can affect x_init or x). Hence
I think the point was that, without extra guarantees:

Syntax to declare shared atomic int x_init = false;

Thread 1:
x = 42;
x_init = true;

exit_thread();

Thread 2:
while (!x_init);
assert(x == 42);

is *still* wrong since the assignment to x may be delayed, either by
the compiler or the hardware. Languages that need the above code to
work, must restrict the compiler and use a relatively harsh memory
regime to ensure that the above does what is expected.

The claim is that C++0x will take a new route. In
<954645d0-2fa8-45fd...@z24g2000prf.googlegroups.com>
Hans Boehm says:

| Consider

|
| Thread 1:
| x = 42;

| lock(l);
exit_thread(); /* Added for clarity */

| Thread 2:
| while (trylock(l) == SUCCESS) unlock(l);
| r1 = x;

| Is this allowable? Does it guarantee r1 == 42? The answer can have
| substantial effect on the cost of the lock() implementation. C++0x

| resolves it in a new and interesting way.

Again, we must assume that thread one exits (or at least does not
touch x or the lock again) and that no other threads are involved.
I don't know how C++0x resolves this, but the suggestion is that is
does so more cheaply than the obvious one.

Anyway, neither example had anything to do x changing again. It would
have been clearer if this has been stated, but it is not reasonable to
assume that your correspondent is missing something as obvious as your
counter example.

--
Ben.

Dmitriy V'jukov

unread,

May 2, 2008, 3:11:57 PM5/2/08

to

On 2 май, 01:08, "Chris Thomasson" <cris...@comcast.net> wrote:

> > So you can't implement VZOOM object lifetime management in
> > autodetection mode, you need compiler_acquire_barrier and
> > compiler_release_barrier.
>
> No way could I do highly platform dependant auto-detection with C++0x. You
> can get it to work with signals, but that's a little crazy:

I am saying not about auto-detection logic itself, but about
vz_acquire()/vz_release() functions. I think they look something like
this:

void vz_acquire(void* p)
{
per_thread_rc_array[hash(p)] += 1;
compiler_acquire_barrier(); // <--------------
}

void vz_release(void* p)
{
compiler_release_barrier(); // <--------------
per_thread_rc_array[hash(p)] -= 1;
}

The question: will you have to manually implement and port to every
compiler compiler_acquire_barrier()/compiler_release_barrier()?

Dmitriy V'jukov

Chris Thomasson

unread,

May 2, 2008, 4:06:44 PM5/2/08

to

"Dmitriy V'jukov" <dvy...@gmail.com> wrote in message

news:f4905e09-1344-4643...@25g2000hsx.googlegroups.com...

The implementation of the function which mutates the array is externally
compiled:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/1d0b291ee41a7fb5

and I document that link-time optimization level should be turned down, or
off... Oh well.

Boehm, Hans

unread,

May 2, 2008, 4:05:05 PM5/2/08

to

On May 2, 7:02 am, Szabolcs Ferenczi <szabolcs.feren...@gmail.com>
wrote:

> Thread 1:

> x = 42;
> x_init = true;
>
> Thread 2:
> while (!x_init);
> // <--- x_init may become false right here
> // delay(1 day); <--- e.g. x=33
> assert(x == 42);
>

I assumed a convention here that I should have been clearer about. In
particular, no other threads execute code relevant to this, and this
is the entire code executed in these two threads. That's admittedly a
simplification, but a convenient one that's generally used in
presenting such examples. A more realistic setting whould involve
multiple threads that behave like thread 2, but only a single thread
that sets x_init, and it is set only once and never reset. Very
similar cases occur with double-checked locking, or when passing
"ownership" to an object between threads through some sort of queue.
In the latter case, the queue is implemented using something like
critical regions, but the object is accessed outside of the critical
region, because it's only accessed by one thread at a time.

Hans

Dmitriy V'jukov

unread,

May 2, 2008, 4:10:02 PM5/2/08

to

On 2 май, 04:56, "Boehm, Hans" <hans.bo...@hp.com> wrote:

> > > There are no compiler ordering barriers in C++Ox.
>
> > Thats what I thought.
>
> Currently not. There is still some low key discussion about that. I
> don't think that affects threads use, though. It does affect code
> using asynchronous signals.

There are synchronization algorithms where compiler ordering affects
exactly threads.

Asymmetric reader-writer mutex:
http://groups.google.ru/group/comp.programming.threads/browse_frm/thread/8c6c19698964d1f6

SMR+RCU:
http://sourceforge.net/project/showfiles.php?group_id=127837
(fastsmr package)

Those algorithms are very... exotic. They eliminate *all* hardware
fences from fast-path. But w/o correct compiler ordering it's
impossible.

> > > Also you can't implement SMR+RCU, asymmetric reader-writer mutex,
> > > asymmetric eventcount etc. Basically any algorithm where you want to
> > > eliminate all hardware barriers from fast-path.
>
> At least parts of this are under active discussion. See N2556. It
> turns out that this extremely tricky to do at the source language
> level and actually get guaranteed correctness. RCU usually relies on
> the hardware enforcing dependency-based memory ordering. But
> compilers like to break dependencies because to shortens critical
> paths. And not all hardware agrees on what constitutes a dependency
> anyway. I think we finally know a reasonable way to do this, though.

I am saying not about data-dependency (std::memory_order_consume).
Consider 'classical' SMR implementation:

// pseudo-code
void* smr_acquire_reference
(void** shared_object, void** hazard_pointer)
{
for (;;)
{
void* object = *shared_object;
*hazard_pointer = object;
hardware_store_load_fence();
void* object2 = *shared_object;
if (object == object2)
{
hardware_acquire_fence();
return object;
}
}
}

void smr_release_reference
(void** hazard_pointer)
{
hardware_release_fence();
*hazard_pointer = 0;
}

In SMR+RCU they looks like this:

// pseudo-code
void* smr_rcu_acquire_reference
(void** shared_object, void** hazard_pointer)
{
for (;;)
{
void* object = *shared_object;
*hazard_pointer = object;
compiler_store_load_fence();
void* object2 = *shared_object;
if (object == object2)
{
compiler_acquire_fence();
return object;
}
}
}

void smr_release_reference
(void** hazard_pointer)
{
compiler_release_fence();
*hazard_pointer = 0;
}

All hardware fences are eliminated. But instead one need:
compiler_store_load_fence()
compiler_acquire_fence()
compiler_release_fence()

As far as I understand, with current C++0x one have to revert assembly/
compiler specific things, and port this manually to every platform.

In gcc it's "__asm__ __volatile__ ("" : : :"memory")"
In msvc it's _ReadWriteBarrier().

It will be great if one will be able to write:

std::atomic_int x;
int l = x.load(std::memory_order_relaxed_but_compiler_acquire);
x.store(0, std::memory_order_relaxed_but_compiler_release);

> > > There are no specific barriers in C++Ox. Like 'load-release-wrt-loads'
> > > which you need to implement sequence mutex (SeqLock). In C++0x you can
> > > use 'load-release' barrier as replacement for 'load-release-wrt-
> > > loads'. But 'load-release' includes #StoreLoad. 'load-release-wrt-
> > > loads' doesn't inlcude #StoreLoad.
>
> load-release actually doesn't make much sense ...

I think that Sequence Lock (SeqLock):
http://en.wikipedia.org/wiki/Seqlock
must be implemented this way:

bool sequence_lock_rdunlock(seqlock* lock, int prev_seq)
{
int seq = lock->seq.load(std::memory_order_load_release_wrt_loads);
return seq == prev_seq;
}

On x86 memory_order_load_release_wrt_loads is no-op (plain load). But
if I use 'next stronger ordering constraint', i.e.
memory_model_acq_rel, then it will be locked rmw operation or mfence
instruction. I think in most C++ implementations it will be locked rmw
operation. And locked rmw operation means ownership of cache-line.
This basically kills the whole idea of SeqLock...

> We currently don't generally have ordering forms that apply to only
> loads or only stores. See N2176 for a rationale.

Yes, there are things about which most developers (including me) don't
even aware :)
But I think (hope) that my example with sequence_lock_rdunlock() is
still correct, because SeqLock allows only 'purely read-only'
transactions on reader side.

I don't argue that it's easy stuff. It's extremely hard stuff. I don't
even hope that I get all right. I just want to figure out whether one
can forget about assembly and manual porting at all, or one still will
have to revert to assembly and manual porting for most-advanced
things.

Dmitriy V'jukov

Chris Thomasson

unread,

May 2, 2008, 4:41:36 PM5/2/08

to

"Dmitriy V'jukov" <dvy...@gmail.com> wrote in message

news:d90142c2-ef0a-4eaf...@34g2000hsf.googlegroups.com...
[...]

> There are synchronization algorithms where compiler ordering affects
> exactly threads.

[...]

> I don't argue that it's easy stuff. It's extremely hard stuff. I don't
> even hope that I get all right. I just want to figure out whether one
> can forget about assembly and manual porting at all, or one still will
> have to revert to assembly and manual porting for most-advanced
> things.

I think your still going to have to use assembly and manual porting for
efficient implementations of algorithms like SMR+RCU. Unless C++ provides a
'rcu_synchronize()' type function, well, I am not sure how you can get
passive sync-epoch detection. You could use a thread bound to each processor
and a single polling thread sends a single message to each one, and waits
for a response. One all responses are in, a synchronization epoch involving
all the CPU's involved with the message broadcast. That is, the CPU has
executing something analogous to a store/load style memory barrier. This has
been proposed before:

http://groups.google.com/group/comp.lang.c++/msg/878a80ea30b2e849

Oh well.BTW, I really do like the idea of providing fine-grain compiler
barriers...

Szabolcs Ferenczi

unread,

May 2, 2008, 5:13:40 PM5/2/08

to

On May 2, 8:01 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

> Szabolcs Ferenczi <szabolcs.feren...@gmail.com> writes:
> > If you are so ambitious, can you comment this "correct C++0x code"?
>
> OK, I know nothing about C++0x but it seems clear from what I've been
> reading here that there has been a basic misunderstanding about the
> purpose of Hans Boehm's example. It was posted as an example of how
> atomic variables do not solve the problem (following your request for
> an such an example).

I think it will be the best if Hans Boehm himself corrects you but I
never requested such an example from him nor from anyone else.

I would never request any example for something that is clear in
concurrent programming from 1965 on, namely that if you have variables
with atomic access (atomic read and atomic write) you cannot derive a
general synchronisation between N processes for critical sections. See
E.W. Dijkstra, Cooperating sequential processes
http://www.cs.utexas.edu/users/EWD/transcriptions/EWD01xx/EWD123.html

Besides, what I requested was that someone could show solutions in C+
+0x for some canonical problems in concurrent programming:

http://groups.google.com/group/comp.lang.c++/msg/9607b37a3b0323f3

That, however, dispite the lot of talk from the wise guys, did not
happen so far that anyone in this discussion list could publish any
example in the C++0x notation for any of the canonical concurrent
problems.

Best Regards,
Szabolcs

Ben Bacarisse

unread,

May 2, 2008, 8:17:29 PM5/2/08

to

Szabolcs Ferenczi <szabolcs...@gmail.com> writes:

> On May 2, 8:01 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:
>> Szabolcs Ferenczi <szabolcs.feren...@gmail.com> writes:
>> > If you are so ambitious, can you comment this "correct C++0x code"?
>>
>> OK, I know nothing about C++0x but it seems clear from what I've been
>> reading here that there has been a basic misunderstanding about the
>> purpose of Hans Boehm's example. It was posted as an example of how
>> atomic variables do not solve the problem (following your request for
>> an such an example).
>
> I think it will be the best if Hans Boehm himself corrects you but I
> never requested such an example from him nor from anyone else.

I see he has elsewhere in this thread.

> I would never request any example for something that is clear in
> concurrent programming from 1965 on,

Obviously, and I never said you did. You did ask for *an* example.

When it was given you could either choose to assume the author had
made a basic mistake that one would be embarrassed to make as a
student, or you could assume that is was illustrating a more subtle
point. It is possible to take the code fragments put then in a
context in which they make sense -- you just need to assume that
nothing else happens. You chose to suggest that a beginner's mistake
had been made. I don't think that helped move this interesting
discussion forwards.

--
Ben.

Ian Collins

unread,

May 2, 2008, 8:36:26 PM5/2/08

to

Ben Bacarisse wrote:
>
> When it was given you could either choose to assume the author had
> made a basic mistake that one would be embarrassed to make as a
> student, or you could assume that is was illustrating a more subtle
> point. It is possible to take the code fragments put then in a
> context in which they make sense -- you just need to assume that
> nothing else happens. You chose to suggest that a beginner's mistake
> had been made. I don't think that helped move this interesting
> discussion forwards.
>

He created a long drawn out thread on c.p.threads by being equally rude
an condescending to everyone foolish enough to join it.

--
Ian Collins.