C++1x: lock-free algos and blocking

54 views
Skip to first unread message

Dmitriy Vyukov

unread,
Oct 18, 2009, 6:50:45 AM10/18/09
to
C++1x provides a complete toolset for lock-based programming: mutex
for ensuring consistency and condition variable for blocking/
signaling.
However, as for lock-free (non-blocking) programming C++1x provides
only atomics for ensuring consistency. And what about thread blocking/
signaling? Condition variables are of no help here, because they
require the state to be protected by a mutex. Yes, some lock-free
algorithms do not require blocking (for example, hash map). But things
like producer-consumer queues, work-stealing deques, etc, do require
some primitives for thread blocking/signaling.
I would like to hear from some people close to the committee on this.
How it's intended I must handle this?
Yes, I can build a semaphore from mutex+condvar+counter. And then
build blocking/signaling logic on top of the semaphore. However IMHO
it's (1) a kind of suboptimal from performance point of view and (2)
everybody will have to build it's own semaphore.
For example, Java provides thread park/unpark primitives for that
purpose, which is a kind of semaphore associated with every thread
with sticky signals and a maximum value of 1. IMHO, it's not the way
to go for C++1x, because one will have to pass thread objects
everywhere.
Thank you.
--
Dmitriy V'jukov

Chris M. Thomasson

unread,
Oct 18, 2009, 1:03:13 PM10/18/09
to
"Dmitriy Vyukov" <dvy...@gmail.com> wrote in message
news:fcbdaa78-a0f5-4ffc...@v25g2000yqk.googlegroups.com...

Humm. Well, the "simplest" algorithm I know of is basically equal to the
original eventcount algorithm I posted here, except it uses different
waitbit value and increments the counter by an even number:


<pseudo-code typed in newsreader; please forgive any typos>
___________________________________________________________________
class eventcount
{
public:
typedef unsigned long key_type;

private:
mutable std::atomic<key_type> m_count;
std::mutex m_mutex;
std::condition_variable m_cond;

private:
void prv_signal(key_type key)
{
if (key & 1)
{
m_mutex.lock($);

while (! m_count($).compare_exchange_weak(
key,
(key + 2) & ~1,
std::memory_order_seq_cst));

m_mutex.unlock($);

m_cond.notify_all($);
}
}

public:
eventcount()
: m_count(0)
{

}

public:
key_type get() const
{
return m_count($).fetch_or(1, std::memory_order_acquire);
}


void signal()
{
prv_signal(m_count($).fetch_add(0, std::memory_order_seq_cst));
}


void signal_relaxed()
{
prv_signal(m_count($).load(std::memory_order_relaxed));
}


void wait(key_type cmp)
{
m_mutex.lock($);

key_type cur = m_count($).load(std::memory_order_seq_cst);

if ((cur & ~1) == (cmp & ~1))
{
m_cond.wait(m_mutex, $);
}

m_mutex.unlock($);
}
};
___________________________________________________________________


Should something like this be standardized? I am not so sure here simply
because one can rather easily use C++1x to create a 100% portable
implementation of an eventcount... Humm...

Dmitriy Vyukov

unread,
Oct 19, 2009, 1:16:14 AM10/19/09
to

> Should something like this be standardized? I am not so sure here simply
> because one can rather easily use C++1x to create a 100% portable
> implementation of an eventcount... Humm...


Chris, probably your are right in the sense that if one builds lock-
free algorithms then he must be able to build blocking/signaling logic
too. And std::condition_variable+std::mutex is a sufficient toolset
for that.

Since you've mentioned an eventcount, I know what you will hear next -
"write your proposal" :)

Btw, here is "a more C++" interface for an eventcount, I've designed
for a contribution to Intel TBB. It supports RAII waiting, and
signaling with arbitrary predicate, with which you may encode for
example "wakeup only thread that waits for exactly this
element" (useful for producer-consumer queues), or "wakeup all threads
that satisfy this condition but at least 1 thread" (useful for work-
stealing environments, condition means data affinity): such fine-
grained signaling avoids unnecessary thread wakeups and thundering
herd, the idea is that it's better to do some more work on producer
thread but avoid boatload of senseless thread wakeups/blocks:

class eventcount
{
public:
eventcount()

void prepare_wait(void* ctx = 0);
void commit_wait();
void cancel_wait();

void notify_one();
void notify_one_relaxed();

void notify_all()
void notify_all_relaxed();

template<typename predicate_t> void notify(predicate_t pred);
template<typename predicate_t> void notify_relaxed(predicate_t
pred)

class wait_guard;
};

class eventcount::wait_guard
{
public:
wait_guard(eventcount& ec, void* ctx = 0);
void commit_wait();
};

And predicate for notify must be of the form:
bool pred(void* ctx, size_t thread_count, size_t thread_index);
// ctx is a arbitrary values passed to prepare_wait (can be allocated
on a stack of a blocked thread)
// return value determines unblock thread or not


--
Dmitriy V'jukov

Anthony Williams

unread,
Oct 19, 2009, 4:52:14 AM10/19/09
to
Dmitriy Vyukov <dvy...@gmail.com> writes:

> C++1x provides a complete toolset for lock-based programming: mutex
> for ensuring consistency and condition variable for blocking/
> signaling.

FYI: the codename for the next release of C++ is still C++0x (even
though it won't be finalized before late 2010). C++1x is the current
codename for the subsequent release (probably around 2020).

> However, as for lock-free (non-blocking) programming C++1x provides
> only atomics for ensuring consistency. And what about thread blocking/
> signaling? Condition variables are of no help here, because they
> require the state to be protected by a mutex.

Well, std::condition_variable_any only requires a "lockable" object, so
you could pass a class that implements lock/try_lock/unlock as empty
functions. However, the implementation probably just uses an internal
mutex in this case, so you don't necessarily gain anything.

> Yes, some lock-free
> algorithms do not require blocking (for example, hash map). But things
> like producer-consumer queues, work-stealing deques, etc, do require
> some primitives for thread blocking/signaling.
> I would like to hear from some people close to the committee on this.
> How it's intended I must handle this?

Currently there is no good way, other than condition variables and
mutexes.

This is a result of the way the thread library has evolved. The "high
level" stuff comes mainly from boost, which does not have a semaphore as
this was considered too hard to use for general development back when
boost.thread was first introduced.

On the other hand, the "low-level" stuff focused on the atomics. Nobody
has campaigned for a lightweight blocking/waking mechanism as far as I
am aware.

I can certainly see the benefits of standardizing a lightweight
mechanism such as a semaphore or event count or even a futex, but that
will have to wait for TR2 as it is currently far too late for getting
new proposals in for C++0x.

Anthony
--
Author of C++ Concurrency in Action | http://www.manning.com/williams
just::thread C++0x thread library | http://www.stdthread.co.uk
Just Software Solutions Ltd | http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Dmitriy Vyukov

unread,
Oct 19, 2009, 5:40:41 AM10/19/09
to
On Oct 19, 12:52 pm, Anthony Williams <anthony....@gmail.com> wrote:


Thank you, Anthony. I get the reasoning.

Btw, regarding:


> Well, std::condition_variable_any only requires a "lockable" object, so
> you could pass a class that implements lock/try_lock/unlock as empty
> functions. However, the implementation probably just uses an internal
> mutex in this case, so you don't necessarily gain anything.

First thought was Yeah cool.
Then - it's illegal because such mutex does not provide mutual
exclusion.
Then - it's legal, because "(mutex) may grant simultaneous ownership
to one or many threads"
Then - it's illegal, because "This operation (unlock) synchronizes
with subsequent lock operations that obtain ownership on the same
object".
std::condition_variable_any probably does not use that property, but
it seems that "empty" mutex does not meet the requirements for a Mutex
type.
It's a bit inconvenient, because "empty" mutex is sometimes required.
Intel TBB just added null_mutex for the purpose of parametrizing
classes with it when multithreaded access is not intended.
Hmmm... does read_unlock operation has to synchronize-with subsequent
read_lock operation for rw_mutex... I think that No... however it's a
kind of non-observable as to whether read_unlock synchronizes-with
read_lock or not (because one can't determine whether read_lock is
subsequent to read_unlock or not). So rw_mutex can not ensure that
synchronization, while still meet the Mutex requirements.


--
Dmitriy V'jukov

Dmitriy Vyukov

unread,
Oct 19, 2009, 5:46:52 AM10/19/09
to
On Oct 19, 1:40 pm, Dmitriy Vyukov <dvyu...@gmail.com> wrote:

> Btw, regarding:
>
> > Well, std::condition_variable_any only requires a "lockable" object, so
> > you could pass a class that implements lock/try_lock/unlock as empty
> > functions. However, the implementation probably just uses an internal
> > mutex in this case, so you don't necessarily gain anything.
>
> First thought was Yeah cool.
> Then - it's illegal because such mutex does not provide mutual
> exclusion.
> Then - it's legal, because "(mutex) may grant simultaneous ownership
> to one or many threads"
> Then - it's illegal, because "This operation (unlock) synchronizes
> with subsequent lock operations that obtain ownership on the same
> object".
> std::condition_variable_any probably does not use that property, but
> it seems that "empty" mutex does not meet the requirements for a Mutex
> type.
> It's a bit inconvenient, because "empty" mutex is sometimes required.
> Intel TBB just added null_mutex for the purpose of parametrizing
> classes with it when multithreaded access is not intended.

Stop. For an empty mutex we have the same story as for a reader-writer
mutex - one can't detect whether unlock synchronizes-with subsequent
lock or not, because one don't know whether the lock is subsequent to
the unlock. So empty mutex can state that it provides that
synchronization w/o actually providing it.
So IMHO empty mutex meets all Mutex requirements.

--
Dmitriy V'jukov

Anthony Williams

unread,
Oct 19, 2009, 6:09:31 AM10/19/09
to
Dmitriy Vyukov <dvy...@gmail.com> writes:

> On Oct 19, 12:52 pm, Anthony Williams <anthony....@gmail.com> wrote:
> Btw, regarding:
>> Well, std::condition_variable_any only requires a "lockable" object, so
>> you could pass a class that implements lock/try_lock/unlock as empty
>> functions. However, the implementation probably just uses an internal
>> mutex in this case, so you don't necessarily gain anything.
>
> First thought was Yeah cool.
> Then - it's illegal because such mutex does not provide mutual
> exclusion.
> Then - it's legal, because "(mutex) may grant simultaneous ownership
> to one or many threads"
> Then - it's illegal, because "This operation (unlock) synchronizes
> with subsequent lock operations that obtain ownership on the same
> object".
> std::condition_variable_any probably does not use that property, but
> it seems that "empty" mutex does not meet the requirements for a Mutex
> type.

I think you can argue (as you have) that you cannot tell whether or not
an empty mutex synchronizes-with anything, so this is not important for
this case. However, the Mutex requirements are wrong. They are a mix of
requirements for the std::yyy_mutex types, and requirements for a
user-supplied mutex type, which should be separate. I thought there was
an issue for this. I'll have a check, and raise one if there isn't
already.

Pete Becker

unread,
Oct 19, 2009, 11:48:41 AM10/19/09
to
Dmitriy Vyukov wrote:
> C++1x provides a complete toolset for lock-based programming: mutex
> for ensuring consistency and condition variable for blocking/
> signaling.

Sorry, but what is C++1x? The current standardization effort is working
toward C++0x. Is 1x some unofficial discussion of possible futures?

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)

Dave Butenhof

unread,
Oct 19, 2009, 11:57:57 AM10/19/09
to
Pete Becker wrote:
> Dmitriy Vyukov wrote:
>> C++1x provides a complete toolset for lock-based programming: mutex
>> for ensuring consistency and condition variable for blocking/
>> signaling.
>
> Sorry, but what is C++1x? The current standardization effort is working
> toward C++0x. Is 1x some unofficial discussion of possible futures?

The ongoing C++ standard revision has taken the shorthand "0x" in
expectation that the work would complete balloting before 2010.

I'm not sure whether it's official, but given that we're now in mid
October of 2009 it seems very nearly impossible that there will be a
final balloted revision of the C++ standard in 2009. Some people at
least have revised the shorthand to "1x". Most likely x == 0, and
there'll be a 2010 edition; but one never knows... ;-)

Pete Becker

unread,
Oct 19, 2009, 12:14:34 PM10/19/09
to

Yes, thanks, I'm well aware of all that. My comment was tongue-in-cheek.

Dmitriy Vyukov

unread,
Oct 19, 2009, 12:15:59 PM10/19/09
to
On Oct 19, 7:48 pm, Pete Becker <p...@versatilecoding.com> wrote:
> Dmitriy Vyukov wrote:
> > C++1x provides a complete toolset for lock-based programming: mutex
> > for ensuring consistency and condition variable for blocking/
> > signaling.
>
> Sorry, but what is C++1x? The current standardization effort is working
> toward C++0x. Is 1x some unofficial discussion of possible futures?

I just did not think that that postfixes have any official ISO
meaning, so I just used '1x' as 'probable release year'. I am sure
I've seen C++1x several times somewhere. Sorry for any confusion.
My question is about C++0x.

--
Dmitriy V'jukov

Chris M. Thomasson

unread,
Oct 19, 2009, 12:34:30 PM10/19/09
to
"Dmitriy Vyukov" <dvy...@gmail.com> wrote in message
news:9ca49d0d-5704-4863...@k19g2000yqc.googlegroups.com...

On 18 О©╫О©╫О©╫, 21:03, "Chris M. Thomasson" <n...@spam.invalid> wrote:

> > > C++1x provides a complete toolset for lock-based programming: mutex
> > > for ensuring consistency and condition variable for blocking/
> > > signaling.
> > > However, as for lock-free (non-blocking) programming C++1x provides
> > > only atomics for ensuring consistency. And what about thread blocking/
> > > signaling?

[...]


> > Humm. Well, the "simplest" algorithm I know of is basically equal to the
> > original eventcount algorithm I posted here, except it uses different
> > waitbit value and increments the counter by an even number:
> >
> > Should something like this be standardized? I am not so sure here simply
> > because one can rather easily use C++1x to create a 100% portable
> > implementation of an eventcount... Humm...


> Chris, probably your are right in the sense that if one builds lock-
> free algorithms then he must be able to build blocking/signaling logic
> too. And std::condition_variable+std::mutex is a sufficient toolset
> for that.

> Since you've mentioned an eventcount, I know what you will hear next -
> "write your proposal" :)

:^)


> Btw, here is "a more C++" interface for an eventcount, I've designed
> for a contribution to Intel TBB. It supports RAII waiting, and
> signaling with arbitrary predicate, with which you may encode for
> example "wakeup only thread that waits for exactly this
> element" (useful for producer-consumer queues), or "wakeup all threads
> that satisfy this condition but at least 1 thread" (useful for work-
> stealing environments, condition means data affinity): such fine-
> grained signaling avoids unnecessary thread wakeups and thundering
> herd, the idea is that it's better to do some more work on producer
> thread but avoid boatload of senseless thread wakeups/blocks:

Totally agreed. The ability to signal a single thread is definitely useful.
Furthermore, the ability to signal a subset of waiting threads which satisfy
some conditions is also very useful. You're implementation that is based on
TLS makes this easy. If you're fine with `SCHED_OTHER' then a complete naive
implementation with linked-list of waiters per-eventcount is fine. FWIW, you
can get single thread signaling with a tweaked form of the eventcount
algorithm I created. It goes something like:


_________________________________________________________
class eventcount
{
public:
typedef unsigned long key_type;

private:
mutable std::atomic<key_type> m_count;

unsigned int m_waiters;


std::mutex m_mutex;
std::condition_variable m_cond;

private:
void prv_signal(key_type key, bool broadcast)
{
if (key & 1)
{
m_mutex.lock();

unsigned int waiters = m_waiters;

key_type xchg;

do
{
xchg = key + 2;

if (waiters < 2 || broadcast)
{
xchg &= ~1;
}
}

while (! m_count($).compare_exchange_weak(
key,
xchg,
std::memory_order_seq_cst));

m_mutex.unlock();

if (waiters)
{
if (waiters == 1 || ! broadcast)
{
m_cond.notify_one();
}

else
{
m_cond.notify_all();
}
}
}
}

public:
eventcount()
: m_count(0),
m_waiters(0)
{

}

public:
key_type get() const
{
return m_count.fetch_or(1, std::memory_order_acquire);
}


void signal()
{
prv_signal(m_count.fetch_add(0, std::memory_order_seq_cst), false);
}


void signal_relaxed()
{
prv_signal(m_count.load(std::memory_order_relaxed), false);
}


void broadcast()
{
prv_signal(m_count.fetch_add(0, std::memory_order_seq_cst), true);
}


void broadcast_relaxed()
{
prv_signal(m_count.load(std::memory_order_relaxed), true);
}


void wait(key_type cmp)
{
m_mutex.lock();

key_type cur = m_count.load(std::memory_order_seq_cst);

if ((cur & ~1) == (cmp & ~1))
{

++m_waiters;

m_cond.wait(m_mutex);

--m_waiters;
}

m_mutex.unlock();
}
};
_________________________________________________________


Of course this is not sufficient for a conditional signaling technique.


> class eventcount::wait_guard
> {
> public:
> wait_guard(eventcount& ec, void* ctx = 0);
> void commit_wait();
> };

> And predicate for notify must be of the form:
> bool pred(void* ctx, size_t thread_count, size_t thread_index);
> // ctx is a arbitrary values passed to prepare_wait (can be allocated
> on a stack of a blocked thread)
> // return value determines unblock thread or not

That's definitely useful!

:^)

Pete Becker

unread,
Oct 19, 2009, 12:54:30 PM10/19/09
to

<g> C++0x is the name of the revision that's currently being worked on.
There is no C++1x.

kibun

unread,
Oct 26, 2009, 10:01:28 AM10/26/09
to

> ___________________________________________________________________
> class eventcount
> {
> public:
>     typedef unsigned long key_type;
>
> private:
>     mutable std::atomic<key_type> m_count;
>     std::mutex m_mutex;
>     std::condition_variable m_cond;
>
> private:
>     void prv_signal(key_type key)
>     {
>         if (key & 1)
>         {
>             m_mutex.lock($);

I know it may be RTFM... but where is this '$' syntax documented. I
feel like I missed something important on the cpp-threads mailing
list...

thanks in advance

Dmitriy Vyukov

unread,
Oct 26, 2009, 10:30:22 AM10/26/09
to
On Oct 26, 5:01 pm, kibun <davide.rosse...@gmail.com> wrote:

> >     void prv_signal(key_type key)
> >     {
> >         if (key & 1)
> >         {
> >             m_mutex.lock($);
>
> I know it may be RTFM... but where is this '$' syntax documented. I
> feel like I missed something important on the cpp-threads mailing
> list...


It's the syntax of Relacy Race Detector - synchronization algorithm
verifier:
http://groups.google.com/group/relacy

--
Dmitriy V'jukov

Reply all
Reply to author
Forward
0 new messages