really simple portable eventcount...

336 views
Skip to first unread message

SenderX

unread,
Mar 6, 2005, 1:54:59 AM3/6/05
to
This is real fast in the signals without waiters case:


// pseudo-code
class ec
{

pthread_mutex_t m_mutex;
pthread_cond_t m_cond;
int m_waiters;
int m_ec;


public:

ec() : m_waiters( 0 ), m_ec( 0 )
{
// init mutex and cond
}


public:

int get()
{
int local = m_ec;
atomic_bitset( &m_ec, 0x80000000 );
return local & 0x7FFFFFFF;
}


int signal()
{
int local = m_ec;

if ( local & 0x80000000 )
{
pthread_mutex_lock( &m_mutex );

while ( ! cas( &m_ec,
local,
( local + 1 ) & 0x7FFFFFFF )
{
local = m_ec
}

local = m_waiters
m_waiters = 0;

pthread_mutex_unlock( &m_mutex );

if ( local )
{
pthread_cond_broadcast( &m_cond );
}
}
}


int wait( int cmp )
{
int local = m_ec & 0x7FFFFFFF;

if ( local == cmp )
{
pthread_mutex_lock( &m_mutex );

local = m_ec & 0x7FFFFFFF;

if ( local == cmp )
{
++m_waiters;

pthread_cond_wait( &m_cond, &m_mutex );
}

pthread_mutex_unlock( &m_mutex );
}
}

};


Can be used as a simple signal for lock-free algorithms.


Joe Seigh

unread,
Mar 6, 2005, 8:27:56 AM3/6/05
to
On Sat, 5 Mar 2005 22:54:59 -0800, SenderX <x...@xxx.com> wrote:

> This is real fast in the signals without waiters case:

I was working on something somewhat similar but used a different
technique that turned out not to work. But anyway I know what
the issues are here. Some of the membars you need aren't so
nice.

>
>
> // pseudo-code
> class ec
> {
>
> pthread_mutex_t m_mutex;
> pthread_cond_t m_cond;
> int m_waiters;
> int m_ec;
>
>
> public:
>
> ec() : m_waiters( 0 ), m_ec( 0 )
> {
> // init mutex and cond
> }
>
>
> public:
>
> int get()
> {
> int local = m_ec;
> atomic_bitset( &m_ec, 0x80000000 );

You need a store/load + load/load rather than just a load/load that you normally
use when just reading an eventcount.

> return local & 0x7FFFFFFF;
> }
>
>
> int signal()
> {

You probably need a store/load here. The lock-free collection
you're signaling about probably has a release memory barrier but
it's in the wrong place. The more conventional method would be
a store/store before incrementing the count with a store/load
before checking for waiters so it's probably not a real hit
relatively speaking.

> int local = m_ec;
>
> if ( local & 0x80000000 )
> {
> pthread_mutex_lock( &m_mutex );
>
> while ( ! cas( &m_ec,
> local,
> ( local + 1 ) & 0x7FFFFFFF )
> {
> local = m_ec
> }
>
> local = m_waiters
> m_waiters = 0;
>
> pthread_mutex_unlock( &m_mutex );
>
> if ( local )
> {
> pthread_cond_broadcast( &m_cond );
> }
> }
> }
>
>
> int wait( int cmp )
> {

load/load here of course.

> int local = m_ec & 0x7FFFFFFF;
>
> if ( local == cmp )
> {
> pthread_mutex_lock( &m_mutex );
>
> local = m_ec & 0x7FFFFFFF;
>
> if ( local == cmp )
> {
> ++m_waiters;
>
> pthread_cond_wait( &m_cond, &m_mutex );
> }
>
> pthread_mutex_unlock( &m_mutex );
> }
> }
>
> };
>
>
> Can be used as a simple signal for lock-free algorithms.
>
>
>

How you structure everything depends on what the relative frequency
of the functions that you expect to used. You can shift the burden
on either the signaling or the waiting depending on which you think
will be more frequent. In this case you are trying to make the
signaling almost free while waiting has a cost with setting the
waiters bit. So you want to reduce that with a particular usage
paradigm. That was that double checked waiting thing I mentioned
previously. For example if you had a lock free queue, you use the
following logic

while ((item = queue_pop()) == NULL) {
wc = get(); // get current waitcount
if ((item = queue_pop()) != NULL)
break;
wait(wc); // wait for signal
}

If you go this way then the likelyhood that if you set the waiters bit
that you are going to wait is extremely high and you can dispense
with the waitcount as nearly redundant and just wasting cycles.

--
Joe Seigh

Joe Seigh

unread,
Mar 6, 2005, 9:19:35 AM3/6/05
to
On Sat, 5 Mar 2005 22:54:59 -0800, SenderX <x...@xxx.com> wrote:

Plus, I don't think you need to optimize wait in any case. It's already
going to be slow. So the compare of the eventcount outside of the locked
region can be removed. It's a pretty small window between the test of
the condition and the test of the eventcout. The eventcount is unlikely
to have been set in that amount of time so that test doesn't buy you
that much. The later test is necessary since you need to test it before
potentially blocking.

>
>
> int wait( int cmp )
> {
> int local = m_ec & 0x7FFFFFFF;
>
> if ( local == cmp )
> {
> pthread_mutex_lock( &m_mutex );
>
> local = m_ec & 0x7FFFFFFF;
>
> if ( local == cmp )
> {
> ++m_waiters;
>
> pthread_cond_wait( &m_cond, &m_mutex );
> }
>
> pthread_mutex_unlock( &m_mutex );
> }
> }
>
> };
>


--
Joe Seigh

Reply all
Reply to author
Forward
0 new messages