Dekker's Algorithm on atomics with memory ordering

mvor...@gmail.com

unread,

Feb 22, 2019, 8:37:28 AM2/22/19

to

Can you please review my implementation of Dekker's algorithm and tell me if my memory ordering on atomic operations is correct. I think it is but I need to double check...

#include <iostream>
#include <atomic>
#include <thread>
using namespace std;

const int COUNT = 10;

int main(int argc, char** argv)
{
atomic_bool f1{false}, f2{false};

thread t1([&]() {
for(int i = 0; i < COUNT;)
{
f1.store(true, memory_order_relaxed);
if(f2.load(memory_order_acquire) == false)
{
cout << "T1 in critical section" << endl;
f1.store(false, memory_order_relaxed);
++i;
}
else
{
f1.store(false, memory_order_relaxed);
this_thread::yield();
}
}
});

thread t2([&]() {
for(int i = 0; i < COUNT;)
{
f2.store(true, memory_order_relaxed);
if(f1.load(memory_order_acquire) == false)
{
cout << "T2 in critical section" << endl;
f2.store(false, memory_order_relaxed);
++i;
}
else
{
f2.store(false, memory_order_relaxed);
this_thread::yield();
}
}
});

t1.join();
t2.join();

return 1;
}

Chris M. Thomasson

unread,

Feb 22, 2019, 5:20:04 PM2/22/19

to

On 2/22/2019 5:37 AM, mvor...@gmail.com wrote:
> Can you please review my implementation of Dekker's algorithm and tell me if my memory ordering on atomic operations is correct. I think it is but I need to double check...

Are you taking a computer science class?

mvor...@gmail.com

unread,

Feb 22, 2019, 5:21:35 PM2/22/19

to

lol no; i'm studying in my own free time ;)

Chris M. Thomasson

unread,

Feb 22, 2019, 5:24:59 PM2/22/19

to

On 2/22/2019 5:37 AM, mvor...@gmail.com wrote:

> Can you please review my implementation of Dekker's algorithm and tell me if my memory ordering on atomic operations is correct. I think it is but I need to double check...

[...]

The reason why I ask, is because you asked about Dekker. Give Lamport's
mutual exclusion work a try. :^)

Chris M. Thomasson

unread,

Feb 22, 2019, 5:33:13 PM2/22/19

to

Okay. Well, keep in mind that Dekker's algorihtm requires MFENCE on x86.
It needs seq_cst. Keep in mind that an x86 can reorder a store followed
by a load to another location. ;^)

mvor...@gmail.com

unread,

Feb 22, 2019, 5:35:54 PM2/22/19

to

thank you. yes i realized the mistake and rewrote it with seq_cst.

Chris Vine

unread,

Feb 23, 2019, 6:46:42 AM2/23/19

to

On Fri, 22 Feb 2019 14:35:46 -0800 (PST)

mvor...@gmail.com wrote:
> On Friday, February 22, 2019 at 5:33:13 PM UTC-5, Chris M. Thomasson wrote:
> > On 2/22/2019 2:21 PM, mvor...@gmail.com wrote:
> > > On Friday, February 22, 2019 at 5:20:04 PM UTC-5, Chris M. Thomasson wrote:
> > >> On 2/22/2019 5:37 AM, mvor...@gmail.com wrote:
> > >>> Can you please review my implementation of Dekker's algorithm and tell me if my memory ordering on atomic operations is correct. I think it is but I need to double check...
> > >>
> > >> Are you taking a computer science class?
> > >
> > > lol no; i'm studying in my own free time ;)
> > >
> >
> > Okay. Well, keep in mind that Dekker's algorihtm requires MFENCE on x86.
> > It needs seq_cst. Keep in mind that an x86 can reorder a store followed
> > by a load to another location. ;^)
>
> thank you. yes i realized the mistake and rewrote it with seq_cst.

Dekker's algorithm is OK to learn for pedagogical purposes but don't
use it in real life. Even though sequential consistency is the default
in C++ for atomics, to avoid unnecessary fence instructions you should
where possible prefer acquire/release semantics, particularly on x86.

std::memory_order_seq_cst is needed where a thread must see the same
order of modification of two or more atomic variables, some of which it
might not modify, as does another thread. Two threads co-operating with
respect to a single atomic variable do not need it (nor where
co-operating with more than one atomic variable where program action
does not depend on their modification ordering). Most algorithms can
be (re)written to void cases of sequential consistency.

Chris M. Thomasson

unread,

Feb 23, 2019, 3:35:59 PM2/23/19

to

Dekker algorihtm basically needs seq_cst wrt its store followed by a
load to another location. It needs an explicit MFENCE or dummy LOCK'ed
atomic, even on x86.

Well, there is the more exotic asymmetric version, but that uses another
type of barrier that is not provided by C++11.

Chris M. Thomasson

unread,

Feb 23, 2019, 5:47:22 PM2/23/19

to

On 2/22/2019 5:37 AM, mvor...@gmail.com wrote:

> Can you please review my implementation of Dekker's algorithm and tell me if my memory ordering on atomic operations is correct. I think it is but I need to double check...
>
>
>
> #include <iostream>
> #include <atomic>
> #include <thread>
> using namespace std;
>
> const int COUNT = 10;
>
> int main(int argc, char** argv)
> {
> atomic_bool f1{false}, f2{false};

[...]

You are using a different algorihtm than Dekker. Afaict, you are
omitting the turn variable:

https://en.wikipedia.org/wiki/Dekker%27s_algorithm

Chris M. Thomasson

unread,

Feb 23, 2019, 6:15:33 PM2/23/19

to

On 2/22/2019 5:37 AM, mvor...@gmail.com wrote:

> Can you please review my implementation of Dekker's algorithm and tell me if my memory ordering on atomic operations is correct. I think it is but I need to double check...

[...]

Fwiw, I can only get Dekker's algorihtm to work with the following
membars. This is a Relacy test unit, take a deep look at ct_dekker:
__________________________________
// Dekker Test
// Relacy Test By Chris M. Thomasson
//_______________________________________________

//#define RL_DEBUGBREAK_ON_ASSERT
//#define RL_MSVC_OUTPUT
//#define RL_FORCE_SEQ_CST
//#define RL_GC

#include <relacy/relacy_std.hpp>
#include <iostream>

#define THREADS (2) // Only 2 threads will work here...

struct ct_dekker
{
std::atomic<bool> f1;
std::atomic<bool> f2;

ct_dekker() : f1(false), f2(false) {}

void p0_lock()
{
for (;;)
{
f1.store(true, std::memory_order_seq_cst);

if (f2.load(std::memory_order_seq_cst) == false)
{
break;
}

f1.store(false, std::memory_order_relaxed);
rl::yield(1, $);
}
}

void p0_unlock()
{
f1.store(false, std::memory_order_release);
}

void p1_lock()
{
for (;;)
{
f2.store(true, std::memory_order_seq_cst);

if (f1.load(std::memory_order_seq_cst) == false)
{
break;
}

f2.store(false, std::memory_order_relaxed);
rl::yield(1, $);
}
}

void p1_unlock()
{
f2.store(false, std::memory_order_release);
}
};

// Relacy Dekker
struct ct_dekker_test
: rl::test_suite<ct_dekker_test, THREADS>
{
VAR_T(long) g_shared_state;
ct_dekker g_dekker;

void before()
{
VAR(g_shared_state) = 0;
}

void after()
{

}

void thread(unsigned int tidx)
{
if (tidx < 1)
{
g_dekker.p0_lock();
++VAR(g_shared_state);
g_dekker.p0_unlock();
}

else
{
g_dekker.p1_lock();
++VAR(g_shared_state);
g_dekker.p1_unlock();
}
}
};

// Test away... Or fly? Humm...
int main()
{
{
rl::test_params p;

p.iteration_count = 10000000;
//p.execution_depth_limit = 33333;
//p.search_type = rl::sched_bound;
//p.search_type = rl::fair_full_search_scheduler_type;
//p.search_type = rl::fair_context_bound_scheduler_type;

rl::simulate<ct_dekker_test>(p);
}

return 0;
}
__________________________________