Question about synchronization on intel x86

Genom Genom

unread,

Aug 3, 2011, 10:18:20 AM8/3/11

to Scalable Synchronization Algorithms

Hello,

Recently i've asked something on this post but have not received
answers. If this post (duplicate) do not fit the rules, please do what
you think is the best.

So, to resume, i've made this piece of code :

=====

#include <iostream>
#include <thread>
#include <assert.h>

volatile int x = 0;
volatile int y = 0;
int z = 0;

void store()
{
y = 1;

x = 1;
}

void load()
{
while (!x)
;;

if (y == 1)
z = 1;
}

int main()
{
std::thread t1(store);
std::thread t2(load);

t1.join();
t2.join();

assert(z == 1);
}

=====

From what i've seen on the thread (
http://groups.google.com/group/lock-free/browse_thread/thread/a2d9d6fc5fa4ffa0
), x86 architecture don't need fences to work.

Could someone explain me why, because i think i missed something.

Sincerely

Nicola Bonelli

unread,

Aug 3, 2011, 3:17:03 PM8/3/11

to lock...@googlegroups.com

Hello,

operations that involve volatile variables are not reordered with
respect to other volatile variables. Note that when mixing normal and
volatile variables, memory fences are instead required (compiler
fences on Intel arch).

Writes to x and y cannot be rearranged because both are volatile.

This should explain why that assert does not fire.
To implement a better synchronization between threads you may want to
use <atomic> anyway.

Nicola

--
The difference between theory and practice is bigger in practice than in theory.

jeremy menetrier

unread,

Aug 4, 2011, 5:48:20 AM8/4/11

to Scalable Synchronization Algorithms

Damn ! That was pretty obvious ...

Thank you taking time to answer me !

Regards

> >http://groups.google.com/group/lock-free/browse_thread/thread/a2d9d6f...

genom...@gmail.com

unread,

Aug 5, 2011, 5:36:18 PM8/5/11

to lock...@googlegroups.com

Hello,

Instead of opening new posts for each basic questions i may have, i think it could be more appropriate to continue inside this post.

I'm continuing the basics with the dekker's algorithm : http://www.justsoftwaresolutions.co.uk/threading/implementing_dekkers_algorithm_with_fences.html

For the moment, i'm studying the top part of the algorithm :

=======

void p0()

{

flag0.store(true,std::memory_order_relaxed);

std::atomic_thread_fence(std::memory_order_seq_cst);

while (flag1.load(std::memory_order_relaxed))

{

}

=======

In Anthony's analysis, p0 start first. Let's try to do a simulation :

p0 -> flag0.store(true,std::memory_order_relaxed); // p0.flag0 = true, p0.flag1 = false;

p0 -> std::atomic_thread_fence(std::memory_order_seq_cst); // flush store buffer / invalidate queues p0.flag0 = true, p0.flag1 = false;

////////

p1 -> flag1.store(true,std::memory_order_relaxed); // p1.flag0 = false, p1.flag1 = true;

p1 -> std::atomic_thread_fence(std::memory_order_seq_cst);

p1 -> while (flag0.load(std::memory_order_relaxed)) // // p1.flag0 = true, p1.flag1 = true;

At this point, my understanding (and with help of the analysis) is flag0's value is seen because of the previous thread_fence. Is it right ?

////////

p0 -> while (flag1.load(std::memory_order_relaxed)) // p0.flag0 = true, p0.flag1 = false;

Here is the part i do not fully understand : "On the other side, there is no such guarantee for the read from flag1 in p0, so p0 may or may not enter the while loop. If p0 reads the value of false for flag1, it will not enter the while loop, and will instead enter the critical section, but that is OK since p1 has entered the while loop."

With the thread_fence before, p0 should be able to retrieve the correct version of flag1 ?

I've read "Memory Barriers: a Hardware View for Software Hackers" and understood it, but i think i need some hints.

Thank you !

Anthony Williams

unread,

Aug 8, 2011, 10:48:41 AM8/8/11

to lock...@googlegroups.com

On 05/08/11 22:36, genom...@gmail.com wrote:
> I'm continuing the basics with the dekker's algorithm :
> http://www.justsoftwaresolutions.co.uk/threading/implementing_dekkers_algorithm_with_fences.html
>
> For the moment, i'm studying the top part of the algorithm :
>
> =======
> void p0()
> {
> flag0.store(true,std::memory_order_relaxed);
> std::atomic_thread_fence(std::memory_order_seq_cst);
> while (flag1.load(std::memory_order_relaxed))
> {
> }
> }
> =======
>
> In Anthony's analysis, p0 start first. Let's try to do a simulation :
>
> p0 -> flag0.store(true,std::memory_order_relaxed); // p0.flag0 = true,
> p0.flag1 = false;
>
> p0 -> std::atomic_thread_fence(std::memory_order_seq_cst); // flush
> store buffer / invalidate queues p0.flag0 = true, p0.flag1 = false;
>
> ////////
>
> p1 -> flag1.store(true,std::memory_order_relaxed); // p1.flag0 = false,
> p1.flag1 = true;
>
> p1 -> std::atomic_thread_fence(std::memory_order_seq_cst);
>
> p1 -> while (flag0.load(std::memory_order_relaxed)) // // p1.flag0 =
> true, p1.flag1 = true;
>
>
> At this point, my understanding (and with help of the analysis) is
> flag0's value is seen because of the previous thread_fence. Is it right ?

Yes.

> ////////
>
> p0 -> while (flag1.load(std::memory_order_relaxed)) // p0.flag0 = true,
> p0.flag1 = false;
>
> Here is the part i do not fully understand : "On the other side, there

> is no such guarantee for the read from |flag1| in |p0|, so |p0| may or

> |flag1|, it will not enter the |while| loop, and will instead enter the

> critical section, but that is OK since |p1| has entered the |while| loop."
>
> With the thread_fence before, p0 should be able to retrieve the correct
> version of flag1 ?

No. The fence in p0 is before the fence in p1 in the SC ordering.
Therefore there is no guarantee that anything from p1 is visible in p0
at this point.

Anthony
--
Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/
just::thread C++0x thread library http://www.stdthread.co.uk
Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

genom...@gmail.com

unread,

Aug 8, 2011, 5:04:14 PM8/8/11

to lock...@googlegroups.com

Hello Anthony,

Thank you for your post and you answer on your blog, too.

Just to be sure i have understood everything :

In the example, we know p0 fence is before p1 fence. So p0 will go through the fence, but may not have any informations on p1 flags state. Since p1 fence come after p0 fence (as in your comment, p0 may have completed his cycle), it will surely know everything about p0 flags state.

Is it correct ?

Sincerely

Anthony Williams

unread,

Aug 8, 2011, 6:27:25 PM8/8/11

to lock...@googlegroups.com

On 08/08/11 22:04, genom...@gmail.com wrote:
> Thank you for your post and you answer on your blog, too.

You're welcome.

> Just to be sure i have understood everything :
>
> In the example, we know p0 fence is before p1 fence. So p0 will go
> through the fence, but may not have any informations on p1 flags state.
> Since p1 fence come after p0 fence (as in your comment, p0 may have
> completed his cycle), it will surely know everything about p0 flags state.
>
> Is it correct ?

Yes.

Reply all

Reply to author

Forward