thread concurancy

Popping mad

unread,

Dec 21, 2016, 3:49:05 PM12/21/16

to le...@nylxs.com

:(

I don't know. I'm very confused about the behavior of this test program I've been right.
I'm trying to move date by worker threads from one blob of memory to another and the debugger
is saying that the pointers beg and canvas_index is jumping 10 bytes between this two lines

for(int i = 0; i < 10;i++)
{
t[i] = std::thread([this]{ readin(beg, canvas_index); });

and I'm not sure why. I lost the concurrency somewhere, but can't seem to figure out what I did wrong

#include <iostream>
#include <thread>
#include <mutex>
std::mutex medco;
std::mutex master;

namespace testing{
std::thread t[10];
class PIC
{
public:
PIC():beg{&source[0]}
{
canvas_index = canvas;
std::cout << "Testing Source" << std::endl;
for(int i = 0; i<100; i++)
{
std::cout << i << " " << source[i] << std::endl ;
}
for(int i = 0; i < 10;i++)
{
t[i] = std::thread([this]{ readin(beg, canvas_index); });
std::cerr << i << ": Making a thread" << std::endl;
sync_canvas_and_input();
}
};

void sync_canvas_and_input()
{
std::cout << "**LOCKING**" << std::endl;
std::lock_guard<std::mutex> turn(medco);
beg += 10;
canvas_index += 10;
}

~PIC()
{
std::cerr << "In the destructor" << std::endl;
for(int i=0; i<10; i++)
{
t[i].join();
std::cerr << i << ": Joining a thread" << std::endl;
}

};
void readin(char * start, char * loc_canvas_index)
{
for( int i = 9; i>=0; i-- )
{
*loc_canvas_index = start[i];
std::cerr << i << ": Copy " << start[i] << std::endl;
std::cerr << i << ": Copied to loc_canvas_index " << reinterpret_cast<char>(*loc_canvas_index) << std::endl;
loc_canvas_index++;
}

Paavo Helde

unread,

Dec 21, 2016, 4:40:09 PM12/21/16

to

On 21.12.2016 22:48, Popping mad wrote:
> :(
>
> I don't know. I'm very confused about the behavior of this test program I've been right.
> I'm trying to move date by worker threads from one blob of memory to another and the debugger
> is saying that the pointers beg and canvas_index is jumping 10 bytes between this two lines
>
> for(int i = 0; i < 10;i++)
> {
> t[i] = std::thread([this]{ readin(beg, canvas_index); });
>
> and I'm not sure why. I lost the concurrency somewhere, but can't seem to figure out what I did wrong

Who knows. Debuggers sometimes also get confused and display wrong data.

Your example is incomplete (truncated?), cannot be compiled and even the
types of beg and canvas_index are not known. So not much can be said
about the code.

From the visible code I can only infer that you have not understood why
and how to protect data with mutexes (mutex lock is in one thread only,
and the data apparently protected by the lock (beg, canvas_index) is
not even accessed in the other threads.

Also, access to the shared data buffer (canvas) is creating a lot of
false sharing as the slicing size 10 is most probably not divisible by
the cache line size, thus causing potentially significant performance
penalties (probably not important for your toy example, but worth to
mention).

ruben safir

unread,

Dec 21, 2016, 11:48:30 PM12/21/16

to

How is this fix ;)

/*
* =====================================================================================
*
* Filename: test.cpp
*
* Description: Threading Experiment
*
* Version: 1.0
* Created: 12/18/2016 12:46:51 PM
* Revision: none
* Compiler: gcc
*
* Author: Ruben Safir (mn), ru...@mrbrklyn.com
* Company: NYLXS Inc
*
* =====================================================================================
*/

#include <iostream>
#include <thread>
#include <mutex>
std::mutex medco;
std::mutex master;

namespace testing{
std::thread t[10];
class PIC
{
public:

PIC():source_index{&source[0]}

{
canvas_index = canvas;
std::cout << "Testing Source" << std::endl;
for(int i = 0; i<100; i++)
{
std::cout << i << " " << source[i] << std::endl ;
}

for(int i = 0; i < 10;i++)
{

t[i] = std::thread([this]{ readin(); });

std::cerr << i << ": Making a thread" << std::endl;
}

};

~PIC()
{
std::cerr << "In the destructor" << std::endl;
for(int i=0; i<10; i++)
{
t[i].join();
std::cerr << i << ": Joining a thread" << std::endl;
}

};

void readin()
{
char * loc_canvas_index;
char * loc_source_index;
sync_canvas_and_input(loc_canvas_index, loc_source_index);

for( int i = 9; i>=0; i-- )
{

*loc_canvas_index = loc_source_index[i];
std::cerr << i << ": Copy " << loc_source_index[i] << std::endl;

std::cerr << i << ": Copied to loc_canvas_index " << reinterpret_cast<char>(*loc_canvas_index) << std::endl;
loc_canvas_index++;
}

};
void sync_canvas_and_input(char * &loc_canvas_index, char * &loc_source_index )

{
std::cout << "**LOCKING**" << std::endl;
std::lock_guard<std::mutex> turn(medco);

loc_canvas_index = canvas_index;
loc_source_index = source_index;
source_index += 10;
canvas_index += 10;
};

char * get_canvas()
{
return canvas;
}
char * get_canvas_index()
{
return canvas_index;
}
char * get_source_index()
{
return source_index;
}

private:
char * canvas = new char[100]{
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z',
'z', 'z','z','z','z','z','z','z','z','z'
};
char * canvas_index;
char source[100] = {
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'
};
char * source_index;
};
}//end namespace

int main(int argc, char** argv)
{
testing::PIC fido;

for(int i = 0; i<100;i++)
{

std::cout << i << " Canvas Position " << fido.get_canvas()[i] << std::endl;
}

return 0;
}

ruben safir

unread,

Dec 21, 2016, 11:49:21 PM12/21/16

to

On 12/21/2016 04:39 PM, Paavo Helde wrote:
> alse sharing as the slicing size 10 is most probably not divisible by
> the cache line size, thus causing potentially significant performance
> penalties (probably not important for your toy example, but worth to
> mention).

That is a VERY cool observation!!!

Thank You

Paavo Helde

unread,

Dec 22, 2016, 5:32:29 AM12/22/16

to

On 22.12.2016 6:48, ruben safir wrote:
>
> How is this fix ;)

Yes, this makes a bit more sense, at least the mutex is really used for
mutual exclusion. A couple of remarks:

The functions get_canvas_index() and get_source_index() are lacking the
mutex lock. Any access to data which is concurrently modified needs a
mutex lock.

The function get_canvas() returns a pointer to the shared array without
any synchronization. This means race conditions. Instead, it should wait
on a std::condition_variable until all threads have fulfilled the task;
upon completion, each thread should increment a counter of completed
tasks under another mutex lock and notify the condition variable. (In
this toy example this waiting could be replaced by just joining all the
threads, but in real life you typically don't want to terminate your
service threads after each task.)

The mutex is defined in another place (a global!) than the protected
data. This is insane. The mutex should be together with the protected
data, preferably in its own section of private variables, clearly
documented:

private: // data members protected by medco
std::mutex medco;
char * canvas_index;
char * source_index;

private: // other data
// ...

The thread array t is also global, with no need, no protection, etc. As
soon as you create another instance of PIC it gets garbled.

HTH
Paavo

Sal LO

unread,

Dec 22, 2016, 10:04:54 AM12/22/16

to

https://youtu.be/Al9bmstjUjc?t=2s

ruben safir

unread,

Dec 22, 2016, 11:01:07 AM12/22/16

to

>
> https://youtu.be/Al9bmstjUjc?t=2s
>

I'm deaf and can't hear videos, but thanks.

ruben safir

unread,

Dec 22, 2016, 12:13:38 PM12/22/16

to

On 12/22/2016 05:32 AM, Paavo Helde wrote:
> The functions get_canvas_index() and get_source_index() are lacking the
> mutex lock. Any access to data which is concurrently modified needs a
> mutex lock.

Wouldn't that cause a deadlock condition?

Paavo Helde

unread,

Dec 22, 2016, 1:29:32 PM12/22/16

to

Deadlock means that at least 2 threads wait for each other. In your
example there is only one function, called in a single thread only,
which ever waits for anything (the destructor, in the thread join
operations), so there is no possibility of deadlocks.

Or alternatively, if you look at the code of sync_canvas_and_input()
then you see that when it has locked the mutex it will release it again
a microsecond or so later, unconditionally. Mutex locks in
get_canvas_index() and get_source_index() are needed so that these
functions would not return incompatible or garbage information during
this microsecond. This is the whole point of "mutual exclusion".

Scott Lurndal

unread,

Dec 23, 2016, 9:31:17 AM12/23/16

to

Paavo Helde <myfir...@osa.pri.ee> writes:
>On 22.12.2016 6:48, ruben safir wrote:
>>
>> How is this fix ;)
>
>Yes, this makes a bit more sense, at least the mutex is really used for
>mutual exclusion. A couple of remarks:
>
>The functions get_canvas_index() and get_source_index() are lacking the
>mutex lock. Any access to data which is concurrently modified needs a
>mutex lock.

Not necessarily - atomics are a viable alternative.

Paavo Helde

unread,

Dec 23, 2016, 1:19:39 PM12/23/16

to

Yes, atomics are fine in many situations, but not here, in the OP
example. They are updated and read concurrently by the
sync_canvas_and_input() function. If it was using 2 atomics instead,
then the values returned are not guaranteed to be in sync, which would
cause interesting bugs (swapped slices in the result).

Scott Lurndal

unread,

Dec 23, 2016, 5:39:56 PM12/23/16

to

Paavo Helde <myfir...@osa.pri.ee> writes:
>On 23.12.2016 16:31, Scott Lurndal wrote:
>> Paavo Helde <myfir...@osa.pri.ee> writes:

>>> Any access to data which is concurrently modified needs a
>>> mutex lock.
>>
>> Not necessarily - atomics are a viable alternative.
>
>Yes, atomics are fine in many situations, but not here, in the OP

You stated, as above, "Any access to data which is concurrently
modified needs a mutex lock". To which I replied as shown. I
made no claim about the OP's code.

Chris M. Thomasson

unread,

Dec 23, 2016, 6:19:49 PM12/23/16

to

Agreed. False sharing is the root of all evil wrt
multi-thread/processing. What good is a slick use of atomics if the
data-structures are not properly padded up to, and allocated in memory
on properly aligned cache lines?

False sharing will totally ruin any chance of a possible speed up wrt
atomic's.

Paavo Helde

unread,

Dec 23, 2016, 6:31:10 PM12/23/16

to

On 24.12.2016 0:39, Scott Lurndal wrote:
> Paavo Helde <myfir...@osa.pri.ee> writes:
>> On 23.12.2016 16:31, Scott Lurndal wrote:
>>> Paavo Helde <myfir...@osa.pri.ee> writes:
>
>>>> Any access to data which is concurrently modified needs a
>>>> mutex lock.
>>>
>>> Not necessarily - atomics are a viable alternative.
>>
>> Yes, atomics are fine in many situations, but not here, in the OP
>
> You stated, as above, "Any access to data which is concurrently
> modified needs a mutex lock". To which I replied as shown. I
> made no claim about the OP's code.

Right, I should have worded my claims more carefully!

Merry Christmas!
p