Tricky ...

242 views
Skip to first unread message

Bonita Montero

unread,
Sep 20, 2021, 4:43:58 PMSep 20
to
In some code I've to check different sates for which an exception might
be thrown. These states are all true or false so I combined them to a
four bit pattern wich I used as an index to a table which stores the
strings of the out_of_range exception to be trown or a nullptr if the
state isn't associated with an out of range condition. This helps me
to prevent some if-else-cascades and speeds up the code.
So this is the complete function but i put two empty lines around the
code I mentioned.

void dual_monitor::wait( bool b )
{
uint64_t cmp = m_flagAndCounters.load( memory_order_relaxed ), chg;
uint32_t visitors, aWaiters, bWaiters, waiters;
assert(m_tid == thread_id::self());
m_tid = thread_id();
do
{
assert(isOwned( cmp ) && !m_recCount);
visitors = getVisitors( cmp );
aWaiters = getAWaiters( cmp );
bWaiters = getAWaiters( cmp );
assert(visitors >= aWaiters + bWaiters);
waiters = !b ? aWaiters : bWaiters;

static
char const *const throwTable[] =
{
nullptr /* 0000 */, nullptr /* 0001 */, nullptr /* 0010 */,
nullptr /* 0011 */, nullptr /* 0100 */, nullptr /* 0101 */,
"monitor::wait() - number of visitors and a-waiters too large", // 0110
"monitor::wait() - number of visitors and a-waiters too large", // 0111
nullptr /* 1000 */, nullptr /* 1001 */, nullptr /* 1010 */,
nullptr /* 1011 */, nullptr /* 1100 */,
"monitor::wait() - number of visitors and b-waiters too large", // 1101
nullptr, // 1110
"monitor::wait() - number of visitors and b-waiters too large", // 1111
};
size_t maxVisitors = visitors == COUNTER_MASK,
maxAWaiters = aWaiters == COUNTER_MASK,
maxBWaiters = bWaiters == COUNTER_MASK;
size_t throwIdx = (size_t)b << 3 | maxVisitors << 2 | maxAWaiters << 1
| maxBWaiters;
if( throwTable[throwIdx] )
{
m_tid = thread_id::self();
throw out_of_range( throwTable[throwIdx] );
};

if( visitors > waiters )
// waiters < COUNTER_MASK because visitors > waiters
// taking over visitor-counter from another thread
chg = cmp + (!b ? WAITER_A_VALUE : WAITER_B_VALUE);
else
// visitors == aWaiters + bWaiters
chg = (cmp & ~OWNER_MASK) + (!b ? WAITER_A_VALUE : WAITER_B_VALUE) +
VISITOR_VALUE;
} while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
memory_order_release, memory_order_relaxed ) );
if( visitors > waiters )
{
#if defined(_MSC_VER)
while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
#elif defined(__unix__)
sysv_semaphore::sembuf semBuf;
do
semBuf( VISITOR_SEM, 1, 0 );
while( m_sems( &semBuf, 1, false ) == -1 );
#endif
}
#if defined(_MSC_VER)
HANDLE ahWait[2] = { (HANDLE)m_xhEvtVisit, (HANDLE)(!b ? m_xhSemWaitA :
m_xhSemWaitB) };
for( DWORD dwWait; (dwWait = WaitForMultipleObjects( 2, ahWait, TRUE,
INFINITE )) < WAIT_OBJECT_0
|| dwWait > WAIT_OBJECT_0 + 1; );
#elif defined(__unix__)
sysv_semaphore::sembuf semBufs[2];
do
semBufs[0]( VISITOR_SEM, -1, 0 ),
semBufs[1]( !b ? WAITER_A_SEM : WAITER_B_SEM, -1, 0 );
while( m_sems( semBufs, 2, false ) == -1 );
#endif
m_tid = thread_id::self();
}

Chris M. Thomasson

unread,
Sep 23, 2021, 1:39:10 PMSep 23
to
On 9/20/2021 1:43 PM, Bonita Montero wrote:
> In some code I've to check different sates for which an exception might
> be thrown. These states are all true or false so I combined them to a
> four bit pattern wich I used as an index to a table which stores the
> strings of the out_of_range exception to be trown or a nullptr if the
> state isn't associated with an out of range condition. This helps me
> to prevent some if-else-cascades and speeds up the code.
> So this is the complete function but i put two empty lines around the
> code I mentioned.
>
> void dual_monitor::wait( bool b )
[...]

Please, try it out in a race detector? Create several programs that use
your code in high load scenarios, and see what happens in the detector.
Do we all have the time to examine your "tricky" code and test it out
for ourselves? Well, not really. I am fairly busy with some fractal work
right now. Sorry Bonita.

Relacy is a good one, ThreadSanitizer is nice. My friend created Relacy,
iirc, he is also working on ThreadSanitizer.

Bonita Montero

unread,
Sep 23, 2021, 2:29:12 PMSep 23
to
Am 23.09.2021 um 19:38 schrieb Chris M. Thomasson:
> On 9/20/2021 1:43 PM, Bonita Montero wrote:
>> In some code I've to check different sates for which an exception might
>> be thrown. These states are all true or false so I combined them to a
>> four bit pattern wich I used as an index to a table which stores the
>> strings of the out_of_range exception to be trown or a nullptr if the
>> state isn't associated with an out of range condition. This helps me
>> to prevent some if-else-cascades and speeds up the code.
>> So this is the complete function but i put two empty lines around the
>> code I mentioned.
>>
>> void dual_monitor::wait( bool b )
> [...]
>
> Please, try it out in a race detector? Create several programs that use
> your code in high load scenarios, and see what happens in the detector.
> Do we all have the time to examine your "tricky" code and test it out
> for ourselves? Well, not really. I am fairly busy with some fractal work
> right now. Sorry Bonita.

Not necessary. My montor works perfectly.
I've implemented something completely new: a poll_wait operation.
You simply supply a check-lambda that checks if there is valid state
and my code repeatedly tries to lock the lock and check if the con-
dition is true. The number of spins is re-calculated at each call
according to the recalculation-pattern of the glibc.

template<typename RetType, typename Predicate>
requires requires( Predicate pred )
{
{ pred() } -> std::same_as<std::pair<bool, RetType>>;
}
RetType monitor::wait_poll( Predicate const &pred )
{
using namespace std;
using retpair_t = pair<bool, RetType>;
uint32_t maxSpinT = (uint32_t)m_nextPollSpinCount * 2 + 10;
uint16_t maxSpin = maxSpinT <= m_maxPollSpinCount ?
(uint16_t)maxSpinT : m_maxPollSpinCount,
spinCount = 0;
bool notify = false;
uint64_t cmp = m_flagAndCounters.load( memory_order_relaxed );
for( ; !notify && spinCount < maxSpin; ++spinCount )
if( isOwned( cmp ) )
{
cpu_pause( PAUSE_ITERATIONS );
cmp = m_flagAndCounters.load( memory_order_relaxed );
continue;
}
else if( m_flagAndCounters.compare_exchange_weak( cmp, cmp |
OWNER_MASK, memory_order_acquire, memory_order_relaxed ) )
{
retpair_t ret = move( pred() );
uint64_t chg;
do
{
uint32_t visitors = getVisitors( cmp ),
waiters = getWaiters( cmp );
assert(visitors >= waiters);
chg = cmp & ~OWNER_MASK;
if( notify = visitors > waiters )
chg -= VISITOR_VALUE;
} while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
memory_order_relaxed, memory_order_relaxed ) );
if( notify )
#if defined(_MSC_VER)
while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
#elif defined(__unix__)
for( m_sems( { sembuf( VISITOR_SEM, 1, 0 ) }, false ) == -1 );
#endif
if( !ret.first )
if( !notify )
continue;
else
break;
if( !notify )
m_nextPollSpinCount += (int16_t)(((int32_t)spinCount -
(int32_t)m_nextPollSpinCount) / 8);
return move( ret.second );
}
else
cpu_pause( PAUSE_ITERATIONS );
if( !notify )
m_nextPollSpinCount += (int16_t)(((int32_t)spinCount -
(int32_t)m_nextPollSpinCount) / 8);
lock();
try
{
for( ; ; )
{
retpair_t ret = move( pred() );
if( ret.first )
{
unlock();
return move( ret.second );
}
wait();
}
}
catch( ... )
{
unlock();
throw;
}
}

Bonita Montero

unread,
Sep 23, 2021, 2:30:39 PMSep 23
to
Oh, I just thought that those two moves can be dropped since thei're
implicit.

Chris M. Thomasson

unread,
Sep 24, 2021, 9:48:34 PMSep 24
to
On 9/23/2021 11:29 AM, Bonita Montero wrote:
> while( !SetEvent( (HANDLE)m_xhEvtVisit ) );

Humm....

Chris M. Thomasson

unread,
Sep 24, 2021, 9:56:30 PMSep 24
to
For some damn reason, this reminds me of some broken condvar impls that
used the good ol' PulseEvent:

https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-pulseevent

A quote: "This function is unreliable and should not be used. It exists
mainly for backward compatibility. For more information, see Remarks."

Bonita Montero

unread,
Sep 25, 2021, 1:24:22 AMSep 25
to
What I do is reliable.

Branimir Maksimovic

unread,
Sep 25, 2021, 8:28:57 AMSep 25
to
This is because you SAY SO?
>


--
7-77-777
Evil Sinner!

Bonita Montero

unread,
Sep 25, 2021, 9:58:05 AMSep 25
to
Inspect my code and say why it is not instead of making
arbitrary assumptions.

Branimir Maksimovic

unread,
Sep 25, 2021, 9:59:13 AMSep 25
to
Show your code, I missed it :(


--
7-77-777
Evil Sinner!

Bonita Montero

unread,
Sep 25, 2021, 10:09:13 AMSep 25
to
Am 25.09.2021 um 15:59 schrieb Branimir Maksimovic:
> On 2021-09-25, Bonita Montero <Bonita....@gmail.com> wrote:
>>>> What I do is reliable.
>>>
>>> This is because you SAY SO?
>>
>> Inspect my code and say why it is not instead of making
>> arbitrary assumptions.
>>
> Show your code, I missed it :(

I showed my wait_poll-function. It's sth. completely new that improves
most usualy monitor-handling, usually with producer-consumer patterns
where the mutex-part of the monitor is held very short.

Branimir Maksimovic

unread,
Sep 25, 2021, 1:13:37 PMSep 25
to
And you use unreliable function for that?
--

7-77-777
Evil Sinner!

Bonita Montero

unread,
Sep 25, 2021, 1:15:47 PMSep 25
to
Am 25.09.2021 um 19:13 schrieb Branimir Maksimovic:
> On 2021-09-25, Bonita Montero <Bonita....@gmail.com> wrote:
>> Am 25.09.2021 um 15:59 schrieb Branimir Maksimovic:
>>> On 2021-09-25, Bonita Montero <Bonita....@gmail.com> wrote:
>>>>>> What I do is reliable.
>>>>>
>>>>> This is because you SAY SO?
>>>>
>>>> Inspect my code and say why it is not instead of making
>>>> arbitrary assumptions.
>>>>
>>> Show your code, I missed it :(
>>
>> I showed my wait_poll-function. It's sth. completely new that improves
>> most usualy monitor-handling, usually with producer-consumer patterns
>> where the mutex-part of the monitor is held very short.
>
> And you use unreliable function for that?

Show me where my code is unreliable.
I think you're simply a poor troll.

Branimir Maksimovic

unread,
Sep 25, 2021, 1:22:15 PMSep 25
to
I am just asking because you build something
on shaken ground. Why do you think is reliable,
simple question?

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,
Sep 25, 2021, 4:34:16 PMSep 25
to
Whats up with the:

while( !SetEvent( (HANDLE)m_xhEvtVisit ) );

Shouldn't you use GetLastError to find out why SetEvent failed?

Also, you are using:

cpu_pause( PAUSE_ITERATIONS );

In several places? Reeks of a spinlock. Some kind of backoff?

Bonita Montero

unread,
Sep 25, 2021, 11:46:38 PMSep 25
to
Because the code is simple.

Bonita Montero

unread,
Sep 25, 2021, 11:49:06 PMSep 25
to
Am 25.09.2021 um 22:34 schrieb Chris M. Thomasson:
> On 9/24/2021 10:24 PM, Bonita Montero wrote:
>> Am 25.09.2021 um 03:56 schrieb Chris M. Thomasson:
>>> On 9/24/2021 6:48 PM, Chris M. Thomasson wrote:
>>>> On 9/23/2021 11:29 AM, Bonita Montero wrote:
>>>>> while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
>>>>
>>>> Humm....
>>>
>>> For some damn reason, this reminds me of some broken condvar impls
>>> that used the good ol' PulseEvent:
>>>
>>> https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-pulseevent
>>>
>>>
>>> A quote: "This function is unreliable and should not be used. It
>>> exists mainly for backward compatibility. For more information, see
>>> Remarks."
>>
>> What I do is reliable.
>>
>
> Whats up with the:
>
> while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
>
> Shouldn't you use GetLastError to find out why SetEvent failed?

No, once I incremented the waiter or visitor count I can't set it back
because it might have been reduced by someone awakening me. So I've to
spin or terminate the program. But this doesn't matter since SE fails
under conditions where the machine almost has no resources.

Chris M. Thomasson

unread,
Sep 27, 2021, 3:18:38 AMSep 27
to
On 9/20/2021 1:43 PM, Bonita Montero wrote:
[...]

Humm... For some reason I feel the need for std::memory_order_acq_rel
here wrt the cas. Humm... I need to port your algorihtm over to a form
that Relacy can understand. Its been a while! I just got a strange
feeling. Waiting is usually, acquire semantics. Humm... Sorry, need to
examine it further, and port it over. Then run it in certain scenarios.
Relacy has the capability to crack it wide open if there are any issues.

I should have some time later on tomorrow. I am busy with my fractal
software right now:

https://fractalforums.org/gallery/1612-270921004032.jpeg

http://siggrapharts.ning.com/photo/alien-anatomy

lol. ;^)

Bonita Montero

unread,
Sep 27, 2021, 8:44:44 AMSep 27
to
> here wrt the cas. ...

No, you only write nonsense. When I wait the lock is released,
so it's release-consistency.

You always write nonsense.

red floyd

unread,
Sep 27, 2021, 1:54:47 PMSep 27
to
On 9/27/2021 5:44 AM, Bonita Montero wrote:
[redacted]
>
> No, you only write nonsense. When I wait the lock is released,
> so it's release-consistency.
>
> You always write nonsense.

If you're going to ask for opinions and then reject any criticism
as nonsense, then why the heck are you even bothering to post your
code?


Chris M. Thomasson

unread,
Sep 27, 2021, 5:06:39 PMSep 27
to
On 9/27/2021 5:44 AM, Bonita Montero wrote:
> Am 27.09.2021 um 09:18 schrieb Chris M. Thomasson:
>> On 9/20/2021 1:43 PM, Bonita Montero wrote:
>>> In some code I've to check different sates for which an exception might
>>> be thrown. These states are all true or false so I combined them to a
>>> four bit pattern wich I used as an index to a table which stores the
>>> strings of the out_of_range exception to be trown or a nullptr if the
>>> state isn't associated with an out of range condition. This helps me
>>> to prevent some if-else-cascades and speeds up the code.
>>> So this is the complete function but i put two empty lines around the
>>> code I mentioned.
>>>
>>> void dual_monitor::wait( bool b )
[...]
>>>      } while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
>>> memory_order_release, memory_order_relaxed ) );
>> [...]
>>
>> Humm... For some reason I feel the need for std::memory_order_acq_rel
>> here wrt the cas. ...
>
> No, you only write nonsense. When I wait the lock is released,
> so it's release-consistency.
>
> You always write nonsense.

Decrementing a semaphore requires acquire semantics. Incrementing a
semaphore requires release semantics. Trying to do both at once in a
single atomic operation requires acquire/release semantics.

I still need to port it to Relacy, but it seems like you need acq_rel here.

Chris M. Thomasson

unread,
Sep 27, 2021, 5:10:29 PMSep 27
to
Yeah, no shi%. Wow. Fwiw, its been a while since I have worked on such
things. I just need to port Bonita's code over to Relacy, and give it a
go in the simulator. It can find obscure memory order issues pretty damn
fast. The problem is that I need to find the time. I mean, Bonita is not
paying me. ;^)

Chris M. Thomasson

unread,
Sep 27, 2021, 7:09:37 PMSep 27
to
Think about it... Waiting for something, implies acquire semantics.

Bonita Montero

unread,
Sep 28, 2021, 12:56:08 AMSep 28
to
Am 27.09.2021 um 23:06 schrieb Chris M. Thomasson:
> On 9/27/2021 5:44 AM, Bonita Montero wrote:
>> Am 27.09.2021 um 09:18 schrieb Chris M. Thomasson:
>>> On 9/20/2021 1:43 PM, Bonita Montero wrote:
>>>> In some code I've to check different sates for which an exception might
>>>> be thrown. These states are all true or false so I combined them to a
>>>> four bit pattern wich I used as an index to a table which stores the
>>>> strings of the out_of_range exception to be trown or a nullptr if the
>>>> state isn't associated with an out of range condition. This helps me
>>>> to prevent some if-else-cascades and speeds up the code.
>>>> So this is the complete function but i put two empty lines around the
>>>> code I mentioned.
>>>>
>>>> void dual_monitor::wait( bool b )
> [...]
>>>>      } while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
>>>> memory_order_release, memory_order_relaxed ) );
>>> [...]
>>>
>>> Humm... For some reason I feel the need for std::memory_order_acq_rel
>>> here wrt the cas. ...
>>
>> No, you only write nonsense. When I wait the lock is released,
>> so it's release-consistency.
>>
>> You always write nonsense.
>
> Decrementing a semaphore requires acquire semantics.

No, I could have modified data before, so I need, release-semantics.
You are such a n00b.

Bonita Montero

unread,
Sep 28, 2021, 12:57:43 AMSep 28
to
No, when I release a lock and could have modified data,
I need release-semantics.

Chris M. Thomasson

unread,
Sep 28, 2021, 1:27:44 AMSep 28
to
For some reason when I saw the word "wait", I think of acquire. Are you
sure you know all about memory barriers? I have worked with them in the
past.

Chris M. Thomasson

unread,
Sep 28, 2021, 1:28:40 AMSep 28
to
So, how can one utilize dual_monitor::wait? Does it wait on something?

Chris M. Thomasson

unread,
Sep 28, 2021, 1:47:06 AMSep 28
to
Terminate? Humm... If there are conditions arising in the system in
which SetEvent fails, you go into infinite loop here, right? Try to
handle this scenario. Find out why it failed, GetLastError.

Chris M. Thomasson

unread,
Sep 28, 2021, 1:49:47 AMSep 28
to
Waiting on something is acquire by nature. Show me where its not?
Actually, show me where to place the acquire and release barriers in a
semaphore? Can you do it? Should be a piece of cake, right?

Bonita Montero

unread,
Sep 28, 2021, 2:33:18 AMSep 28
to
> sure you know all about memory barriers? ...

Everything necessary here.

Bonita Montero

unread,
Sep 28, 2021, 2:34:06 AMSep 28
to
Its like waiting on a condvar, which might have modified any data
before.

Bonita Montero

unread,
Sep 28, 2021, 2:34:57 AMSep 28
to
If the system hasn't any resources to satisfy a simple WaitForSingle...
spinning is quite o.k..

Chris M. Thomasson

unread,
Sep 28, 2021, 3:03:47 AMSep 28
to
I was thinking along those lines. However, waiting on a condvar involves
several steps. It also involves reacquiring the mutex...

Chris M. Thomasson

unread,
Sep 28, 2021, 3:05:47 AMSep 28
to
spinning indeed. However, you have no backoff. You are spinning full speed!


while( !SetEvent( (HANDLE)m_xhEvtVisit ) );

Oh shit. When shit hits the fan, so does this! Try to not bleed the
system when its in dire straights?

Chris M. Thomasson

unread,
Sep 28, 2021, 3:06:39 AMSep 28
to
For sure?

Bonita Montero

unread,
Sep 28, 2021, 6:05:43 AMSep 28
to
With what I do the reacquisition is done by the side that unlocks the
mutex. It simply keeps the locked-flag while unlocking and sets the
visitor-event.

Bonita Montero

unread,
Sep 28, 2021, 6:06:46 AMSep 28
to
That doesnt't matter under such conditions.
Am I talking to a complete moron here ?

David Brown

unread,
Sep 28, 2021, 10:06:07 AMSep 28
to
Chris, why do you keep trying to communicate with Bonita? She is
clearly so convinced of her own superiority to everyone else that it is
pointless. She does not post code for people to check, she posts it
because she thinks she is showing off and impressing people. This kind
of code is difficult to get right, and any capable developer is going to
want to discuss it with people to check the details. Since Bonita
responds with nothing but insults and arrogance, I think it is safe to
assume her code is flawed and leave her with it.

Bonita Montero

unread,
Sep 28, 2021, 12:46:59 PMSep 28
to
Chris doesn't understand my code. He even doesn't understand that before
any wait-operations you use atomic operations with release consistency
and after you use atomic operations with acquire-consistency. He is
totally confused.
And with the above code: once a visitor has incremented the visitor
-counter it can't decrement it because it might have decremented by
the code wanting to wake up a visitor. So under extreme conditions
where the system doesn't have the least resources to satisfy such a
simple operations spinning is quite o.k. until the conditions become
better. The system isn't usable anyway then.
It's simply frustrating to explain such simple things. Chris is simply
a n00b which lacks basic knowledge.

Chris M. Thomasson

unread,
Sep 28, 2021, 5:02:57 PMSep 28
to
Ummm... WOW. You are confused, Big time. I already asked you to show me
exactly where to place the memory barriers necessary to implement a
semaphore. Can you do it? I can. Show me... I am interested to see if
you can do it. From what you are writing, I don't think you can. Oh
well. Shit happens.


> And with the above code: once a visitor has incremented the visitor
> -counter it can't decrement it because it might have decremented by
> the code wanting to wake up a visitor. So under extreme conditions
> where the system doesn't have the least resources to satisfy such a
> simple operations spinning is quite o.k. until the conditions become
> better. The system isn't usable anyway then.
> It's simply frustrating to explain such simple things. Chris is simply
> a n00b which lacks basic knowledge.

lol. Humm... Perhaps I should just leave you with it. You refuse to run
it through a race detector... Why? I was going to port it over to
Relacy, but now... Argh. Screw it. You do it.

Chris M. Thomasson

unread,
Sep 28, 2021, 5:42:18 PMSep 28
to
Yeah. Damn. She does not seem to know a whole lot about memory barriers,
those tricky bastards! ;^) I was trying to help her. Then I get spit on.
I asked her a simple question: Where do the memory barriers go for a
semaphore. So far, no answer. She is confused. Calling me a newb is
strange because I have been working with these types of things for decades.

I would love to see Bonita program a SPARC in RMO mode. That would be
amusing... Actually Sun gave me a SunFire T2000 after winning the first
round in their CoolThreads contest. My project was called vZOOM. That
damn server sounded like several vacuum cleaners running when I was
working with it.

I wonder if she knows what a #LoadStore | #LoadLoad barrier is. That
MEMBAR instruction was fun on the SPARC. ;^)

So, I will just shut up, and let her do her thing.

Oh well.

red floyd

unread,
Sep 28, 2021, 8:05:18 PMSep 28
to
On 9/28/2021 2:42 PM, Chris M. Thomasson wrote:

> working with it.
>
> I wonder if she knows what a #LoadStore | #LoadLoad barrier is. That
> MEMBAR instruction was fun on the SPARC. ;^)
>

Not quite a membar, but on PowerPC, the EIEIO instruction was also fun.

Chris M. Thomasson

unread,
Sep 28, 2021, 10:28:48 PMSep 28
to
Yup. Iirc, it was for IO. Its been a while since I programmed a PPC. I
wrote about some of the pitfalls of using LL/SC on it a while back on
comp.arch:

https://groups.google.com/g/comp.arch/c/yREvvvKvr6k/m/nRZ5tpLwDNQJ

Wow, this was way back in 2005! Jeeze! ;^o

Branimir Maksimovic

unread,
Sep 28, 2021, 10:32:01 PMSep 28
to
semaphore wait then acquire

--

7-77-777
Evil Sinner!

Branimir Maksimovic

unread,
Sep 28, 2021, 10:32:51 PMSep 28
to
She is clever, but newb. Forgive her for that.

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,
Sep 28, 2021, 11:15:11 PMSep 28
to
Yup! You got it. I like how C++ has standalone membars via
std::atomic_thread_fence. Makes me reminisce about the SPARC where all
atomic ops are naked, or relaxed in C++ terms. An acquire on the SPARC
was MEMBAR #LoadStore | #LoadLoad, ah the good ol' days. ;^)

David Brown

unread,
Sep 29, 2021, 2:35:35 AMSep 29
to
On 29/09/2021 02:05, red floyd wrote:
> On 9/28/2021 2:42 PM, Chris M. Thomasson wrote:
>
>> working with it.
>>
>> I wonder if she knows what a #LoadStore | #LoadLoad barrier is. That
>> MEMBAR instruction was fun on the SPARC. ;^)
>>

The most challenging architecture, AFAIK, was the Alpha. (I have no
experience of it myself.)

>
> Not quite a membar, but on PowerPC, the EIEIO instruction was also fun.
>

It is certainly the best named instruction around!

It was useful on early PPC microcontroller cores, to avoid unwanted
reordering or buffering of accesses to hardware registers. Later cores
had an MPU that supported setting up memory areas for direct unbuffered
accesses, which is a more convenient and safer method.


Bonita Montero

unread,
Sep 29, 2021, 6:41:26 AMSep 29
to
Am 28.09.2021 um 23:42 schrieb Chris M. Thomasson:

> Yeah. Damn. She does not seem to know a whole lot about memory barriers,

I use them a lot and correctly.

Bonita Montero

unread,
Sep 29, 2021, 6:42:39 AMSep 29
to
Am 29.09.2021 um 08:35 schrieb David Brown:

> The most challenging architecture, AFAIK, was the Alpha.

The Alpha isn't challenging because it has the most relaxed memory
-ordering of all CPUs: If there would be a C++11-compiler for the
alpha you would use conventional membars and the code virtually
executes like on every other CPU.

Chris M. Thomasson

unread,
Sep 29, 2021, 3:04:41 PMSep 29
to
Oh my. You cannot even implement the read side of RCU without a damn
membar on an Alpha! You don't think that is challenging at all? I don't
think you ever implemented RCU. Also, have you implemented SMR? That
actually requires an explicit membar on an x86! Joe Seigh came up with a
way to combine RCU and SMR to get rid of the membar by putting SMR in a
read side RCU critical section. It improved the performance of loading a
SMR pointer by orders of magnitude. Well done Joe.

:^)

Chris M. Thomasson

unread,
Sep 29, 2021, 3:05:25 PMSep 29
to
A wait on a semaphore does not need release semantics. It only needs
acquire.

Chris M. Thomasson

unread,
Sep 29, 2021, 3:05:40 PMSep 29
to
She is clever.

Chris M. Thomasson

unread,
Sep 29, 2021, 3:10:11 PMSep 29
to
On 9/20/2021 1:43 PM, Bonita Montero wrote:
> In some code I've to check different sates for which an exception might
> be thrown. These states are all true or false so I combined them to a
> four bit pattern wich I used as an index to a table which stores the
> strings of the out_of_range exception to be trown or a nullptr if the
> state isn't associated with an out of range condition. This helps me
> to prevent some if-else-cascades and speeds up the code.
> So this is the complete function but i put two empty lines around the
> code I mentioned.
>
> void dual_monitor::wait( bool b )
[...]

Have you posted the whole dual_monitor code? If you did, I missed it.
Sorry.

Bonita Montero

unread,
Sep 29, 2021, 9:34:02 PMSep 29
to
> A wait on a semaphore does not need release semantics. ...

It needs because I could have modiefied data before releasing the
lock I'm waiting for just at the same time.

Chris M. Thomasson

unread,
Sep 29, 2021, 9:44:14 PMSep 29
to
So, it needs acquire release, right?

Chris M. Thomasson

unread,
Sep 29, 2021, 10:47:16 PMSep 29
to
On 9/29/2021 6:33 PM, Bonita Montero wrote:
Have you posted all of your code for dual_monitor? Or just
dual_monitor::wait?

Chris M. Thomasson

unread,
Sep 30, 2021, 12:57:35 AMSep 30
to
Where are your example use cases for dual_monitor? I cannot find all of
the code for dual_monitor anyway. So, its difficult for me to port to a
simulator and use... Just make a new thread, with an example program
that uses dual_monitor.

Bonita Montero

unread,
Sep 30, 2021, 2:16:54 AMSep 30
to
> Where are your example use cases for dual_monitor? ...

It's usable when two threads communicate bidirectional.

David Brown

unread,
Sep 30, 2021, 3:11:00 AMSep 30
to
If you release a lock, you need release semantics. If you acquire a
lock, you need acquire semantics.

There is a lot of fiddly stuff in getting memory barriers and
synchronisation right - and a lot of /really/ tough stuff in getting it
right and optimally efficient on big processors. But "acquire" and
"release" semantics are helpfully named!

Chris M. Thomasson

unread,
Sep 30, 2021, 4:19:40 AMSep 30
to
That's a bit vague. One can create highly efficient bidirectional
communication between two threads using two wait-free
single-producer/single-consumer queues without using any atomic RMW's,
just atomic loads, stores, and some some cleverly placed membars. Now,
if one needs to be able to wait on them, well that's a different story.
This makes me think about eventcounts; I have plenty of experience with
those. God I am getting older. Actually, I am wondering if you happen to
be familiar with the good ol' two lock queue? I first saw it decades ago:

https://www.cs.rochester.edu/~scott/papers/1996_PODC_queues.pdf

A classic! The queues in that paper are well known in lock-free
circles... Beware... There are subtle memory lifetime issues that can
occur in the lock-free version. Basically, the same horror show that can
occur in a lock-free stack. Its not just ABA, its making sure that the
memory is valid for a certain operation. Iirc, Windows does something
funky with SEH to get this to work in their SList API.

;^)

Horsey...@the_stables.com

unread,
Sep 30, 2021, 5:35:46 AMSep 30
to
On Wed, 29 Sep 2021 12:05:24 -0700
"Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:
Clever in a small area. She seems to have limited knowledge of development
and OS methodologies that arn't commonly used in Windows however.

Bonita Montero

unread,
Sep 30, 2021, 6:39:47 AMSep 30
to
> That's a bit vague. One can create highly efficient bidirectional
> communication between two threads using two wait-free
> single-producer/single-consumer queues without using any atomic RMW's,
> just atomic loads, stores, and some some cleverly placed membars.

When you use wait-free or lock-free algorithms and there's no data
you have to poll. That makes wait-free and lock-free algorithms
out of the question.
The only lock-free algorihm which ware practicable are lock-free
stacks. You can use them for pooling objects or to give back items
to a thread which has allocated the items so that ther's no need
for a common lock - all modern memory-allocators give back blocks
freed bei foreign threads to the thread to whose pool the blocks
belongs.

Chris M. Thomasson

unread,
Sep 30, 2021, 4:01:12 PMSep 30
to
In the mean time, I am working on another fractal set to music. Like
these I did last year:

https://youtu.be/DSiWvF5QOiI

https://youtu.be/QFPSYsYUKBA

Chris M. Thomasson

unread,
Sep 30, 2021, 4:20:46 PMSep 30