Tricky ...

Bonita Montero

unread,

Sep 20, 2021, 4:43:58 PM9/20/21

to

In some code I've to check different sates for which an exception might
be thrown. These states are all true or false so I combined them to a
four bit pattern wich I used as an index to a table which stores the
strings of the out_of_range exception to be trown or a nullptr if the
state isn't associated with an out of range condition. This helps me
to prevent some if-else-cascades and speeds up the code.
So this is the complete function but i put two empty lines around the
code I mentioned.

void dual_monitor::wait( bool b )
{
uint64_t cmp = m_flagAndCounters.load( memory_order_relaxed ), chg;
uint32_t visitors, aWaiters, bWaiters, waiters;
assert(m_tid == thread_id::self());
m_tid = thread_id();
do
{
assert(isOwned( cmp ) && !m_recCount);
visitors = getVisitors( cmp );
aWaiters = getAWaiters( cmp );
bWaiters = getAWaiters( cmp );
assert(visitors >= aWaiters + bWaiters);
waiters = !b ? aWaiters : bWaiters;

static
char const *const throwTable[] =
{
nullptr /* 0000 */, nullptr /* 0001 */, nullptr /* 0010 */,
nullptr /* 0011 */, nullptr /* 0100 */, nullptr /* 0101 */,
"monitor::wait() - number of visitors and a-waiters too large", // 0110
"monitor::wait() - number of visitors and a-waiters too large", // 0111
nullptr /* 1000 */, nullptr /* 1001 */, nullptr /* 1010 */,
nullptr /* 1011 */, nullptr /* 1100 */,
"monitor::wait() - number of visitors and b-waiters too large", // 1101
nullptr, // 1110
"monitor::wait() - number of visitors and b-waiters too large", // 1111
};
size_t maxVisitors = visitors == COUNTER_MASK,
maxAWaiters = aWaiters == COUNTER_MASK,
maxBWaiters = bWaiters == COUNTER_MASK;
size_t throwIdx = (size_t)b << 3 | maxVisitors << 2 | maxAWaiters << 1
| maxBWaiters;
if( throwTable[throwIdx] )
{
m_tid = thread_id::self();
throw out_of_range( throwTable[throwIdx] );
};

if( visitors > waiters )
// waiters < COUNTER_MASK because visitors > waiters
// taking over visitor-counter from another thread
chg = cmp + (!b ? WAITER_A_VALUE : WAITER_B_VALUE);
else
// visitors == aWaiters + bWaiters
chg = (cmp & ~OWNER_MASK) + (!b ? WAITER_A_VALUE : WAITER_B_VALUE) +
VISITOR_VALUE;
} while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
memory_order_release, memory_order_relaxed ) );
if( visitors > waiters )
{
#if defined(_MSC_VER)
while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
#elif defined(__unix__)
sysv_semaphore::sembuf semBuf;
do
semBuf( VISITOR_SEM, 1, 0 );
while( m_sems( &semBuf, 1, false ) == -1 );
#endif
}
#if defined(_MSC_VER)
HANDLE ahWait[2] = { (HANDLE)m_xhEvtVisit, (HANDLE)(!b ? m_xhSemWaitA :
m_xhSemWaitB) };
for( DWORD dwWait; (dwWait = WaitForMultipleObjects( 2, ahWait, TRUE,
INFINITE )) < WAIT_OBJECT_0
|| dwWait > WAIT_OBJECT_0 + 1; );
#elif defined(__unix__)
sysv_semaphore::sembuf semBufs[2];
do
semBufs[0]( VISITOR_SEM, -1, 0 ),
semBufs[1]( !b ? WAITER_A_SEM : WAITER_B_SEM, -1, 0 );
while( m_sems( semBufs, 2, false ) == -1 );
#endif
m_tid = thread_id::self();
}

Chris M. Thomasson

unread,

Sep 23, 2021, 1:39:10 PM9/23/21

to

On 9/20/2021 1:43 PM, Bonita Montero wrote:
> In some code I've to check different sates for which an exception might
> be thrown. These states are all true or false so I combined them to a
> four bit pattern wich I used as an index to a table which stores the
> strings of the out_of_range exception to be trown or a nullptr if the
> state isn't associated with an out of range condition. This helps me
> to prevent some if-else-cascades and speeds up the code.
> So this is the complete function but i put two empty lines around the
> code I mentioned.
>
> void dual_monitor::wait( bool b )

[...]

Please, try it out in a race detector? Create several programs that use
your code in high load scenarios, and see what happens in the detector.
Do we all have the time to examine your "tricky" code and test it out
for ourselves? Well, not really. I am fairly busy with some fractal work
right now. Sorry Bonita.

Relacy is a good one, ThreadSanitizer is nice. My friend created Relacy,
iirc, he is also working on ThreadSanitizer.

Bonita Montero

unread,

Sep 23, 2021, 2:29:12 PM9/23/21

to

Am 23.09.2021 um 19:38 schrieb Chris M. Thomasson:
> On 9/20/2021 1:43 PM, Bonita Montero wrote:
>> In some code I've to check different sates for which an exception might
>> be thrown. These states are all true or false so I combined them to a
>> four bit pattern wich I used as an index to a table which stores the
>> strings of the out_of_range exception to be trown or a nullptr if the
>> state isn't associated with an out of range condition. This helps me
>> to prevent some if-else-cascades and speeds up the code.
>> So this is the complete function but i put two empty lines around the
>> code I mentioned.
>>
>> void dual_monitor::wait( bool b )
> [...]
>
> Please, try it out in a race detector? Create several programs that use
> your code in high load scenarios, and see what happens in the detector.
> Do we all have the time to examine your "tricky" code and test it out
> for ourselves? Well, not really. I am fairly busy with some fractal work
> right now. Sorry Bonita.

Not necessary. My montor works perfectly.
I've implemented something completely new: a poll_wait operation.
You simply supply a check-lambda that checks if there is valid state
and my code repeatedly tries to lock the lock and check if the con-
dition is true. The number of spins is re-calculated at each call
according to the recalculation-pattern of the glibc.

template<typename RetType, typename Predicate>
requires requires( Predicate pred )
{
{ pred() } -> std::same_as<std::pair<bool, RetType>>;
}
RetType monitor::wait_poll( Predicate const &pred )
{
using namespace std;
using retpair_t = pair<bool, RetType>;
uint32_t maxSpinT = (uint32_t)m_nextPollSpinCount * 2 + 10;
uint16_t maxSpin = maxSpinT <= m_maxPollSpinCount ?
(uint16_t)maxSpinT : m_maxPollSpinCount,
spinCount = 0;
bool notify = false;
uint64_t cmp = m_flagAndCounters.load( memory_order_relaxed );
for( ; !notify && spinCount < maxSpin; ++spinCount )
if( isOwned( cmp ) )
{
cpu_pause( PAUSE_ITERATIONS );
cmp = m_flagAndCounters.load( memory_order_relaxed );
continue;
}
else if( m_flagAndCounters.compare_exchange_weak( cmp, cmp |
OWNER_MASK, memory_order_acquire, memory_order_relaxed ) )
{
retpair_t ret = move( pred() );
uint64_t chg;
do
{
uint32_t visitors = getVisitors( cmp ),
waiters = getWaiters( cmp );
assert(visitors >= waiters);
chg = cmp & ~OWNER_MASK;
if( notify = visitors > waiters )
chg -= VISITOR_VALUE;
} while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
memory_order_relaxed, memory_order_relaxed ) );
if( notify )

#if defined(_MSC_VER)
while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
#elif defined(__unix__)

for( m_sems( { sembuf( VISITOR_SEM, 1, 0 ) }, false ) == -1 );
#endif
if( !ret.first )
if( !notify )
continue;
else
break;
if( !notify )
m_nextPollSpinCount += (int16_t)(((int32_t)spinCount -
(int32_t)m_nextPollSpinCount) / 8);
return move( ret.second );
}
else
cpu_pause( PAUSE_ITERATIONS );
if( !notify )
m_nextPollSpinCount += (int16_t)(((int32_t)spinCount -
(int32_t)m_nextPollSpinCount) / 8);
lock();
try
{
for( ; ; )
{
retpair_t ret = move( pred() );
if( ret.first )
{
unlock();
return move( ret.second );
}
wait();
}
}
catch( ... )
{
unlock();
throw;
}
}

Bonita Montero

unread,

Sep 23, 2021, 2:30:39 PM9/23/21

to

Oh, I just thought that those two moves can be dropped since thei're
implicit.

Chris M. Thomasson

unread,

Sep 24, 2021, 9:48:34 PM9/24/21

to

On 9/23/2021 11:29 AM, Bonita Montero wrote:
> while( !SetEvent( (HANDLE)m_xhEvtVisit ) );

Humm....

Chris M. Thomasson

unread,

Sep 24, 2021, 9:56:30 PM9/24/21

to

For some damn reason, this reminds me of some broken condvar impls that
used the good ol' PulseEvent:

https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-pulseevent

A quote: "This function is unreliable and should not be used. It exists
mainly for backward compatibility. For more information, see Remarks."

Bonita Montero

unread,

Sep 25, 2021, 1:24:22 AM9/25/21

to

What I do is reliable.

Branimir Maksimovic

unread,

Sep 25, 2021, 8:28:57 AM9/25/21

to

This is because you SAY SO?
>

--
7-77-777
Evil Sinner!

Bonita Montero

unread,

Sep 25, 2021, 9:58:05 AM9/25/21

to

Inspect my code and say why it is not instead of making
arbitrary assumptions.

Branimir Maksimovic

unread,

Sep 25, 2021, 9:59:13 AM9/25/21

to

Show your code, I missed it :(

--
7-77-777
Evil Sinner!

Bonita Montero

unread,

Sep 25, 2021, 10:09:13 AM9/25/21

to

Am 25.09.2021 um 15:59 schrieb Branimir Maksimovic:
> On 2021-09-25, Bonita Montero <Bonita....@gmail.com> wrote:
>>>> What I do is reliable.
>>>
>>> This is because you SAY SO?
>>
>> Inspect my code and say why it is not instead of making
>> arbitrary assumptions.
>>
> Show your code, I missed it :(

I showed my wait_poll-function. It's sth. completely new that improves
most usualy monitor-handling, usually with producer-consumer patterns
where the mutex-part of the monitor is held very short.

Branimir Maksimovic

unread,

Sep 25, 2021, 1:13:37 PM9/25/21

to

And you use unreliable function for that?
--

7-77-777
Evil Sinner!

Bonita Montero

unread,

Sep 25, 2021, 1:15:47 PM9/25/21

to

Am 25.09.2021 um 19:13 schrieb Branimir Maksimovic:
> On 2021-09-25, Bonita Montero <Bonita....@gmail.com> wrote:
>> Am 25.09.2021 um 15:59 schrieb Branimir Maksimovic:
>>> On 2021-09-25, Bonita Montero <Bonita....@gmail.com> wrote:
>>>>>> What I do is reliable.
>>>>>
>>>>> This is because you SAY SO?
>>>>
>>>> Inspect my code and say why it is not instead of making
>>>> arbitrary assumptions.
>>>>
>>> Show your code, I missed it :(
>>
>> I showed my wait_poll-function. It's sth. completely new that improves
>> most usualy monitor-handling, usually with producer-consumer patterns
>> where the mutex-part of the monitor is held very short.
>
> And you use unreliable function for that?

Show me where my code is unreliable.
I think you're simply a poor troll.

Branimir Maksimovic

unread,

Sep 25, 2021, 1:22:15 PM9/25/21

to

I am just asking because you build something
on shaken ground. Why do you think is reliable,
simple question?

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,

Sep 25, 2021, 4:34:16 PM9/25/21

to

Whats up with the:

while( !SetEvent( (HANDLE)m_xhEvtVisit ) );

Shouldn't you use GetLastError to find out why SetEvent failed?

Also, you are using:

cpu_pause( PAUSE_ITERATIONS );

In several places? Reeks of a spinlock. Some kind of backoff?

Bonita Montero

unread,

Sep 25, 2021, 11:46:38 PM9/25/21

to

Because the code is simple.

Bonita Montero

unread,

Sep 25, 2021, 11:49:06 PM9/25/21

to

Am 25.09.2021 um 22:34 schrieb Chris M. Thomasson:
> On 9/24/2021 10:24 PM, Bonita Montero wrote:
>> Am 25.09.2021 um 03:56 schrieb Chris M. Thomasson:
>>> On 9/24/2021 6:48 PM, Chris M. Thomasson wrote:
>>>> On 9/23/2021 11:29 AM, Bonita Montero wrote:
>>>>> while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
>>>>
>>>> Humm....
>>>
>>> For some damn reason, this reminds me of some broken condvar impls
>>> that used the good ol' PulseEvent:
>>>
>>> https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-pulseevent
>>>
>>>
>>> A quote: "This function is unreliable and should not be used. It
>>> exists mainly for backward compatibility. For more information, see
>>> Remarks."
>>
>> What I do is reliable.
>>
>
> Whats up with the:
>
> while( !SetEvent( (HANDLE)m_xhEvtVisit ) );
>
> Shouldn't you use GetLastError to find out why SetEvent failed?

No, once I incremented the waiter or visitor count I can't set it back
because it might have been reduced by someone awakening me. So I've to
spin or terminate the program. But this doesn't matter since SE fails
under conditions where the machine almost has no resources.

Chris M. Thomasson

unread,

Sep 27, 2021, 3:18:38 AM9/27/21

to

On 9/20/2021 1:43 PM, Bonita Montero wrote:

[...]

Humm... For some reason I feel the need for std::memory_order_acq_rel
here wrt the cas. Humm... I need to port your algorihtm over to a form
that Relacy can understand. Its been a while! I just got a strange
feeling. Waiting is usually, acquire semantics. Humm... Sorry, need to
examine it further, and port it over. Then run it in certain scenarios.
Relacy has the capability to crack it wide open if there are any issues.

I should have some time later on tomorrow. I am busy with my fractal
software right now:

https://fractalforums.org/gallery/1612-270921004032.jpeg

http://siggrapharts.ning.com/photo/alien-anatomy

lol. ;^)

Bonita Montero

unread,

Sep 27, 2021, 8:44:44 AM9/27/21

to

> here wrt the cas. ...

No, you only write nonsense. When I wait the lock is released,
so it's release-consistency.

You always write nonsense.

red floyd

unread,

Sep 27, 2021, 1:54:47 PM9/27/21

to

On 9/27/2021 5:44 AM, Bonita Montero wrote:
[redacted]

>
> No, you only write nonsense. When I wait the lock is released,
> so it's release-consistency.
>
> You always write nonsense.

If you're going to ask for opinions and then reject any criticism
as nonsense, then why the heck are you even bothering to post your
code?

Chris M. Thomasson

unread,

Sep 27, 2021, 5:06:39 PM9/27/21

to

On 9/27/2021 5:44 AM, Bonita Montero wrote:

> Am 27.09.2021 um 09:18 schrieb Chris M. Thomasson:
>> On 9/20/2021 1:43 PM, Bonita Montero wrote:
>>> In some code I've to check different sates for which an exception might
>>> be thrown. These states are all true or false so I combined them to a
>>> four bit pattern wich I used as an index to a table which stores the
>>> strings of the out_of_range exception to be trown or a nullptr if the
>>> state isn't associated with an out of range condition. This helps me
>>> to prevent some if-else-cascades and speeds up the code.
>>> So this is the complete function but i put two empty lines around the
>>> code I mentioned.
>>>
>>> void dual_monitor::wait( bool b )

[...]

>>> } while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
>>> memory_order_release, memory_order_relaxed ) );
>> [...]
>>
>> Humm... For some reason I feel the need for std::memory_order_acq_rel
>> here wrt the cas. ...
>
> No, you only write nonsense. When I wait the lock is released,
> so it's release-consistency.
>
> You always write nonsense.

Decrementing a semaphore requires acquire semantics. Incrementing a
semaphore requires release semantics. Trying to do both at once in a
single atomic operation requires acquire/release semantics.

I still need to port it to Relacy, but it seems like you need acq_rel here.

Chris M. Thomasson

unread,

Sep 27, 2021, 5:10:29 PM9/27/21

to

Yeah, no shi%. Wow. Fwiw, its been a while since I have worked on such
things. I just need to port Bonita's code over to Relacy, and give it a
go in the simulator. It can find obscure memory order issues pretty damn
fast. The problem is that I need to find the time. I mean, Bonita is not
paying me. ;^)

Chris M. Thomasson

unread,

Sep 27, 2021, 7:09:37 PM9/27/21

to

Think about it... Waiting for something, implies acquire semantics.

Bonita Montero

unread,

Sep 28, 2021, 12:56:08 AM9/28/21

to

Am 27.09.2021 um 23:06 schrieb Chris M. Thomasson:
> On 9/27/2021 5:44 AM, Bonita Montero wrote:
>> Am 27.09.2021 um 09:18 schrieb Chris M. Thomasson:
>>> On 9/20/2021 1:43 PM, Bonita Montero wrote:
>>>> In some code I've to check different sates for which an exception might
>>>> be thrown. These states are all true or false so I combined them to a
>>>> four bit pattern wich I used as an index to a table which stores the
>>>> strings of the out_of_range exception to be trown or a nullptr if the
>>>> state isn't associated with an out of range condition. This helps me
>>>> to prevent some if-else-cascades and speeds up the code.
>>>> So this is the complete function but i put two empty lines around the
>>>> code I mentioned.
>>>>
>>>> void dual_monitor::wait( bool b )
> [...]
>>>> } while( !m_flagAndCounters.compare_exchange_weak( cmp, chg,
>>>> memory_order_release, memory_order_relaxed ) );
>>> [...]
>>>
>>> Humm... For some reason I feel the need for std::memory_order_acq_rel
>>> here wrt the cas. ...
>>
>> No, you only write nonsense. When I wait the lock is released,
>> so it's release-consistency.
>>
>> You always write nonsense.
>
> Decrementing a semaphore requires acquire semantics.

No, I could have modified data before, so I need, release-semantics.
You are such a n00b.

Bonita Montero

unread,

Sep 28, 2021, 12:57:43 AM9/28/21

to

No, when I release a lock and could have modified data,
I need release-semantics.

Chris M. Thomasson

unread,

Sep 28, 2021, 1:27:44 AM9/28/21

to

For some reason when I saw the word "wait", I think of acquire. Are you
sure you know all about memory barriers? I have worked with them in the
past.

Chris M. Thomasson

unread,

Sep 28, 2021, 1:28:40 AM9/28/21

to

So, how can one utilize dual_monitor::wait? Does it wait on something?

Chris M. Thomasson

unread,

Sep 28, 2021, 1:47:06 AM9/28/21

to

Terminate? Humm... If there are conditions arising in the system in
which SetEvent fails, you go into infinite loop here, right? Try to
handle this scenario. Find out why it failed, GetLastError.

Chris M. Thomasson

unread,

Sep 28, 2021, 1:49:47 AM9/28/21

to

Waiting on something is acquire by nature. Show me where its not?
Actually, show me where to place the acquire and release barriers in a
semaphore? Can you do it? Should be a piece of cake, right?

Bonita Montero

unread,

Sep 28, 2021, 2:33:18 AM9/28/21

to

> sure you know all about memory barriers? ...

Everything necessary here.

Bonita Montero

unread,

Sep 28, 2021, 2:34:06 AM9/28/21

to

Its like waiting on a condvar, which might have modified any data
before.

Bonita Montero

unread,

Sep 28, 2021, 2:34:57 AM9/28/21

to

If the system hasn't any resources to satisfy a simple WaitForSingle...
spinning is quite o.k..

Chris M. Thomasson

unread,

Sep 28, 2021, 3:03:47 AM9/28/21

to

I was thinking along those lines. However, waiting on a condvar involves
several steps. It also involves reacquiring the mutex...

Chris M. Thomasson

unread,

Sep 28, 2021, 3:05:47 AM9/28/21

to

spinning indeed. However, you have no backoff. You are spinning full speed!

while( !SetEvent( (HANDLE)m_xhEvtVisit ) );

Oh shit. When shit hits the fan, so does this! Try to not bleed the
system when its in dire straights?

Chris M. Thomasson

unread,

Sep 28, 2021, 3:06:39 AM9/28/21

to

For sure?

Bonita Montero

unread,

Sep 28, 2021, 6:05:43 AM9/28/21

to

With what I do the reacquisition is done by the side that unlocks the
mutex. It simply keeps the locked-flag while unlocking and sets the
visitor-event.

Bonita Montero

unread,

Sep 28, 2021, 6:06:46 AM9/28/21

to

That doesnt't matter under such conditions.
Am I talking to a complete moron here ?

David Brown

unread,

Sep 28, 2021, 10:06:07 AM9/28/21

to

Chris, why do you keep trying to communicate with Bonita? She is
clearly so convinced of her own superiority to everyone else that it is
pointless. She does not post code for people to check, she posts it
because she thinks she is showing off and impressing people. This kind
of code is difficult to get right, and any capable developer is going to
want to discuss it with people to check the details. Since Bonita
responds with nothing but insults and arrogance, I think it is safe to
assume her code is flawed and leave her with it.

Bonita Montero

unread,

Sep 28, 2021, 12:46:59 PM9/28/21

to

Chris doesn't understand my code. He even doesn't understand that before
any wait-operations you use atomic operations with release consistency
and after you use atomic operations with acquire-consistency. He is
totally confused.
And with the above code: once a visitor has incremented the visitor
-counter it can't decrement it because it might have decremented by
the code wanting to wake up a visitor. So under extreme conditions
where the system doesn't have the least resources to satisfy such a
simple operations spinning is quite o.k. until the conditions become
better. The system isn't usable anyway then.
It's simply frustrating to explain such simple things. Chris is simply
a n00b which lacks basic knowledge.

Chris M. Thomasson

unread,

Sep 28, 2021, 5:02:57 PM9/28/21

to

Ummm... WOW. You are confused, Big time. I already asked you to show me
exactly where to place the memory barriers necessary to implement a
semaphore. Can you do it? I can. Show me... I am interested to see if
you can do it. From what you are writing, I don't think you can. Oh
well. Shit happens.

> And with the above code: once a visitor has incremented the visitor
> -counter it can't decrement it because it might have decremented by
> the code wanting to wake up a visitor. So under extreme conditions
> where the system doesn't have the least resources to satisfy such a
> simple operations spinning is quite o.k. until the conditions become
> better. The system isn't usable anyway then.
> It's simply frustrating to explain such simple things. Chris is simply
> a n00b which lacks basic knowledge.

lol. Humm... Perhaps I should just leave you with it. You refuse to run
it through a race detector... Why? I was going to port it over to
Relacy, but now... Argh. Screw it. You do it.

Chris M. Thomasson

unread,

Sep 28, 2021, 5:42:18 PM9/28/21

to

Yeah. Damn. She does not seem to know a whole lot about memory barriers,
those tricky bastards! ;^) I was trying to help her. Then I get spit on.
I asked her a simple question: Where do the memory barriers go for a
semaphore. So far, no answer. She is confused. Calling me a newb is
strange because I have been working with these types of things for decades.

I would love to see Bonita program a SPARC in RMO mode. That would be
amusing... Actually Sun gave me a SunFire T2000 after winning the first
round in their CoolThreads contest. My project was called vZOOM. That
damn server sounded like several vacuum cleaners running when I was
working with it.

I wonder if she knows what a #LoadStore | #LoadLoad barrier is. That
MEMBAR instruction was fun on the SPARC. ;^)

So, I will just shut up, and let her do her thing.

Oh well.

red floyd

unread,

Sep 28, 2021, 8:05:18 PM9/28/21

to

On 9/28/2021 2:42 PM, Chris M. Thomasson wrote:

> working with it.
>
> I wonder if she knows what a #LoadStore | #LoadLoad barrier is. That
> MEMBAR instruction was fun on the SPARC. ;^)
>

Not quite a membar, but on PowerPC, the EIEIO instruction was also fun.

Chris M. Thomasson

unread,

Sep 28, 2021, 10:28:48 PM9/28/21

to

Yup. Iirc, it was for IO. Its been a while since I programmed a PPC. I
wrote about some of the pitfalls of using LL/SC on it a while back on
comp.arch:

https://groups.google.com/g/comp.arch/c/yREvvvKvr6k/m/nRZ5tpLwDNQJ

Wow, this was way back in 2005! Jeeze! ;^o

Branimir Maksimovic

unread,

Sep 28, 2021, 10:32:01 PM9/28/21

to

semaphore wait then acquire

--

7-77-777
Evil Sinner!

Branimir Maksimovic

unread,

Sep 28, 2021, 10:32:51 PM9/28/21

to

She is clever, but newb. Forgive her for that.

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,

Sep 28, 2021, 11:15:11 PM9/28/21

to

Yup! You got it. I like how C++ has standalone membars via
std::atomic_thread_fence. Makes me reminisce about the SPARC where all
atomic ops are naked, or relaxed in C++ terms. An acquire on the SPARC
was MEMBAR #LoadStore | #LoadLoad, ah the good ol' days. ;^)

David Brown

unread,

Sep 29, 2021, 2:35:35 AM9/29/21

to

On 29/09/2021 02:05, red floyd wrote:
> On 9/28/2021 2:42 PM, Chris M. Thomasson wrote:
>
>> working with it.
>>
>> I wonder if she knows what a #LoadStore | #LoadLoad barrier is. That
>> MEMBAR instruction was fun on the SPARC. ;^)
>>

The most challenging architecture, AFAIK, was the Alpha. (I have no
experience of it myself.)

>
> Not quite a membar, but on PowerPC, the EIEIO instruction was also fun.
>

It is certainly the best named instruction around!

It was useful on early PPC microcontroller cores, to avoid unwanted
reordering or buffering of accesses to hardware registers. Later cores
had an MPU that supported setting up memory areas for direct unbuffered
accesses, which is a more convenient and safer method.

Bonita Montero

unread,

Sep 29, 2021, 6:41:26 AM9/29/21

to

Am 28.09.2021 um 23:42 schrieb Chris M. Thomasson:

> Yeah. Damn. She does not seem to know a whole lot about memory barriers,

I use them a lot and correctly.

Bonita Montero

unread,

Sep 29, 2021, 6:42:39 AM9/29/21

to

Am 29.09.2021 um 08:35 schrieb David Brown:

> The most challenging architecture, AFAIK, was the Alpha.

The Alpha isn't challenging because it has the most relaxed memory
-ordering of all CPUs: If there would be a C++11-compiler for the
alpha you would use conventional membars and the code virtually
executes like on every other CPU.

Chris M. Thomasson

unread,

Sep 29, 2021, 3:04:41 PM9/29/21

to

Oh my. You cannot even implement the read side of RCU without a damn
membar on an Alpha! You don't think that is challenging at all? I don't
think you ever implemented RCU. Also, have you implemented SMR? That
actually requires an explicit membar on an x86! Joe Seigh came up with a
way to combine RCU and SMR to get rid of the membar by putting SMR in a
read side RCU critical section. It improved the performance of loading a
SMR pointer by orders of magnitude. Well done Joe.

:^)

Chris M. Thomasson

unread,

Sep 29, 2021, 3:05:25 PM9/29/21

to

A wait on a semaphore does not need release semantics. It only needs
acquire.

Chris M. Thomasson

unread,

Sep 29, 2021, 3:05:40 PM9/29/21

to

She is clever.

Chris M. Thomasson

unread,

Sep 29, 2021, 3:10:11 PM9/29/21

to

On 9/20/2021 1:43 PM, Bonita Montero wrote:
> In some code I've to check different sates for which an exception might
> be thrown. These states are all true or false so I combined them to a
> four bit pattern wich I used as an index to a table which stores the
> strings of the out_of_range exception to be trown or a nullptr if the
> state isn't associated with an out of range condition. This helps me
> to prevent some if-else-cascades and speeds up the code.
> So this is the complete function but i put two empty lines around the
> code I mentioned.
>
> void dual_monitor::wait( bool b )
[...]

Have you posted the whole dual_monitor code? If you did, I missed it.
Sorry.

Bonita Montero

unread,

Sep 29, 2021, 9:34:02 PM9/29/21

to

> A wait on a semaphore does not need release semantics. ...

It needs because I could have modiefied data before releasing the
lock I'm waiting for just at the same time.

Chris M. Thomasson

unread,

Sep 29, 2021, 9:44:14 PM9/29/21

to

So, it needs acquire release, right?

Chris M. Thomasson

unread,

Sep 29, 2021, 10:47:16 PM9/29/21

to

On 9/29/2021 6:33 PM, Bonita Montero wrote:

Have you posted all of your code for dual_monitor? Or just
dual_monitor::wait?

Chris M. Thomasson

unread,

Sep 30, 2021, 12:57:35 AM9/30/21

to

Where are your example use cases for dual_monitor? I cannot find all of
the code for dual_monitor anyway. So, its difficult for me to port to a
simulator and use... Just make a new thread, with an example program
that uses dual_monitor.

Bonita Montero

unread,

Sep 30, 2021, 2:16:54 AM9/30/21

to

> Where are your example use cases for dual_monitor? ...

It's usable when two threads communicate bidirectional.

David Brown

unread,

Sep 30, 2021, 3:11:00 AM9/30/21

to

If you release a lock, you need release semantics. If you acquire a
lock, you need acquire semantics.

There is a lot of fiddly stuff in getting memory barriers and
synchronisation right - and a lot of /really/ tough stuff in getting it
right and optimally efficient on big processors. But "acquire" and
"release" semantics are helpfully named!

Chris M. Thomasson

unread,

Sep 30, 2021, 4:19:40 AM9/30/21

to

That's a bit vague. One can create highly efficient bidirectional
communication between two threads using two wait-free
single-producer/single-consumer queues without using any atomic RMW's,
just atomic loads, stores, and some some cleverly placed membars. Now,
if one needs to be able to wait on them, well that's a different story.
This makes me think about eventcounts; I have plenty of experience with
those. God I am getting older. Actually, I am wondering if you happen to
be familiar with the good ol' two lock queue? I first saw it decades ago:

https://www.cs.rochester.edu/~scott/papers/1996_PODC_queues.pdf

A classic! The queues in that paper are well known in lock-free
circles... Beware... There are subtle memory lifetime issues that can
occur in the lock-free version. Basically, the same horror show that can
occur in a lock-free stack. Its not just ABA, its making sure that the
memory is valid for a certain operation. Iirc, Windows does something
funky with SEH to get this to work in their SList API.

;^)

Horsey...@the_stables.com

unread,

Sep 30, 2021, 5:35:46 AM9/30/21

to

On Wed, 29 Sep 2021 12:05:24 -0700

"Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:

Clever in a small area. She seems to have limited knowledge of development
and OS methodologies that arn't commonly used in Windows however.

Bonita Montero

unread,

Sep 30, 2021, 6:39:47 AM9/30/21

to

> That's a bit vague. One can create highly efficient bidirectional
> communication between two threads using two wait-free
> single-producer/single-consumer queues without using any atomic RMW's,
> just atomic loads, stores, and some some cleverly placed membars.

When you use wait-free or lock-free algorithms and there's no data
you have to poll. That makes wait-free and lock-free algorithms
out of the question.
The only lock-free algorihm which ware practicable are lock-free
stacks. You can use them for pooling objects or to give back items
to a thread which has allocated the items so that ther's no need
for a common lock - all modern memory-allocators give back blocks
freed bei foreign threads to the thread to whose pool the blocks
belongs.

Chris M. Thomasson

unread,

Sep 30, 2021, 4:01:12 PM9/30/21

to

In the mean time, I am working on another fractal set to music. Like
these I did last year:

https://youtu.be/DSiWvF5QOiI

https://youtu.be/QFPSYsYUKBA

Chris M. Thomasson

unread,

Sep 30, 2021, 4:20:46 PM9/30/21

to

On 9/30/2021 3:39 AM, Bonita Montero wrote:
>> That's a bit vague. One can create highly efficient bidirectional
>> communication between two threads using two wait-free
>> single-producer/single-consumer queues without using any atomic RMW's,
>> just atomic loads, stores, and some some cleverly placed membars.
>
> When you use wait-free or lock-free algorithms and there's no data
> you have to poll.

Why?

[...]

Bonita Montero

unread,

Sep 30, 2021, 9:32:18 PM9/30/21

to

Because you don't have to wait in the kernel.

Radica...@theburrow.co.uk

unread,

Oct 1, 2021, 5:17:27 AM10/1/21

to

On Thu, 30 Sep 2021 13:00:53 -0700

"Chris M. Thomasson" <chris.m.t...@gmail.com> wrote:

Very impressive. Could use maybe a few more colours though.

Öö Tiib

unread,

Oct 1, 2021, 7:43:53 AM10/1/21

to

What a thread does when its input queue is empty?

Bonita Montero

unread,

Oct 1, 2021, 8:59:26 AM10/1/21

to

If it hasn't anything other to do than "waiting" for a new entry it
spins. Lock-free and wait-free datastructures are idiocracy except
from lock-free stacks.

Branimir Maksimovic

unread,

Oct 1, 2021, 9:07:28 AM10/1/21

to

An error occurred. Please try again later. (Playback ID: pHR4vgIqqp13ECHz)
Learn More

--

7-77-777
Evil Sinner!

Branimir Maksimovic

unread,

Oct 1, 2021, 9:08:57 AM10/1/21

to

If you don't spin it is not lock :P
problem is that only way you don't synchronize
is when things are *independent* from each other :P

--

7-77-777
Evil Sinner!

Branimir Maksimovic

unread,

Oct 1, 2021, 9:11:02 AM10/1/21

to

On 2021-10-01, Bonita Montero <Bonita....@gmail.com> wrote:

Poof. you spin me around like a record? :P

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,

Oct 1, 2021, 4:04:42 PM10/1/21

to

Huh? Ever heard of a futex, or an eventcount? Take a lock-free dynamic
stack into account. When its empty we can use, say, a futex to wait on
it. This would be the slow-path. A fast-path is when the stack is not
empty. There is a big difference between a fast and a slow path. The
former is lock-free, the latter might have to wait.

This even works for wait-free structures. In this case, the fast-path
would be wait-free.

> Lock-free and wait-free datastructures are idiocracy except
> from lock-free stacks.

Sure. Whatever you say Bonita. Cough.... Cough... ;^)

Chris M. Thomasson

unread,

Oct 1, 2021, 4:08:52 PM10/1/21

to

That's odd. The links I posted work for me. Just checked them. Humm...

Chris M. Thomasson

unread,

Oct 1, 2021, 4:11:38 PM10/1/21

to

Thank you very much for the kind comment! :^)

> Could use maybe a few more colours though.

Yeah, they could. Humm... I am working on another animation using a more
robust coloring algorithm for the field lines. Here is an example:

https://fractalforums.org/gallery/1612-300921002353.png

The coloring for that one is more dynamic. Also, I am thinking about
cycling the colors during animation.

Chris M. Thomasson

unread,

Oct 1, 2021, 4:19:47 PM10/1/21

to

She seems to make absolute statements based on her rather narrow views.
Iirc, she even said something akin to a mutex implementation must use
compare-and-swap. I showed her how to do one using only exchange.
Recently, in this thread she said a lock-free structure, say a lock-free
stack, has to use spin-polling to wait for an empty state to become,
non-empty. Well, she is wrong, yet again.

Chris M. Thomasson

unread,

Oct 1, 2021, 4:20:27 PM10/1/21

to

On 10/1/2021 5:59 AM, Bonita Montero wrote:

So a lock-free stack is okay with you, but not a lock-free queue? Why?

Branimir Maksimovic

unread,

Oct 1, 2021, 4:53:25 PM10/1/21

to

Watch out that "wait free" isn't just spining with more cpomplex algorithm
doing nothing usefull :P
here is mine mutex:
format elf64

FUTEX_WAIT equ 0
FUTEX_WAKE equ 1
FUTEX_PRIVATE_FLAG equ 128

public futex_acquire
public futex_release

futex_acquire:
push rbx
push r15
; push r10
mov r15,rdi
.L0:
mov ebx,1
xor eax,eax
lock cmpxchg [r15],ebx
test eax,eax
jz .L1
mov eax, 202
mov rdi, r15
mov rsi, FUTEX_WAIT or FUTEX_PRIVATE_FLAG
mov edx, 1
xor r10,r10
syscall
jmp .L0
.L1:; pop r10
pop r15
pop rbx
ret

futex_release:
lock and dword[rdi],0
mov eax,202
; mov rdi, sema
mov rsi, FUTEX_WAKE or FUTEX_PRIVATE_FLAG
mov edx,1
syscall
ret

>
>
>> Lock-free and wait-free datastructures are idiocracy except from lock-free
>> stacks.
>
> Sure. Whatever you say Bonita. Cough.... Cough... ;^)

Heh, rsyncing something, looking bloat and have no patience :p

--

7-77-777
Evil Sinner!

Branimir Maksimovic

unread,

Oct 1, 2021, 4:54:27 PM10/1/21

to

Safari, iCloud private relay.... ipv6

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,

Oct 1, 2021, 5:24:25 PM10/1/21

to

True! Imvvvho, wait-free should be loopless. Now, I always got nervous
over trying to implement wait-free with LL/SC, an optimistic atomic
primitive. I prefer to implement them using pessimistic atomic ops like
an atomic exchange or fetch-and-add. However, then again atomic exchange
can be implemented by LL/SC, or CAS. This is not Kosher in my mind.
Quite fond of the pessimistic x86 XADD or XCHG atomic OPS when it comes
to wait-free because there is no looping involved. Actually, CAS in x86
can be used for a state machine without looping because it always
returns a result. If CAS on x86 fails we have the reason why. This is
different than LL/SC that can spuriously fail.

Need to look at it, should work. Fwiw, here is an example using a futex
on windows. Yes it actually has one, lol! Btw, this is using XCHG only.
______________________________________
#include <iostream>
#include <thread>
#include <functional>

#define WIN32_LEAN_AND_MEAN
#include <Windows.h>

#define CT_L2_ALIGNMENT 128
#define CT_THREADS 32
#define CT_ITERS 6666666

struct ct_futex_mutex
{
ULONG alignas(CT_L2_ALIGNMENT) m_state;

ct_futex_mutex() : m_state(0)
{

}

void lock()
{
if (InterlockedExchange(&m_state, 1))
{
while (InterlockedExchange(&m_state, 2))
{
ULONG cmp = 2;
WaitOnAddress(&m_state, &cmp, sizeof(ULONG), INFINITE);
}
}
}

void unlock()
{
if (InterlockedExchange(&m_state, 0) == 2)
{
WakeByAddressSingle(&m_state);
}
}
};

struct ct_shared
{
ct_futex_mutex m_mtx;
unsigned long m_count;

ct_shared() : m_count(0) {}

~ct_shared()
{
if (m_count != 0)
{
std::cout << "counter is totally fubar!\n";
}
}
};

void ct_thread(ct_shared& shared)
{
for (unsigned long i = 0; i < CT_ITERS; ++i)
{
shared.m_mtx.lock();
++shared.m_count;
shared.m_mtx.unlock();

shared.m_mtx.lock();
--shared.m_count;
shared.m_mtx.unlock();
}
}

int main()
{
std::thread threads[CT_THREADS];

std::cout << "Starting up...\n";

{
ct_shared shared;

for (unsigned long i = 0; i < CT_THREADS; ++i)
{
threads[i] = std::thread(ct_thread, std::ref(shared));
}

std::cout << "Running...\n";

for (unsigned long i = 0; i < CT_THREADS; ++i)
{
threads[i].join();
}
}

std::cout << "Completed!\n";

return 0;
}
______________________________________

;^)

>>
>>
>>> Lock-free and wait-free datastructures are idiocracy except from lock-free
>>> stacks.
>>
>> Sure. Whatever you say Bonita. Cough.... Cough... ;^)
> Heh, rsyncing something, looking bloat and have no patience :p

;^)

Branimir Maksimovic

unread,

Oct 1, 2021, 6:14:04 PM10/1/21

to

Same thing practically, except linux futex, which is same thing.
Interrestingly Darwin does not have it and I am really interrested
how Apple immplements pthread_mutex?

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,

Oct 1, 2021, 8:14:40 PM10/1/21

to

Ohhhh... Good question. I am not sure about Darwin. Actually, there is a
way to implement the mutex I showed you using binary semaphores for the
slow path. Iirc, it went like this, with the futex part commented out,
and the initialization of the binary sema to zero, auto-reset event on
windoze, also out:
____________________________

struct ct_futex_mutex
{
ULONG alignas(CT_L2_ALIGNMENT) m_state;

ct_futex_mutex() : m_state(0)
{

}

void lock()
{
if (InterlockedExchange(&m_state, 1))
{
while (InterlockedExchange(&m_state, 2))
{

//ULONG cmp = 2;
//WaitOnAddress(&m_state, &cmp, sizeof(ULONG), INFINITE);

WaitForSingleObject(m_event, INFINITE);

}
}
}

void unlock()
{
if (InterlockedExchange(&m_state, 0) == 2)
{

//WakeByAddressSingle(&m_state);

SetEvent(m_event);
}
}
};
____________________________

That works as well. Humm...

https://github.com/apple/darwin-libpthread/blob/main/man/pthread_mutex_lock.3

https://github.com/apple/darwin-libpthread/blob/main/src/pthread_mutex.c

Need to example this! There is another way to create a nice FIFO mutex
using a fast semaphore. Iirc, it was called a benaphore:

https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26

Chris M. Thomasson

unread,

Oct 1, 2021, 8:16:38 PM10/1/21

to

On 10/1/2021 5:14 PM, Chris M. Thomasson wrote:
> On 10/1/2021 3:13 PM, Branimir Maksimovic wrote:
>> On 2021-10-01, Chris M. Thomasson <chris.m.t...@gmail.com> wrote:

[...]

> There is another way to create a nice FIFO mutex
> using a fast semaphore. Iirc, it was called a benaphore:
>
> https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26
>

Basically, its adding the ability to avoid the kernel using a fast-path
on the semaphore logic. Also, its wait-free on the fast-path because it
can be implemented using XADD.

Chris M. Thomasson

unread,

Oct 1, 2021, 9:00:57 PM10/1/21

to

They are named nicely. Back on the SPARC acquire is:

MEMBAR #LoadStore | #LoadLoad

Release is:

MEMBAR #LoadStore | #StoreStore

Ahhh, then we have hardcore: #StoreLoad. SMR required this on the SPARC.

Chris M. Thomasson

unread,

Oct 1, 2021, 9:02:06 PM10/1/21

to

Heck, it even required it on the x86! LOCK'ed atomic or MFENCE.

Branimir Maksimovic

unread,

Oct 1, 2021, 9:44:55 PM10/1/21

to

On 2021-10-02, Chris M. Thomasson <chris.m.t...@gmail.com> wrote:
>>>> FUTEX_WAKE equ 1

>> Same thing practically, except linux futex, which is same thing.
>> Interrestingly Darwin does not have it and I am really interrested
>> how Apple immplements pthread_mutex?
>>
>
> Ohhhh... Good question. I am not sure about Darwin. Actually, there is a
> way to implement the mutex I showed you using binary semaphores for the
> slow path. Iirc, it went like this, with the futex part commented out,
> and the initialization of the binary sema to zero, auto-reset event on
> windoze, also out:

Problem is that Apple act like student newbs. They don't care about
API stability and code in general. Puring empty to void, you have
to waste time a lot if programming for macOS...

> ____________________________
> struct ct_futex_mutex
> {
> ULONG alignas(CT_L2_ALIGNMENT) m_state;
>
> ct_futex_mutex() : m_state(0)
> {
>
> }
>
> void lock()
> {
> if (InterlockedExchange(&m_state, 1))
> {
> while (InterlockedExchange(&m_state, 2))
> {
> //ULONG cmp = 2;
> //WaitOnAddress(&m_state, &cmp, sizeof(ULONG), INFINITE);
>
> WaitForSingleObject(m_event, INFINITE);
> }
> }
> }
>
> void unlock()
> {
> if (InterlockedExchange(&m_state, 0) == 2)
> {
> //WakeByAddressSingle(&m_state);
>
> SetEvent(m_event);
> }
> }
> };
> ____________________________
>
> That works as well. Humm...

yeah...

>
> https://github.com/apple/darwin-libpthread/blob/main/man/pthread_mutex_lock.3
>
> https://github.com/apple/darwin-libpthread/blob/main/src/pthread_mutex.c
>
> Need to example this! There is another way to create a nice FIFO mutex
> using a fast semaphore. Iirc, it was called a benaphore:
>
> https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26

I look...

--

7-77-777
Evil Sinner!

Branimir Maksimovic

unread,

Oct 1, 2021, 10:15:03 PM10/1/21

to

Here is Apple version:
#include <locale>
#include <iostream>
#include <thread>
#include <semaphore.h>
#include <functional>

#define CT_L2_ALIGNMENT 128
#define CT_THREADS 32
#define CT_ITERS 666666

using ULONG = unsigned long;

struct ct_futex_mutex
{
sem_t sema;
alignas(CT_L2_ALIGNMENT) ULONG m_state;

ct_futex_mutex() : m_state(0)
{
sem_init(&sema,0,0);
}

void lock()
{
if (__sync_swap(&m_state, 1))
{
while (__sync_swap(&m_state, 2))
{
sem_wait(&sema);
}
}
}

void unlock()
{
if (__sync_swap(&m_state, 0) == 2)
{
sem_post(&sema);

}
}
};

struct ct_shared
{
ct_futex_mutex m_mtx;
unsigned long m_count;

ct_shared() : m_count(0) {}

~ct_shared()
{
if (m_count != 0)
{
std::cout << "counter is totally fubar!\n";
}
}
};

void ct_thread(ct_shared& shared)
{
for (unsigned long i = 0; i < CT_ITERS; ++i)
{
shared.m_mtx.lock();
++shared.m_count;
shared.m_mtx.unlock();

shared.m_mtx.lock();
--shared.m_count;
shared.m_mtx.unlock();
}
}

int main()
{

std::locale mylocale("");
std::cout.imbue(mylocale); // use locale number formatting style
std::thread *threads = new std::thread[CT_THREADS];

std::cout << "Starting up...\n";

{
ct_shared shared;

for (unsigned long i = 0; i < CT_THREADS; ++i)
{
threads[i] = std::thread(ct_thread, std::ref(shared));
}

std::cout << "Running...\n";

for (unsigned long i = 0; i < CT_THREADS; ++i)
{
threads[i].join();
}
}

std::cout << "Completed!\n";

return 0;
}

/**

;^)

>>
>>
>>> Lock-free and wait-free datastructures are idiocracy except from lock-free
>>> stacks.
>>
>> Sure. Whatever you say Bonita. Cough.... Cough... ;^)
> Heh, rsyncing something, looking bloat and have no patience :p

;^)

*/

--

7-77-777
Evil Sinner!

Chris M. Thomasson

unread,

Oct 1, 2021, 10:34:11 PM10/1/21

to

[...]

Excellent! That is basically identical to the bin-sema version! For what
its worth, take reference to a man by the name of Alexander Terekhov!
And look at the code in pthreads-win32 sources. I used to converse with
this genius way back on comp.programming.threads. He was a pleasure to
talk to. Actually, I am SenderX way back here:

https://groups.google.com/g/comp.programming.threads/c/KepRbFWBJA4/m/pg83oJTzPUIJ

;^)

Bonita Montero

unread,

Oct 2, 2021, 12:40:03 AM10/2/21

to

Am 01.10.2021 um 22:04 schrieb Chris M. Thomasson:
> On 10/1/2021 5:59 AM, Bonita Montero wrote:
>> Am 01.10.2021 um 13:43 schrieb Öö Tiib:
>>> On Friday, 1 October 2021 at 04:32:18 UTC+3, Bonita Montero wrote:
>>>> Am 30.09.2021 um 22:20 schrieb Chris M. Thomasson:
>>>>> On 9/30/2021 3:39 AM, Bonita Montero wrote:
>>>>>>> That's a bit vague. One can create highly efficient bidirectional
>>>>>>> communication between two threads using two wait-free
>>>>>>> single-producer/single-consumer queues without using any atomic
>>>>>>> RMW's, just atomic loads, stores, and some some cleverly placed
>>>>>>> membars.
>>>>>>
>>>>>> When you use wait-free or lock-free algorithms and there's no data
>>>>>> you have to poll.
>>>>>
>>>>> Why?
>>>> Because you don't have to wait in the kernel.
>>>
>>> What a thread does when its input queue is empty?
>>
>> If it hasn't anything other to do than "waiting" for a new entry it
>> spins.
>
> Huh? Ever heard of a futex, or an eventcount?

Lock-free is without any kernel-structures and polling only.
No futex.

Bonita Montero

unread,

Oct 2, 2021, 12:40:55 AM10/2/21

to

Because the use-cases of lock-free stacks are so that the
lock-free stacks are never polled.

Chris M. Thomasson

unread,

Oct 2, 2021, 1:12:44 AM10/2/21

to

Lock-free on the fast-path... Ever heard of such a thing? Wow.

Chris M. Thomasson

unread,

Oct 2, 2021, 1:13:06 AM10/2/21

to

Huh? What are you talking about?

Chris M. Thomasson

unread,

Oct 2, 2021, 1:14:20 AM10/2/21

to

Never polled? A slow-path on a lock-free stack can be waited on.

Bonita Montero

unread,

Oct 2, 2021, 1:35:50 AM10/2/21

to

There is no slow path with lock-free structures.

Bonita Montero

unread,

Oct 2, 2021, 1:36:37 AM10/2/21

to

Then it isn't lock-free.
I think I'm talking to a complete moron here.

Chris M. Thomasson

unread,

Oct 2, 2021, 1:49:44 AM10/2/21

to

Ummmm... You are just trolling me right? A slow path would be what to do
when one needs to wait, on say, an empty condition? Humm... Why do you
troll?

Chris M. Thomasson

unread,

Oct 2, 2021, 1:55:18 AM10/2/21

to

Fast-path lock/wait-free, slow-path might have to hit the kernel to wait
on certain conditions. You seem to have never differentiated between the
two possible paths?

Bonita Montero

unread,

Oct 2, 2021, 2:00:26 AM10/2/21

to

Lock-free is when there's no kernel-locking involved.
And a slow-path involves kernel-locking.

Bonita Montero

unread,

Oct 2, 2021, 2:00:54 AM10/2/21

to

Chris M. Thomasson

unread,

Oct 2, 2021, 2:08:24 AM10/2/21

to

One cal use lock-free in user-space! A futex or eventcount can be used
to help it out. You know, if it must wait on an empty condition, well,
it can without spinning around. If the queue/stack is empty, we can
wait! Or not. Up to the programmer.

Chris M. Thomasson

unread,

Oct 2, 2021, 2:12:37 AM10/2/21

to

A fast-path can be 100% pure lock/wait-free, so yes; indeed. A slow-path
can involve kernel locking. Are you late to the game?

> And a slow-path involves kernel-locking.

Yes. So, a lock/wait-free fast-path in user-space can be realized,
right? Think about it, then think about it again. So can a slow-path.
The target usage paradigm is to use a lot more fast-paths, than slow
ones... ;^)