blocking drain with lock-free Qs based on "Adventures with Memory Barriers and Seastar on Linux"

321 views
Skip to first unread message

Francesco Nigro

unread,
Jun 20, 2018, 4:34:19 AM6/20/18
to mechanica...@googlegroups.com
Hi guys!

I've recently read Adventures with Memory Barriers and Seastar on Linux of Avi K. and I was trying to use its pseudo-code to implement a batch drain method with optional wait/sleep when no items are detected.
Sadly, I wasn't able to use it without getting a deadlock, hence probably there is something I'm missing on it :(
On the left side there is the version that is deadlocking and is pretending to use the same pattern on the article, while on the right side there is a "fixed" version (based on take blocking strategy of JCtools) that is not deadlocking:


The offer side instead is quite simple:



I'm not able to understand why the version that just use the memory barriers provided by volatile set/get and queue.isEmpty/poll/offer seems not correct (at least in Java)...any idea?

Obviously the right side version could be "enhanced" by adding a !task.isEmpty check BEFORE the lock to avoid entering the lock, without creating any deadlock.

Consider that if I replace the volatile boolean with an atomic reference holding a potentially parked thread and the condition usage with LockSupport::unpark/park, the version on the left is working as expected...


One important note on the q: JCtools MpscArrayQueue has a full barrier on offer side just while moving the producer sequence and not while populating the slot, but q.isEmpty/q.poll is consistent with it (@nitsanw am I correct? ),

so the full memory barrier required while sending messages is embedded into the q offer.


Cheers,

Franz


Avi Kivity

unread,
Jun 20, 2018, 8:52:59 AM6/20/18
to mechanica...@googlegroups.com, Francesco Nigro



On 2018-06-20 11:34, Francesco Nigro wrote:
Hi guys!

I've recently read Adventures with Memory Barriers and Seastar on Linux of Avi K. and I was trying to use its pseudo-code to implement a batch drain method with optional wait/sleep when no items are detected.

It's good to see someone putting it into practice! The technique is somewhat esoteric.


Sadly, I wasn't able to use it without getting a deadlock, hence probably there is something I'm missing on it :(
On the left side there is the version that is deadlocking and is pretending to use the same pattern on the article, while on the right side there is a "fixed" version (based on take blocking strategy of JCtools) that is not deadlocking:




I don't understand the if (tasks.isEmpty()) { continue; } block. Shouldn't it be inverted? You want to sleep if there is nothing to do, not go back and poll again.


The offer side instead is quite simple:



I'm not able to understand why the version that just use the memory barriers provided by volatile set/get and queue.isEmpty/poll/offer seems not correct (at least in Java)...any idea?

One important note: JCtools MpscArrayQueue has a full barrier on offer side just while moving the producer sequence and not while populating the slot, but q.isEmpty/q.poll is consistent with it (@nitsanw am I correct? ),

so the full memory barrier required while sending messages is embedded into the q offer.


Cheers,

Franz


--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Francesco Nigro

unread,
Jun 20, 2018, 9:02:29 AM6/20/18
to mechanical-sympathy
Thanks Avi for the quick response and the article!!

It's good to see someone putting it into practice! The technique is somewhat esoteric.

The final objective would be to provide barrier "injection" into the producer from the consumer, but I have to think carefully if in Java it is somehow possible (maybe with mprotect or a file lock? not sure how to handle the signal or if Java will raise an exeception)...

I don't understand the if (tasks.isEmpty()) { continue; } block. Shouldn't it be inverted? You want to sleep if there is nothing to do, not go back and poll again.

You're right, I have already updated the post, but via email has been received only the first "stale" version...
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Francesco Nigro

unread,
Jun 20, 2018, 9:10:11 AM6/20/18
to mechanical-sympathy
@Avi 

With both Semaphore and park/unpark it seems to work as expected: only condition variables seems to behave differently and a unlucky "timing" between signal/await will cause the deadlock (signal happening before await) :(
Maybe is a newbie question, but I'm surprised that it hasn't worked...

Avi Kivity

unread,
Jun 20, 2018, 9:25:57 AM6/20/18
to mechanica...@googlegroups.com, Francesco Nigro



On 2018-06-20 16:02, Francesco Nigro wrote:
Thanks Avi for the quick response and the article!!

It's good to see someone putting it into practice! The technique is somewhat esoteric.

The final objective would be to provide barrier "injection" into the producer from the consumer, but I have to think carefully if in Java it is somehow possible (maybe with mprotect or a file lock? not sure how to handle the signal or if Java will raise an exeception)...


The mprotect() trick (which we since updated to use the more efficient madvise(MADV_DONTNEED) based on feedback on the article), and sys_membarrier() should also work from Java.


I don't understand the if (tasks.isEmpty()) { continue; } block. Shouldn't it be inverted? You want to sleep if there is nothing to do, not go back and poll again.

You're right, I have already updated the post, but via email has been received only the first "stale" version...


Are you sure that volatile reads insert a full barrier? The offer-side if (!running) should have an MFENCE before it. Look at the generated assembly if you can.


Il giorno mercoledì 20 giugno 2018 14:52:59 UTC+2, Avi Kivity ha scritto:



On 2018-06-20 11:34, Francesco Nigro wrote:
Hi guys!

I've recently read Adventures with Memory Barriers and Seastar on Linux of Avi K. and I was trying to use its pseudo-code to implement a batch drain method with optional wait/sleep when no items are detected.

It's good to see someone putting it into practice! The technique is somewhat esoteric.

Sadly, I wasn't able to use it without getting a deadlock, hence probably there is something I'm missing on it :(
On the left side there is the version that is deadlocking and is pretending to use the same pattern on the article, while on the right side there is a "fixed" version (based on take blocking strategy of JCtools) that is not deadlocking:




I don't understand the if (tasks.isEmpty()) { continue; } block. Shouldn't it be inverted? You want to sleep if there is nothing to do, not go back and poll again.


The offer side instead is quite simple:



I'm not able to understand why the version that just use the memory barriers provided by volatile set/get and queue.isEmpty/poll/offer seems not correct (at least in Java)...any idea?

One important note: JCtools MpscArrayQueue has a full barrier on offer side just while moving the producer sequence and not while populating the slot, but q.isEmpty/q.poll is consistent with it (@nitsanw am I correct? ),

so the full memory barrier required while sending messages is embedded into the q offer.


Cheers,

Franz


--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Francesco Nigro

unread,
Jun 20, 2018, 9:46:56 AM6/20/18
to mechanica...@googlegroups.com
 
Are you sure that volatile reads insert a full barrier? The offer-side if (!running) should have an MFENCE before it. Look at the generated assembly if you can.

On offer side MpscArrayQueue::offer *should* have a full barrier after it, that result in (pseudo-code):

q.offer
//StoreLoad
boolean run = running;
//LoadStore + LoadLoad
If(!run){
  //...
}

To be sure that it wasn't the reason of the deadlock I have put manually a full barrier between offer and if(!running) with no success: the only way 
I have found to avoid deadlocks is checking the queue emptyness on consumer side while holding the lock, before waiting.


 

Avi Kivity

unread,
Jun 20, 2018, 10:43:53 AM6/20/18
to mechanica...@googlegroups.com, Francesco Nigro



On 2018-06-20 16:46, Francesco Nigro wrote:
 
Are you sure that volatile reads insert a full barrier? The offer-side if (!running) should have an MFENCE before it. Look at the generated assembly if you can.

On offer side MpscArrayQueue::offer *should* have a full barrier after it, that result in (pseudo-code):

q.offer
//StoreLoad
boolean run = running;
//LoadStore + LoadLoad
If(!run){
  //...
}

To be sure that it wasn't the reason of the deadlock I have put manually a full barrier

How did you do that?

between offer and if(!running) with no success: the only way 
I have found to avoid deadlocks is checking the queue emptyness on cosumer side while holding the lock, before waiting.



Francesco Nigro

unread,
Jun 20, 2018, 10:58:44 AM6/20/18
to Avi Kivity, mechanica...@googlegroups.com

I have used the "dirty" way ie Unsafe::fullFence, but there are cleaner way to do that too: for some detail about the implementation (http://openjdk.java.net/jeps/171)

Avi Kivity

unread,
Jun 20, 2018, 12:42:49 PM6/20/18
to Francesco Nigro, mechanica...@googlegroups.com

Well, that looks correct, so I don't know why you deadlock.

Francesco Nigro

unread,
Jun 20, 2018, 1:01:20 PM6/20/18
to Avi Kivity, mechanica...@googlegroups.com
That's why I'm shocked...let's wait some of the other Java folks on the list 
to hear other opinions, maybe is just the behaviour of the condition variable in Java to be dependent by timing (differently from park/unpack)...
Anywah, thanks to have looked into it :)

Francesco Nigro

unread,
Jun 21, 2018, 9:31:40 AM6/21/18
to mechanical-sympathy
I have taken a look on the disruptor WaitStrategy and probably Mike Barker has played with condition variables and batch draining (although checking sequences, not queues): maybe he has some hints on it :)

John Hening

unread,
Jun 21, 2018, 5:43:38 PM6/21/18
to mechanical-sympathy
Don't think about Java Memory Model in terms of memory barriers because very often it can lead to incorrect conclusions. For example, it is allowed that your 16-19 lines at left side will be executed as:


boolean x = !tasks.isEmpty();
running
= false;

if(x){
 
continue;

}



From the other hand, I am not sure what do you mean by a deadlock. Deadlock is not possible here because you have only one reentrant lock. If you mean that polling thread can wait infinitelty on await, yes it can.

Francesco Nigro

unread,
Jun 21, 2018, 6:17:25 PM6/21/18
to mechanica...@googlegroups.com
TBH I'm not sure about it: running is declared as volatile and that means that while storing it is enforced a sequential consistent memory ordering with the subsequent statements ie isEmpty evaluation can't happen before it.
About the deadlock probably you're right: generally a deadlock could be defined as a situation in which 2 parties (offer and drain in this case) can't get an agreement and none can progress..
I think that's the case but effectively most computer science definitions involves more precise mechanics that this case seems to not belong.
I would say instead that the drain seems blocked.



Il gio 21 giu 2018, 23:43 John Hening <goci...@gmail.com> ha scritto:


Don't think about Java Memory Model in terms of memory barriers because very often it can lead to incorrect conclusions. For example, it is allowed that your 16-19 lines at left side will be executed as:




boolean x = !tasks.isEmpty();
running = false;

if(x){
 continue;

}




Please note: https://shipilev.net/blog/2014/jmm-pragmatics/#_jmm_interpretation_roach_motel



From the other hand, I am not sure what do you mean by a deadlock. Deadlock is not possible here because you have only one reentrant lock. If you mean that polling thread can wait infinitelty on await, yes it can.









W dniu środa, 20 czerwca 2018 10:34:19 UTC+2 użytkownik Francesco Nigro napisał:

Hi guys!


I've recently read Adventures with Memory Barriers and Seastar on Linux of Avi K. and I was trying to use its pseudo-code to implement a batch drain method with optional wait/sleep when no items are detected.
Sadly, I wasn't able to use it without getting a deadlock, hence probably there is something I'm missing on it :(
On the left side there is the version that is deadlocking and is pretending to use the same pattern on the article, while on the right side there is a "fixed" version (based on take blocking strategy of JCtools) that is not deadlocking:






The offer side instead is quite simple:





I'm not able to understand why the version that just use the memory barriers provided by volatile set/get and queue.isEmpty/poll/offer seems not correct (at least in Java)...any idea?


Obviously the right side version could be "enhanced" by adding a !task.isEmpty check BEFORE the lock to avoid entering the lock, without creating any deadlock.
Consider that if I replace the volatile boolean with an atomic reference holding a potentially parked thread and the condition usage with LockSupport::unpark/park, the version on the left is working as expected...


One important note on the q: JCtools MpscArrayQueue has a full barrier on offer side just while moving the producer sequence and not while populating the slot, but q.isEmpty/q.poll is consistent with it (@nitsanw am I correct? ),


so the full memory barrier required while sending messages is embedded into the q offer.


Cheers,
Franz






--

You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.

To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/yKQNVFAjui0/unsubscribe.

To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.

John Hening

unread,
Jun 22, 2018, 2:16:33 AM6/22/18
to mechanica...@googlegroups.com
running is declared as volatile and that means that while storing it is enforced a sequential consistent memory ordering with the subsequent statements ie isEmpty evaluation can't happen before it.

It doesn't. Please look at https://shipilev.net/blog/2014/jmmpragmatics/#_jmm_interpretation_roach_motel (it is a great presentation)

I would say instead that the drain seems blocked.

What do you mean by 'blocked' exactly?

W dniu środa, 20 czerwca 2018 10:34:19 UTC+2 użytkownik Francesco Nigro napisał:

Francesco Nigro

unread,
Jun 22, 2018, 3:33:15 AM6/22/18
to mechanica...@googlegroups.com
It doesn't. Please look at https://shipilev.net/blog/2014/jmmpragmatics/#_jmm_interpretation_roach_motel (it is a great presentation)

Thanks John, I have already read it in the past, but giving it a refresh is not bad, considering how complicate is the subject :)

I will use the JLS itself to explain it: q::isEmpty is making uses of several volatile loads while running = false is a volatile store.
As https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.2 states the volatile writes and reads are synchronization actions and 
https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.4.4 states that for each thread the sycnhronization order implied by using volatile writes/reads must
be consistent with the program order ie the JVM cannot reorder synchronization actions on the same thread.
I think that it explains in JMM term that such optimization isn't possible AFAIK.
I don't know if my answer is complete enough, probably Gil Tene and other super-expert folks on the list could give a better and more detailed (less dirty!) explanation :)

What do you mean by 'blocked' exactly?

I mean that the poller thread is calling condition::wait without never being wake up.

John Hening

unread,
Jun 22, 2018, 4:23:14 AM6/22/18
to mechanica...@googlegroups.com
Francesco, you right.
I think that it explains in JMM term that such optimization isn't possible AFAIK.
If q::isEmpty contains synchronization access they will be executed in program order (I didn't know that isEmpty() has synchronization accesses).

I mean that the poller thread is calling condition::wait without never being wake up.


Why do you think that poller should be waken up, do you call offer side?

Are you sure that, for example?
poller await on the condition. offer tries to add something but it fails because the queue is full.


W dniu środa, 20 czerwca 2018 10:34:19 UTC+2 użytkownik Francesco Nigro napisał:

Francesco Nigro

unread,
Jun 22, 2018, 5:07:40 AM6/22/18
to mechanica...@googlegroups.com
Why do you think that poller should be waken up, do you call offer side? 

Please takes a look to the Adventures with Memory Barriers and Seastar on Linux that explain why if the poller is going to sleep it is supposed to be signaled by the offer side.
TLDR the ordering of operations imply that on every running = false on poller side there is a signaling on offer side; the problem seems that Condition variable, differently from park/unpark,
are sensitive on ordering between notify/wait. 
Indeed if you write a single threaded program that notify and right after wait it won't finish, while if you do it with unpark and park you will see it working without blocking.
What bugged me is why putting into the lock/unlock region the !q::isEmpty check to avoid going sleep it will save the day.
TBH if you take a look to the JMH bench there is a "intuitive" answer to that: the offer side will wait that each event has been executed (in my code) and if the poller will check the emptiness while in the locked region
it can avoid being blocked if there is at least an event to be executed (eg the one that the offer side is waiting to be finished), hence avoiding to rely on any notify that could be already heppened (right after running = false) to be awaken.

John Hening

unread,
Jun 22, 2018, 6:16:25 AM6/22/18
to mechanical-sympathy
Ok,

so I see. This is why I ask you why do you think that offer should awake poller. As you noticed it is possible that queue is not empty and poller went to sleep. In result the poller sleep to the end of world (or spurious wakeup).



W dniu środa, 20 czerwca 2018 10:34:19 UTC+2 użytkownik Francesco Nigro napisał:

Vitaly Davidovich

unread,
Jun 22, 2018, 7:05:10 AM6/22/18
to mechanical-sympathy
Condition is likely a poor choice for this, as you've observed.  It is more like a rendezvous point/synchronous handoff - there must be a waiter when a signal happens, or else the signal is lost.  LockSupport (park/unpark), on the other hand, is sticky - an unpark() will set a bit (permit) that persists until the next park() call.  As such, the producer can race ahead of the consumer and mark the permit available (via unpark()) and the consumer will observe it on its next park().

Given the above, the easiest case to reason about is the following in the consumer:
running = false;
if (!q.isEmpty()) { continue; }
// Assume consumer gets here, and now gets preempted by the OS scheduler
// Producer runs offer() to completion, but its signal is gone
// Consumer wakes up and goes to sleep until the next time a producer offers something but if that never happens, you're stuck.

By moving the q.isEmpty() check into the locked region, you enforce that one of two things happen:
1) Either you observe that q.isEmpty() == false, and then don't care about the signal at all - this helps if the producer raced ahead and signaled into the ether
2) You observe that q.isEmpty() == true (legitimately no items yet, let's say) and then producer will eventually signal you because you're guaranteed to have been put on the waiters list of the lock - producer cannot signal ahead of you, no matter the scheduling, because you're holding the lock all the way until you await().


--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Francesco Nigro

unread,
Jun 22, 2018, 7:47:49 AM6/22/18
to mechanica...@googlegroups.com
Thanks Vitaly: that's exactly the explanation that makes me clear what's happening!
So the problem is that such kind of algorithm were relying on a rendezvous point/synchronous handoff-like behaviour: I suppose that on of the version used on ScyllaDb was making uses of something 
similar instead of condition variables. With "barriers injection" I think is the same, because the barrier can't be put on the producer until the consumer will inject it (right before going to sleep), makinig 
impossible that a producer can lost a signal.
What is interesting IMO is that this mechanics (the simpler one I have implemented in Java) seems to be effective if you rely on synchronization actions of q::offer and q::isEmpty ie by reading its code: 
it makes me clear that adding such informations on the JavaDoc is quite important if these guarantees need to be provided externally.

Il giorno venerdì 22 giugno 2018 13:05:10 UTC+2, Vitaly Davidovich ha scritto:
Condition is likely a poor choice for this, as you've observed.  It is more like a rendezvous point/synchronous handoff - there must be a waiter when a signal happens, or else the signal is lost.  LockSupport (park/unpark), on the other hand, is sticky - an unpark() will set a bit (permit) that persists until the next park() call.  As such, the producer can race ahead of the consumer and mark the permit available (via unpark()) and the consumer will observe it on its next park().


Given the above, the easiest case to reason about is the following in the consumer:
running = false;
if (!q.isEmpty()) { continue; }
// Assume consumer gets here, and now gets preempted by the OS scheduler
// Producer runs offer() to completion, but its signal is gone
// Consumer wakes up and goes to sleep until the next time a producer offers something but if that never happens, you're stuck.


By moving the q.isEmpty() check into the locked region, you enforce that one of two things happen:
1) Either you observe that q.isEmpty() == false, and then don't care about the signal at all - this helps if the producer raced ahead and signaled into the ether
2) You observe that q.isEmpty() == true (legitimately no items yet, let's say) and then producer will eventually signal you because you're guaranteed to have been put on the waiters list of the lock - producer cannot signal ahead of you, no matter the scheduling, because you're holding the lock all the way until you await().




On Fri, Jun 22, 2018 at 10:16 AM John Hening <goci...@gmail.com> wrote:


Ok,


so I see. This is why I ask you why do you think that offer should awake poller. As you noticed it is possible that queue is not empty and poller went to sleep. In result the poller sleep to the end of world (or spurious wakeup).






W dniu środa, 20 czerwca 2018 10:34:19 UTC+2 użytkownik Francesco Nigro napisał:
Hi guys!


I've recently read Adventures with Memory Barriers and Seastar on Linux of Avi K. and I was trying to use its pseudo-code to implement a batch drain method with optional wait/sleep when no items are detected.
Sadly, I wasn't able to use it without getting a deadlock, hence probably there is something I'm missing on it :(
On the left side there is the version that is deadlocking and is pretending to use the same pattern on the article, while on the right side there is a "fixed" version (based on take blocking strategy of JCtools) that is not deadlocking:






The offer side instead is quite simple:





Reply all
Reply to author
Forward
0 new messages