Explicit Lock Doubt

666 views
Skip to first unread message

Francesco Nigro

unread,
Jun 11, 2014, 11:34:20 AM6/11/14
to mechanica...@googlegroups.com
Hi, during my studies about concurrency in Java (my sources where the CookBook JSR-133 and related Faq, all the articles inside the blog of mechanical sympathy, the book Java Councurrency in Practice and many others) i've earned a lot of confusion and new doubts over the new knowledge that I have acquired...
The last one (and maybe the most important for me) was born when i tried to develop an explicit Reentrant Lock for IPC using a memory mapped file:
in the implementation i've aligned the memory inside the mapped file to obtain atomicity even in old x86 architectures and i used a 4 byte region to implement my CAS to simulate the tryLock/Lock and Unlock semantic of the intrinsic lock that we hate/love in java...
But i don't want to explain more of my implementation, i prefer to reason toward what happen beetween the (usual) calls:

lock();
try{
...do something..
}finally{
 unlock();

the "...do something.." part specifically.
How (not only mine) every (implementation of an) explicit lock could ensure that the JVM do not try to reorder all the instructions inside that region of code?
If i use a CAS over a memory region or over a volatile variable (with Unsafe) i'm ensuring that all the instructions that use it in any form are correctly ordered each other but why this should affect the order of the others?
For me it's (almost) clear the with such an explicit lock every "guarded" region would be executed only by one thread at a time but why i should rely on the visibility of the executed code?
Please heeeeeeeelp....My head is bursting :( 

Olivier Bourgain

unread,
Jun 12, 2014, 10:20:52 AM6/12/14
to mechanica...@googlegroups.com
Hi,




Le mercredi 11 juin 2014 17:34:20 UTC+2, Francesco Nigro a écrit :
Hi, during my studies about concurrency in Java (my sources where the CookBook JSR-133 and related Faq, all the articles inside the blog of mechanical sympathy, the book Java Councurrency in Practice and many others) i've earned a lot of confusion and new doubts over the new knowledge that I have acquired...
The last one (and maybe the most important for me) was born when i tried to develop an explicit Reentrant Lock for IPC using a memory mapped file:
in the implementation i've aligned the memory inside the mapped file to obtain atomicity even in old x86 architectures and i used a 4 byte region to implement my CAS to simulate the tryLock/Lock and Unlock semantic of the intrinsic lock that we hate/love in java...
But i don't want to explain more of my implementation, i prefer to reason toward what happen beetween the (usual) calls:

lock();
try{
...do something..
}finally{
 unlock();

the "...do something.." part specifically.
How (not only mine) every (implementation of an) explicit lock could ensure that the JVM do not try to reorder all the instructions inside that region of code?
instruction inside the block can be reordonned, but with the standard rules of reordering : single thread behavior must remain the same. Additionnally, they can not be moved outside of the block. CAS/volatile and some other provide what is called a memory barrier.
 
If i use a CAS over a memory region or over a volatile variable (with Unsafe) i'm ensuring that all the instructions that use it in any form are correctly ordered each other but why this should affect the order of the others?
Because these instructions can not be reordered before or after the barrier.
 
For me it's (almost) clear the with such an explicit lock every "guarded" region would be executed only by one thread at a time but why i should rely on the visibility of the executed code?
I am not sure to understand this question. You should rely on the visibility of the memory operations guarded by the same lock performed before  and ensuring that your memory operations wil be visible to any other thread acquiring the lock after you release it.
 
Please heeeeeeeelp....My head is bursting :( 

Mine too, often.

Francesco Nigro

unread,
Jun 12, 2014, 11:00:22 AM6/12/14
to mechanica...@googlegroups.com
Thnks Oliver,

maybe for me is time to search for a good source of  informations (for newbie like me :)) about the memory barrier/fences (with a loooot of example too)...

Prior to your post i were convinced that the effects of such a barriers (at a compiler/runtime level...not CPU) were to mark a couple (literally!) of instructions that act on the same memory address so that the compiler/runtime must compile its "happens-before" graph and ensure the correct (re)ordering of only these two instructions and not the others...maybe the effects of these fences are completly different from what i've understood...
If any of you have a good source of information about this topic is welcome...because i'm encountering really a lot of difficulties thinking about the effects of this kind of istructions at any level of abstraction concurrently (the memory model of java + the memory model of the CPU) while programming...
The blog mechanical-sympathy has ruin my life...now i CANNOT ignore what's happen under the hood :P

Vitaly Davidovich

unread,
Jun 12, 2014, 11:36:58 AM6/12/14
to mechanica...@googlegroups.com

Short answer is fences affect compiler and cpu ordering, not just compiler code motion.

Read this, it's very good: https://www.kernel.org/doc/Documentation/memory-barriers.txt

Sent from my phone

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Francesco Nigro

unread,
Jun 13, 2014, 3:36:57 AM6/13/14
to mechanica...@googlegroups.com
Thnks...i'm starting reading it!

:)

Francesco
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Olivier Bourgain

unread,
Jun 13, 2014, 5:30:44 AM6/13/14
to mechanica...@googlegroups.com
The JSR133 cookbook is interresting. But be careful not taking everything in it as true, as it is a conservative way of implementing the JMM, real implementations may differ.

I really like Preshing's blog, while C/C++ oriented, it is both well explained and very deep.

Francesco Nigro

unread,
Jun 15, 2014, 6:29:19 AM6/15/14
to mechanica...@googlegroups.com
Hi Oliver,

i'm literally devouring the Preshing's blog and i'm evaluating that my ignorance about all this "low level" stuff is more extensive than I thought...That's good! :)

A couple of months ago i wrote an email to Martin Thompson in which i asked him if it's the case that i have to learn C++/Assembly to improve my knowledge about what computer do at any level in order to produce low-latency code more easily...and all of your answers seem to lead to the same conclusion, even collaterally...i'm using Java in my everyday Job (prducing ERP/CRM portals and other similiar stuff) and the "low-latency pressure" it's only a my concern, so the time to learn will not directly "repayed", not in money at least...
I know it's a speech immature, but sometimes I wish I had 50-hour days to have the time to learn what I like most :)

Gil Tene

unread,
Jun 15, 2014, 10:52:09 AM6/15/14
to mechanica...@googlegroups.com


On Wednesday, June 11, 2014 11:34:20 AM UTC-4, Francesco Nigro wrote:
... in the implementation i've aligned the memory inside the mapped file to obtain atomicity even in old x86 architectures and i used a 4 byte region to implement my CAS to simulate the tryLock/Lock and Unlock semantic of the intrinsic lock that we hate/love in java...

 
But i don't want to explain more of my implementation, i prefer to reason toward what happen beetween the (usual) calls:

lock();
try{
...do something..
}finally{
 unlock();

A tip about something to be VERY careful about in implementing your lock/trylock/unlock on top of a mapped file in Java: You may be tempted to only use an ordered operations (like an unsafe CAS) on the lock and try lock operations, and to use a regular (unfenced) store for the unlock (e.g. a buffer put operation, or an unsafe putInt()). Resist this temptation. 



The cause of this common roll-your-own-lock mistake usually stems from looking only at the atomicity requirements, without considering the ordering requirements. The lock word's atomicity can be safely established by using a CAS for the lock and a regular unordered store for the unlock, because the unlock can only be performed by the owning thread. However, without the unlocking store operation establishing volatile-store-equivalent ordering, loads and stores inside the locked regions can (and will) be scheduled past the unlocking store in the code, and loose the presumed protection of the lock. This reordering will happen even when the CPU's ordering rules do not allow it (e.g. x86 will maintain LoadStore and LoadLoad order for the instruction sequence it sees), because the ordering will often be done by the JIT compiler, ahead of the CPU ever seeing the instructions. 

So make sure your unlock operation carries at least volatile-store semantics. Since MappedFileBuffer does not provide a direct way to do this, you can  use unsafe putIntVolatile(), or use a volatile store to another globally visible Java field ahead of the regular [unordered] unlocking store to the lock address in your mapped file. 

Francesco Nigro

unread,
Jun 16, 2014, 3:50:00 AM6/16/14
to mechanica...@googlegroups.com
Hi Gil,and thanks for the advices :),

For the unlock part i have used an Unsafe.putOrderedInt on the base that in a normal heap implementation of the explicit lock i would use AtomicXXX.lazySet ...I'm hoping that the garantee of correct ordering remain the same of a putVolatileInt, but maybe it induce more failed CAS in a subsequent lock call because the retrieving of the current state of the lock (a getVolatileInt) in the CAS loop could obtain older values of the lock..i'm not sure if the StoreStore barrier emitted on the unlock it's enough...am i wrong?

Gil Tene

unread,
Jun 16, 2014, 7:27:00 AM6/16/14
to mechanica...@googlegroups.com
That's exactly what the tip/warning was warning against...


On Monday, June 16, 2014 3:50:00 AM UTC-4, Francesco Nigro wrote:
Hi Gil,and thanks for the advices :),

For the unlock part i have used an Unsafe.putOrderedInt on the base that in a normal heap implementation of the explicit lock i would use AtomicXXX.lazySet ...I'm hoping that the garantee of correct ordering remain the same of a putVolatileInt, but maybe it induce more failed CAS in a subsequent lock call because the retrieving of the current state of the lock (a getVolatileInt) in the CAS loop could obtain older values of the lock..i'm not sure if the StoreStore barrier emitted on the unlock it's enough...am i wrong?

Yup. Unfortunately you are wrong... Unsafe.putOrderedInt is not strong enough for an unlock. 

Unsafe.putOrderedInt will prevent stores done before the operation from flowing past the operation (as you note, it is equivalent to placing a StoreStore barrier ahead of the operation), so stores in the locked region are prevented from moving past the unlock.

But Unsafe.putOrderedInt does not imply or include the equivalent of a LoadStore barrier, so any load that happens inside the locked region can move past your "unlock", which means your lock does not protect any reads (that are not otherwise ordered against other stores inside the lock)...

And before people jump in with "but on x86 there is an implicit LoadStore order on everything" (which is correct), remember that I'm not talking about the CPU. The JIT compiler will move those loads out of the locked region before the CPU code is created.

For a specific example, the following code:

// Read coherent x and y values from point object:

lock(point); // Implemented with CAS loop
try {
  x = point.x;
  y = point.y;
} finally {
  unlock(point); // wrongly implemented using only lazySet() or unsafe.putOrderedInt()
}

can (and will often) be legitimately reordered to:

// FAIL TO Read coherent x and y values from point object:

lock(point); // Implemented with CAS loop
unlock(point); // wrongly implemented using only lazySet() or unsafe.putOrderedInt()
x = point.x;
y = point.y;


 




 

Martin Thompson

unread,
Jun 16, 2014, 7:33:03 AM6/16/14
to mechanica...@googlegroups.com
This is why we need the Unsafe.loadFence() in Java 8 to make StampedLock work efficiently.



--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Gil Tene

unread,
Jun 16, 2014, 8:07:23 AM6/16/14
to
Yeh, but Unsafe.loadFence() is "too strong" for what we need here, as it includes an unneeded LoadLoad fence that not even a volatile store would force. I wish the API had were finer grain ones (LoadLoadFence(), LoadStoreFence(), StoreStoreFence(), StoreLoadFence()). Perhaps we'll get that in Java SE 9...

Strictly speaking the combination of Unsafe.loadFence() with Unsafe.poutOrdered() is also not enough for unlock semantics, since it is missing the StoreLoad fence that a monitorExit implies, and that would be "intuitively" implied by any unlock or release semantics. The issue here is subtle: while it is OK to move subsequent regular loads "backwards" into a locked region. It is NOT OK to do the same with volatile loads, as moving volatile loads backwards past unlocks or volatile store can have "surprising" behavior effects. The StoreLoad barrier is there to guard against that move of a volatile load (and only that move). In the case of an unlock, it's not really the moving of the volatile load past the unlocking store that's the problem (if it moved to "just before the unlock store", there wouldn't really be a detectable semantic difference). It's the fact that once in there, it can also move backwards past regular loads and stores in the locked region. And since we tend to think of regular stores and loads as being ordered by the lock (against other lock-protected regular loads and stores, and against outside-the-lock volatile loads and stores), this move can create "surprising" side-effects.

Most runtimes choose to satisfy this ordering requirement (volatile loads not crossing backwards into a locked region, or backwards past a volatile store) by placing StoreLoad barriers at the monitorExit and volatile stores operations (rather than ahead of each volatile load). As a result, missing the StoreLoad on a roll-your-own unlock will still make it buggy.

Together, this means that your valid choices (even with the Java 8 unsafe fence apis) are:
1. put an unsafe.loadFence() *and* and an unsafe.storeFence() before the unlock's store operation.
2. put an unsafe.fullFence() before the unlock's store operation (which is basically the same as #1).
3. Use an unsafe.putVolatileInt() (or equivalent) for the unlock.

Of the these, #3 is the cheapest: #1 and #2 include all the ordering that #3 does, but #3 avoids the unnecessary LoadLoad barrier that would prevent regular loads from floating backward into the locked region for some optimizations.

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Francesco Nigro

unread,
Jun 16, 2014, 10:37:11 AM6/16/14
to mechanica...@googlegroups.com
Thnks!! Changed with Unsafe.putVolatileInt :)

Now i have to "solve" (self-creating the requirements :D) :

- Reentrancy (i could inspire me with the Doug Lea's Reentrant Lock implementation)
- A sort of "transactional approch" while writing the value of the lock to detect (it is possible?) if any process/thread crash before releasing the lock

Il giorno lunedì 16 giugno 2014 14:07:23 UTC+2, Gil Tene ha scritto:
Yeh, but Unsafe.loadFence() is "too strong" for what we need here, as it includes an unneeded LoadLoad fence that not even a volatile store would force. I wish the API had were finer grain ones (LoadLoadFence(), LoadStoreFence(), StoreStoreFence(), StoreLoadFence()). Perhaps we'll get that in Java SE 9...

Strictly speaking the combination of Unsafe.loadFence() with Unsafe.poutOrdered() is also not enough for unlock semantics, since it is missing the StoreLoad fence that a monitorExit implies, and that would be "intuitively" implied by any unlock or release semantics. The issue here is subtle: while it is OK to move subsequent regular loads "backwards" into a locked region. It is NOT OK to do the same with volatile loads, as moving volatile loads backwards past unlocks or volatile store can have "surprising" behavior effects. The StoreLoad barrier is there to guard against that move of a volatile load (and only that move). In the case of an unlock, it's not really the moving of the volatile load past the unlocking store that's the problem (if it moved to "just before the unlock store", there wouldn't really be a detectable semantic difference). It's the fact that once in there, it can also move backwards past regular loads and stores in the locked region. And since we tend to think of regular stores and loads as being ordered by the lock (against other lock-protected regular loads and stores, and against outside-the-lock volatile loads and stores), this move can create "surprising" side-effects.

Most runtimes choose to satisfy this ordering requirement (volatile loads not crossing backwards into a locked region, or backwards past a volatile store) by placing StoreLoad barriers at the monitorExit and volatile stores operations (rather than ahead of each volatile load). As a result, missing the StoreLoad on a roll-your-own unlock will still make it buggy.

Together, this means that your valid choices (even with the Java 8 unsafe fence apis) are:
1. put an unsafe.loadFence() *and* and an unsafe.storeFence() before the unlock's store operation.
2. put an unsafe.fullFence() before the unlock's store operation (which is basically the same as #1).
3. Use an unsafe.putVolatileInt() (or equivalent) for the unlock.

Of the these, #3 is the cheapest: #1 and #2 include all the ordering that #3 does, but #3 avoids the unnecessary LoadLoad barrier that would prevent regular loads from floating backward into the locked region for some optimizations.

On Monday, June 16, 2014 7:33:03 AM UTC-4, Martin Thompson wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Nitsan Wakart

unread,
Jun 20, 2014, 8:38:53 AM6/20/14
to mechanica...@googlegroups.com
Gil: "Unsafe.putOrderedInt will prevent stores done before the operation from flowing past the operation (as you note, it is equivalent to placing a StoreStore barrier ahead of the operation), so stores in the locked region are prevented from moving past the unlock."

This is my understanding, and every expert I talked to, yet the JMM cookbook definition is:
"StoreStore Barriers
The sequence: Store1; StoreStore; Store2
ensures that Store1's data are visible to other processors (i.e., flushed to memory) before the data associated with Store2 and all subsequent store instructions. In general, StoreStore barriers are needed on processors that do not otherwise guarantee strict ordering of flushes from write buffers and/or caches to other processors or main memory."

Seems to say the reverse.


Vitaly Davidovich

unread,
Jun 20, 2014, 9:56:40 AM6/20/14
to mechanica...@googlegroups.com

Hmm, they seem to say the same thing.  What part(s) do you think is reversed?

Sent from my phone

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Nitsan Wakart

unread,
Jun 22, 2014, 7:51:25 AM6/22/14
to mechanica...@googlegroups.com
"will prevent stores<note the plural> done before the operation from flowing past the operation": To me this means stores that happen before the barrier are visible when the barrier operation (as they are joined in putOrdered to be a store that is a barrier or 'glued' to one) is visible. I.e if we have the order s1, s2, s3, so4, s5 -> s1,s2,s3 are visible with so4 (the ordered store), so preceding stores are blocked by the barrier. It implies s5 MAY float up past so4. 

"Store1's data are visible to other processors (i.e., flushed to memory) before the data associated with Store2 and all subsequent store instructions": I read this to mean the reverse relationship, stores after the barrier happen after the barrier. I.e given s1, so2, s3, s4, s5 we now have s1 allowed to flow past the barrier, but s3,s4,s5 not allowed to move back. My interpretation of the phrasing is that the Store1 in the definition is the store made by so2 in my example. In particular the cookbook definition says nothing of the stores done before Store1. 

Am I failing English comprehension here?

Gil Tene

unread,
Jun 22, 2014, 10:37:02 AM6/22/14
to
You are right that the two statements (mine and the StoreStore barrier's) are not the same, as the plural on the subsequent stores does make a difference in what is guaranteed. My statement says nothing about what can or can't happen to subsequent stores, while the StoreStore barrier statement does. So the StoreStore statement guarantees more than what I say. It is stronger and inclusive, not opposite. It does guaranteed everything that what i say I guarantees (and more).

For the purposes of the stores in question, however (the ones that are "inside" the lock and expect its protection) the two statements guarantee the same thing. Mine is simply not stronger than is needed for the argument at hand.

There are reasons I avoid the stronger statement:

The first reason is that the StoreStore barrier is easy to talk about with respect to the barrier itself (stores on either side can't cross). But it is harder to talk the same way for the lazySet or putOrdered: those are equivalent to a StoreStore barrier followed by a the operation's store, but you can only easily talk about prior stores not being able to float part the store operation. Stores that follow the lazySet or putOrdered *can* flow backwards past it, they just can't flow backwards past prior stores (because they can't flat back part the StoreStore barrier). The StoreStore barrier in a lazySet does not have to occurs "immediately before" the lazySet. It is simply guaranteed to exist between any prior stores and the lazySet's store, so other things can float into that gap... This basically means that stores outside of the lock can float into the lock. Which is ok to have happen, btw.

 The second reason is that in figuring out what is required, I like to use a form of conservation of energy or intentional laziness: try to use the weakest thing that satisfies my requirements. I find that it helps to state the actual requirements when reversing logic back and forth later. When the requirements are not stated precisely (as strong as needed but not stronger), logic reversal can have dangerous effects. E.g. The JMM cookbook is sufficient but not required, which makes it safe to follow for JVM and compiler implementors, but not safe to follow by Java programmers.

Nitsan Wakart

unread,
Jun 22, 2014, 11:53:15 AM6/22/14
to mechanica...@googlegroups.com
The quoted statement says less about the subsequent and more about the preceding, the JMM cookbook says more about the subsequent and less about the preceding. Let me illustrate with one example and apply both rules.
If B1,B2…. are all stores before Store1 and the barrier, A1,A2… are all after Store2, and Store1,StoreStore,Store2 are as the cookbook describes: B1, B2 … Store1 StoreStore Store2 A1, A2...
"will prevent stores done before the operation from flowing past the operation" -> A1..N can float before Store1, the B stores are prohibited from floating after Store1.
"Store1's data are visible to other processors before the data associated with Store2 and all subsequent store instructions" -> B1..N can float after Store2. The A stores are prohibited to float before Store2.

Vitaly Davidovich

unread,
Jun 22, 2014, 2:17:31 PM6/22/14
to mechanica...@googlegroups.com

The JMM wording is more restrictive (prohibits stores after Store2 from moving before the barrier) but I always interpreted Store1 as including all prior stores as well.  So the B1...N stores cannot move after the barrier, and I believe that's actually how this barrier is implemented in the JVM (hotspot, at least).  You typically don't care about stores after Store2 because the places where StoreStore is used are to order the target store and preceding stores (e.g. target is publishing some state directly queried by other threads, and you want to ensure that when that target store is visible, prior stores are as well - sort of piggybacking).  Technically, if stores after Store2 are also visible at that point you don't really care because they're unrelated to the happens-before edge you're inserting.  If those subsequent stores did matter, the placement of the StoreStore would've been different.

Sent from my phone

On Jun 22, 2014 11:53 AM, "'Nitsan Wakart' via mechanical-sympathy" <mechanica...@googlegroups.com> wrote:
The quoted statement says less about the subsequent and more about the preceding, the JMM cookbook says more about the subsequent and less about the preceding. Let me illustrate with one example and apply both rules.
If B1,B2.... are all stores before Store1 and the barrier, A1,A2... are all after Store2, and Store1,StoreStore,Store2 are as the cookbook describes: B1, B2 ... Store1 StoreStore Store2 A1, A2...

"will prevent stores done before the operation from flowing past the operation" -> A1..N can float before Store1, the B stores are prohibited from floating after Store1.
"Store1's data are visible to other processors before the data associated with Store2 and all subsequent store instructions" -> B1..N can float after Store2. The A stores are prohibited to float before Store2.

Nitsan Wakart

unread,
Jun 22, 2014, 3:04:36 PM6/22/14
to mechanica...@googlegroups.com
Vitaly: "I always interpreted Store1 as including all prior stores as well. "
Indeed that is the reasonable thing to do and implement. I merely wish the wording was a bit more precise. In particular if the definition finds it relevant to discuss subsequent stores it seems strange not to mention the stores 'behind' Store1.
In any case, I hope the next JMM is more clearly defined :-)

Vitaly Davidovich

unread,
Jun 22, 2014, 3:07:44 PM6/22/14
to mechanica...@googlegroups.com

Agreed, the wording could elaborate a bit more.  Perhaps send an email to concurrency interest so that Doug & co can opine and put it down on their todo list (maybe they're already aware of this, I don't know)?

Sent from my phone

Gil Tene

unread,
Jun 22, 2014, 3:35:19 PM6/22/14
to mechanica...@googlegroups.com
The JMM says nothing about StoreStore or lazySet (and yes, it should, and hopefully will).

In the case of StoreStore (not the JMM), I think it actually does cover things clearly, because of program order. When describing the program sequence Store1, StoreStore, Store2, Store1 refers to any store before the StoreStore barrier, and Store2 refers to any store after the barrier. The StoreStore does not discuss any specific store, it's just a line in the sand in the stated program order. It obviously does not guarantee any ordering within the preceding stores, or within the subsequent stores, but it guarantees ordering between the two sets.

In the case of lazySet, and putOrdered, the documentation we have is the contract on the defined methods:

- lazySet() Javadocs say: "Eventually sets to the given value." Which is nearly useless on it's own, but the java.util.concurrent.atomic package documentation gives more detail: "lazySet has the memory effects of writing (assigning) a volatile variable except that it permits reorderings with subsequent (but not previous) memory actions that do not themselves impose reordering constraints with ordinary non-volatile writes. Among other usage contexts, lazySet may apply when nulling out, for the sake of garbage collection, a reference that is never accessed again."

- putOrderedObject (if you are inclined to believe unsafe documentation) says: "Version of #putObjectVolatile(Object, long, Object) that does not guarantee immediate visibility of the store to other threads. This method is generally only useful if the underlying field is a Java volatile (or if an array cell, one that is otherwise only accessed using volatile accesses)." This is similar in meaning to what java.util.concurrent.atomic has to say about lazySet().

Neither lazySet or putOrdered guarantee laying down a StoreStore barrier. Dong so is a valid (and likely) implementation, but unlike a StoreStore barrier, the contracts only guarantee that the specific store in question will not be reordered with previous stores. It says nothing about subsequent stores' ability to float back past this store and previous ones... So in this sense, it is weaker than a StoreStore guarantee. [Obviously in other senses, like ordering against previous loads, and ordering against subsequent volatile loads, it is stronger that a StoreStore barrier].

Martin Thompson

unread,
Jun 23, 2014, 8:02:53 AM6/23/14
to mechanica...@googlegroups.com
On 22 June 2014 20:35, Gil Tene <g...@azulsystems.com> wrote:

In the case of StoreStore (not the JMM), I think it actually does cover things clearly, because of program order. When describing the program sequence Store1, StoreStore, Store2, Store1 refers to any store before the StoreStore barrier, and Store2 refers to any store after the barrier. The StoreStore does not discuss any specific store, it's just a line in the sand in the stated program order. It obviously does not guarantee any ordering within the preceding stores, or within the subsequent stores, but it guarantees ordering between the two sets.

Gil do you really mean "program order" here? Program is for OOE on a single thread from what I understand. Do you mean "synchronizes-with" to achieve the "happens-before" across threads?
 

Gil Tene

unread,
Jun 23, 2014, 11:05:16 AM6/23/14
to <mechanical-sympathy@googlegroups.com>


Sent from my iPad
I do mean "program order", and I'm only concerned with the semantics stated by the thread executing the barrier. Placing a StoreStore (or something that includes it) has semantic meaning in program order that cannot be ignored or replaced with synchronizes-with statements. It creates a required semantic execution and visibility order between variables that may gave no other dependencies, and could be legitimately reordered in execution and visibility if the barrier was not stated. You can think of it as preventing OOE-like interpretation of the StoreStore barrier by the JIT and/or the CPU. It is necessary to interpret StoreStore (and other) barriers in program order, otherwise when you say:

X.a = b;
StoreStore
Y.c = d;

(With nothing being volatile and no synchronized-with semantics)

It could be legitimately interpreted by the compiler as e.g.:

Y.c = d;
X.a = b;
StoreStore

It is the program order (and nothing else) that 


--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/EHQp7lm5cbM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.

Alexandru Nedelcu

unread,
Jun 24, 2014, 7:43:05 AM6/24/14
to mechanica...@googlegroups.com

On Mon, Jun 16, 2014 at 5:37 PM, Francesco Nigro <nigr...@gmail.com> wrote:

Now i have to "solve" (self-creating the requirements :D) :

- Reentrancy (i could inspire me with the Doug Lea's Reentrant Lock implementation)

Not sure if it helps, but a simple solution if you want lock() to be reentrant, is to keep a ThreadLocal count of the number of times the lock has been acquired by the current thread, decrement it on unlock() and release the lock only after it reaches 0. Or if you want to avoid the ThreadLocal usage overhead (it does have some overhead), you could keep the currentThread as a non-volatile variable in addition to a counter also stored as a non-volatile variable and check them before doing the CAS on lock(). You also need to reset them before the final unlock() that does the volatile store.

You also may not need to have the lock() method be reentrant per-se. A simple isAcquiredByCurrentThread method exposed might do the trick (which would save you from incrementing and decrementing a counter). I could get away with that in a simple implementation of a Lock I have.

In my implementation .lock() in itself is not reentrant, which is OK, because using it can be done by means of a Scala [macro](https://github.com/monifu/monifu/blob/v0.13.0/monifu-core/src/main/scala/monifu/concurrent/locks/Lock.scala#L249):

lock.enter {
  // stuff here
}

The above is translated at compile-time (“enter” is a macro) to approximately something like this, so it doesn’t have overhead and provides the API safety of intrinsic locks:

  boolean shouldAcquire = ! lock.isAcquiredByCurrentThread
  if (shouldAcquire) lock.lock()

  try {
    // stuff here
  } 
  finally {
    if (shouldAcquire) lock.unlock()
  }

Too bad Java doesn’t have macros.

--
Alexandru Nedelcu
www.bionicspirit.com

PGP Public Key:
https://bionicspirit.com/key.aexpk
Reply all
Reply to author
Forward
0 new messages