On 10/29/2016 10:13 AM, Peter Veentjer wrote:
> So you get something like this:
>
> public class Counter {
>
> private final AtomicLong c = new AtomicLong();
>
> public void inc(){
> long newValue = c.get()+1;
> c.lazySet(newValue);
> }
>
> public long get(){
> return c.get();
> }
> }
Does not work. With non-atomic updates, you set yourself for losing a
virtually unbounded number of updates. See e.g:
https://shipilev.net/blog/2016/close-encounters-of-jmm-kind/#pitfall-no-sync
> But in this case you still need to do a volatile read to read the actual
> value. In theory one could optimize this to:
>
> public class Counter {
>
> private final AtomicLong c = new AtomicLong();
> private long local;
>
> public void inc(){
> long newValue = local+1;
> this.local = newValue;
> c.lazySet(newValue);
> }
>
> public long get(){
> return c.get();
> }
> }
I guess get() was meant to have "return local"? If so, this does not
work. With no ordering implied for the reads, you set yourself up for
this trouble:
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Peter's wording after that snippet implied he was trying to remove a volatile read on the reader, so I think Aleksey is right that get() was intended to return the local.
The main "issue" here is readers will take cache misses and coherence traffic as their cacheline will be invalidated when the writer stores the new value.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscribe...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Sent from my phone
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsubscribe...@googlegroups.com.
Peter's wording after that snippet implied he was trying to remove a volatile read on the reader, so I think Aleksey is right that get() was intended to return the local.I'm not so sure -- you'll notice in the last example he provided he does avoid a volatile read in inc() ... and since inc() will be run on a single thread and all reads are on the atomic/volatile var, I think it all works out. He's basically using `local` as a "thread-local" cache of the atomic value on the writer so he doesn't have to read the atomic.
Peter, perhaps it's best if you can clarify but assuming that was your intent: whether there's a performance change on x86 is another question entirely. You're much better off benchmarking, but here's some wild speculation for you to consider [and for others to rubbish :)]:I would expect no obvious benefit on x86 between the `local` code and the `lazySet` version assuming this code is running in isolation, no context switches and/or there is nothing else invalidating cache lines. Assuming a single writer thread, the cache line should never leave the L1 cache on the writer thread after the first write. As a result, on all but the first call to inc() you should read the current value from the L1 cache. IIRC reads from L1 are roughly as fast as a read from a register, so I think the writer thread's "volatile read" should actually be quite fast.
Vitaly, whatever his intent the other details in your email were interesting all the same -- I'm not especially familiar with the newer C++ memory ordering stuff. One question:The main "issue" here is readers will take cache misses and coherence traffic as their cacheline will be invalidated when the writer stores the new value.If I'm reading what you're saying RE: getOpaque correctly, seems like the cache line invalidation performance hit would be unavoidable except on more relaxed archs [than x86], right?
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
> A counter could be incremented >100k times a second and we don't want a
> counter to cause too much of a performance degradation. A counter will
> always be incremented by the same thread.
There is a subtle difference between "a single writer to Counter", and
"each thread gets its own Counter". I would venture to guess that
(false) sharing in your case should be the concern, not the cost of
volatile ops.
Having a referenced AtomicLong instead of padded out "local" field would
set you up for false sharing very nicely :)
Thanks,
-Aleksey
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thanks,
-Aleksey
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.