> I am writing a class to calculate average of prices. We use it to generate buy/sell signal for some financial instruments therefore latency is crucial. Prices are sent via different threads therefore class needs to be thread-safe.
>
> Am I correct in understanding that I have to use a lock to make the two operations (adding and incrementing) atomic? I have looked at AtomicLong/LongAdder/LongAccumulator but looks like they can only sum the numbers atomically.
>
> In other words, there is no way to do this in a lock-free manner?
I am not much of an expert here but I have done exactly this. The mean
and count were packed into a single 128-bit value, which was then
updated with x86_64 DWCAS operation (LOCK CMPXCHG16B). It was slower
than a locking version though. I think that is because 1) 128-bit
atomics are more limited than 64-bit ones on x86_64, i.e. to load the
128 bit value you still have to do CAS instead of simple load,
resulting in a more expensive implementation; 2) under high
concurrency this implementation trades a contended lock for a
contended 128-bit atomic, gaining nothing for scalability. To scale,
I'd look into per-thread averages with occasional global aggregation,
but I'm not sure your use case allows this
--
Laurynas