stamped lock in Go?

169 views
Skip to first unread message

Robert Engels

unread,
Jan 31, 2025, 6:14:24 AM1/31/25
to golang-nuts
Hi,

Do you think it is possible to implement a stamped lock in Go https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/locks/StampedLock.html ?

It would seem that the Go race detector would always report the “optimistic read” mode as a data race?

(The docs state for Java that the values can be wildly inconsistent when the optimistic read fails).

Ideas on how to implement in Go?

Bruno Albuquerque

unread,
Jan 31, 2025, 11:32:27 AM1/31/25
to Robert Engels, golang-nuts
This seemed expected to me but I went ahead and created a Go implementation (which might not be 100% correct so take it for what it will) and I was surprised that the race detector did not really complain about anything.

https://go.dev/play/p/R1alMCc-xN9

-Bruno


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/FBEB055A-2B84-4738-AE22-C22ABAC8C4A9%40ix.netcom.com.

Bruno Albuquerque

unread,
Jan 31, 2025, 11:52:30 AM1/31/25
to Robert Engels, golang-nuts
Ops. There was a bug due to left over of me testing. Here is a fixed version:

https://go.dev/play/p/UuIWVlV0UTN

Also, don't try to run in the playground as this runs forever (it could be changed but I am lazy).

-Bruno

robert engels

unread,
Jan 31, 2025, 11:53:24 AM1/31/25
to Bruno Albuquerque, golang-nuts
The code you provided doesn’t compile - wrong link?

Bruno Albuquerque

unread,
Jan 31, 2025, 12:01:16 PM1/31/25
to Robert Engels, golang-nuts
I wonder if the TryLock() call in the TryOptimisticRead() (used here as an acquire barrier/half fence) is making the race detector happy enough.

-Bruno

robert engels

unread,
Jan 31, 2025, 12:06:21 PM1/31/25
to Bruno Albuquerque, golang-nuts
Your code seems substantially different than the Java implementation. For instance, in your tryOptimisticRead(), you take a lock - the Java version has no locking in that method (on a volatile, aka atomic read) - although I am not sure why you are taking the lock.

func (sl *StampedLock) TryOptimisticRead() uint64 {
stamp := atomic.LoadUint64(&sl.stamp)
if sl.mu.TryLock() {
sl.mu.Unlock()
return stamp
}
return stamp
}

// Java - state is a “volatile” integer

public long tryOptimisticRead() {
        long s;
        return (((s = state) & WBIT) == 0L) ? (s & SBITS) : 0L;
}


Bruno Albuquerque

unread,
Jan 31, 2025, 12:14:34 PM1/31/25
to robert engels, golang-nuts
I did not look at the java code (I went for the description on the link you posted).

The idea in TryOptimisticRead() is that the TryLock() helps avoiding reordering/cache effects. Consider this:

Reader 1 calls TryOptimisticRead() and reads the stamp (let's say it's 100).

Writer acquires the write lock, increments data, and increments the stamp to 101.

Writer releases the write lock.

Reader 1 proceeds to read the shared data. Crucially, due to memory reordering or caching effects, it might still see the old value of data, even though the writer has already updated it.

Reader 1 calls Validate(100). The stamp is now 101, so Validate() returns false.

The second to last step is hypothetical and I did not test it. In fact, I just removed the TryLock() and returned the result of Adduint32 directly and the race detector still did not complain (not sure if the results are the expected ones. More testing would be needed). In other words, this still seems to work:


func (sl *StampedLock) TryOptimisticRead() uint64 {
    return atomic.LoadUint64(&sl.stamp)
}

robert engels

unread,
Jan 31, 2025, 12:20:48 PM1/31/25
to Bruno Albuquerque, golang-nuts
But the lock shouldn’t be necessary I don’t think? The atomic read provides a happens before relationship - which is why removing it didn’t change anything.

I am just surprised the race detector does not complain - given my knowledge of how the race detector works it seems it should.

Thanks for writing all of the code - interesting problem. The stamped lock is super efficient for many readers, rare writers - without resorting to copy on write or mutexes.

Bruno Albuquerque

unread,
Jan 31, 2025, 12:39:11 PM1/31/25
to robert engels, golang-nuts
I believe you are most likely right about the TryLock(0 not being needed but I am not 100% sure. I guess eventually I will investigate it.

I am also puzzled by the race detector behavior here, FWIIW. :)

And no worries, this sounded like something useful to write. It goes without saying but feel free to use it as you see fit.

-Bruno

Michael Knyszek

unread,
Jan 31, 2025, 12:55:18 PM1/31/25
to golang-nuts
This is a well-known pattern (use an atomic counter to check if anyone else modified some data in a critical section and retry or fall back if it happened), but as the extensive Java docs imply, any such abstraction is going be fairly leaky. Java has a very thoroughly defined memory model that I suspect comes into play here to make the abstraction more usable (but don't quote me on that).

You *could* do a very close read of the Go memory model and try to do something similar but you end up pushing a lot of these subtleties onto the caller. This would also require //go:norace annotations on the caller to not fail in the race detector. Neither I nor the memory model docs recommend this.

The pattern is still useful and possible to apply in Go, however I have two suggestions:
1. Take the abstraction one level higher. Using the Java doc's Point example, what I mean is to only provide Point, not StampedLock. It's not too hard to implement a bespoke version of this sort of thing when you need it, and I find that doing these sorts of tricks benefits from breaking down the abstraction internally anyway, since it gives you more flexibility when adding new features.
2. Read and write the optimistically-read variables using the sync/atomic package. In the Point example from the Java docs, the optimistic read path would load x and y atomically (i.e. two separate Loads) and any write paths would store x and y atomically (i.e. two separate Stores). That's going to be far easier to reason about and more portable, though it might not achieve maximum performance on your hardware. If you're doing mostly optimistic reads (which I suspect is really when you'd want to consider this anyway), the optimistic reads themselves should scale just as well and you won't need to apply any //go:norace annotations. This suggestion is in the same vein as the "don't be clever" part of the memory model documentation.

robert engels

unread,
Jan 31, 2025, 1:10:49 PM1/31/25
to Michael Knyszek, golang-nuts
Yes, Java has a very detailed and explicit memory model which makes things like this doable.

Although the ideas you propose are workable, the primary purpose of this is performance based - otherwise you would simply use a mutex. The problem with making every shared variable an atomic is that you pay the atomic/fence cost on every access - read and write. Whereas in the Java case, there is no flushing of the cache at all if there has been no writer activity, and even if there is, it is a single cache flush per “bulk update”.

There reason StampedLock is provided rather than modifying Point is DRY. If you need a highly concurrent shared data system (i know, don’t share data in Go…) which is often the case for HPC/HFT systems, StampedLock can be useful.

Using channels and locks is simply too slow - see github.com/robaho/go-concurrency-test (old, but I think still applicable).

I think a more viable performant replacement in Go is to use copy on write with an atomic object reference - which in most cases is going to be more performant than using a StampedLock.

-- 
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

robert engels

unread,
Jan 31, 2025, 1:13:14 PM1/31/25
to Michael Knyszek, golang-nuts
To clarify, my op was to understand if this could be done in Go, which Bruno showed it can, and know I’d like to know why the race detector is not complaining…

I wasn’t making any attempt at creating a more specified Go memory model. It is what it is.

Michael Knyszek

unread,
Jan 31, 2025, 1:58:39 PM1/31/25
to robert engels, golang-nuts
I think I got a little overzealous with my opinions, it wasn't my intent to be particularly prescriptive. I also did not get the impression that you were suggesting the memory model should be different.

Bottom line: yes, it's possible to implement but more fragile in Go due to the less-thoroughly-specified memory model. (I think we're in agreement there.)

(Minor point, but I do think making the shared variables atomic is good enough for some cases (like perhaps the overly-simplified Point example). If you've got *only* reads, even atomic loads should still keep everything in a shared state across all caches, so no flushing. Especially true on amd64 with TSO where atomic loads compile down to exactly the same instruction, just with less reordering allowed at the compiler level. Writes are of course more expensive for both the reader and writer. Though yeah, if the shared state is big enough, I'd probably take a different approach altogether, same as you suggest.)

RE: the race detector, I can get https://go.dev/play/p/UuIWVlV0UTN to trip the race detector if:
1. I remove the Println calls.
2. I remove the Sleep call.

(The copy of `data` that's read by each reader goroutine is written to some independent global variable by each reader goroutine to satisfy the compiler.)

I suspect what's happening is that the locking inside Println combined with the relatively low frequency of writes is making it so that the race detector does not actually ever observe a racy read/write. It's also a single int that's being loaded and stored, so the race window is quite small.

Michael Knyszek

unread,
Jan 31, 2025, 2:05:44 PM1/31/25
to golang-nuts
On Friday, January 31, 2025 at 1:58:39 PM UTC-5 Michael Knyszek wrote:
I think I got a little overzealous with my opinions, it wasn't my intent to be particularly prescriptive. I also did not get the impression that you were suggesting the memory model should be different.

Bottom line: yes, it's possible to implement but more fragile in Go due to the less-thoroughly-specified memory model. (I think we're in agreement there.)
(By more fragile I mean more fragile for the caller to use in the way it's used in Java. The StampedLock implementation itself isn't fragile.)

Robert Engels

unread,
Jan 31, 2025, 2:07:04 PM1/31/25
to Michael Knyszek, golang-nuts
That makes sense - I forgot about the locking in output. I changed it to loop repeatedly and still couldn’t get the detector to trip. Thanks for the help. 

On Jan 31, 2025, at 12:58 PM, Michael Knyszek <mkny...@google.com> wrote:


Reply all
Reply to author
Forward
0 new messages