Upgradable RLock

781 views
Skip to first unread message

Diego Augusto Molina

unread,
Jan 29, 2023, 9:34:42 PM1/29/23
to golang-nuts
From times to times I write a scraper or some other tool that would authenticate to a service and then use the auth result to do stuff concurrently. But when auth expires, I need to synchronize all my goroutines and have a single one do the re-auth process, check the status, etc. and then arrange for all goroutines to go back to work using the new auth result.

To generalize the problem: multiple goroutines read a cached value that expires at some point. When it does, they all should block and some I/O operation has to be performed by a single goroutine to renew the cached value, then unblock all other goroutines and have them use the new value.

I solved this in the past in a number of ways: having a single goroutine that handles the cache by asking it for the value through a channel, using sync.Cond (which btw every time I decide to use I need to carefully re-read its docs and do lots of tests because I never get it right at first). But what I came to do lately is to implement an upgradable lock and have every goroutine do:

<code>
func (x implem) getProtectedValue() (someType, error) {
// acquires a read lock that can be upgraded
lock := x.upgradableLock.UpgradableRLock()
// the Unlock method of the returned lock does the right thing
// even if we later upgrade the lock
defer lock.Unlock()

// here we have read access to x.protectedValue

// if the cached value is no longer valid, upgrade the lock
// and update it
if !isValid(x.protectedValue) && lock.TryUpgrade() {
  // within this if, we know we *do* have write access
  // to x.protectedValue
    x.protectedValue, x.protectedValueError = newProtectedValue()
}

// here we can only say everyone has read access to x.protectedValue
// (the goroutine that got the lock upgrade could still write
// here but as this code is shared, we should check the result
// of the previous lock.TryUpgrade() again)

return x.protectedValue, x.protectedValueError
}
</code>

The implementation is quite simple:
<code>
// Upgradable implements all methods of sync.RWMutex, plus a new
// method to acquire a read lock that can optionally be upgraded
// to a write lock.
type Upgradable struct {
    readers sync.RWMutex
    writers sync.RWMutex
}

func (m *Upgradable) RLock() { m.readers.RLock() }
func (m *Upgradable) TryRLock() bool { return m.readers.TryRLock() }
func (m *Upgradable) RUnlock() { m.readers.RUnlock() }
func (m *Upgradable) RLocker() sync.Locker { return m.readers.RLocker() }

func (m *Upgradable) Lock() {
    m.writers.Lock()
    m.readers.Lock()
}

func (m *Upgradable) TryLock() bool {
    if m.writers.TryLock() {
        if m.readers.TryLock() {
            return true
        }
        m.writers.Unlock()
    }
    return false
}

func (m *Upgradable) Unlock() {
    m.readers.Unlock()
    m.writers.Unlock()
}

// UpgradableRLock returns a read lock that can optionally be
// upgraded to a write lock.
func (m *Upgradable) UpgradableRLock() *UpgradableRLock {
    m.readers.RLock()
    return &UpgradableRLock{
        m:          m,
        unlockFunc: m.RUnlock,
    }
}

// UpgradableRLock is a read lock that can be upgraded to a write
// lock. This is acquired by calling (*Upgradable).
// UpgradableRLock().
type UpgradableRLock struct {
    mu         sync.Mutex
    m          *Upgradable
    unlockFunc func()
}

// TryUpgrade will attempt to upgrade the acquired read lock to
// a write lock, and return whether it succeeded. If it didn't
// succeed then it will block until the goroutine that succeeded
// calls Unlock(). After unblocking, the read lock will still be
// valid until calling Unblock().
//
// TryUpgrade panics if called more than once or if called after
// Unlock.
func (u *UpgradableRLock) TryUpgrade() (ok bool) {
    u.mu.Lock()
    defer u.mu.Unlock()

    if u.m == nil {
        panic("TryUpgrade can only be called once and not after Unlock")
    }

    if ok = u.m.writers.TryLock(); ok {
        u.m.readers.RUnlock()
        u.m.readers.Lock()
        u.unlockFunc = u.m.Unlock

    } else {
        u.m.readers.RUnlock()
        u.m.writers.RLock()
        u.unlockFunc = u.m.writers.RUnlock
    }

    u.m = nil

    return
}

// Unlock releases the lock, whether it was a read lock or a write
// lock acquired by calling Upgrade.
//
// Unlock panics if called more than once.
func (u *UpgradableRLock) Unlock() {
    u.mu.Lock()
    defer u.mu.Unlock()

    if u.unlockFunc == nil {
        panic("Unlock can only be called once")
    }

    u.unlockFunc()
    u.unlockFunc = nil
    u.m = nil
}
</code>

I obviously try to avoid using it for other than protecting values that require potentially long I/O operations (like in my case re-authenticating to a web service) and also having a lot of interested goroutines. The good thing is that most of the time this pattern only requires one RLock, but the (*UpgradableRLock).Unlock requires an additional lock/unlock to prevent misusage of the upgradable lock (I could potentially get rid of it but preferred to keep it in the side of caution). Another thing I don't like is that I need to allocate for each UpgradableRLock. I'm thinking to re-define UpgradableRLock to be a defined type of Upgradable and convert the pointer type of the receiver to *Upgradable whenever needed (also getting rid of the lock that prevents misusage).

I wanted to ask the community what do you think of this pattern, pros and cons, and whether this is overkill and could be better solved with something obvious I'm missing (even though I did my best searching).

The reason why I decide to use this pattern is that all other options seem to make my code more complex and less readable. I generally just implement one method that does all the value-getting and it ends up being quite readable and easy to follow:
  1. Get an upgradable lock
  2. If the value is stale and I get to upgrade the lock, then update the value and store the error as well. The "update" part is generally implemented as a separate method or func called "updateThingLocked".
  3. Return the value and the error.

Kind regards.

burak serdar

unread,
Jan 29, 2023, 9:56:59 PM1/29/23
to Diego Augusto Molina, golang-nuts
On Sun, Jan 29, 2023 at 7:34 PM Diego Augusto Molina <diegoaugu...@gmail.com> wrote:
From times to times I write a scraper or some other tool that would authenticate to a service and then use the auth result to do stuff concurrently. But when auth expires, I need to synchronize all my goroutines and have a single one do the re-auth process, check the status, etc. and then arrange for all goroutines to go back to work using the new auth result.

To generalize the problem: multiple goroutines read a cached value that expires at some point. When it does, they all should block and some I/O operation has to be performed by a single goroutine to renew the cached value, then unblock all other goroutines and have them use the new value.

I solved this in the past in a number of ways: having a single goroutine that handles the cache by asking it for the value through a channel, using sync.Cond (which btw every time I decide to use I need to carefully re-read its docs and do lots of tests because I never get it right at first). But what I came to do lately is to implement an upgradable lock and have every goroutine do:


So how about using sync.Once:

type CacheEntry struct {
  Auth AuthInfo 
  once sync.Once
}

Cache would be a map[CacheKey]*CacheEntry. When AuthInfo expires, you simply replace it with a new one:

func (c *Cache) Expire(key CacheKey) {
  c.Lock()
  defer c.Unlock()
  c.cache[key]=&CacheEntry{}
}

func (c *Cache) Get(key CacheKey) Auth {
   c.RLock()
   entry:=c.cache[key]
    c.RUnlock()
    if entry==nil {
          c.Lock()
          entry=c.cache[key]
          if entry==nil {
             c.cache[key]=&CacheEntry{}
          }
          c.UnLock()
   }
    entry.Do(func() {
            // Authenticate 
    })
    return entry.Auth
}

 
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/0fe9c0f3-a703-45f3-88b7-bbfc29111b2en%40googlegroups.com.

Ian Lance Taylor

unread,
Jan 30, 2023, 1:49:40 PM1/30/23
to Diego Augusto Molina, golang-nuts
On Sun, Jan 29, 2023 at 6:34 PM Diego Augusto Molina
<diegoaugu...@gmail.com> wrote:
>
> From times to times I write a scraper or some other tool that would authenticate to a service and then use the auth result to do stuff concurrently. But when auth expires, I need to synchronize all my goroutines and have a single one do the re-auth process, check the status, etc. and then arrange for all goroutines to go back to work using the new auth result.
>
> To generalize the problem: multiple goroutines read a cached value that expires at some point. When it does, they all should block and some I/O operation has to be performed by a single goroutine to renew the cached value, then unblock all other goroutines and have them use the new value.
>
> I solved this in the past in a number of ways: having a single goroutine that handles the cache by asking it for the value through a channel, using sync.Cond (which btw every time I decide to use I need to carefully re-read its docs and do lots of tests because I never get it right at first). But what I came to do lately is to implement an upgradable lock and have every goroutine do:


We have historically rejected this kind of adjustable lock. There is
some previous discussion at https://go.dev/issue/4026,
https://go.dev/issue/23513, https://go.dev/issue/38891,
https://go.dev/issue/44049.

For a cache where checking that the cached value is valid (not stale)
and fetching the cached value is quick, then in general you will be
better off using a plain Mutex rather than RWMutex. RWMutex is more
complicated and therefore slower. It's only useful to use an RWMutex
when the read case is both contested and relatively slow. If the read
case is fast then the simpler Mutex will tend to be faster. And then
you don't have to worry about upgrading the lock.

Ian

Robert Engels

unread,
Jan 30, 2023, 2:27:13 PM1/30/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
I don’t think that is true. A RW lock is always better when the reader activity is far greater than the writer - simply because in a good implementation the read lock can be acquired without blocking/scheduling activity.

> On Jan 30, 2023, at 12:49 PM, Ian Lance Taylor <ia...@golang.org> wrote:
>
> On Sun, Jan 29, 2023 at 6:34 PM Diego Augusto Molina
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcXNVFkc5H-L6K4Mt81gB6u91Ja07hob%3DS8Qwgy2buiQjQ%40mail.gmail.com.

Ian Lance Taylor

unread,
Jan 30, 2023, 3:16:14 PM1/30/23
to Robert Engels, Diego Augusto Molina, golang-nuts
On Mon, Jan 30, 2023 at 11:26 AM Robert Engels <ren...@ix.netcom.com> wrote:
>
> I don’t think that is true. A RW lock is always better when the reader activity is far greater than the writer - simply because in a good implementation the read lock can be acquired without blocking/scheduling activity.

The best read lock implementation is not going to be better than the
best plain mutex implementation. And with current technology any
implementation is going to require atomic memory operations which
require coordinating cache lines between CPUs. If your reader
activity is so large that you get significant contention on a plain
mutex (recalling that we are assuming the case where the operations
under the read lock are quick) then you are also going to get
significant contention on a read lock. The effect is that the read
lock isn't going to be faster anyhow in practice, and your program
should probably be using a different approach.

Ian

Robert Engels

unread,
Jan 30, 2023, 4:00:47 PM1/30/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
Pure readers do not need any mutex on the fast path. It is an atomic CAS - which is faster than a mutex as it allows concurrent readers. On the slow path - fairness with a waiting or active writer - it degenerates in performance to a simple mutex.

The issue with a mutex is that you need to acquire it whether reading or writing - this is slow…. (at least compared to an atomic cas)

> On Jan 30, 2023, at 2:24 PM, Ian Lance Taylor <ia...@golang.org> wrote:
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcWJ%2BLPOoTk9H7bxAj8_dLsuhgOpy_bZZrGW%3D%2Bz6N%3DrX-w%40mail.gmail.com.

Robert Engels

unread,
Jan 30, 2023, 4:06:19 PM1/30/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
A quick search found this https://yizhang82.dev/lock-free-rw-lock which describes the algo pretty well. 

On Jan 30, 2023, at 3:00 PM, Robert Engels <ren...@ix.netcom.com> wrote:

Pure readers do not need any mutex on the fast path. It is an atomic CAS - which is faster than a mutex as it allows concurrent readers. On the slow path - fairness with a waiting or active writer - it degenerates in performance to a simple mutex.

Ian Lance Taylor

unread,
Jan 30, 2023, 7:23:08 PM1/30/23
to Robert Engels, Diego Augusto Molina, golang-nuts
On Mon, Jan 30, 2023 at 1:00 PM Robert Engels <ren...@ix.netcom.com> wrote:
>
> Pure readers do not need any mutex on the fast path. It is an atomic CAS - which is faster than a mutex as it allows concurrent readers. On the slow path - fairness with a waiting or active writer - it degenerates in performance to a simple mutex.
>
> The issue with a mutex is that you need to acquire it whether reading or writing - this is slow…. (at least compared to an atomic cas)

The fast path of a mutex is also an atomic CAS.

Robert Engels

unread,
Jan 30, 2023, 7:42:31 PM1/30/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
Yes but only for a single reader - any concurrent reader is going to park/deschedule.

There’s a reason RW locks exist - and I think it is pretty common - but agree to disagree :)

> On Jan 30, 2023, at 6:23 PM, Ian Lance Taylor <ia...@golang.org> wrote:
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVLzkTgiYqw%2BWh6pTFX74X-LYoyPFK5bkX7T8J8j3mb%3Dg%40mail.gmail.com.

Ian Lance Taylor

unread,
Jan 30, 2023, 9:30:10 PM1/30/23
to Robert Engels, Diego Augusto Molina, golang-nuts
On Mon, Jan 30, 2023 at 4:42 PM Robert Engels <ren...@ix.netcom.com> wrote:
>
> Yes but only for a single reader - any concurrent reader is going to park/deschedule.

If we are talking specifically about Go, then it's more complex than
that. In particular, the code will spin briefly trying to acquire the
mutex, before queuing.

> There’s a reason RW locks exist - and I think it is pretty common - but agree to disagree :)

Sure: read-write locks are fine and appropriate when the program holds
the read lock for a reasonably lengthy time. As I said, my analysis
only applies when code holds the read lock briefly, as is often the
case for a cache.

robert engels

unread,
Feb 4, 2023, 11:49:39 AM2/4/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
I took some time to put this to a test. The Go program here https://go.dev/play/p/378Zn_ZQNaz uses a VERY short holding of the lock - but a large % of runtime holding the lock.

(You can’t run it on the Playground because of the length of time). You can comment/uncomment the lines 28-31 to test the different mutexes,

It simulates a common system scenario (most web services) - lots of readers of the cache, but the cache is updated infrequently.

On my machine the RWMutex is > 50% faster - taking 22 seconds vs 47 seconds using a simple Mutex.

It is easy to understand why - you get no parallelization of the readers when using a simple Mutex. 

Kurtis Rader

unread,
Feb 4, 2023, 2:00:25 PM2/4/23
to robert engels, golang-nuts
FWIW, Using an RCU (https://en.wikipedia.org/wiki/Read-copy-update) algorithm is often the best choice for caches where reads happen several orders of magnitude more often than writes. RCU usually avoids the need for any mutex by readers. I was working at Sequent Computer Systems when this was implemented for the network ARP and routing tables and was friends with the engineers who designed and patented RCU. I remember reading their patent and looking at the kernel implementation and thinking to myself how brilliant the concept was.



--
Kurtis Rader
Caretaker of the exceptional canines Junior and Hank

Robert Engels

unread,
Feb 4, 2023, 2:26:00 PM2/4/23
to Kurtis Rader, golang-nuts
Agreed. Copy-on-write is also very good if the shared structure is small. 

On Feb 4, 2023, at 1:00 PM, Kurtis Rader <kra...@skepticism.us> wrote:



Ian Lance Taylor

unread,
Feb 4, 2023, 5:59:42 PM2/4/23
to robert engels, Diego Augusto Molina, golang-nuts
On Sat, Feb 4, 2023 at 8:49 AM robert engels <ren...@ix.netcom.com> wrote:
>
> I took some time to put this to a test. The Go program here https://go.dev/play/p/378Zn_ZQNaz uses a VERY short holding of the lock - but a large % of runtime holding the lock.
>
> (You can’t run it on the Playground because of the length of time). You can comment/uncomment the lines 28-31 to test the different mutexes,
>
> It simulates a common system scenario (most web services) - lots of readers of the cache, but the cache is updated infrequently.
>
> On my machine the RWMutex is > 50% faster - taking 22 seconds vs 47 seconds using a simple Mutex.
>
> It is easy to understand why - you get no parallelization of the readers when using a simple Mutex.

Thanks for the benchmark. You're right: if you have hundreds of
goroutines doing nothing but acquiring a read lock, then an RWMutex
can be faster. They key there is that there are always multiple
goroutines waiting for the lock.

I still stand by my statement for more common use cases.

Robert Engels

unread,
Feb 4, 2023, 6:12:17 PM2/4/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
I think with server processes - with possibly 100k+ connections - the contention on a “read mainly” cache is more than you think. This test only uses 500 readers with little work to simulate the 100k case.

> On Feb 4, 2023, at 4:59 PM, Ian Lance Taylor <ia...@golang.org> wrote:
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcV-7RfjXakYkc-pVJHPwhkaTLXky0mOMXbhqpcXLGwp2Q%40mail.gmail.com.

Ian Lance Taylor

unread,
Feb 4, 2023, 6:19:26 PM2/4/23
to Robert Engels, Diego Augusto Molina, golang-nuts
On Sat, Feb 4, 2023 at 3:11 PM Robert Engels <ren...@ix.netcom.com> wrote:
>
> I think with server processes - with possibly 100k+ connections - the contention on a “read mainly” cache is more than you think. This test only uses 500 readers with little work to simulate the 100k case.

Not to get too far into the weeds, but if I were expecting that kind
of load I would use an atomic.Pointer anyhow, rather than any sort of
mutex.

Ian

Robert Engels

unread,
Feb 4, 2023, 6:24:56 PM2/4/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
That only works if what it is pointing to is cheap to copy. If it is a large multi-layer structure a RW lock is usually more efficient.


> On Feb 4, 2023, at 5:19 PM, Ian Lance Taylor <ia...@golang.org> wrote:
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcVgOfcSr%2BvzTKGMpicw1hbD6bzrB5yZhOn-sYGW81b6tw%40mail.gmail.com.

Ian Lance Taylor

unread,
Feb 4, 2023, 7:14:57 PM2/4/23
to Robert Engels, Diego Augusto Molina, golang-nuts
On Sat, Feb 4, 2023 at 3:24 PM Robert Engels <ren...@ix.netcom.com> wrote:
>
> That only works if what it is pointing to is cheap to copy. If it is a large multi-layer structure a RW lock is usually more efficient.

No significant copying is required, you just get a pointer to the
value. Then you have some way to determine whether it is up to date.
If not, you create a new value and store a pointer to it back in the
atomic.Pointer.

robert engels

unread,
Feb 4, 2023, 7:48:45 PM2/4/23
to Ian Lance Taylor, Diego Augusto Molina, golang-nuts
That is similar to sync.Map works, but it involves much more complex code.

More importantly though, if multiple entries need to be keep in sync that technique doesn’t work - at least not directly/easily. This is a common need with associated caches.

Even copy on write isn’t always suitable. Assume you have a map (cache) that is 1GB in size. It is mainly read. But you need to update an entry every once in a while.

With copy on write, the “create a new value” - needs to create a new map and copy over the existing map - very expensive. Then atomically replace the reference.

With multiple writers this can be even more expensive, since you need a secondary lock to avoid each writer attempting to make an expensive copy then failing the CAS. (no more expensive than the write lock in RWMutex).


Bakul Shah

unread,
Feb 5, 2023, 1:42:29 AM2/5/23
to Robert Engels, Ian Lance Taylor, Diego Augusto Molina, golang-nuts
You can implement your own multiple-reader-single-writer-lock using what Go gives you.
For example: https://eli.thegreenplace.net/2019/implementing-reader-writer-locks/
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/E700E6A8-0114-4F64-9042-B9E9C62F06FA%40ix.netcom.com.

Juliusz Chroboczek

unread,
Feb 5, 2023, 7:34:36 AM2/5/23
to golan...@googlegroups.com
>> I took some time to put this to a test. The Go program here
>> https://go.dev/play/p/378Zn_ZQNaz uses a VERY short holding of the
>> lock - but a large % of runtime holding the lock.

> Thanks for the benchmark. You're right: if you have hundreds of
> goroutines doing nothing but acquiring a read lock, then an RWMutex
> can be faster. They key there is that there are always multiple
> goroutines waiting for the lock.

Could you please explain that? You previously implied that the
required atomic operation is going to make the difference between the
two kinds of mutex irrelevant, why does the argument not apply here?

Thanks,

-- Juliusz

ren...@ix.netcom.com

unread,
Feb 5, 2023, 11:09:15 AM2/5/23
to golang-nuts
The article is very suspect. In the first section "A simple implementation" the code is badly broken. You can't get a reader lock if the writer has the write lock - which the code doesn't test:

func (l *ReaderCountRWLock) RLock() 
{
l.m.Lock() 
l.readerCount++
l.m.Unlock()
}

A single writer RWMutex implementation can be more efficient than the general case RWMutex - since you don't need to maintain a list of waiting writers. Usually though you would use a specialized data structure designed for single writer - multiple reader which can be lock-free for both reading and writing.

robert engels

unread,
Feb 5, 2023, 11:12:22 AM2/5/23
to golang-nuts
Also, to remind the readers - the OP question is about an “upgradable RW lock” - where a reader thread holder the read lock can request an upgrade to be a writer. This is a more complex scenario.

robert engels

unread,
Feb 5, 2023, 11:31:19 AM2/5/23
to golang-nuts
I take it back. Apologies. The writer will hold for the lock on the mutex for the duration - so the code is fine.

On Feb 5, 2023, at 10:09 AM, ren...@ix.netcom.com <ren...@ix.netcom.com> wrote:

Ian Lance Taylor

unread,
Feb 5, 2023, 11:45:38 AM2/5/23
to Juliusz Chroboczek, golang-nuts
If there are enough concurrent lockers to overwhelm the plain mutex spin lock, then the read-write mutex will work better.  My argument is that in real programs that is an unlikely case if the lock is held for a short period of time.

Ian

drc...@google.com

unread,
Feb 13, 2023, 2:45:11 PM2/13/23
to golang-nuts
Could you use an applicative data structure? (e.g., a balanced binary tree where you allocate a new spine for each insertion/deletion)
That has log N overhead to read, log N storage allocated per write, but I think if you CAS the writes, the reads can proceed with a lightweight barrier.

Robert Engels

unread,
Feb 13, 2023, 3:03:50 PM2/13/23
to drc...@google.com, golang-nuts
ConcurrentHashMap works that way in Java. Multiple write locks on portions of the table and CAS/atomic reading. 

On Feb 13, 2023, at 1:45 PM, 'drc...@google.com' via golang-nuts <golan...@googlegroups.com> wrote:

Could you use an applicative data structure? (e.g., a balanced binary tree where you allocate a new spine for each insertion/deletion)
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages