Write lock-free w/ proxy collectors

Joe Seigh

unread,

Nov 14, 2006, 11:31:49 AM11/14/06

to

It turns out the scheme is fairly simple if you have a proxy collector.
If you remember the scheme from back here
http://groups.google.com/group/comp.programming.threads/msg/d78d43a1d0181f2c

Relabeling that diagram gets you

tail-> x <- x <- x ...

which is tail end of a fifo queue. You can use Michael and
Scott's lock-free fifo queue logic to enqueue onto the tail.

Using the proxy collector for read lock-free access has already
been described. Using it for write lock-free involves the
following steps.

1) acquire a reference to the current collector object (read lock)
2) perform write lock-free modification of data structure. Proxy
read lock prevents nodes from being deallocated out from under
you and any ABA problems.
3) release proxy reference (unlock)

if you have one or more removed nodes

4) reacquire a reference to the current proxy object
5) attempt lock-free enqueueing of the new proxy object
(having a proxy lock prevents ABA problems here)
6) release proxy reference.
7) repeat steps 4-7 if enqueue failed

The order of lock-free modifications to the data structure
and the enqueueing of proxy objects does not have to be the
same. The reacquiring of the proxy, which you need to do
to get the current proxy object, protects any prior proxy
protected read accesses to the data structure.

This should work with any proxy collector scheme, e.g. refcounted,
hazard pointers, etc...

I don't know if I'll get around to modifiying rcpc since there's
three different algorithms ifdef'd together and it'd be a little
tricky. I might to a hazared pointer version as an example maybe.

Also there's the fact that I thought up another lock-free scheme
unrelated to this so I have more schemes than resources to
implement them all right now.

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.

Chris Thomasson

unread,

Nov 14, 2006, 7:52:10 PM11/14/06

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message
news:JMmdnaYcYId6c8TY...@comcast.com...

> It turns out the scheme is fairly simple if you have a proxy collector.
> If you remember the scheme from back here
> http://groups.google.com/group/comp.programming.threads/msg/d78d43a1d0181f2c
>
> Relabeling that diagram gets you
>
> tail-> x <- x <- x ...
>
> which is tail end of a fifo queue. You can use Michael and
> Scott's lock-free fifo queue logic to enqueue onto the tail.
>
> Using the proxy collector for read lock-free access has already
> been described. Using it for write lock-free involves the
> following steps.
>
> 1) acquire a reference to the current collector object (read lock)
> 2) perform write lock-free modification of data structure. Proxy
> read lock prevents nodes from being deallocated out from under
> you and any ABA problems.
> 3) release proxy reference (unlock)
>
> if you have one or more removed nodes
>
> 4) reacquire a reference to the current proxy object
> 5) attempt lock-free enqueueing of the new proxy object
> (having a proxy lock prevents ABA problems here)
> 6) release proxy reference.
> 7) repeat steps 4-7 if enqueue failed
>

The way I do it sounds a little different than yours?

Read-Only Iteration: lock-free lifo
---------
ref = acquire proxy-ref
traverse lock-free stack ...
release proxy-ref(ref)

Writer Produce :
lock-free lifo push
---------
normal writer lock-free stack push ...

Writer Consume :
lock-free lifo pop
---------
ref = acquire proxy-ref
do writer lock-free stack ...
which implys reads...
to pop an objX...
ref->defer_next_epoch(objX)...
release proxy-ref(ref)...

I have you read your posts in more detail...

Joe Seigh

unread,

Nov 14, 2006, 11:21:12 PM11/14/06

to

Chris Thomasson wrote:
>>following steps.

>>
>
>
>
> The way I do it sounds a little different than yours?
>
>
> Read-Only Iteration: lock-free lifo
> ---------
> ref = acquire proxy-ref
> traverse lock-free stack ...
> release proxy-ref(ref)
>
>
>
>
> Writer Produce :
> lock-free lifo push
> ---------
> normal writer lock-free stack push ...
>
>
>
> Writer Consume :
> lock-free lifo pop
> ---------
> ref = acquire proxy-ref
> do writer lock-free stack ...
> which implys reads...
> to pop an objX...
> ref->defer_next_epoch(objX)...
> release proxy-ref(ref)...
>
>

That's better. Just hold onto the same proxy lock
through out

1) acquire a reference to the current collector object (read lock)
2) perform write lock-free modification of data structure. Proxy
read lock prevents nodes from being deallocated out from under
you and any ABA problems.

if you have one or more removed nodes

3) lock-free enqueue of the new proxy object

(having a proxy lock prevents ABA problems here)

finally

4) release proxy reference (unlock)

The fifo ordering is only required for the one form of RCU+SMR
and for some reason I thought the proxy collector needed it also.
Which means that kung fu monkey through the trees swap trick
I had at one point actually will work also. I hate when that
happens. :)

So with that trick.

1) acquire a reference to the current collector object (read lock)
2) perform write lock-free modification of data structure. Proxy
read lock prevents nodes from being deallocated out from under
you and any ABA problems.
3) release proxy reference (unlock)

if you have one or more removed nodes

4) lock-free enqueue of the new proxy object
(no ABA problem with swap)

finally

Joe Seigh

unread,

Nov 14, 2006, 11:34:01 PM11/14/06

to

Oh and step 5) defer free of the new proxy object if you're
not using reference counting or something like that. If
you're using reference counting, decrement the reference
count by 1 (initial value for the new proxy node was 3,
1 for the proxy anchor, 1 for the previous proxy node, and
one held by the current thread).

Joe Seigh

unread,

Nov 15, 2006, 12:13:25 AM11/15/06

to

Joe Seigh wrote:
>
> Oh and step 5) defer free of the new proxy object if you're
> not using reference counting or something like that. If
> you're using reference counting, decrement the reference
> count by 1 (initial value for the new proxy node was 3,
> 1 for the proxy anchor, 1 for the previous proxy node, and
> one held by the current thread).
>
>

Ack. I'm trying to remember from appc (atomic_ptr proxy collector)
except I didn't have to explicitly manage refcounts in that.
I'll illustrate what the queue looks like with refcount values
for each node

queue before pushing new collector node

C -> 2 <- 1 <- 1 <- ... <- ?

queue after swap in of new node w/ its refcount initialized to 2

C -> 2 2 <- 1 <- 1 <- ... <- ?

after linking old collector node to point to new one

C -> 2 <- 2 <- 1 <- 1 <- ... <- ?

after decrementing refcount of old collector node

C -> 2 <- 1 <- 1 <- 1 <- ... <- ?

Chris Thomasson

unread,

Nov 15, 2006, 2:42:23 AM11/15/06

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message

news:pLydnQ43JYS0CMfY...@comcast.com...

> Chris Thomasson wrote:
>>>following steps.
>> The way I do it sounds a little different than yours?

[...]

> That's better. Just hold onto the same proxy lock
> through out

[...]

> The fifo ordering is only required for the one form of RCU+SMR
> and for some reason I thought the proxy collector needed it also.
> Which means that kung fu monkey through the trees swap trick
> I had at one point actually will work also.

I can't decide if I would use a method that just does a lock-free push onto
the proxy object's stack of deferred objects in the same critical-region as
the write, or swap a completely new collector in at a later time... Humm...

> I hate when that happens. :)

;)

Something like this comes to mind:

Chris Thomasson

unread,

Nov 15, 2006, 2:44:57 AM11/15/06

to

>
>> I hate when that happens. :)
>
> ;)
>
>
> Something like this comes to mind:

Can't find the link. darn... Anyway I would just do stuff inside the
critical-region, unless I was creating a more generic lock-free container
class that could return pointers to its nodes. Humm...

Joe Seigh

unread,

Nov 15, 2006, 8:50:12 AM11/15/06

to

Chris Thomasson wrote:
> "Joe Seigh" <jsei...@xemaps.com> wrote in message
> news:pLydnQ43JYS0CMfY...@comcast.com...

>>The fifo ordering is only required for the one form of RCU+SMR
>>and for some reason I thought the proxy collector needed it also.
>>Which means that kung fu monkey through the trees swap trick
>>I had at one point actually will work also.
>
>
> I can't decide if I would use a method that just does a lock-free push onto
> the proxy object's stack of deferred objects in the same critical-region as
> the write, or swap a completely new collector in at a later time... Humm...
>

The swap has simpler logic, just one atomic op. It doesn't need the
lock-free forward progress guarantees that the Michael and Scott
fifo queue needs since the dequeuing of proxy elements can't
happen until the proxy lock is released by the writer thread anyway.

But using a conventional lock is simpler especially when you consider
how few write lock-free data structures with some form of traversal
required there are.

Joe Seigh

unread,

Nov 15, 2006, 10:59:02 AM11/15/06

to

Joe Seigh wrote:
> But using a conventional lock is simpler especially when you consider
> how few write lock-free data structures with some form of traversal
> required there are.
>

I'm thinking of read only traversal. Actually, all of the lock-free
algorthms require some form of memory management. That includes
lock-free fifo queue and lock-free lifo queue. Even if you use
versioning with the latter you need to make sure the memory never
gets unmapped.

Joe Seigh

unread,

Nov 15, 2006, 12:31:38 PM11/15/06

to

You refcount the nodes you return pointers to. The refcounting
would be protected by the proxy lock so you can use a simpler
form similar to rcuref in the linux kernel.

Steve Watt

unread,

Nov 16, 2006, 3:02:17 PM11/16/06

to

In article <ZbKdnZ7fNNldXMfY...@comcast.com>,
Chris Thomasson <cri...@comcast.net> wrote:
[ ... ]

>Something like this comes to mind:
>
>
>

I.e. nothing comes to mind? :)

(Yeah, I know, you probably hit send too fast, but it's such an easy
straght line. Guess I've been reading too much alt.humor lately.)
--
Steve Watt KD6GGD PP-ASEL-IA ICBM: 121W 56' 57.5" / 37N 20' 15.3"
Internet: steve @ Watt.COM Whois: SW32-ARIN
Free time? There's no such thing. It just comes in varying prices...

Chris Thomasson

unread,

Nov 17, 2006, 11:20:59 PM11/17/06

to

"Joe Seigh" <jsei...@xemaps.com> wrote in message

news:_9SdnZApC7gZh8bY...@comcast.com...

> Chris Thomasson wrote:
>> "Joe Seigh" <jsei...@xemaps.com> wrote in message
>> news:pLydnQ43JYS0CMfY...@comcast.com...
>>>The fifo ordering is only required for the one form of RCU+SMR
>>>and for some reason I thought the proxy collector needed it also.
>>>Which means that kung fu monkey through the trees swap trick
>>>I had at one point actually will work also.
>>
>>
>> I can't decide if I would use a method that just does a lock-free push
>> onto the proxy object's stack of deferred objects in the same
>> critical-region as the write, or swap a completely new collector in at a
>> later time... Humm...
>>
> The swap has simpler logic, just one atomic op. It doesn't need the
> lock-free forward progress guarantees that the Michael and Scott
> fifo queue needs

[...]

Yes. I don't need Michael and Scott FIFO queuing either... Just a SWAP for
the consume, and a CAS for the produce... I use fetch-and-add (e.g.m, XADD
on IA-32) for the reference count adjustments... It all lock-free*... This
is directly referring to a proxy collector objects private stack of deferred
objects...

-- For some more detailed information, and a complete working prototype, I
would advise anybody who is interested to take a look at the following
posts, and do follow all of the links:

http://groups.google.com/group/comp.arch/browse_frm/thread/b2f751d5dad6c07b

-- If anybody is interested and studies the working protoype code, you can
make safe use of this and actually implement some of the stuff that Joe and
I have been discussing... This particular PDR code I invented does not rely
on FIFO, or any ordering for that matter, on the actual deferred objects...
The collectors themselves are processed in FIFO, but the order of their
individual private stacks of deferred objects can be processed in an
undefined order...

What do ya' think of this?

* -- lock-free but not Virtually Zero-Overhead... vZOOM, and Joes work can
address this issue... Joes work is tied to RCU+SMR+Joes+Nice-Work... vZOOM
is my personal improvements on SMR and RCU, plus a neat virtually
zero-overhead reference counting technique that is compatible with Joes
work, my work, RCU, SMR, Pass-The-Buck, Single-Target rip-off of Joes
atomic_ptr, my mostly lock-free atomic ptr, ect...

Any thoughts?

;^)