Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A race-condition in a SUN paper by Detlefs on reference counting...

6 views
Skip to first unread message

Chris Thomasson

unread,
Apr 4, 2008, 9:55:13 AM4/4/08
to
Here is the paper in question:

http://citeseer.ist.psu.edu/cache/papers/cs/25408/http:zSzzSzwww.sun.comzSzresearchzSzpeoplezSzdetlefszSzpaperszSz01-podc.pdf/detlefs01lockfree.pdf

on page 8 they present pseudo-code. Here is the function:
___________________________________________________________
void LFRCLoad(SNode **A, SNode **dest) {
1 SNode *a, *olddest = *dest;
2 long r;
3 while (true) {
4 a = *A;
5 if (a == Null) {
6 *dest = Null;
7 break;
}
8 r = a->rc;
9 if (DCAS(A, &a->rc, a, r, a, r+1)) {
10 *dest = a;
11 break;
}
}
12 LFRCDestroy(olddest);
}
___________________________________________________________


AFAICT, this will not work at all. The reference count can be dropped to
zero right before 'line 8'. Then it goes BOOM! I wonder if this is why SUN
has patent application 20060037026? They needed to use Joe Seighs atomic
pointer:

http://atomic-ptr-plus.sourceforge.net

So they took a patent application out on it. Yikes!


--
Chris M. Thomasson
http://appcore.home.comcast.net

Chris Thomasson

unread,
Apr 4, 2008, 10:17:42 AM4/4/08
to
"Chris Thomasson" <cri...@comcast.net> wrote in message
news:wpudnRpPHc-armva...@comcast.com...

> Here is the paper in question:
>
> http://citeseer.ist.psu.edu/cache/papers/cs/25408/http:zSzzSzwww.sun.comzSzresearchzSzpeoplezSzdetlefszSzpaperszSz01-podc.pdf/detlefs01lockfree.pdf
>
> on page 8 they present pseudo-code. Here is the function:
> ___________________________________________________________
[...]

> ___________________________________________________________
>
>
> AFAICT, this will not work at all. The reference count can be dropped to
> zero right before 'line 8'. Then it goes BOOM! I wonder if this is why SUN
> has patent application 20060037026? They needed to use Joe Seighs atomic
> pointer:
>
> http://atomic-ptr-plus.sourceforge.net
>
> So they took a patent application out on it. Yikes!

Here is some more information on that:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/f2c94118046142e8

Chris Thomasson

unread,
Apr 4, 2008, 10:52:37 AM4/4/08
to
"Chris Thomasson" <cri...@comcast.net> wrote in message
news:wpudnRpPHc-armva...@comcast.com...
[...]

>
>
> AFAICT, this will not work at all. The reference count can be dropped to
> zero right before 'line 8'. Then it goes BOOM!

[...]

Here is a more recent version of the paper:

http://research.sun.com/scalable/pubs/LFRC-DC02.pdf

Unfortunately, it still seems to have the same race-condition. Before I send
an e-mail to Mark Moir and/or David, I was wondering if there is an errata
posted for this that I am missing?


Here is how the race can occur:
_________________________________________________________________


void LFRCLoad(SNode **A, SNode **dest) {
1 SNode *a, *olddest = *dest;
2 long r;
3 while (true) {
4 a = *A;
5 if (a == Null) {
6 *dest = Null;
7 break;
}
8 r = a->rc;
9 if (DCAS(A, &a->rc, a, r, a, r+1)) {
10 *dest = a;
11 break;
}
}
12 LFRCDestroy(olddest);
}


// a shared pointer to a SNode object
static SNode* g_sptr = NULL;


R0: Thread A - Creates new SNode object A.
R2: Thread A - Calls LFRCStore(&g_sptr, A).
R3: Thread A - Calls LFRCDestroy(A).
R4: Thread B - Calls LFRCLoad(&g_sptr, ...) and gets to line 5.
R5: Thread A - Calls LFRCDCAS(&g_sptr, &A::rc, A, A::rc, 0, 0)
R6: Thread B - Executes line 8 and reads from the dead object!
_________________________________________________________________


In steps R0-R2 ThreadA creates a new ObjectA and stores it into the shared
pointer g_sptr making the reference count of ObjectA 2. In step r3 ThreadA
destroys its reference to objectA thus rendering its reference count 1. In
step R4 ThreadB tries to load from the shared pointer g_sptr by deferenceing
it and getting a pointer to objectA whose reference count is 1. ThreadB gets
to line 5 and is pre-empted. In step R5 ThreadA swaps out the objectA
pointed to by g_sptr and destroys the reference which drops the reference
count of ObjectA to zero and it gets deleted. ObjectA is completely
destroyed at this point. Now we move on to step R6... ThreadB still has a
pointer to ObjectA so it merrily executes line 8 in the pseudo-code which
tries to read the reference count of ObjectA... End result is seg-fault.
BOOM!


There has to be an errata I am missing. Humm.

0 new messages