Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Implementing thread-safe shared_ptr/weak_ptr using atomic counts

13 views
Skip to first unread message

Niklas Matthies

unread,
Mar 21, 2003, 3:53:05 AM3/21/03
to

I need a sanity check on my shared_ptr/weak_ptr implementation.
See Boost for the basic concepts and a reference implementation.

<motivation>

In my application, there are weak_ptrs that will never be destructed,
since they need to be accessible from destructors of possibly static
objects, regardless of destruction order.

The Boost implementation uses mutex locking to make operations atomic,
which means that the weak_ptr's associated mutex object would never be
destructed, resulting in a potential resource leak, in particular when
dynamic loading/unloading of libraries is involved.

For this reason, I looked into implementing a shared_ptr/weak_ptr
combination using atomic (= thread-safe) counts that don't require
any resources beyond the storage of the count variable itself.

</motivation>

Assume that there is a class atomic_count which provides both prefix
and postfix atomic increment and decrement, as well as an atomic
compare-and-increment operation, and whose destructor is a no-op.
(Under Win32, the Interlocked*() family of functions can be used to
implement this.)

The shared_ptr and weak_ptr objects that refer to one object share
a common reference count object (cf. boost::counted_base) which
maintains two distinct counts. In my implementation the class is
called shared_count (not to be confused with boost::shared_count,
which has a different purpose) and has the following interface:

class shared_count
: nocopy // prevents copying
{
public:

shared_count(long initial_strongcount = 1);
shared_count(long initial_strongcount, long initial_refcount);

void add_ref();
void release_ref();

void add_strong();
void release_strong();

virtual ~shared_count();

private:

virtual void v_dispose() = 0;

private:

atomic_count m_refcount;
atomic_count m_strongcount;
};

Here, m_refcount is the total reference count--i.e. both shared_ptrs
and weak_ptrs added together--which serves to keep the shared_count
object alive, and m_strongcount is the number of shared_ptrs, keeping
the actual object alive that the shared/weak_ptrs refer to (when
m_strongcount drops to zero, v_dispose() is called).

The implementation of add_ref()/release_ref() is pretty trivial:

inline void shared_count::add_ref()
{
++m_refcount;
}

inline void shared_count::release_ref()
{
if (--m_refcount == 0) delete this;
}

Things become interesting with add_strong()/release_strong(), since
they need to update both counts.

First, release_strong()'s implementation:

void shared_count::release_strong()
{
if (--m_strongcount == 0) v_dispose();
release_ref();
}

The fact that the two decrements are not a single atomic operation is
not a problem--it's equivalent to having an extra weak_ptr between both
decrements.

The tricky issue with add_strong() is that it needs to check whether
m_strongcount is zero, which means that the referred-to object has
already been disposed, hence creation of a strong reference must be
prevented in that case. We want to throw an exception then, as Boost
does.

The concrete situation where this can happen is when a shared_ptr is
created from a weak_ptr. The exception tells the client that the
weak_ptr is a dangling pointer, and therefore no shared_ptr can be
created from it.

The implementation of add_strong() cannot just increment m_strongcount
and then if the result is 1 reset it to 0 and throw the exception,
because while m_strongcount is temporarily set to 1, a different
thread could successfully create a shared_ptr from a weak_ptr.

So I implemented add_strong() as follows:

void shared_count::add_strong()
{
long n;

do
{
n = m_strongcount; // [1]
if (n == 0) throw some_exception;
}
while (n != m_strongcount.increment_if_equals(n)); // [2]

add_ref();
}

In case it is not obvious, atomic_count is convertible to long, hence
line [1] performs an atomic read of the count.

In line [2], increment_if_equals() atomically increments the count iff
it is equal to n, and then returns its prior value. This equality test
is necessary since m_strongcount could have been modified by a different
thread (in particular: have dropped to zero) between [1] and [2].

So we repeat until m_strongcount has the same non-zero value at [2] as
was determined at [1]. Obviously this is more than we need, since all
we actually care about is that m_strongcount is non-zero at [2],
regardless of the exact it had value at [1].

But, since the Interlocked*() functions under Win32 do not facilitate
an increment_unless_equals() function to be implemented, this is the
best I could come up with.

There is a minimal risk that the loop never terminates. This may
happen if other threads continuously and endlessly modify the count in
a tight loop, and thread scheduling is such that this always happens
between [1] and [2]. While not totally impossible, I think it is
resonable to assume that this will never happen in practice.

Lastly, the fact that add_ref() is only invoked at the end of
add_strong() is again no problem, since it is implied that any caller
of add_strong()--typically a shared_ptr or weak_ptr member function--
already has another (weak) reference, hence m_refcount cannot drop
to zero during add_strong(), so everything is well.

My questions are:

1) Is this really safe, or is there some race condition I have missed?
2) Did I overlook some way to improve add_strong(), given the above
atomic_count primitives?

-- Niklas Matthies

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Alexander Terekhov

unread,
Mar 21, 2003, 6:59:04 PM3/21/03
to

Niklas Matthies wrote:
[...]

> But, since the Interlocked*() functions under Win32 do not facilitate
> an increment_unless_equals() function to be implemented,

Yeah. You might want to take a look at this:

http://lists.boost.org/MailArchives/boost/msg43813.php
(Subject: [boost] Re: Weak ref. via atomicity (RE: Smart pointers:...))

> this is the best I could come up with.

FWIW, thus far, the best I could come up with (that I can show
publicly) is this:

http://lists.boost.org/MailArchives/boost/msg43729.php
http://www.terekhov.de/pthread_refcount_t/draft-edits.txt
http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.h
http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.c

regards,
alexander.

Joseph Seigh

unread,
Mar 22, 2003, 4:40:43 AM3/22/03
to

Niklas Matthies wrote:
>
....


> So I implemented add_strong() as follows:
>
> void shared_count::add_strong()
> {
> long n;
>
> do
> {
> n = m_strongcount; // [1]
> if (n == 0) throw some_exception;
> }
> while (n != m_strongcount.increment_if_equals(n)); // [2]
>
> add_ref();
> }
>

....


>
> But, since the Interlocked*() functions under Win32 do not facilitate
> an increment_unless_equals() function to be implemented, this is the
> best I could come up with.
>

Actually you can do that with this or something like it.

void shared_count::add_strong()
{
long cmp;
long n;

cmp = n = m_strongcount;

while (n != 0) {
cmp = n;
n = InterlockedCompareExchange(&this.m_strongcount, cmp + 1, cmp);
if (n == cmp)
break;
}

if (n == 0)
throw some exception;

add_ref();
}

Joe Seigh

Joseph Seigh

unread,
Mar 22, 2003, 3:27:19 PM3/22/03
to

I wrote:
....


> n = InterlockedCompareExchange(&this.m_strongcount, cmp + 1, cmp);

Umm... that should of course just be

n = InterlockedCompareExchange(&m_strongcount, cmp + 1, cmp);

A little bit of a mix of C pseudo objects and Java in there. I'm suprised I didn't
manage to throw in a perlism.

I've been trying to come up with an all interlocked increment/decrement solution since
you might get better performance on some platforms with native interlocked increment/decrement
and this is, I gather, performance sensitive code. Here is an almost all interlocked increment/decrement
solution.

void shared_count::add_strong() {
if (InterlockedIncrement(&m_strongcount) < 0) {
InterlockedDecrement(&m_strongcount); // good idea. LONG_MIN isn't negative infinity
throw some exception;
}
add_ref();
}

void shared_count::release_strong() {
if (InterlockedDecrement(&m_strongcount) == 0)
// attempt to set strongcount to LONG_MIN. If attempt fails, some other
// thread has incremented strongcount so object is still alive or already disposed of.
if (InterlockedCompareExchange(&m_strongcount, LONG_MIN, 0) == 0) v_dispose();
release_ref();
}


Some comments. The exception throwing path I think is not performance critical since the
action on getting the exception should be to set the weakptr to something else or null. So
the exception path shouldn't get used a lot.

If you have native interlocked increment/decrement, then this code is totally loop free.
Not that small compare and swap loops aren't usually a problem, but still it's nice.
Of course the typical heap manager still uses locks so we can't claim total lock-free
here.

Also, you might consider packing both refcount and strongcount into the same word and
manipulating both with a single interlocked operation to reduce the use of interlocked
instructions to a minimum. For 32 bit words this would constrain your max refcounts to
64000 or so. Maybe it would give you an excuse to implement a refcount overflow exception. :)

Final comment to Alexander. This is why I don't think you can come up something like
pthread_refcount_t. There are too many ways to implement things here. I'd think you'd be
better off trying to implement a more generic interlocked api with performance hints so
the various applications could customize optimal solutions for various platforms.

Niklas Matthies

unread,
Mar 22, 2003, 3:43:34 PM3/22/03
to
On 2003-03-22 09:40, Joseph Seigh <jsei...@xemaps.com> wrote:
> Niklas Matthies wrote:
[...]

> > But, since the Interlocked*() functions under Win32 do not facilitate
> > an increment_unless_equals() function to be implemented, this is the
> > best I could come up with.
> >
>
> Actually you can do that with this or something like it.
>
> void shared_count::add_strong()
> {
> long cmp;
> long n;
>
> cmp = n = m_strongcount;
>
> while (n != 0) {
> cmp = n;
> n = InterlockedCompareExchange(&this.m_strongcount, cmp + 1, cmp);
> if (n == cmp)
> break;
> }
>
> if (n == 0)
> throw some exception;
>
> add_ref();
> }

That's a nice improvement, thanks!

-- Niklas Matthies

Niklas Matthies

unread,
Mar 22, 2003, 3:45:16 PM3/22/03
to
On 2003-03-21 23:59, Alexander Terekhov <tere...@web.de> wrote:
> Niklas Matthies wrote:
> [...]
>> But, since the Interlocked*() functions under Win32 do not facilitate
>> an increment_unless_equals() function to be implemented,
>
> Yeah. You might want to take a look at this:
>
> http://lists.boost.org/MailArchives/boost/msg43813.php
> (Subject: [boost] Re: Weak ref. via atomicity (RE: Smart pointers:...))
>
>> this is the best I could come up with.
>
> FWIW, thus far, the best I could come up with (that I can show
> publicly) is this:
>
> http://lists.boost.org/MailArchives/boost/msg43729.php
> http://www.terekhov.de/pthread_refcount_t/draft-edits.txt
> http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.h
> http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.c

I'm afraid this doesn't help, sind this refcount implementation seems
to require some cleanup (pthread_refcount_destroy()). I'm specifically
implementing a resource-less refcount.

-- Niklas Matthies

Alexander Terekhov

unread,
Mar 22, 2003, 9:24:23 PM3/22/03
to

Niklas Matthies wrote:
[...]

> > http://www.terekhov.de/pthread_refcount_t/draft-edits.txt
> > http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.h
> > http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.c
>
> I'm afraid this doesn't help, sind this refcount implementation seems
> to require some cleanup (pthread_refcount_destroy()). I'm specifically
> implementing a resource-less refcount.

"resource-less refcount". Interesting. Would you please provide some
kinda-real example illustrating the problem you are trying to solve
using a "resource-less refcount"? TIA.

regards,
alexander.

Niklas Matthies

unread,
Mar 23, 2003, 4:08:21 PM3/23/03
to
On 2003-03-22 20:27, Joseph Seigh <jsei...@xemaps.com> wrote:
[...]

> I've been trying to come up with an all interlocked
> increment/decrement solution since you might get better performance
> on some platforms with native interlocked increment/decrement and
> this is, I gather, performance sensitive code.

A few spins are probably fine, but I'd prefer to have some definite
guaranteed upper bound.

> Here is an almost all interlocked increment/decrement solution.
>
> void shared_count::add_strong() {
> if (InterlockedIncrement(&m_strongcount) < 0) {
> InterlockedDecrement(&m_strongcount); // good idea. LONG_MIN isn't negative infinity
> throw some exception;
> }
> add_ref();
> }
>
> void shared_count::release_strong() {
> if (InterlockedDecrement(&m_strongcount) == 0)
> // attempt to set strongcount to LONG_MIN. If attempt fails, some other
> // thread has incremented strongcount so object is still alive or already disposed of.
> if (InterlockedCompareExchange(&m_strongcount, LONG_MIN, 0) == 0) v_dispose();
> release_ref();
> }

Now that's truly cute. :)

> Some comments. The exception throwing path I think is not
> performance critical since the action on getting the exception
> should be to set the weakptr to something else or null. So the
> exception path shouldn't get used a lot.

There are situations where it might get used a lot, but performance is
not critical there.

> If you have native interlocked increment/decrement, then this code
> is totally loop free. Not that small compare and swap loops aren't
> usually a problem, but still it's nice. Of course the typical heap
> manager still uses locks so we can't claim total lock-free here.

Only when you use the shared_ptr to actually manage heap objects. :)

> Also, you might consider packing both refcount and strongcount into
> the same word and manipulating both with a single interlocked
> operation to reduce the use of interlocked instructions to a
> minimum. For 32 bit words this would constrain your max refcounts
> to 64000 or so. Maybe it would give you an excuse to implement a
> refcount overflow exception. :)

It's tempting, but I'd rather have add_ref() remain nothrow.
It might also be difficult to do the compare-exchange operation then,
since it has to work independently of changes to the refcount.

-- Niklas Matthies

Niklas Matthies

unread,
Mar 23, 2003, 4:13:30 PM3/23/03
to
On 2003-03-23 02:24, Alexander Terekhov <tere...@web.de> wrote:
> Niklas Matthies wrote:
> [...]
>> > http://www.terekhov.de/pthread_refcount_t/draft-edits.txt
>> > http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.h
>> > http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.c
>>
>> I'm afraid this doesn't help, sind this refcount implementation seems
>> to require some cleanup (pthread_refcount_destroy()). I'm specifically
>> implementing a resource-less refcount.
>
> "resource-less refcount". Interesting. Would you please provide some
> kinda-real example illustrating the problem you are trying to solve
> using a "resource-less refcount"? TIA.

I already explained this in thread's initial posting. In essence, I
need to reliably detect that some singleton has already been destroyed,
even from within destructors of statically allocated objects, and I
can't rely on some "resource-ful" mutex or refcount object to still be
alive.

-- Niklas Matthies

Alexander Terekhov

unread,
Mar 24, 2003, 5:16:19 AM3/24/03
to

Niklas Matthies wrote:
[...]

> > "resource-less refcount". Interesting. Would you please provide some
> > kinda-real example illustrating the problem you are trying to solve
> > using a "resource-less refcount"? TIA.
>
> I already explained this in thread's initial posting. In essence, I
> need to reliably detect that some singleton has already been destroyed,
> even from within destructors of statically allocated objects,

Okay.

> and I
> can't rely on some "resource-ful" mutex or refcount object to still be
> alive.

Why? Please illustrate.

regards,
alexander.

Niklas Matthies

unread,
Mar 24, 2003, 3:18:24 PM3/24/03
to
On 2003-03-24 10:16, Alexander Terekhov <tere...@web.de> wrote:
> Niklas Matthies wrote:
> [...]
> > > "resource-less refcount". Interesting. Would you please provide some
> > > kinda-real example illustrating the problem you are trying to solve
> > > using a "resource-less refcount"? TIA.
> >
> > I already explained this in thread's initial posting. In essence, I
> > need to reliably detect that some singleton has already been destroyed,
> > even from within destructors of statically allocated objects,
>
> Okay.
>
> > and I
> > can't rely on some "resource-ful" mutex or refcount object to still be
> > alive.
>
> Why? Please illustrate.

Simply because I don't control the order of destruction of static
objects (including the order of shared library unloading).

-- Niklas Matthies

0 new messages