why boost:shared_ptr so slower?

chris

unread,

Aug 19, 2009, 6:53:29 PM8/19/09

to

I've just done a performance test on the shared_ptr compared to the
native pointer case, the results show at lease 3 times slower when the
iteration number > 10000, this is the code snippet:

#include <vector>
#include <iostream>
#include <boost/shared_ptr.hpp>

using namespace std;
using namespace boost;

class Thing
{
public:
Thing()
{
}

void method (void)
{
int i = 5;
}
};

typedef boost::shared_ptr<Thing> ThingPtr;

void processThing(Thing* thing)
{
thing->method();
}

//loop1 and loop2 test shared_ptr in the vector container
void loop1(long long num)
{
vector<ThingPtr> thingPtrs;

for(int i=0; i< num; i++) {
ThingPtr p1(new Thing);
thingPtrs.push_back(p1);
}
}

void loop2(long long num)
{
vector<Thing> thingPtrs;
for(int i=0; i< num; i++) {
Thing thing;
thingPtrs.push_back(thing);
}
thingPtrs.clear();
}

//loop3 and loop4 test shared_ptr in the vector container
void loop3(long long num)
{
for(int i=0; i< num; i++) {
ThingPtr p1(new Thing);
processThing(p1.get());
}
}

void loop4(long long num)
{
for(int i=0; i< num; i++) {
Thing* p1 = new Thing();
processThing(p1);
delete p1;
}
}

The results are the following:
CPU: Intel Core2 Quad CPU Q8200
RAM: 4G
OS: Windows XP SP2
Compiler: Visual Studio 2005

loop1 vs loop2: 100000 times
loop1 elapsed 390 msec
loop2 elapsed 93 msec

loop5 vs loop6: 100000 times
loop5 elapsed 171 msec
loop6 elapsed 78 msec

If we heavily use the boost::shared_ptr in a large project, is that a
big porformance problem?

chris

Sam

unread,

Aug 19, 2009, 7:40:23 PM8/19/09

to

chris writes:

> If we heavily use the boost::shared_ptr in a large project, is that a
> big porformance problem?

I always thought that Boost was generally crap. I never understood that
library's popularity. And shared_ptr was it's cream of the crap. shared_ptr
is a structure with two pointers: one to the original object, and a second
pointer to a separately-allocate memory block that stores the reference
count.

So, passing a shared_ptr around, say, as a parameter to a function call,
means pushing two pointers on the stack, versus one. 200% overhead. Plus,
the reference count is kept in a separate, small object. Each time you
allocate an object, you end up allocating another object, to hold the
refenrece count. Then, with all the shared_ptrs flying around, you now have
a whole bunch of heap churn going on, allocating and deallocating small
blocks of memory, for all the reference counts. Much higher heap memory
fragmentation.

Horrible.

Message has been deleted

Brian Wood

unread,

Aug 19, 2009, 9:55:47 PM8/19/09

to

On Aug 19, 6:40 pm, Sam <s...@email-scan.com> wrote:
> chris writes:
> > If we heavily use the boost::shared_ptr in a large project, is that a
> > big porformance problem?
>
> I always thought that Boost was generally crap. I never understood that
> library's popularity. And shared_ptr was it's cream of the crap. shared_ptr
> is a structure with two pointers: one to the original object, and a second
> pointer to a separately-allocate memory block that stores the reference
> count.

I wouldn't say Boost is generally poor quality. Some of the libs
are pretty good and some are less so. One Boost library that I
think highly of is the Boost Intrusive library --
http://www.boost.org/doc/libs/1_39_0/doc/html/intrusive.html

Brian Wood
www.webEbenezer.net

Ian Collins

unread,

Aug 20, 2009, 3:29:33 AM8/20/09

to

Sam wrote:
> chris writes:
>
>> If we heavily use the boost::shared_ptr in a large project, is that a
>> big porformance problem?
>
> I always thought that Boost was generally crap. I never understood that
> library's popularity. And shared_ptr was it's cream of the crap.
> shared_ptr is a structure with two pointers: one to the original object,
> and a second pointer to a separately-allocate memory block that stores
> the reference count.

It's a reference counted smart pointer, so what else do you expect?

--
Ian Collins

James Kanze

unread,

Aug 20, 2009, 5:17:01 AM8/20/09

to

Performance isn't really the problem---it's obviously easier to
find examples where boost::shared_ptr is slower than a raw
pointer, because it does more. If you need that "more", the
question is whether boost::shared_ptr is more efficient than
other solutions to the same problem you're trying to solve.
From what I've seen, boost::shared_ptr is a excellent
implementation of its design specification, and when used
appropriately, does the job quite well.

What worries me in the original posting is the "we heavily use
the boost::shared_ptr". There aren't really that many cases
where it's appropriate, and the "heavily use" sounds like
they're counting on it to be a silver bullet for all object
lifetime and memory management issues. Which it isn't.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Sam

unread,

Aug 20, 2009, 7:08:21 AM8/20/09

to

Ian Collins writes:

Someone to think about it, before cranking out the code. shared_ptr may be
an adequate academical implementation, but it belongs in a classroom, not in
the real world. And part of the classroom study would be how it fragments
the heap, with bazillions of tiny chunks of allocated memory.

Pete Becker

unread,

Aug 20, 2009, 9:03:13 AM8/20/09

to

joseph cook wrote:

> On Aug 19, 7:40 pm, Sam <s...@email-scan.com> wrote:
>> chris writes:

> <snip> Then, with all the shared_ptrs flying around, you now have

>> a whole bunch of heap churn going on, allocating and deallocating small
>> blocks of memory, for all the reference counts. Much higher heap memory
>> fragmentation.
>>
>> Horrible.
>>
>

> I have to agree, at least partially. the boost::smart_ptr<> is
> allocating memory off the heap (twice!), which on some systems might
> make it 100X slower or more than passing around a raw pointer.
>

Whoa, that's mixing a couple of different things.

There is one allocation off the heap when you create an object. There is
one allocation off the heap when you create a boost::shared_ptr object
to manage that object. So you get two allocations when you create an
object and manage it with a boost::shared_ptr, as opposed to one
allocation when you create an object and manage it with a raw pointer.
There are no additional allocations when you pass a boost::shared_ptr
object around.

The speed difference between a raw pointer and a shared_ptr is mostly in
managing the shared_ptr's reference count: increment when you create a
new shared_ptr object, decrement when you destroy it.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of
"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)

Pete Becker

unread,

Aug 20, 2009, 9:05:01 AM8/20/09

to

Sam wrote:
> And part of the classroom study would be how it
> fragments the heap, with bazillions of tiny chunks of allocated memory.

And that, in turn, depends heavily on the heap manager that you're using.

Jorgen Grahn

unread,

Aug 20, 2009, 10:01:48 AM8/20/09

to

On Wed, 19 Aug 2009 18:40:23 -0500, Sam <s...@email-scan.com> wrote:
...

> I always thought that Boost was generally crap. I never understood that
> library's popularity. And shared_ptr was it's cream of the crap. shared_ptr
> is a structure with two pointers: one to the original object, and a second
> pointer to a separately-allocate memory block that stores the reference
> count.
>
> So, passing a shared_ptr around, say, as a parameter to a function call,
> means pushing two pointers on the stack, versus one. 200% overhead.

Two-instead-of-one is a 100% overhead. Also note that 200% of almost
nothing is still almost nothing, and that far from all function calls
push arguments onto a stack.

> Plus,
> the reference count is kept in a separate, small object. Each time you
> allocate an object, you end up allocating another object, to hold the
> refenrece count. Then, with all the shared_ptrs flying around, you now have
> a whole bunch of heap churn going on, allocating and deallocating small
> blocks of memory, for all the reference counts. Much higher heap memory
> fragmentation.

I would have expected them to pool these allocations somehow. But I
haven't read the source code, and more importantly I haven't profiled
boost.shared_ptr against another solution in real code. I have never
needed shared pointers badly enough.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Maxim Yegorushkin

unread,

Aug 20, 2009, 10:01:10 AM8/20/09

to

chris wrote:
> I've just done a performance test on the shared_ptr compared to the
> native pointer case, the results show at lease 3 times slower when the
> iteration number > 10000, this is the code snippet:
>
> #include <vector>
> #include <iostream>
> #include <boost/shared_ptr.hpp>
>
> using namespace std;
> using namespace boost;
>
> class Thing
> {
> public:
> Thing()
> {
> }
>
> void method (void)
> {
> int i = 5;
> }
> };
>
> typedef boost::shared_ptr<Thing> ThingPtr;
>
> void processThing(Thing* thing)
> {
> thing->method();
> }
>
> //loop1 and loop2 test shared_ptr in the vector container
> void loop1(long long num)
> {
> vector<ThingPtr> thingPtrs;
>
> for(int i=0; i< num; i++) {
> ThingPtr p1(new Thing);
> thingPtrs.push_back(p1);
> }
> }

loop1 one body does up to three memory allocations: one in 'new Thing', another
in 'ThingPtr p(<raw-pointer>)' and the last one in vector::push_back().

> void loop2(long long num)
> {
> vector<Thing> thingPtrs;
> for(int i=0; i< num; i++) {
> Thing thing;
> thingPtrs.push_back(thing);
> }
> thingPtrs.clear();
> }

loop2 does at most one memory allocation in vector::push_back() and a copy of
thing. So, given that Thing small, loop2 always wins.

> //loop3 and loop4 test shared_ptr in the vector container
> void loop3(long long num)
> {
> for(int i=0; i< num; i++) {
> ThingPtr p1(new Thing);
> processThing(p1.get());
> }
> }
>
> void loop4(long long num)
> {
> for(int i=0; i< num; i++) {
> Thing* p1 = new Thing();
> processThing(p1);
> delete p1;
> }
> }
>
> The results are the following:
> CPU: Intel Core2 Quad CPU Q8200
> RAM: 4G
> OS: Windows XP SP2
> Compiler: Visual Studio 2005
>
> loop1 vs loop2: 100000 times
> loop1 elapsed 390 msec
> loop2 elapsed 93 msec
>
> loop5 vs loop6: 100000 times
> loop5 elapsed 171 msec
> loop6 elapsed 78 msec

Well, you posted results of loop1 vs loop2 which compare quite different things
and loop5 vs. loop6, for which you did not provide any source code. Your
question can not be answered, since there are no relevant facts provided.

--
Max

Noah Roberts

unread,

Aug 20, 2009, 12:21:23 PM8/20/09

to

Pete Becker wrote:
> joseph cook wrote:
>> On Aug 19, 7:40 pm, Sam <s...@email-scan.com> wrote:
>>> chris writes:
>> <snip> Then, with all the shared_ptrs flying around, you now have
>>> a whole bunch of heap churn going on, allocating and deallocating small
>>> blocks of memory, for all the reference counts. Much higher heap memory
>>> fragmentation.
>>>
>>> Horrible.
>>>
>>
>> I have to agree, at least partially. the boost::smart_ptr<> is
>> allocating memory off the heap (twice!), which on some systems might
>> make it 100X slower or more than passing around a raw pointer.
>>
>
> Whoa, that's mixing a couple of different things.
>
>
> There is one allocation off the heap when you create an object. There is
> one allocation off the heap when you create a boost::shared_ptr object
> to manage that object. So you get two allocations when you create an
> object and manage it with a boost::shared_ptr, as opposed to one
> allocation when you create an object and manage it with a raw pointer.
> There are no additional allocations when you pass a boost::shared_ptr
> object around.
>
> The speed difference between a raw pointer and a shared_ptr is mostly in
> managing the shared_ptr's reference count: increment when you create a
> new shared_ptr object, decrement when you destroy it.
>

Which is a good reason to pass them around as const references and let
the function create a copy if it needs it.

Pete Becker

unread,

Aug 20, 2009, 12:57:30 PM8/20/09

to

Which pretty much defeats their purpose. If you've measured the overhead
of reference counting and you can't afford it, then you shouldn't be
using it.

Noah Roberts

unread,

Aug 20, 2009, 1:57:41 PM8/20/09

to

There's no reason to make a copy of a shared_ptr unless you actually
need it to be shared. Just as we don't pass strings around willy-nilly
by value, we shouldn't pass shared_ptrs that way either. If the object
being called wants a copy, it will make one.

Pete Becker

unread,

Aug 20, 2009, 2:17:33 PM8/20/09

to

There's no reason to make a copy of a raw pointer unless you actually
need it to be shared. Except, of course, that passing pointers by
reference is not what people usually do. shared_ptr is supposed to look
like a pointer, not like a string.

Oh, and since you're micro-optimizing, don't forget to measure the
performance impact of passing shared_ptr objects by reference.

Juha Nieminen

unread,

Aug 20, 2009, 2:47:38 PM8/20/09

to

Sam wrote:
> So, passing a shared_ptr around, say, as a parameter to a function call,
> means pushing two pointers on the stack, versus one. 200% overhead.
> Plus, the reference count is kept in a separate, small object. Each time
> you allocate an object, you end up allocating another object, to hold
> the refenrece count. Then, with all the shared_ptrs flying around, you
> now have a whole bunch of heap churn going on, allocating and
> deallocating small blocks of memory, for all the reference counts. Much
> higher heap memory fragmentation.
>
> Horrible.

Since you don't like how the boost shared_ptr is implemented, it
sounds like you have a better idea. Let's hear it. How would you
implement such a shared_ptr?

Naturally it should have the same basic properties:

- Non-intrusive. We want to be able to use it for existing object types
(and even basic types if so desired).

- Since it's a smart pointer to a shared object, and it should
automatically destroy the object when the last pointer is destroyed, it
has to be reference-counted somehow.

- Works with incomplete types (the only place where the type must be
complete is when constructing the smart pointer).

- Thread-safe.

Noah Roberts

unread,

Aug 20, 2009, 2:49:41 PM8/20/09

to

Pete Becker wrote:
> Noah Roberts wrote:
>> Pete Becker wrote:
>>>
>>> Which pretty much defeats their purpose. If you've measured the
>>> overhead of reference counting and you can't afford it, then you
>>> shouldn't be using it.
>>>
>> There's no reason to make a copy of a shared_ptr unless you actually
>> need it to be shared. Just as we don't pass strings around
>> willy-nilly by value, we shouldn't pass shared_ptrs that way either.
>> If the object being called wants a copy, it will make one.
>
> There's no reason to make a copy of a raw pointer unless you actually
> need it to be shared. Except, of course, that passing pointers by
> reference is not what people usually do. shared_ptr is supposed to look
> like a pointer, not like a string.
>
> Oh, and since you're micro-optimizing, don't forget to measure the
> performance impact of passing shared_ptr objects by reference.
>

Just like you don't prematurely optimize...don't prematurely pessimize.
If there's no reason to pass objects by value...don't do it. This
isn't a micro-optimization issue.

Pete Becker

unread,

Aug 20, 2009, 3:15:28 PM8/20/09

to

Once again: shared_ptr is supposed to look like a pointer, not like a
string. If you don't pass pointers by reference, don't pass shared_ptr's

Ian Collins

unread,

Aug 20, 2009, 3:23:05 PM8/20/09

to

Which would lead into the class on heap manager design.

--
Ian Collins

Alf P. Steinbach

unread,

Aug 20, 2009, 4:09:00 PM8/20/09

to

* Pete Becker:

> Noah Roberts wrote:
>> Pete Becker wrote:
>>> Noah Roberts wrote:
>>>> Pete Becker wrote:
>>>>>
>>>>> Which pretty much defeats their purpose. If you've measured the
>>>>> overhead of reference counting and you can't afford it, then you
>>>>> shouldn't be using it.
>>>>>
>>>> There's no reason to make a copy of a shared_ptr unless you actually
>>>> need it to be shared. Just as we don't pass strings around
>>>> willy-nilly by value, we shouldn't pass shared_ptrs that way
>>>> either. If the object being called wants a copy, it will make one.
>>>
>>> There's no reason to make a copy of a raw pointer unless you actually
>>> need it to be shared. Except, of course, that passing pointers by
>>> reference is not what people usually do. shared_ptr is supposed to
>>> look like a pointer, not like a string.
>>>
>>> Oh, and since you're micro-optimizing, don't forget to measure the
>>> performance impact of passing shared_ptr objects by reference.
>>>
>>
>> Just like you don't prematurely optimize...don't prematurely
>> pessimize. If there's no reason to pass objects by value...don't do
>> it. This isn't a micro-optimization issue.
>
> Once again: shared_ptr is supposed to look like a pointer, not like a
> string. If you don't pass pointers by reference, don't pass shared_ptr's
> by reference.

I'm with Noah in principle, but with you in practice. :-) I've never felt the
need for doing const& for shared_ptr. It's just more to write for no particular
(significant) gain, and it may also convey a misleading impression that this
shared_ptr's reference count or deleter can't be modified, but in principle I
think enforcing on oneself a convention of doing const& for all but basic types
could remove some inefficiencies -- it's just that it's not that important.

Cheers,

- Alf

Squeamizh

unread,

Aug 20, 2009, 4:13:49 PM8/20/09

to

On Aug 20, 12:15 pm, Pete Becker <p...@versatilecoding.com> wrote:

> Noah Roberts wrote:
> > Just like you don't prematurely optimize...don't prematurely pessimize.
> > If there's no reason to pass objects by value...don't do it. This
> > isn't a micro-optimization issue.
>
> Once again: shared_ptr is supposed to look like a pointer, not like a
> string. If you don't pass pointers by reference, don't pass shared_ptr's
> by reference.

This is not a very convincing argument.

Keith H Duggar

unread,

Aug 20, 2009, 5:33:16 PM8/20/09

to

Boost shared_ptr is not "thread safe" by any standard that is
usually meant by the term:

http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety

ie "the same level of thread safety as built-in types" means
not "thread safe". I don't know why this misconception about
boost shared_ptr is so common. Was an earlier version of it
"thread safe" with some lock blah hidden in it? That would
hit the performance even more.

KHD

Sam

unread,

Aug 20, 2009, 5:55:37 PM8/20/09

to

Ian Collins writes:

Understood. Boost is free to implement whatever cockamamie algorithm it
wants. Efficient runtime performance is someone else's problem.

Juha Nieminen

unread,

Aug 20, 2009, 5:56:11 PM8/20/09

to

Keith H Duggar wrote:
> Boost shared_ptr is not "thread safe" by any standard that is
> usually meant by the term:
>
> http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety

First you claim that it's not thread-safe, and then you point to the
documentation which says that it is.

> ie "the same level of thread safety as built-in types" means
> not "thread safe".

You are confusing two different types of thread-safety.

Instances of shared_ptr are exactly as thread-safe as, for example,
regular pointers. If you want to pass a regular pointer from one thread
to another, you will need to use some synchronization mechanism in order
for this to work (in other words, for the target thread to know when it
can use the pointer it was given by the source thread). shared_ptr is in
no way different: If you want to pass an instance from one thread to
another, you will need some kind of synchronization, in the exact same
way as with *any* type.

What makes shared_ptr thread-safe compared to other, naive
reference-counting smart pointers, is that if an object is being shared
by more than one thread, the shared_ptr instances inside one thread can
be safely copied, assigned and destroyed without it affecting the
validity of the shared_ptrs in the other thread which point to the same
object.

A naive reference-counting smart pointer implementation which does not
take into account thread-safety cannot be used in this way. There will
be mutual exclusion problems if two threads, which share the same
object, copy, assign or destroy instances of this smart pointer pointing
to that object. That's because they are not only sharing the object,
they are also sharing the reference counter: Changing it must be
thread-locked.

> I don't know why this misconception about
> boost shared_ptr is so common.

I think it's you who is having a misconception here.

Juha Nieminen

unread,

Aug 20, 2009, 5:59:10 PM8/20/09

to

Sam wrote:
> Understood. Boost is free to implement whatever cockamamie algorithm it
> wants. Efficient runtime performance is someone else's problem.

I'm still waiting to see your better solution to this.

If you don't have a better solution to offer, stop bitching.

Sam

unread,

Aug 20, 2009, 6:13:04 PM8/20/09

to

Juha Nieminen writes:

> Sam wrote:
>> So, passing a shared_ptr around, say, as a parameter to a function call,
>> means pushing two pointers on the stack, versus one. 200% overhead.
>> Plus, the reference count is kept in a separate, small object. Each time
>> you allocate an object, you end up allocating another object, to hold
>> the refenrece count. Then, with all the shared_ptrs flying around, you
>> now have a whole bunch of heap churn going on, allocating and
>> deallocating small blocks of memory, for all the reference counts. Much
>> higher heap memory fragmentation.
>>
>> Horrible.
>
> Since you don't like how the boost shared_ptr is implemented, it
> sounds like you have a better idea. Let's hear it. How would you
> implement such a shared_ptr?

This problem has been solved already. The reference count simply needs to be
a virtual superclass, and all reference-counted objects are derived from the
superclass.

Given that it's a virtual superclass, multiple inheritance works properly.
An object multiply inherited gets one instance of the virtual superclass.
And the actual shared pointer is a single pointer, a pointer to the object;
and not two pointers. And you only need to allocate one object from the
heap, not two.

The only real advantage to shared_ptr, in its current form, that I can see
is that you can use it directly with standard library classes and basic
types. Otherwise, you'll need to derive from the class, and the virtual
reference count superclass, in order to get the shared pointer semantics.
This is not a big deal, and a couple of templates will take care of it.

> Naturally it should have the same basic properties:
>
> - Non-intrusive. We want to be able to use it for existing object types
> (and even basic types if so desired).

Not a showstopper. Can be used with existing object types, and basic types,
by subclassing them.

> - Since it's a smart pointer to a shared object, and it should
> automatically destroy the object when the last pointer is destroyed, it
> has to be reference-counted somehow.

Done.

> - Works with incomplete types (the only place where the type must be
> complete is when constructing the smart pointer).

Done.

> - Thread-safe.

Thread safety has no relevance here. The actual reference count can be
updated using the same atomic instructions that shared_ptr uses.

The grand total:

* The shared pointer becomes just a native pointer.

* The additional memory allocation gets eliminated.

* Given that the shared pointer is now just a native pointer, it's likely
that the compiler will be able to do a better job of optimizing code. With
shared pointers all over the place, now that each one requires a single CPU
register, rather than two, as is the case with shared_ptr, the compiler will
generally have more available CPU registers to work with, and likely produce
better and faster code.

Message has been deleted

Keith H Duggar

unread,

Aug 20, 2009, 7:08:59 PM8/20/09

to

On Aug 20, 5:56 pm, Juha Nieminen <nos...@thanks.invalid> wrote:
> Keith H Duggar wrote:
> > Boost shared_ptr is not "thread safe" by any standard that is
> > usually meant by the term:
>

> >http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#Th...

>
> First you claim that it's not thread-safe, and then you point to the
> documentation which says that it is.

No, I pointed to a document that says it is "as thread safe as ..."
not that it is "thread safe".

> > ie "the same level of thread safety as built-in types" means
> > not "thread safe".
>
> You are confusing two different types of thread-safety.

No, you are fundamentally missing my point. Obviously I know the
level of "thread safety" that boost::smart_ptr provides (since I
posted the reference) so you should have at least thought for a
a moment about what my point is and at least recognize the point
(whether you agree or not) before launching into an elaboration
of what the document already explains.

My point (since you missed it the first time) is that "as thread
safe as a built-in type" is not, in my opinion, what one commonly
thinks of when someone says a construct is "thread safe". So when
you posted this

> Naturally it should have the same basic properties:

> ...
> - Thread-safe.

it can easily be misunderstood or deceive someone without the
qualification "as a built-in type" which the boost document is
careful to include. To put it simply

"as thread-safe as a built-in type" != "thread safe"

and it is deceptive not to qualify it appropriately when you
challenge someone to produce an alternate implementation.

> Instances of shared_ptr are exactly as thread-safe as, for example,
> regular pointers. If you want to pass a regular pointer from one thread
> to another, you will need to use some synchronization mechanism in order
> for this to work

Which is exactly why it is not "thread safe" but rather only
"as thread safe as a ...".

[snip further elaboration of already linked documentation]

> > I don't know why this misconception about
> > boost shared_ptr is so common.
>
> I think it's you who is having a misconception here.

No, you were just missing my point. I hope it's clearer now.

KHD

Pete Becker

unread,

Aug 20, 2009, 7:23:44 PM8/20/09

to

joseph cook wrote:

> On Aug 20, 9:03 am, Pete Becker <p...@versatilecoding.com> wrote:
>> joseph cook wrote:
>>> On Aug 19, 7:40 pm, Sam <s...@email-scan.com> wrote:
>>>> chris writes:
>>> <snip> Then, with all the shared_ptrs flying around, you now have
>>>> a whole bunch of heap churn going on, allocating and deallocating small
>>>> blocks of memory, for all the reference counts. Much higher heap memory
>>>> fragmentation.
>>>> Horrible.
>>> I have to agree, at least partially. the boost::smart_ptr<> is
>>> allocating memory off the heap (twice!), which on some systems might
>>> make it 100X slower or more than passing around a raw pointer.
>> Whoa, that's mixing a couple of different things.
>>
>> There is one allocation off the heap when you create an object. There is
>> one allocation off the heap when you create a boost::shared_ptr object
>> to manage that object. So you get two allocations when you create an
>> object and manage it with a boost::shared_ptr, as opposed to one
>> allocation when you create an object and manage it with a raw pointer.
>> There are no additional allocations when you pass a boost::shared_ptr
>> object around.
>>
>> The speed difference between a raw pointer and a shared_ptr is mostly in
>> managing the shared_ptr's reference count: increment when you create a
>> new shared_ptr object, decrement when you destroy it.
>

> I disagree. In my implementation of Boost, anyway, there are 2 memory
> allocations for every shared_ptr creation, not one.

I'm not going to review the code now, but when I implemented shared_ptr
I only did one allocation. I looked pretty carefully at Boost's
implementation and I don't remember seeing two allocations.

> Also, the whole
> point of shared_ptr is that I can have, many pointers to the same
> memory location, and not worry about when it gets deleted.
>
> In a real application, maybe I would create the object once, and
> destruct it once. In the course of my application, I could have many
> pointers to this data. I want to delete the data when finally no one
> is using it (shared_ptr<>). If I were using plain pointers, I would
> always have one heap allocation, period. It wouldn't depend on how
> many intermediate objects are pointing to this data (sharing it).
>

Even if it's three allocations on initial creation, that isn't changed
when you make additional copies of that object. You only have three heap
allocations, period. Regardless of how many intermediate objects are
sharing it.

Keith H Duggar

unread,

Aug 20, 2009, 7:48:24 PM8/20/09

to

On Aug 20, 7:08 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Aug 20, 5:56 pm, Juha Nieminen <nos...@thanks.invalid> wrote:
>
> > Keith H Duggar wrote:
> > > Boost shared_ptr is not "thread safe" by any standard that is
> > > usually meant by the term:
>
> > >http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#Th...
>
> > First you claim that it's not thread-safe, and then you point to the
> > documentation which says that it is.
>
> No, I pointed to a document that says it is "as thread safe as ..."
> not that it is "thread safe".

By the way, I don't mean this as any kind of argument against boost
shared_ptr. I think the boost implementation strikes the right level
of thread-safety. A fully "thread safe" shared_ptr would not be as
useful (at least to me) as it would (most likely) suffer very high
performance penalties and support a more limited interface. I think
the two most useful levels are undefined-thread-safety (ie no extra
work for safety) and boost-level-thread-safety (for a shared_ptr).

KHD

joseph cook

unread,

Aug 20, 2009, 9:42:53 PM8/20/09

to

OK, my intent isn't to beat up boost::shared_ptr because I use it
frequently, and find it very useful. Unfortunately, I could use it a
lot more if it didn't utilize heap allocations when being constructed,
or I could control those heap allocations better. Probably I can and
just am not aware of how.

I was mistaken about the two heap allocations. Perhaps this was true
in an older boost version?

What I meant about the intermediate objects is this. If I had a
singleton as follows, which self-destructs as soon as all users of it
go out of scope:

template<typename T>
class Single
{
static shared_ptr<T> instance()
{
if(!m_instance)
{
m_instance = shared_ptr<T>(new T);
return m_instance;
}
}
static shared_ptr<T> m_instance;
};

Every time I call "instance()", a shared_ptr is being created (slow
heap allocation).

If instead I had used plain pointers:
template<typename T>
class Single
{
static T* instance()
{
if(!m_instance)
{
m_instance = new T;
}
}
static T* m_instance; //initialize to 0
};

I would have the initial allocation, but could call "instance" an
unlimited amount of times without additional allocations. (I'm
obviously not getting the reference counting either here)

Joe Cook

James Kanze

unread,

Aug 21, 2009, 6:57:41 AM8/21/09

to

On Aug 20, 3:05 pm, Pete Becker <p...@versatilecoding.com> wrote:
> Sam wrote:
> > And part of the classroom study would be how it fragments
> > the heap, with bazillions of tiny chunks of allocated
> > memory.

> And that, in turn, depends heavily on the heap manager that
> you're using.

And a lot of other things. An invasive reference counted
pointer will generally be faster, but more because copying it is
only copying one pointer, rather than two, than because of the
allocations.

IIRC, at one point, very early in the development of
boost::shared_ptr, the author actually ran some benchmarks. And
found that the extra allocation wasn't very expensive; not
expensive enough, for example, to justify the added complexity
of implementing its own memory manager. (For various reasons,
most of the usual heap managers today are very efficient when it
comes to allocating a lot of small sized objects of the same
size.) In other words, the authors actually treated the project
as if it were a professional development, and behaved
professionally.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Aug 21, 2009, 7:00:42 AM8/21/09

to

On Aug 20, 4:01 pm, Jorgen Grahn <grahn+n...@snipabacken.se> wrote:
> On Wed, 19 Aug 2009 18:40:23 -0500, Sam <s...@email-scan.com> wrote:

[...]

> I would have expected them to pool these allocations somehow.

IIRC, the authors actually experimented with several
implementations, including one which did use a pooled allocator.
And found that pooling the allocations didn't measurably change
performance.

But of course, any time you do anything professionally, you'll
find a lot of amateurs ready to criticize, because you haven't
fallen victime to their prejudices.

James Kanze

unread,

Aug 21, 2009, 7:22:59 AM8/21/09

to

On Aug 21, 1:08 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Aug 20, 5:56 pm, Juha Nieminen <nos...@thanks.invalid> wrote:

> > Keith H Duggar wrote:
> > > Boost shared_ptr is not "thread safe" by any standard that
> > > is usually meant by the term:

> > >http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#Th...

> > First you claim that it's not thread-safe, and then you
> > point to the documentation which says that it is.

> No, I pointed to a document that says it is "as thread safe as
> ..." not that it is "thread safe".

> > > ie "the same level of thread safety as built-in types" means
> > > not "thread safe".

> > You are confusing two different types of thread-safety.

> No, you are fundamentally missing my point. Obviously I know
> the level of "thread safety" that boost::smart_ptr provides
> (since I posted the reference) so you should have at least
> thought for a a moment about what my point is and at least
> recognize the point (whether you agree or not) before
> launching into an elaboration of what the document already
> explains.

> My point (since you missed it the first time) is that "as
> thread safe as a built-in type" is not, in my opinion, what
> one commonly thinks of when someone says a construct is
> "thread safe".

So what do you think one "commonly thinks of" when one says a
construct is "thread safe". As far as I can tell,
boost::smart_ptr meets the definitions of Posix, and offers
the maximum thread safety which one can reasonably expect.

> So when you posted this

> > Naturally it should have the same basic properties:
> > ...
> > - Thread-safe.

> it can easily be misunderstood or deceive someone without the
> qualification "as a built-in type" which the boost document is
> careful to include. To put it simply

> "as thread-safe as a built-in type" != "thread safe"

In what way?

The basic definition of "thread safe", at least as I see it used
by the experts in the field, more or less corresponds to the
Posix definition: there is no problem using different instances
of the object in different threads, regardless of what you're
doing with them, there is no problem using the same instance in
different threads as long as none of the threads are modifying
the instance (from a logical point of view). Anything else
requires external synchronization. (I've not analysed
shared_ptr in detail, and it's possible that it doesn't meet
these guarantees. The implementation I have access to uses
__gnu_cxx::__atomic_add_dispatch, for example, and the last time
I looked, the implementation of that function could result in a
deadlock on 32 bit Sparcs under Solaris.)

Chris M. Thomasson

unread,

Aug 21, 2009, 8:06:23 AM8/21/09

to

"Alf P. Steinbach" <al...@start.no> wrote in message
news:h6kagv$b1o$1...@news.eternal-september.org...

In a multiple processor system, avoiding mutations to the reference count
does give you significant gains indeed. Why should I suffer the penalty of a
memory barrier and an atomic RMW operation when I don't have to? The memory
barrier will have a negative effect in the form of blowing away cached data
and the atomic RMW might need to lock the damn bus, or do a cache snoop.
This has MAJOR negative effects. Passing smart pointers by reference is the
way to go. You only need to increment the counter when your passing the
pointer to another thread. Which procedure is going to be more efficient:

void func1(shared_ptr<foo> f) {

}

void func2(shared_ptr<foo>& f) {

}

?

Keep in mind that `func1' is going to have an atomic RMW + membar to
increment the count, and a membar + atomic RMW when the `f' goes out of
scope... That's four expensive operations, versus `func2' which has none of
that.

> and it may also convey a misleading impression that this shared_ptr's
> reference count or deleter can't be modified, but in principle I think
> enforcing on oneself a convention of doing const& for all but basic types
> could remove some inefficiencies -- it's just that it's not that
> important.

In the context of avoiding membars and atomic RMW's, IMO, it's EXTREMELY
important!

Please read the following which explains why traditional read/write mutex
implementations can be expensive:

http://groups.google.com/group/comp.programming.threads/msg/fdc665e616176dce

;^o

Christof Donat

unread,

Aug 21, 2009, 8:18:56 AM8/21/09

to

Hi,

> There is one allocation off the heap when you create an object. There is
> one allocation off the heap when you create a boost::shared_ptr object
> to manage that object.

No, the shared pointer object will usually be allocated on the stack.
boost::shared_ptr will allocate an intermediate object with the raw pointer
and the refference counter, that is referenced by all instances of
boost::shared_ptr that point to the same object. That is the second
allocation.

The performance could be improved by using a pool of these intermediate
objects. On the other hand you buy that with more memory consumption, since
the pool will always have to be there.

Christof

Christof Donat

unread,

Aug 21, 2009, 8:38:44 AM8/21/09

to

Hi,

> This problem has been solved already. The reference count simply needs to
> be a virtual superclass, and all reference-counted objects are derived
> from the superclass.

How does this work for boost::shared_ptr<int>? Sorry, no solution. That was
part of the requirements for boost::shared_ptr.

>> - Non-intrusive. We want to be able to use it for existing object types
>> (and even basic types if so desired).
>
> Not a showstopper. Can be used with existing object types, and basic
> types, by subclassing them.

boost::shared_ptr<std::string> sp = new std::string();

versus

boost::shared_ptr<std::string> sp =
new shared_ptrWrapper<std::string>(new std::string());

Ah, here we have the two allocations again. Just that the user has to do it
himself.

For most people std:: is not the only library they use. All of those
libraries would either have to be changed or their users would need to use
the second allocation and do that himself.

people will start to use the prtWrapper all the time, because it makes the
use of their classes easier in cases where no shared_ptr is needed.

> * The shared pointer becomes just a native pointer.

No, it has to have some operators to make sure, the memory will be released
as soon as the reference count reaches 0.

> * The additional memory allocation gets eliminated.

Not really, see above. The second allocation has just moved from the library
into the users code.

Christof

Pete Becker

unread,

Aug 21, 2009, 8:43:22 AM8/21/09

to

Chris M. Thomasson wrote:
>
> In a multiple processor system, avoiding mutations to the reference
> count does give you significant gains indeed.

Um, there's another qualifier that got left out:

In a multiple processor system, *when shared_ptr objects
are shared between threads*, avoiding mutations to the

reference count does give you significant gains indeed.

And therein lies a great debate about whether shared_ptr should be
sharable between threads. But that's different from the argument that
objects of class type should always be passed by reference, which is
where this particular subthread started.

Pete Becker

unread,

Aug 21, 2009, 8:47:40 AM8/21/09

to

Christof Donat wrote:
> Hi,
>
>> There is one allocation off the heap when you create an object. There is
>> one allocation off the heap when you create a boost::shared_ptr object
>> to manage that object.
>
> No, the shared pointer object will usually be allocated on the stack.
> boost::shared_ptr will allocate an intermediate object with the raw pointer
> and the refference counter, that is referenced by all instances of
> boost::shared_ptr that point to the same object. That is the second
> allocation.
>

I agree with everything in the preceding paragraph except the first
word. Unfortunately, you snipped the rest of what I said, which makes it
clear that I wasn't saying that the shared_ptr object was on the heap,
but that it allocated memory on the heap.

Chris M. Thomasson

unread,

Aug 21, 2009, 9:03:40 AM8/21/09

to

"Pete Becker" <pe...@versatilecoding.com> wrote in message
news:TY2dnZJ975L2CxPX...@giganews.com...

> Chris M. Thomasson wrote:
>>
>> In a multiple processor system, avoiding mutations to the reference count
>> does give you significant gains indeed.
>
> Um, there's another qualifier that got left out:
>
> In a multiple processor system, *when shared_ptr objects
> are shared between threads*, avoiding mutations to the
> reference count does give you significant gains indeed.

Please correct me if I am wrong, but I believe that shared_ptr will be using
membars and atomic RMW on multi-processor systems regardless if they are
shared between threads or not. This can still have negative effects. One
simple example, think of false sharing in the reference count. I do not
believe that Boost eliminates false sharing between reference counts. In
other words, I don't think that Boost implementation pads the reference
count object to the size of a L2 cache line, and I don't think they align
said refcount object in memory on a L2 cache line boundary.

> And therein lies a great debate about whether shared_ptr should be
> sharable between threads.

In other words:

therein lies a great debate about whether shared_ptr should be thread-safe
at all right? Or am I totally misunderstanding you?

Chris M. Thomasson

unread,

Aug 21, 2009, 9:30:10 AM8/21/09

to

"Juha Nieminen" <nos...@thanks.invalid> wrote in message
news:%Djjm.223$1Y6...@read4.inet.fi...

> Keith H Duggar wrote:
>> Boost shared_ptr is not "thread safe" by any standard that is
>> usually meant by the term:
>>
>> http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety
>
> First you claim that it's not thread-safe, and then you point to the
> documentation which says that it is.
>
>> ie "the same level of thread safety as built-in types" means
>> not "thread safe".
>
> You are confusing two different types of thread-safety.

[...]

FWIW, Boost shared_ptr only provides basic/normal thread-safety. It does NOT
provide strong thread-safety in any way shape or form. Read the entire
following thread:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/e5167941d32340c6/1b2e1c98fa9ad7c7

You simply cannot use shared_ptr in a scenario which demands strong thread
safety level. For example, this will NOT work:
_______________________________________________________________
static shared_ptr<foo> global_foo;

void writer_threads() {
for (;;) {
shared_ptr<foo> local_foo(new foo);
global_foo = local_foo;
}
}

void reader_threads() {
for (;;) {
shared_ptr<foo> local_foo(global_foo);
local_foo->read_only_operation()();
}
}
_______________________________________________________________

If you want to use shared_ptr this way, you need a mutex:
_______________________________________________________________
static shared_ptr<foo> global_foo;

void writer_threads() {
for (;;) {
shared_ptr<foo> local_foo(new foo);
// lock
global_foo = local_foo;
// unlock
}
}

void reader_threads() {
for (;;) {
// lock
shared_ptr<foo> local_foo(global_foo);
// unlock
local_foo->read_only_operation()();
}
}
_______________________________________________________________

Christof Donat

unread,

Aug 21, 2009, 9:39:21 AM8/21/09

to

Hi,

>>> There is one allocation off the heap when you create an object. There is
>>> one allocation off the heap when you create a boost::shared_ptr object
>>> to manage that object.
>>
>> No, the shared pointer object will usually be allocated on the stack.
>> boost::shared_ptr will allocate an intermediate object with the raw
>> pointer and the refference counter, that is referenced by all instances
>> of boost::shared_ptr that point to the same object. That is the second
>> allocation.
>>
>
> I agree with everything in the preceding paragraph except the first
> word. Unfortunately, you snipped the rest of what I said, which makes it
> clear that I wasn't saying that the shared_ptr object was on the heap,
> but that it allocated memory on the heap.

Sorry, actually I have not read your complete post, because after the
beginning I was sure, that you have a misconception there. I retract my first
word ;-)

Christof

Chris M. Thomasson

unread,

Aug 21, 2009, 9:59:26 AM8/21/09

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message
news:h6m7gt$2f6k$1...@news.ett.com.ua...

[...]

Here is a pre-alpha implementation of my experimental strongly thread-safe
reference counting algorithm:

http://webpages.charter.net/appcore/vzoom/refcount

Here is some very crude documentation of the C API:

http://webpages.charter.net/appcore/vzoom/refcount/doc

I proposed this algorithm for Boost a while ago, but I never really followed
up on it and did not make a formal proposal. Here are the relevant posts:

http://search.gmane.org/?query=&group=gmane.comp.lib.boost.devel&author=chris%20thomasson

Here is a fairly detailed description of exactly how the "strong
thread-safety aspect" of the algorithm works:

http://article.gmane.org/gmane.comp.lib.boost.devel/149803

Enjoy!

Juha Nieminen

unread,

Aug 21, 2009, 10:30:03 AM8/21/09

to

Keith H Duggar wrote:
> No, you are fundamentally missing my point.

And you missed mine.

My original post was in reply to someone complaining that
boost::shared_ptr is slow. I asked if that person would have a better
solution for a reference-counting smart pointer which would have the
same features as boost::shared_ptr.

One of these features is that it's thread-safe: You can have
shared_ptr instances in different threads pointing to the same object,
and inside that thread you can copy, assign and destroy these instances
without malfunction (in other words, even if the other thread is doing
that as well).

This is in contrast to most naive smart pointer implementations out
there which are *not* thread safe and cannot be used to share an object
among several threads. The reason for this is simple: These naive smart
pointers don't lock accesses to the reference counter and thus will
invariably malfunction when used with objects shared among threads.

Juha Nieminen

unread,

Aug 21, 2009, 10:37:27 AM8/21/09

to

Chris M. Thomasson wrote:
> FWIW, Boost shared_ptr only provides basic/normal thread-safety. It does
> NOT provide strong thread-safety in any way shape or form.

I don't even understand the difference between "basic/normal" and
"strong".

boost::shared_ptr is thread-safe because instance of it can be used in
different threads even if these instances point to the same object.
boost::shared_ptr doesn't malfunction if two threads manipulate these
instances at the same time.

You seem to be talking about thread-safety of the shared object
itself, rather than the thread-safety of boost::shared_ptr. That's a
completely different issue. Whether the shared object is thread-safe is,
of course, up to the object. Why should boost::shared_ptr have anything
to do with that? That's exactly as silly as saying that if the shared
object contains a pointer to dynamically allocated memory, it's
boost::shared_ptr's duty to free that memory as well, rather than the
object's duty.

Juha Nieminen

unread,

Aug 21, 2009, 10:45:32 AM8/21/09

to

Sam wrote:
>> - Non-intrusive. We want to be able to use it for existing object types
>> (and even basic types if so desired).
>
> Not a showstopper. Can be used with existing object types, and basic
> types, by subclassing them.

Care to show me how you subclass a basic type?

>> - Works with incomplete types (the only place where the type must be
>> complete is when constructing the smart pointer).
>
> Done.

Care to show me how you make an intrusive smart pointer work with
incomplete types?

For example, when the intrusive smart pointer needs to increment the
reference counter, how do you do it when the object type is incomplete?

> * The shared pointer becomes just a native pointer.

It's not like non-intrusive reference-counting smart pointers cannot
be made the size of one single pointer.

Chris M. Thomasson

unread,

Aug 21, 2009, 10:59:12 AM8/21/09

to

"Juha Nieminen" <nos...@thanks.invalid> wrote in message

news:Hiyjm.60$lD3...@read4.inet.fi...

> Chris M. Thomasson wrote:
>> FWIW, Boost shared_ptr only provides basic/normal thread-safety. It does
>> NOT provide strong thread-safety in any way shape or form.
>
> I don't even understand the difference between "basic/normal" and
> "strong".

Please read here:

http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#ThreadSafety

Please take note of the first word in the following sentence:

"Different shared_ptr instances can be "written to" (accessed using mutable
operations such as operator= or reset) simultaneously by multiple threads
(even when these instances are copies, and share the same reference count
underneath.)"

See? The SAME instance of a shared pointer CANNOT be written to by multiple
threads. The same instance of a shared pointer cannot be written to by
thread A, and simultaneously read from by thread B. This describes
limitation of basic/normal thread-safety level. On the other hand, this is
perfectly legal with a strong thread-safety level.

Take note of example 5:

//--- Example 5 ---

// thread A
p3.reset(new int(1));

// thread B
p3.reset(new int(2)); // undefined, multiple writes

If shared_ptr was strongly thread-safe, then example 5 works fine. In fact,
if it honored strong thread-safety, then examples 3-5 would also work fine.
I dare you to try and get the following example pseudo-code to work without
using mutual exclusion:

_______________________________________________________________
static shared_ptr<foo> global_foo;

void writer_threads() {
for (;;) {
shared_ptr<foo> local_foo(new foo);
global_foo = local_foo;
}
}

void reader_threads() {
for (;;) {
shared_ptr<foo> local_foo(global_foo);
local_foo->read_only_operation()();
}
}
_______________________________________________________________

> boost::shared_ptr is thread-safe because instance of it can be used in

> different threads even if these instances point to the same object.
> boost::shared_ptr doesn't malfunction if two threads manipulate these
> instances at the same time.

NO! See, a single instance of boost::shared_ptr CANNOT be simultaneously
written to (e.g., operator = or reset()) by more than one thread. A single
instance of boost::shared_ptr CANNOT be written to and read from by more
than one thread simultaneously. This is due tot he fact that shared_ptr is
not strongly thread-safe.

> You seem to be talking about thread-safety of the shared object
> itself, rather than the thread-safety of boost::shared_ptr.

I am not writing about that in any way, shape or form.

> That's a completely different issue.

Agreed.

> Whether the shared object is thread-safe is,
> of course, up to the object. Why should boost::shared_ptr have anything
> to do with that?

It should not have anything to do with that; period.

> That's exactly as silly as saying that if the shared
> object contains a pointer to dynamically allocated memory, it's
> boost::shared_ptr's duty to free that memory as well, rather than the
> object's duty.

Agreed, that would be silly!

Chris M. Thomasson

unread,

Aug 21, 2009, 11:05:40 AM8/21/09

to

"Juha Nieminen" <nos...@thanks.invalid> wrote in message

news:gqyjm.62$lD3...@read4.inet.fi...

Do you mean that the overhead of a non-intrusive reference-counting smart
pointer will be sizeof(void*)? Are you stealing some bits in the pointer to
keep the reference count or something? I must be totally misunderstanding
you. I can think of a way in which the counters are located in a static
array and objects are queued for destruction when that count goes to zero.
But that might mean non-deterministic destruction properties.

Chris M. Thomasson

unread,

Aug 21, 2009, 11:07:14 AM8/21/09

to

"Chris M. Thomasson" <n...@spam.invalid> wrote in message

news:h6m5va$2eja$1...@news.ett.com.ua...

> "Pete Becker" <pe...@versatilecoding.com> wrote in message
> news:TY2dnZJ975L2CxPX...@giganews.com...
>> Chris M. Thomasson wrote:
>>>
>>> In a multiple processor system, avoiding mutations to the reference
>>> count does give you significant gains indeed.
>>
>> Um, there's another qualifier that got left out:
>>
>> In a multiple processor system, *when shared_ptr objects
>> are shared between threads*, avoiding mutations to the
>> reference count does give you significant gains indeed.
>
> Please correct me if I am wrong, but I believe that shared_ptr will be
> using membars and atomic RMW on multi-processor systems regardless if they
> are shared between threads or not.

`BOOST_SP_DISABLE_THREADS' aside for a moment.

[...]

Pete Becker

unread,

Aug 21, 2009, 11:52:42 AM8/21/09

to

Chris M. Thomasson wrote:
> "Pete Becker" <pe...@versatilecoding.com> wrote in message
> news:TY2dnZJ975L2CxPX...@giganews.com...
>> Chris M. Thomasson wrote:
>>>
>>> In a multiple processor system, avoiding mutations to the reference
>>> count does give you significant gains indeed.
>>
>> Um, there's another qualifier that got left out:
>>
>> In a multiple processor system, *when shared_ptr objects
>> are shared between threads*, avoiding mutations to the
>> reference count does give you significant gains indeed.
>
> Please correct me if I am wrong, but I believe that shared_ptr will be
> using membars and atomic RMW on multi-processor systems regardless if
> they are shared between threads or not.

Yes, you're right. Sorry: too early in the morning.

This can still have negative
> effects. One simple example, think of false sharing in the reference
> count. I do not believe that Boost eliminates false sharing between
> reference counts. In other words, I don't think that Boost
> implementation pads the reference count object to the size of a L2 cache
> line, and I don't think they align said refcount object in memory on a
> L2 cache line boundary.
>
>
>
>
>> And therein lies a great debate about whether shared_ptr should be
>> sharable between threads.
>
> In other words:
>
> therein lies a great debate about whether shared_ptr should be
> thread-safe at all right? Or am I totally misunderstanding you?
>

You're understanding me correctly. There's much to be said for not
sacrificing performance here.

Pete Becker

unread,

Aug 21, 2009, 11:53:54 AM8/21/09

to

I hate when I do that.

> I retract my first
> word ;-)
>

<g>

Howard Hinnant

unread,

Aug 21, 2009, 12:12:33 PM8/21/09

to

Replying to no one in particular:

For those concerned about needing two heap allocations to create a
shared_ptr, you might want to check out boost:: (soon to be std::)
make_shared:

struct A
{
A(int i, char c);
...
};

shared_ptr<A> p = make_shared<A>(i, c);

The A and the refcount are allocated within one heap allocation (just
like an intrusive implementation). If you still need more control
than that (e.g. allocate out of static buffer) there is:

shared_ptr<A> p = allocate_shared<A>(my_allocator<A>(), i, c);

The memory footprint is still larger than most homegrown reference
counted implementations, in order to support features such as weak_ptr
(to break cyclic ownership). But like earlier posters have implied;
shared_ptr is a good tool for when you need to share ownership. It is
not a "silver bullet" one should blindly substitute every raw pointer
for. There is no substitute for thoughtful design. Nor would I even
call shared_ptr the last reference counted pointer that should ever be
crafted. For special purpose applications, one can always create a
custom tool that will outperform a tool built for general use.

shared_ptr has (at least) 2 things going for it:

1. It is soon to be standard (std::shared_ptr). It will be the *one*
reference counted pointer people will become familiar with to share
ownership, especially across shared library boundaries. I.e. the fact
that it only has one flavor (one template parameter) is a strength.
You can customize the innards of your shared_ptr<A> however you like
(special allocators, special deallocators, allocate in one go, or in
two), but all your clients sees is shared_ptr<A>. They don't have to
know how you created this animal, or even how to destroy it. It will
be to your benefit to know how to construct and use one.

2. Reference counted pointers are notoriously difficult to get
right. They can look right for years before it gets used in a way
that will bite you. shared_ptr has this experience behind it. It is
fully vetted, and ready for you to use. I'm writing this as someone
who has written slightly buggy reference counted pointers in the past,
and as someone who has implemented the full C++0x shared_ptr spec.

All that being said, I should stress that I find unique-ownership
smart pointers just as useful (if not more so) than shared-ownership
smart pointers. They are a pain to implement and use correctly in C++
today, but that will change in C++0x (shameless plug for
std::unique_ptr - in a C++0x context - see Ion's boost intrusive
library for a high quality preview you can use today).

-Howard

Richard

unread,

Aug 21, 2009, 1:53:04 PM8/21/09

to

[Please do not mail me a copy of your followup]

Thanks for that very informative reply, Howard!
--
"The Direct3D Graphics Pipeline" -- DirectX 9 draft available for download
<http://legalizeadulthood.wordpress.com/the-direct3d-graphics-pipeline/>

Legalize Adulthood! <http://legalizeadulthood.wordpress.com>

Juha Nieminen

unread,

Aug 21, 2009, 2:23:26 PM8/21/09

to

Chris M. Thomasson wrote:
>> It's not like non-intrusive reference-counting smart pointers cannot
>> be made the size of one single pointer.
>
> Do you mean that the overhead of a non-intrusive reference-counting
> smart pointer will be sizeof(void*)?

I'm saying that sizeof(NonintrusiveSmartPointer) == sizeof(void*),
which is what the original poster seemed to be concerned about.

Of course the reference counter will still take its own memory, but it
will not make instances of the non-intrusive smart pointer any larger
than one single raw pointer.

> Are you stealing some bits in the
> pointer to keep the reference count or something?

No.

> I must be totally
> misunderstanding you. I can think of a way in which the counters are
> located in a static array and objects are queued for destruction when
> that count goes to zero. But that might mean non-deterministic
> destruction properties.

The solution is much simpler than that. Rather than storing the
pointer to the managed object in the smart pointer, store it alongside
the reference counter. The smart pointer will only have a pointer to
this structure as member.

Yes, there will be an additional level of indirection when accessing
the managed object, but at least the size of the smart pointer will be
the size of one raw pointer. This might in some cases be advantageous if
the level of sharing is significant.

Chris M. Thomasson

unread,

Aug 21, 2009, 2:35:08 PM8/21/09

to

"Juha Nieminen" <nos...@thanks.invalid> wrote in message

news:yCBjm.148$lD3...@read4.inet.fi...

Okay. I totally understand what you're getting at. There are some more
advantages when you do it this way. For instance, one can atomically mutate
the pointer to the underlying reference count object using normal word-sized
atomic operations. IIRC from a past conversation with
David Abrahams and Peter Dimov, Boost does not allow one to do this because
it's size is greater than sizeof(void*). You have to use double-width atomic
operations which are not portable at all.

Alf P. Steinbach

unread,

Aug 21, 2009, 2:47:02 PM8/21/09

to

* Chris M. Thomasson:

Actually using the pointer is the most frequent operation, where an extra
indirection can be costly.

Cheers,

- Alf

Chris M. Thomasson

unread,

Aug 21, 2009, 3:06:14 PM8/21/09

to

"Alf P. Steinbach" <al...@start.no> wrote in message

news:h6mq39$thh$1...@news.eternal-september.org...
>* Chris M. Thomasson:
[...]

>> Okay. I totally understand what you're getting at. There are some more
>> advantages when you do it this way. For instance, one can atomically
>> mutate the pointer to the underlying reference count object using normal
>> word-sized atomic operations. IIRC from a past conversation with
>> David Abrahams and Peter Dimov, Boost does not allow one to do this
>> because it's size is greater than sizeof(void*). You have to use
>> double-width atomic operations which are not portable at all.
>
> Actually using the pointer is the most frequent operation, where an extra
> indirection can be costly.

Agreed. However, everything has it's tradeoffs...

;^)

Chris M. Thomasson

unread,

Aug 21, 2009, 3:15:35 PM8/21/09

to

"Howard Hinnant" <howard....@gmail.com> wrote in message
news:85b57840-2047-4021...@k19g2000yqn.googlegroups.com...

> Replying to no one in particular:
>
> For those concerned about needing two heap allocations to create a
> shared_ptr, you might want to check out boost:: (soon to be std::)
> make_shared:
>
> struct A
> {
> A(int i, char c);
> ...
> };
>
> shared_ptr<A> p = make_shared<A>(i, c);
>
> The A and the refcount are allocated within one heap allocation (just
> like an intrusive implementation). If you still need more control
> than that (e.g. allocate out of static buffer) there is:
>
> shared_ptr<A> p = allocate_shared<A>(my_allocator<A>(), i, c);

That's pretty cool. I did not know about it because I am not a user of
Boost.

> The memory footprint is still larger than most homegrown reference
> counted implementations, in order to support features such as weak_ptr
> (to break cyclic ownership). But like earlier posters have implied;
> shared_ptr is a good tool for when you need to share ownership. It is
> not a "silver bullet" one should blindly substitute every raw pointer
> for. There is no substitute for thoughtful design. Nor would I even
> call shared_ptr the last reference counted pointer that should ever be
> crafted. For special purpose applications, one can always create a
> custom tool that will outperform a tool built for general use.

Agreed.

> shared_ptr has (at least) 2 things going for it:
>
> 1. It is soon to be standard (std::shared_ptr). It will be the *one*
> reference counted pointer people will become familiar with to share
> ownership, especially across shared library boundaries. I.e. the fact
> that it only has one flavor (one template parameter) is a strength.
> You can customize the innards of your shared_ptr<A> however you like
> (special allocators, special deallocators, allocate in one go, or in
> two), but all your clients sees is shared_ptr<A>. They don't have to
> know how you created this animal, or even how to destroy it. It will
> be to your benefit to know how to construct and use one.
>
> 2. Reference counted pointers are notoriously difficult to get
> right. They can look right for years before it gets used in a way
> that will bite you. shared_ptr has this experience behind it. It is
> fully vetted, and ready for you to use. I'm writing this as someone
> who has written slightly buggy reference counted pointers in the past,
> and as someone who has implemented the full C++0x shared_ptr spec.

Agreed.

> All that being said, I should stress that I find unique-ownership
> smart pointers just as useful (if not more so) than shared-ownership
> smart pointers.

Agreed.

Pete Becker

unread,

Aug 21, 2009, 3:27:11 PM8/21/09

to

Juha Nieminen wrote:
>
> The solution is much simpler than that. Rather than storing the
> pointer to the managed object in the smart pointer, store it alongside
> the reference counter. The smart pointer will only have a pointer to
> this structure as member.
>
> Yes, there will be an additional level of indirection when accessing
> the managed object,

An additional level of indirection and a (possibly expensive) conversion.

struct B { };
struct D : virtual B { };

shared_ptr<D> ptr(new D);
shared_ptr<B> ptr2(ptr);

With shared_ptr as it's typically implemented, accesses through ptr2
just go through the pointer contained in the ptr2 object. If the ptr2
object doesn't hold a pointer, then you have to go to the control block,
pick up the stored pointer, and convert it to a B*. That conversion
typically involves two more dereferences.

Thomas J. Gritzan

unread,

Aug 21, 2009, 4:23:24 PM8/21/09

to

joseph cook schrieb:
> On Aug 20, 7:23 pm, Pete Becker <p...@versatilecoding.com> wrote:
>> joseph cook wrote:
>>> In a real application, maybe I would create the object once, and
>>> destruct it once. In the course of my application, I could have many
>>> pointers to this data. I want to delete the data when finally no one
>>> is using it (shared_ptr<>). If I were using plain pointers, I would
>>> always have one heap allocation, period. It wouldn't depend on how
>>> many intermediate objects are pointing to this data (sharing it).
>> Even if it's three allocations on initial creation, that isn't changed
>> when you make additional copies of that object. You only have three heap
>> allocations, period. Regardless of how many intermediate objects are
>> sharing it.
>
> OK, my intent isn't to beat up boost::shared_ptr because I use it
> frequently, and find it very useful. Unfortunately, I could use it a
> lot more if it didn't utilize heap allocations when being constructed,
> or I could control those heap allocations better. Probably I can and
> just am not aware of how.
>
> I was mistaken about the two heap allocations. Perhaps this was true
> in an older boost version?
>
> What I meant about the intermediate objects is this. If I had a
> singleton as follows, which self-destructs as soon as all users of it
> go out of scope:
>
> template<typename T>
> class Single
> {
> static shared_ptr<T> instance()
> {
> if(!m_instance)
> {
> m_instance = shared_ptr<T>(new T);
> return m_instance;
> }

Move the "return" here.

> }
> static shared_ptr<T> m_instance;
> };
>
> Every time I call "instance()", a shared_ptr is being created (slow
> heap allocation).

Huh? The shared_ptr is copied but there's no heap allocation. It's just
the reference count that is incremented. There's only one heap
allocation for the shared_ptr state when you create the first shared_ptr
from a raw pointer.

> If instead I had used plain pointers:
> template<typename T>
> class Single
> {
> static T* instance()
> {
> if(!m_instance)
> {
> m_instance = new T;
> }
> }
> static T* m_instance; //initialize to 0
> };
>
> I would have the initial allocation, but could call "instance" an
> unlimited amount of times without additional allocations. (I'm
> obviously not getting the reference counting either here)

I don't see a reason for a shared_ptr for a singleton either.

--
Thomas

Sam

unread,

Aug 21, 2009, 6:11:50 PM8/21/09

to

Christof Donat writes:

> Hi,
>
>> There is one allocation off the heap when you create an object. There is
>> one allocation off the heap when you create a boost::shared_ptr object
>> to manage that object.
>
> No, the shared pointer object will usually be allocated on the stack.
> boost::shared_ptr will allocate an intermediate object with the raw pointer

That's exactly what I said: boost::shared_ptr will allocate an object off
the heap, for storing the reference count.

Sam

unread,

Aug 21, 2009, 6:17:17 PM8/21/09

to

Christof Donat writes:

> Hi,
>
>> This problem has been solved already. The reference count simply needs to
>> be a virtual superclass, and all reference-counted objects are derived
>> from the superclass.
>
> How does this work for boost::shared_ptr<int>? Sorry, no solution.

You stopped reading too soon. The rest of my message explained it.

>>> - Non-intrusive. We want to be able to use it for existing object types
>>> (and even basic types if so desired).
>>
>> Not a showstopper. Can be used with existing object types, and basic
>> types, by subclassing them.
>
> boost::shared_ptr<std::string> sp = new std::string();
>
> versus
>
> boost::shared_ptr<std::string> sp =
> new shared_ptrWrapper<std::string>(new std::string());
>
> Ah, here we have the two allocations again. Just that the user has to do it
> himself.

I have no idea what you are referring to here. You are replying to something
other than what I wrote.

>> * The shared pointer becomes just a native pointer.
>
> No, it has to have some operators to make sure, the memory will be released
> as soon as the reference count reaches 0.

That's not what I meant. In my design, the shared pointer is a single
pointer, and gets handled, by the compiled code, as a single pointer value,
rather than a class with two pointers, which is boost:shared_ptr.

Invoking methods, as part of the implementation, is orthogonal.

>> * The additional memory allocation gets eliminated.
>
> Not really, see above.

Read the message again. In my design, which I have implemented, there is no
additional memory allocation, in order to implement reference-counted
objects, a.k.a. boost:shared_ptr. One memory allocation for the object.
That's it. The pointer to the object is a class containing a single pointer,
which is not allocated from the heap at all.

Try reading a message, before replying to it, next time.

> The second allocation has just moved from the library
> into the users code.

No, it didn't.

Keith H Duggar

unread,

Aug 21, 2009, 9:23:42 PM8/21/09

to

On Aug 21, 7:22 am, James Kanze <james.ka...@gmail.com> wrote:
> On Aug 21, 1:08 am, Keith H Duggar <dug...@alum.mit.edu> wrote:

> > On Aug 20, 5:56 pm, Juha Nieminen <nos...@thanks.invalid> wrote:
> > > Keith H Duggar wrote:
> > > > Boost shared_ptr is not "thread safe" by any standard that
> > > > is usually meant by the term:

> > > >http://www.boost.org/doc/libs/1_39_0/libs/smart_ptr/shared_ptr.htm#Th...

> > > First you claim that it's not thread-safe, and then you
> > > point to the documentation which says that it is.

> > No, I pointed to a document that says it is "as thread safe as
> > ..." not that it is "thread safe".

> > > > ie "the same level of thread safety as built-in types" means
> > > > not "thread safe".
> > > You are confusing two different types of thread-safety.

> > No, you are fundamentally missing my point. Obviously I know
> > the level of "thread safety" that boost::smart_ptr provides
> > (since I posted the reference) so you should have at least
> > thought for a a moment about what my point is and at least
> > recognize the point (whether you agree or not) before
> > launching into an elaboration of what the document already
> > explains.
> > My point (since you missed it the first time) is that "as
> > thread safe as a built-in type" is not, in my opinion, what
> > one commonly thinks of when someone says a construct is
> > "thread safe".
>
> So what do you think one "commonly thinks of" when one says a
> construct is "thread safe".

I mean that the entire type interface is "as thread-safe as a
POSIX-thread-safe function".

> As far as I can tell,
> boost::smart_ptr meets the definitions of Posix, and offers
> the maximum thread safety which one can reasonably expect.

I disagree that boost::smart_ptr meets the definition of POSIX
"thread-safe". Why? Because only a subset of, not all of, the
interface can be safely invoked concurrently by multiple threads
(example reset).

> > So when you posted this
> > > Naturally it should have the same basic properties:
> > > ...
> > > - Thread-safe.
> > it can easily be misunderstood or deceive someone without the
> > qualification "as a built-in type" which the boost document is
> > careful to include. To put it simply
> > "as thread-safe as a built-in type" != "thread safe"
>
> In what way?
>
> The basic definition of "thread safe", at least as I see it used
> by the experts in the field, more or less corresponds to the
> Posix definition: there is no problem using different instances
> of the object in different threads, regardless of what you're
> doing with them, there is no problem using the same instance in
> different threads as long as none of the threads are modifying
> the instance (from a logical point of view). Anything else

First off (please correct me if I'm wrong) POSIX only defines
"thread-safe" for functions and that definition is:

3.396 Thread-Safe
A function that may be safely invoked concurrently by multiple
threads. Each function defined in the System Interfaces volume
of IEEE Std 1003.1-2001 is thread-safe unless explicitly stated
otherwise. Examples are any "pure" function, a function which
holds a mutex locked while it is accessing static storage, or
objects shared among threads.

So first the POSIX definition must be extended to the notion of
type (ie class in C++). One way to do that is to take the simple
and common view of all member functions being free functions that
take a hidden this pointer. Then we apply the POSIX definition to
that set of functions and if *every* function (minus destructor)
is POSIX thread-safe then we can say the class is "thread-safe".
boost::share_ptr fails this criteria for example "reset" fails
when supplied with identical "this" pointers.

In other words, it is what N2410

http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2007/n2410.html

proposed to call "strong thread-safe" also as Chris MT has called
in some other posts. However, note that since "strong thread-safe"
is simply the most natural extension of POSIX "thread-safe" to
C++ types, then "thread-safe" without qualification should mean
"strong thread-safe" and that is consistent with your claim that
"the experts in the field, more or less corresponds to the Posix
definition". It's just I don't know where you got your definition
of POSIX "thread-safe" because that's not what I recall from the
POSIX document?

Finally, I will note N2519

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2519.html

which claims that the level of thread-safe that boost::shared_ptr
provides is "not widely recognized or named in the literature" and
that is consistent with my experience as well.

KHD

James Kanze

unread,

Aug 22, 2009, 6:42:14 AM8/22/09

to

On Aug 22, 3:23 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Aug 21, 7:22 am, James Kanze <james.ka...@gmail.com> wrote:
> > On Aug 21, 1:08 am, Keith H Duggar <dug...@alum.mit.edu> wrote:

> > So what do you think one "commonly thinks of" when one says
> > a construct is "thread safe".

> I mean that the entire type interface is "as thread-safe as a
> POSIX-thread-safe function".

In other words (quoting the Posix standard): "A function that
may be safely invoked concurrently by multiple threads." All of
the member functions of boost::shared_ptr meet that requirement.

In this case, of course, we're not talking about just functions,
but about an in memory object. And the Posix definitions for
memory access say that "Applications shall ensure that access to
any memory location by more than one thread of control (threads
or processes) is restricted such that no thread of control can
read or modify a memory location while another thread of control
may be modifying it." Which is exactly what boost::shared_ptr
requires.

> > As far as I can tell, boost::smart_ptr meets the definitions
> > of Posix, and offers the maximum thread safety which one can
> > reasonably expect.

> I disagree that boost::smart_ptr meets the definition of POSIX
> "thread-safe". Why? Because only a subset of, not all of, the
> interface can be safely invoked concurrently by multiple
> threads (example reset).

I don't think so. Could you post a scenario where reset (or any
other function) fails when invoked from different threads. Or
for that matter, when any combination of functions fails when
called from different threads, providing the basic requirement
is met: if you're modifying the object (and reset() modifies
it), then if any other thread accesses the same object (the same
instance of boost::shared_ptr---client code shouldn't have to be
concerned with what different instances of the object share),
access must be synchronized. This is exactly the same
requirement as for any Posix function. (There are a very few IO
functions which give an even stronger guarantee, i.e. read and
write.) In other words, given:

boost::shared_ptr< T > p1( new T ) ;
boost::shared_ptr< T > p2( p1 ) ;

you should be able to do p1.reset() in a thread, regardless of
what other threads might be doing with p2, but if you do
p1.reset() in one thread, all accesses to p1 must be
synchronized if p1 is accessed from any other thread. If I
understand correctly, boost::shared_ptr guarantees this, and
that's all that Posix guarantees for it's thread-safe functions.

> > In what way?

It also defines guarantees concerning "memory" accesses, see
above. These are perhaps even more relevant than the function
guarantees when an object is involved---a boost::shared_ptr,
after all, is a replacement for a raw pointer, not for some
function.

> So first the POSIX definition must be extended to the notion of
> type (ie class in C++). One way to do that is to take the simple
> and common view of all member functions being free functions that
> take a hidden this pointer. Then we apply the POSIX definition to
> that set of functions and if *every* function (minus destructor)
> is POSIX thread-safe then we can say the class is "thread-safe".
> boost::share_ptr fails this criteria for example "reset" fails
> when supplied with identical "this" pointers.

And localtime_r fails when supplied with identical buffers.
That's a foregone. Posix doesn't require thread-safe functions
to be able to be called on the same objects from different
threads. Because that's not reasonable; if you want an extreme
example, think of setjmp (which Posix defines as thread safe).

> In other words, it is what N2410

> http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2007/n2410.html

> proposed to call "strong thread-safe" also as Chris MT has
> called in some other posts. However, note that since "strong
> thread-safe" is simply the most natural extension of POSIX
> "thread-safe" to C++ types, then "thread-safe" without
> qualification should mean "strong thread-safe" and that is
> consistent with your claim that "the experts in the field,
> more or less corresponds to the Posix definition". It's just I
> don't know where you got your definition of POSIX
> "thread-safe" because that's not what I recall from the POSIX
> document?

If you'd read the document you'd site, it points out quite
clearly that the so-called "strong thread-safety" is a very
naïve meaning for thread safety. As pointed out above, Posix
doesn't require it, and in fact, no expert that I know defines
thread-saftey in that way.

> Finally, I will note N2519

> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2519.html

> which claims that the level of thread-safe that
> boost::shared_ptr provides is "not widely recognized or named
> in the literature" and that is consistent with my experience
> as well.

It's curious that in the sentence immediately following this
statement, he cites a document that does "recognize" this level
of thread safety. And if this level of thread safety has no
special name, it's probably because it is what is generally
assumed by "thread-safety" by the experts in the field; I've
never seen any article by an expert in threading that spoke of
"strong thread-safety" other than to explain that this is not
what is meant by "thread-safety".

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

unread,

Aug 22, 2009, 6:48:49 AM8/22/09

to

On Aug 21, 4:37 pm, Juha Nieminen <nos...@thanks.invalid> wrote:
> Chris M. Thomasson wrote:
> > FWIW, Boost shared_ptr only provides basic/normal
> > thread-safety. It does NOT provide strong thread-safety in
> > any way shape or form.

> I don't even understand the difference between "basic/normal"
> and "strong".

"basic/normal" thread-safety is what is normally meant by thread
safety (e.g. Posix guarantees). "Strong" thread safety is
something more: it means that the object state can be modified
(from the client point of view) from several threads
simultaneously. It can be useful in a few specific cases
(message queues, etc.), but is expensive, and not generally
useful enough to warrant the expense.

As you might imagine, when used without further qualification,
"thread-safety" means normal thread-safety.

> boost::shared_ptr is thread-safe because instance of it can be
> used in different threads even if these instances point to the
> same object. boost::shared_ptr doesn't malfunction if two
> threads manipulate these instances at the same time.

Exactly. Also, you can use the same instance in many threads
provided no thread mutates it (again, from a client point of
view).

James Kanze

unread,

Aug 22, 2009, 6:57:56 AM8/22/09

to

On Aug 21, 5:05 pm, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "Juha Nieminen" <nos...@thanks.invalid> wrote in message

> > It's not like non-intrusive reference-counting smart
> > pointers cannot be made the size of one single pointer.

> Do you mean that the overhead of a non-intrusive
> reference-counting smart pointer will be sizeof(void*)?

Yes.

> Are you stealing some bits in the pointer to keep the
> reference count or something?

No, you introduce an additional level of indirection. (Isn't
that the solution for every problem:-)?) Something like:

template< typename T >
class SharedPtr
{
struct Impl
{
T* ptr ;
int cnt ;

Impl( T* ptr ) : ptr( ptr ), cnt( 0) {}
} ;
Impl* myImpl ;
public:
SharedPtr( T* newedPtr )
: myImpl( new Impl( newedPtr ) )
{
++ myImpl->cnt ;
}
// ...
} ;

(It's obviously a bit more complicated than that if you want to
support conversions of SharedPtr< Derived > to SharedPtr< Base
>, but you get the idea.)

> I must be totally misunderstanding you. I can think of a way
> in which the counters are located in a static array and
> objects are queued for destruction when that count goes to
> zero. But that might mean non-deterministic destruction
> properties.

You could always keep the counters in map, indexed by the object
address. I rather think that the performance of copying a
pointer would be unacceptable in such cases (since it implies a
map lookup in order to increment), but I've not actually
measured it to be sure---perhaps with a very clever
implementation of the map.

Chris M. Thomasson

unread,

Aug 22, 2009, 7:37:15 AM8/22/09

to

"Juha Nieminen" <nos...@thanks.invalid> wrote in message

news:gqyjm.62$lD3...@read4.inet.fi...

> Sam wrote:
>>> - Non-intrusive. We want to be able to use it for existing object types
>>> (and even basic types if so desired).
>>
>> Not a showstopper. Can be used with existing object types, and basic
>> types, by subclassing them.
>
> Care to show me how you subclass a basic type?
>
>>> - Works with incomplete types (the only place where the type must be
>>> complete is when constructing the smart pointer).
>>
>> Done.
>
> Care to show me how you make an intrusive smart pointer work with
> incomplete types?
>
> For example, when the intrusive smart pointer needs to increment the
> reference counter, how do you do it when the object type is incomplete?

Perhaps something along the lines of:
____________________________________________________________________
struct ref_base {
unsigned m_count;
virtual ~ref_base() = 0;
};

ref_base::~ref_base() {}

template<typename T>
class ptr {
ref_base* m_ptr;

void prv_init() const {
if (m_ptr) m_ptr->m_count = 1;
}

void prv_inc() const {
if (m_ptr) ++m_ptr->m_count;
}

void prv_dec() const {
if (m_ptr) {
if (! --m_ptr->m_count) {
delete m_ptr;
}
}
}

public:
ptr(T* ptr = NULL) : m_ptr(static_cast<ref_base*>(ptr)) {
prv_init();
}

ptr(ptr const& rhs) : m_ptr(rhs.m_ptr) {
prv_inc();
}

~ptr() {
prv_dec();
}

ptr& operator = (ptr const& rhs) {
rhs.prv_inc();
prv_dec();
m_ptr = rhs.m_ptr;
return *this;
}

T* operator ->() {
return static_cast<T*>(m_ptr);
}
};
____________________________________________________________________

Chris M. Thomasson

unread,

Aug 22, 2009, 7:51:01 AM8/22/09

to

"James Kanze" <james...@gmail.com> wrote in message
news:88b62bf5-dd6e-4db9...@k30g2000yqf.googlegroups.com...

On Aug 21, 5:05 pm, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> > "Juha Nieminen" <nos...@thanks.invalid> wrote in message
> > > It's not like non-intrusive reference-counting smart
> > > pointers cannot be made the size of one single pointer.

> > Do you mean that the overhead of a non-intrusive
> > reference-counting smart pointer will be sizeof(void*)?

> Yes.

> > Are you stealing some bits in the pointer to keep the
> > reference count or something?

> No, you introduce an additional level of indirection. (Isn't
> that the solution for every problem:-)?)

:^D

> Something like:

> template< typename T >
> class SharedPtr
> {
> struct Impl
> {
> T* ptr ;
> int cnt ;
>
> Impl( T* ptr ) : ptr( ptr ), cnt( 0) {}
> } ;
> Impl* myImpl ;
> public:
> SharedPtr( T* newedPtr )
> : myImpl( new Impl( newedPtr ) )
> {
> ++ myImpl->cnt ;
> }
> // ...
> } ;

Okay. See, when I used the term overhead, I meant the total overhead
including the pointer to the private counter object `myImpl' and the amount
of memory it takes to create said object. So, that's 1 pointer + 1 pointer +
1 int. Perhaps I am in error thinking about it that way.

[...]

Sam

unread,

Aug 22, 2009, 8:59:04 AM8/22/09

to

Chris M. Thomasson writes:

> "Juha Nieminen" <nos...@thanks.invalid> wrote in message
> news:gqyjm.62$lD3...@read4.inet.fi...
>> Sam wrote:
>>>> - Non-intrusive. We want to be able to use it for existing object types
>>>> (and even basic types if so desired).
>>>
>>> Not a showstopper. Can be used with existing object types, and basic
>>> types, by subclassing them.
>>
>> Care to show me how you subclass a basic type?
>>
>>>> - Works with incomplete types (the only place where the type must be
>>>> complete is when constructing the smart pointer).
>>>
>>> Done.
>>
>> Care to show me how you make an intrusive smart pointer work with
>> incomplete types?
>>
>> For example, when the intrusive smart pointer needs to increment the
>> reference counter, how do you do it when the object type is incomplete?
>
> Perhaps something along the lines of:
>
> ____________________________________________________________________
> struct ref_base {
> unsigned m_count;
> virtual ~ref_base() = 0;
> };
>

> [ … ]

That's the general idea -- all reference-counted objects are then derived
from this superclass, virtually, and this is a rough outline of how I
implemented it, except that the virtual superclass's destructor is not
abstract, of course, nor does it need to be. The actual increment/decrement
operations are the same as with shared_ptr -- compiler-specific instructions
that compile down to atomic CPU operations, so that they are thread-safe.
There's also some code to deal with converting between references to
subclasses and superclasses, as well as support for weak references, and
destructor callbacks -- callbacks that are invoked when the object gets
destroyed after its last reference goes out of scope, and a couple of
subclasses of STL containers -- weak lists, weak maps, and weak multimaps --
which hold weak references that get automatically removed from the container
when the last strong reference to the weakly-referenced object goes out of
scope.

End result - what I believe is a much better implementation of
reference-counted objects. It does not require twice as many heap
allocations per object; just one for the object itself; also, the reference
class which contains just one native pointer, rather than two as is the case
with shared_ptr.

I think that shared_ptr could've been a much better implementation, only
with a little bit additional forethought.

Also, consider another major design flaw with shared_ptr: a class method has
no way of obtaining a reference to its own instance. Some method of class A
may want to create an instance of class B that holds a reference to the
instance of A that created it, and, say, return a reference to the newly
created instance of B. That seems to me like a reasonable, and quite common,
thing to do:

ref<B> A::method()
{
// create an instance of B

// B contains a reference to an instance of A, namely this object.

// return the initial reference to B
}

Except that shared_ptr forces you to pass the shared pointer as a parameter
to A::method. A::method has no other means of getting a reference to its own
object, in order to stuff it into B. The pointer to the reference counter is
contained in shared_ptr<A>. Invoking shared_ptr<A>->method() gives a plain
A::method() nothing that it can use. So, you're forced to add shared_ptr<A>
as a parameter to method(). Sloppy.

On the other hand, if A is derived from something like ref_base, A::method()
can simply derive a new reference from /this/, incrementing its own
reference count. A::method() does not need to get its own reference as a
parameter. The implementation of A::method() is further encapsulated. Clean,
and simple.

Pete Becker

unread,

Aug 22, 2009, 10:59:32 AM8/22/09

to

Sam wrote:

> The actual
> increment/decrement operations are the same as with shared_ptr --
> compiler-specific instructions that compile down to atomic CPU
> operations, so that they are thread-safe.

Atomic CPU operations are not automatically thread safe. Atomic CPU
operations won't tear values on a thread switch, but that's not enough.
You also have to ensure that changes made in one thread are visible to
all other threads, and that requires some form of memory barrier.
Otherwise you can end up with different values in different CPU caches.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of

"The Standard C++ Library Extensions: a Tutorial and Reference"
(www.petebecker.com/tr1book)

Juha Nieminen

unread,

Aug 22, 2009, 12:24:28 PM8/22/09

to

James Kanze wrote:
> You could always keep the counters in map, indexed by the object
> address. I rather think that the performance of copying a
> pointer would be unacceptable in such cases (since it implies a
> map lookup in order to increment), but I've not actually
> measured it to be sure---perhaps with a very clever
> implementation of the map.

There is also a third possibility, which is somewhere between an
intrusive and non-intrusive solution: Every time you allocate an object
to be managed, make the allocation larger by sizeof(size_t), and then
use that extra space for the "semi-intrusive" reference counter.

The advantage of this is that the smart pointer will behave basically
in the exact same way as an intrusive smart pointer, and the memory
consumption will be exactly the same.

This solution even has additional advantages over classic intrusive
smart pointers: The "semi-intrusive" smart pointer can support existing
types and even builtin types, and can work with incomplete types at no cost.

Of course there are some disadvantages over non-intrusive smart
pointers. For instance, you obviously cannot manage any object which has
been allocated in the regular way, without that extra space. Depending
on how you implement the allocation, a base class type smart pointer
might or might not support derived class objects.

Juha Nieminen

unread,

Aug 22, 2009, 12:38:48 PM8/22/09

to

Sam wrote:
> That's the general idea -- all reference-counted objects are then
> derived from this superclass, virtually, and this is a rough outline of
> how I implemented it, except that the virtual superclass's destructor is
> not abstract, of course, nor does it need to be.

You do realize that casting from a virtual base class to the actual
object's type can incur a penalty which does not happen with regular
non-intrusive smart pointers?

> The actual
> increment/decrement operations are the same as with shared_ptr --
> compiler-specific instructions that compile down to atomic CPU
> operations, so that they are thread-safe.

Increments and decrements are in no way guaranteed to be atomic, and
in some architectures they may well not be. Even if they were, there's
still a huge mutual exclusion problem here:

if (! --m_ptr->m_count) {
delete m_ptr;
}

Guess what happens if another thread executes this same code in
between the decrement and the comparison to null in this thread, and the
counter happened to be 2 to begin with.

> I think that shared_ptr could've been a much better implementation, only
> with a little bit additional forethought.

Except that shared_ptr was designed to work with any existing type
(including builtin types) without the need to modify that type.

Sam

unread,

Aug 22, 2009, 1:39:39 PM8/22/09

to

Juha Nieminen writes:

> Sam wrote:
>> That's the general idea -- all reference-counted objects are then
>> derived from this superclass, virtually, and this is a rough outline of
>> how I implemented it, except that the virtual superclass's destructor is
>> not abstract, of course, nor does it need to be.
>
> You do realize that casting from a virtual base class to the actual
> object's type can incur a penalty which does not happen with regular
> non-intrusive smart pointers?

Casting from any superclass to a subclass incurs a penalty, even with
shared_ptr. The actual smart pointer implementation is not a factor.

There may be differences in the amount of the penalty amount, but this does
not occur very often, in practice.

>> The actual
>> increment/decrement operations are the same as with shared_ptr --
>> compiler-specific instructions that compile down to atomic CPU
>> operations, so that they are thread-safe.
>
> Increments and decrements are in no way guaranteed to be atomic, and
> in some architectures they may well not be. Even if they were, there's
> still a huge mutual exclusion problem here:
>
> if (! --m_ptr->m_count) {
> delete m_ptr;
> }

That, of course, is not how shared_ptr does it.

> Except that shared_ptr was designed to work with any existing type
> (including builtin types) without the need to modify that type.

Which, of course, is exactly what I stated in my first message in this
thread, and explained what the trade-offs are. Besides, you cannot assume
that you can arbitrarly attach a shared_ptr to some arbitrary instance of a
builtin or a basic type. For all you know, it was allocated on the stack,
and not the heap, or whichever library allocated it on the heap, might
decide to deallocate it, since it's a basic or a builtin type that the
library owns it, despite the fact that you have a shared_ptr pointing to it.

So, IMO, shared_ptr's ability to work with builtin types is somewhat
overvalued.

Pete Becker

unread,

Aug 22, 2009, 1:51:07 PM8/22/09

to

Juha Nieminen wrote:
>
> Increments and decrements are in no way guaranteed to be atomic, and
> in some architectures they may well not be. Even if they were, there's
> still a huge mutual exclusion problem here:
>
> if (! --m_ptr->m_count) {
> delete m_ptr;
> }
>
> Guess what happens if another thread executes this same code in
> between the decrement and the comparison to null in this thread, and the
> counter happened to be 2 to begin with.
>

If the decrement is atomic (not an atomic CPU instruction, but atomic in
the sense of not tearing and producing a result that's visible to all
threads that use the variable) then this works just fine. Of course, all
the other manipulations of this variable must also be similarly atomic.

SG

unread,

Aug 22, 2009, 2:24:52 PM8/22/09

to

Juha Nieminen wrote:
> There is also a third possibility, which is somewhere between an
> intrusive and non-intrusive solution: Every time you allocate an object
> to be managed, make the allocation larger by sizeof(size_t), and then
> use that extra space for the "semi-intrusive" reference counter.
>
> The advantage of this is that the smart pointer will behave basically
> in the exact same way as an intrusive smart pointer, and the memory
> consumption will be exactly the same.

It's my understanding that the std::make_shared<T>(...) function
template does something very similar (as Howard Hinnant pointed out
already).

Sam wrote:
> Also, consider another major design flaw with shared_ptr: a class
> method has no way of obtaining a reference to its own instance.

I think that's what N2914 section 20.8.10.5 is about:

" Class template enable_shared_from_this

A class T can inherit from enable_shared_from_this<T> to
inherit the shared_from_this member functions that obtain
a shared_ptr instance pointing to *this. "

It looks like std::shared_ptr will be just as good as any intrusive
pointer w.r.t. allocation counts and "shared_from_this". A shared_ptr
implementation will probably store two pointers but I'm fine with
that. The flipside is that types don't have to derive from some
special base class for reference counting. This is IMHO a big
advantage.

Cheers!
SG

Juha Nieminen

unread,

Aug 22, 2009, 4:47:51 PM8/22/09

to

SG wrote:
> It's my understanding that the std::make_shared<T>(...) function
> template does something very similar (as Howard Hinnant pointed out
> already).

I think you mean boost::make_shared? Although I hope it will some day
be in the std namespace as well... :)

Juha Nieminen

unread,

Aug 22, 2009, 4:58:12 PM8/22/09

to

Sam wrote:
> Casting from any superclass to a subclass incurs a penalty

I don't think that's true. If the compiler can see both class
declarations and there is a simple inheritance relation between them
(with no virtual functions in the derived class), the pointer doesn't
have to be modified in any way and this decision can be done at compile
time. The cast will effectively not create any machine code whatsoever.

>> Except that shared_ptr was designed to work with any existing type
>> (including builtin types) without the need to modify that type.
>
> Which, of course, is exactly what I stated in my first message in this
> thread, and explained what the trade-offs are. Besides, you cannot
> assume that you can arbitrarly attach a shared_ptr to some arbitrary
> instance of a builtin or a basic type. For all you know, it was
> allocated on the stack, and not the heap, or whichever library allocated
> it on the heap, might decide to deallocate it, since it's a basic or a
> builtin type that the library owns it, despite the fact that you have a
> shared_ptr pointing to it.

If you want to use a shared_ptr which points to an object which must
not be destroyed by that shared_ptr, you can tell it. If the object must
be destroyed in some special way (eg. by using some custom allocator),
you can also tell it that.

Juha Nieminen

unread,

Aug 22, 2009, 5:04:30 PM8/22/09

to

Pete Becker wrote:
> Juha Nieminen wrote:
>>
>> Increments and decrements are in no way guaranteed to be atomic, and
>> in some architectures they may well not be. Even if they were, there's
>> still a huge mutual exclusion problem here:
>>
>> if (! --m_ptr->m_count) {
>> delete m_ptr;
>> }
>>
>> Guess what happens if another thread executes this same code in
>> between the decrement and the comparison to null in this thread, and the
>> counter happened to be 2 to begin with.
>>
>
> If the decrement is atomic (not an atomic CPU instruction, but atomic in
> the sense of not tearing and producing a result that's visible to all
> threads that use the variable) then this works just fine. Of course, all
> the other manipulations of this variable must also be similarly atomic.

I don't understand how that can work if the result of the decrement is
not immediately visible to all threads.

If, for example, the code waits for the next sequence point to
actually write the new value of the variable into shared RAM, you will
obviously have another problem: Two threads might see the counter as
having the value 2, they then may both decrement it to 1 at the same
time, after which both decide that the object should not be deleted,
after which both write the 1 into the shared RAM. Then they both go out
of scope and the object is never deleted, and thus leaked.

And as I said, if the decrement is immediately visible to all threads,
you end up having a mutual exclusion problem which may cause the object
to be deleted twice (and possibly other even more obscure problems). In
the best case scenario your program will simply crash.

Sam

unread,

Aug 22, 2009, 5:33:36 PM8/22/09

to

Juha Nieminen writes:

> Sam wrote:
>> Casting from any superclass to a subclass incurs a penalty
>
> I don't think that's true. If the compiler can see both class
> declarations and there is a simple inheritance relation between them
> (with no virtual functions in the derived class), the pointer doesn't

A dynamic_cast from a superclass to a subclass, a derived class, only works
if the superclass has, at least, a virtual destructor.

It's only when casting from a subclass to a superclass the compiler does not
need the vtable.

> have to be modified in any way and this decision can be done at compile
> time. The cast will effectively not create any machine code whatsoever.

If a compiler sees that you have a pointer to a superclass, and you ask for
a dynamic_cast to a subclass, the compiler still needs to check the object's
vtable to figure out whether or not the object is, indeed, the subclass.

> If you want to use a shared_ptr which points to an object which must
> not be destroyed by that shared_ptr, you can tell it.

I thought that the whole point of a shared_ptr is so that the referenced
object may be destroyed at the appropriate time. If you don't want the
object destroyed, you don't need a shared_ptr.

Chris M. Thomasson

unread,

Aug 22, 2009, 5:44:40 PM8/22/09

to

"Juha Nieminen" <nos...@thanks.invalid> wrote in message

news:y3Zjm.153$ci...@read4.inet.fi...

For a reference counting algorithm with basic thread-safety:
___________________________________________________________
void refcount_decrement()
{
MEMBAR_RELEASE();

if (ATOMIC_FAA(&m_ptr->m_count, -1) == 1)
{
MEMBAR_ACQUIRE();
delete m_ptr;
}
}
___________________________________________________________

is 100% perfectly safe. BTW, ATOMIC_FAA is Fetch-and-Add.

Chris M. Thomasson

unread,

Aug 22, 2009, 5:53:43 PM8/22/09

to

"Pete Becker" <pe...@versatilecoding.com> wrote in message
news:KsadnSFScpGWrQ3X...@giganews.com...

> Juha Nieminen wrote:
>>
>> Increments and decrements are in no way guaranteed to be atomic, and
>> in some architectures they may well not be. Even if they were, there's
>> still a huge mutual exclusion problem here:
>>
>> if (! --m_ptr->m_count) {
>> delete m_ptr;
>> }
>>
>> Guess what happens if another thread executes this same code in
>> between the decrement and the comparison to null in this thread, and the
>> counter happened to be 2 to begin with.
>>
>
> If the decrement is atomic (not an atomic CPU instruction, but atomic in
> the sense of not tearing and producing a result that's visible to all
> threads that use the variable) then this works just fine. Of course, all
> the other manipulations of this variable must also be similarly atomic.

You would also need to ensure that the read-modify-write sequence was a full
atomic operation. No compiler I have ever seen would ensure that the
following RMW was uninterruptible in the presence of multiple threads:

int i = 3;

--i;

Think if thread A read a value of 3, thread B read a value of 3 and thread C
read a value of 3. Then thread A wrote a value of 2, thread B wrote a value
of 2 and finally thread C wrote a value of 2. The end result would be 2 when
it should have been 0.

Keith H Duggar

unread,

Aug 22, 2009, 9:18:41 PM8/22/09

to

On Aug 22, 6:42 am, James Kanze <james.ka...@gmail.com> wrote:
> On Aug 22, 3:23 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> > On Aug 21, 7:22 am, James Kanze <james.ka...@gmail.com> wrote:
> > > On Aug 21, 1:08 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> > > So what do you think one "commonly thinks of" when one says
> > > a construct is "thread safe".
> > I mean that the entire type interface is "as thread-safe as a
> > POSIX-thread-safe function".
>
> In other words (quoting the Posix standard): "A function that
> may be safely invoked concurrently by multiple threads." All of
> the member functions of boost::shared_ptr meet that requirement.

[snip same old arguments ie that that some functions are only
safe when called on *different* objects and that it is this
conditional safety that "all the experts" mean when they say
"thread-safe"]

> > In other words, it is what N2410
> >http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2007/n2410.html
> > proposed to call "strong thread-safe" also as Chris MT has
> > called in some other posts. However, note that since "strong
> > thread-safe" is simply the most natural extension of POSIX
> > "thread-safe" to C++ types, then "thread-safe" without
> > qualification should mean "strong thread-safe" and that is
> > consistent with your claim that "the experts in the field,
> > more or less corresponds to the Posix definition". It's just I
> > don't know where you got your definition of POSIX
> > "thread-safe" because that's not what I recall from the POSIX
> > document?
>
> If you'd read the document you'd site, it points out quite
> clearly that the so-called "strong thread-safety" is a very

> na√Øve meaning for thread safety.

It points out nothing about the notion being "naive". That is
your coloration. It simply points out the likely costs.

> As pointed out above, Posix
> doesn't require it, and in fact, no expert that I know defines
> thread-saftey in that way.
>
> > Finally, I will note N2519
> >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2519.html
> > which claims that the level of thread-safe that
> > boost::shared_ptr provides is "not widely recognized or named
> > in the literature" and that is consistent with my experience
> > as well.
>
> It's curious that in the sentence immediately following this
> statement, he cites a document that does "recognize" this level
> of thread safety. And if this level of thread safety has no
> special name, it's probably because it is what is generally
> assumed by "thread-safety" by the experts in the field; I've
> never seen any article by an expert in threading that spoke of
> "strong thread-safety" other than to explain that this is not
> what is meant by "thread-safety".

The crux of our disagreement is two fold 1) you hold that
a function can be "thread-safe" even if it conditionally
imposes some extra requirements such as different objects,
buffer pointers, etc 2) you hold that "all the experts"
agree with you.

As to 1) I simply disagree. I think something is "thread-safe"
only if it is the naive sense of "if the class works when there
is only a single-thread and is thread-safe, then it works when
there are multiple threads with *no additional synchronization
coding required* (note this was just a *toy* way of putting it,
read the article I'm about to link for a more useful wording
and discussion of it).

As to 2) well there are at least 3 experts who hold my view:
Brian Goetz, Joshua Bloch, and Chris Thomasson. Here is, an
article by Brian Goetz that lays out the issues very nicely:

http://www.ibm.com/developerworks/java/library/j-jtp09263.html

Finally, as to 1) ultimately it is a matter of definition. If
you are right and we all agree to call the "as thread-safe as
a built-in type" just "thread-safe" that would be fine too.
However, as you can see, there is disagreement and my point
here was simply that one should be a bit more careful that just
saying boost::shared_ptr is "thread-safe". Indeed, one should
call it exactly what the Boost document calls it "as thread-safe
as a built-in type" or perhaps "conditionally thread-safe" so
as to be careful and avoid confusion.

And by the way, that last point is not just some pointless nit-
pick. Even in the last year-and-half at work, I caught (during
review) three cases of thread-unsafe code that was a result of a
boost::shared_ptr instance being shared unsafely (all were cases
of one thread writing and one reading). When I discussed the review
with the coders all three said exactly the same thing "But I thought
boost::shared_ptr was thread-safe?". Posts like Juha's that say
unconditionally "boost::shared_ptr is thread-safe" continue to
help perpetuate this common (as naive as you might say it is)
misunderstanding.

KHD

Keith H Duggar

unread,

Aug 22, 2009, 10:27:52 PM8/22/09

to

On Aug 22, 9:18 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
> As to 2) well there are at least 3 experts who hold my view:
> Brian Goetz, Joshua Bloch, and Chris Thomasson. Here is, an

Strike that. It seems Chris refers to boost::shared_ptr as having
"normal" thread-safety so I guess he doesn't agree with those other
two. Apologies, Chris, for misrepresenting what you wrote.

KHD

Chris M. Thomasson

unread,

Aug 23, 2009, 12:04:27 AM8/23/09

to

"Keith H Duggar" <dug...@alum.mit.edu> wrote in message
news:9c7269cd-48b6-4307...@c2g2000yqi.googlegroups.com...

No problem at all Keith. FWIW, one can make `boost::shared_ptr' strongly
thread-safe by adding some external synchronization. For instance, take this
simple solution to the classic reader/writer problem:
__________________________________________________________________
static boost::shared_ptr<foo> g_foo;

void writers() {
for (;;) {
boost::shared_ptr<foo> l_foo(new foo);
mutex_lock();
g_foo = l_foo;
mutex_unlock();
}
}

void readers() {
for (;;) {
mutex_lock();
boost::shared_ptr<foo> l_foo(g_foo);
mutex_unlock();
l_foo->something();
}
}
__________________________________________________________________

Otherwise, shared_ptr as-is does not have what it takes to do that on it's
own. BTW, the scenario above will not result in memory consumption blow up
because of the deterministic nature of reference counting.

Chris M. Thomasson

unread,

Aug 23, 2009, 12:07:44 AM8/23/09

to

"James Kanze" <james...@gmail.com> wrote in message

news:c82f3460-d7eb-4b69...@32g2000yqj.googlegroups.com...

On Aug 21, 4:37 pm, Juha Nieminen <nos...@thanks.invalid> wrote:
> > Chris M. Thomasson wrote:
> > > FWIW, Boost shared_ptr only provides basic/normal
> > > thread-safety. It does NOT provide strong thread-safety in
> > > any way shape or form.

> > I don't even understand the difference between "basic/normal"
> > and "strong".

> "basic/normal" thread-safety is what is normally meant by thread
> safety (e.g. Posix guarantees). "Strong" thread safety is
> something more: it means that the object state can be modified
> (from the client point of view) from several threads
> simultaneously. It can be useful in a few specific cases
> (message queues, etc.), but is expensive, and not generally
> useful enough to warrant the expense.

FWIW, there are several ways to implement strongly thread-safe smart
pointers without using any mutual exclusion synchronization whatsoever such
that the implementation can be lock-free, or even wait-free.

Juha Nieminen

unread,

Aug 23, 2009, 3:07:27 AM8/23/09

to

Sam wrote:
> Juha Nieminen writes:
>
>> Sam wrote:
>>> Casting from any superclass to a subclass incurs a penalty
>>
>> I don't think that's true. If the compiler can see both class
>> declarations and there is a simple inheritance relation between them
>> (with no virtual functions in the derived class), the pointer doesn't
>
> A dynamic_cast from a superclass to a subclass, a derived class, only
> works if the superclass has, at least, a virtual destructor.

Since the smart pointer was told what the derived class type is, why
would it do a dynamic_cast? What would be the point?

The only difference between a dynamic_cast and a static_cast in this
case is that the former might return a null pointer. If for whatever
reason the smart pointer was told that the object type is A but in
reality it's an object of different type B, you will get buggy behavior
regardless of whether the smart pointer uses dynamic_cast or
static_cast: In the former case you will get a null pointer access, in
the latter memory trashing (as the member functions of the object are
called with the wrong type of object). Either situation is completely
erroneous.

>> If you want to use a shared_ptr which points to an object which must
>> not be destroyed by that shared_ptr, you can tell it.
>
> I thought that the whole point of a shared_ptr is so that the referenced
> object may be destroyed at the appropriate time. If you don't want the
> object destroyed, you don't need a shared_ptr.

You might not have an option. You could have, for example, a function
which takes a boost::shared_ptr as parameter, and thus you have no other
option but to give it one. However, if you don't want to allocate the
object dynamically just to call that function, but instead you want a
stack-allocated object instead, you can still do it, as
boost::shared_ptr supports telling it to not to try to destroy it.

James Kanze

unread,

Aug 23, 2009, 4:49:40 AM8/23/09

to

It doesn't use the word "naive", no. But that's a more or less
obvious interpretation of what it does say---that requiring the
"strong" guarantee is a more or less naive interpretation of
thread safety.

> > As pointed out above, Posix doesn't require it, and in fact,
> > no expert that I know defines thread-saftey in that way.

> > > Finally, I will note N2519
> > >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2519.html
> > > which claims that the level of thread-safe that
> > > boost::shared_ptr provides is "not widely recognized or named
> > > in the literature" and that is consistent with my experience
> > > as well.

> > It's curious that in the sentence immediately following this
> > statement, he cites a document that does "recognize" this
> > level of thread safety. And if this level of thread safety
> > has no special name, it's probably because it is what is
> > generally assumed by "thread-safety" by the experts in the
> > field; I've never seen any article by an expert in threading
> > that spoke of "strong thread-safety" other than to explain
> > that this is not what is meant by "thread-safety".

> The crux of our disagreement is two fold 1) you hold that a
> function can be "thread-safe" even if it conditionally imposes
> some extra requirements such as different objects, buffer
> pointers, etc 2) you hold that "all the experts" agree with
> you.

> As to 1) I simply disagree.

Then practically speaking, thread-safety is a more or less
useless term, except in a few special cases.

As I pointed out, my meaning is the one Posix uses, which is a
pretty good start for "accepted use" of a term.

> I think something is "thread-safe" only if it is the naive
> sense of "if the class works when there is only a
> single-thread and is thread-safe, then it works when there are
> multiple threads with *no additional synchronization coding
> required* (note this was just a *toy* way of putting it, read
> the article I'm about to link for a more useful wording and
> discussion of it).

In other words, functions like localtime_r, which Posix
introduced precisely to offer a thread safe variant aren't
thread safe.

> As to 2) well there are at least 3 experts who hold my view:
> Brian Goetz, Joshua Bloch, and Chris Thomasson. Here is, an
> article by Brian Goetz that lays out the issues very nicely:

> http://www.ibm.com/developerworks/java/library/j-jtp09263.html

Except for Chris, I've never heard of any of them. But
admittedly, most of my information comes from experts in Posix
threading.

> Finally, as to 1) ultimately it is a matter of definition. If
> you are right and we all agree to call the "as thread-safe as
> a built-in type" just "thread-safe" that would be fine too.
> However, as you can see, there is disagreement and my point
> here was simply that one should be a bit more careful that
> just saying boost::shared_ptr is "thread-safe". Indeed, one
> should call it exactly what the Boost document calls it "as
> thread-safe as a built-in type" or perhaps "conditionally
> thread-safe" so as to be careful and avoid confusion.

It's always worth being more precise, and I agree that when the
standard defines certain functions or objects as "thread-safe",
it should very precisely define what it means by the term---in
the end, it's an expression which in itself doesn't mean much.

Formally speaking, no object or function is required to meet its
contract unless the client code also fulfills its obligations;
formally speaking, an object or function is "thread safe" if it
defines its contractual behavior in a multithreaded environment,
and states what it requires of the client code in such an
environment. Practically speaking, I think that this would
really be the most useful definition of thread-safe as well, but
I think I'm about the only person who sees it that way. The
fact remains, however, that Posix and others do define
thread-safety in a more or less useful form, which is much less
strict than what you seem to be claiming.

> And by the way, that last point is not just some pointless
> nit- pick. Even in the last year-and-half at work, I caught
> (during review) three cases of thread-unsafe code that was a
> result of a boost::shared_ptr instance being shared unsafely
> (all were cases of one thread writing and one reading). When I
> discussed the review with the coders all three said exactly
> the same thing "But I thought boost::shared_ptr was
> thread-safe?". Posts like Juha's that say unconditionally
> "boost::shared_ptr is thread-safe" continue to help perpetuate
> this common (as naive as you might say it is)
> misunderstanding.

OK. I can understand your problem, but I don't think that the
problem is with boost::shared_ptr (or even with calling it
thread-safe); the problem is education. In my experience, the
vast majority of programmers don't understand threading issues
in general: I've seen more than a few cases of people putting
locks in functions like std::vector<>::operator[], which return
references, and claiming the strong thread-safe guarantee. Most
of the time, when I hear people equating strong thread-safety
with thread-safety in general, they are more or less at about
this level---and the word naive really does apply. Just telling
them that boost::shared_ptr is not thread safe in this sense is
treating the symptom, not the problem, and will cause problems
further down the road. (Again, IMHO, the best solution would be
to teach them that "thread-safety" means that the class has
documented its requirements with regards to threading somewhere,
and that client code has to respect those requirements, but I
fear that that's a loosing battle.)

James Kanze

unread,

Aug 23, 2009, 4:55:21 AM8/23/09

to

On Aug 22, 1:51 pm, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "James Kanze" <james.ka...@gmail.com> wrote in message

> news:88b62bf5-dd6e-4db9...@k30g2000yqf.googlegroups.com...
> On Aug 21, 5:05 pm, "Chris M. Thomasson" <n...@spam.invalid>
> wrote:

> > > "Juha Nieminen" <nos...@thanks.invalid> wrote in message
> > > > It's not like non-intrusive reference-counting smart
> > > > pointers cannot be made the size of one single pointer.
> > > Do you mean that the overhead of a non-intrusive
> > > reference-counting smart pointer will be sizeof(void*)?

> Okay. See, when I used the term overhead, I meant the total

> overhead including the pointer to the private counter object
> `myImpl' and the amount of memory it takes to create said
> object.

I wasn't sure, but the use of "sizeof" suggested very strongly
that you were talking about sizeof, which doesn't include such
overhead.

> So, that's 1 pointer + 1 pointer + 1 int. Perhaps I am in
> error thinking about it that way.

There's no simple answer, and I can imagine cases where the
extra level of indirection is the cheapest solution. For
something like shared_ptr, it's a trade-off between the cost of
copying (cheaper with the extra level of indirection) and
dereference speed (cheaper with the larger "sizeof"). In most
of my code, pointers are dereferenced a lot more than they are
copied, so the choice is obvious. For my code.

(I doubt that the memory usage of smart pointers is ever much of
an issue.)

James Kanze

unread,

Aug 23, 2009, 5:01:32 AM8/23/09

to

On Aug 22, 2:59 pm, Sam <s...@email-scan.com> wrote:
> Chris M. Thomasson writes:

[...]

> Also, consider another major design flaw with shared_ptr: a
> class method has no way of obtaining a reference to its own
> instance. Some method of class A may want to create an
> instance of class B that holds a reference to the instance of
> A that created it, and, say, return a reference to the newly
> created instance of B. That seems to me like a reasonable, and
> quite common, thing to do:

> ref<B> A::method()
> {
> // create an instance of B
> // B contains a reference to an instance of A, namely this object.
> // return the initial reference to B
> }

That, of course, is a serious problem, and is the main reason
why I'd tend to avoid using boost::shared_ptr for managing
lifetime. Several hacks have been introduced to work around it,
e.g. enable_shared_from_this, but unless you have the rule that
you never use shared_ptr except if the class derives from
enable_shared_from_this, then you're playing with fire.

James Kanze

unread,

Aug 23, 2009, 5:09:46 AM8/23/09

to

On Aug 22, 7:51 pm, Pete Becker <p...@versatilecoding.com> wrote:
> Juha Nieminen wrote:

> > Increments and decrements are in no way guaranteed to be
> > atomic, and in some architectures they may well not be. Even
> > if they were, there's still a huge mutual exclusion problem
> > here:

> > if (! --m_ptr->m_count) {
> > delete m_ptr;
> > }

> > Guess what happens if another thread executes this same code
> > in between the decrement and the comparison to null in this
> > thread, and the counter happened to be 2 to begin with.

> If the decrement is atomic (not an atomic CPU instruction, but
> atomic in the sense of not tearing and producing a result
> that's visible to all threads that use the variable) then this
> works just fine. Of course, all the other manipulations of
> this variable must also be similarly atomic.

That's fine, but on what machines is the decrement atomic. On
an Intel, only if it's done as a single instruction, preceded by
a lock prefix, and I'm not even sure then (and the compilers I
use don't generate the lock prefix, even if the expression has a
volatile qualified type). On a Sparc (and most other RISC
architectures), decrementation requires several machine
instructions, period, so is not atomic.

Pete Becker

unread,

Aug 23, 2009, 7:16:14 AM8/23/09

to

Sam wrote:
>
> A dynamic_cast from a superclass to a subclass, a derived class, only
> works if the superclass has, at least, a virtual destructor.
>

dynamic_cast requires that the type being converted from have at least
one virtual function. There is no requirement for a virtual destructor.

Pete Becker

unread,

Aug 23, 2009, 7:19:52 AM8/23/09

to

Juha Nieminen wrote:
> Sam wrote:
>> Casting from any superclass to a subclass incurs a penalty
>

To use C++ terminology: Casting from any base class to a derived class
incurs a penalty...

> I don't think that's true. If the compiler can see both class
> declarations and there is a simple inheritance relation between them
> (with no virtual functions in the derived class), the pointer doesn't
> have to be modified in any way and this decision can be done at compile
> time. The cast will effectively not create any machine code whatsoever.
>

To use dynamic_cast to convert from a base type to a derived type, the
base type must have at least one virtual function. dynamic_cast has to
check whether the type of the object is, in fact, the derived type. If
not, it returns a null pointer or throws an exception, depending on
whether the dynamic_cast is targeting a pointer or a reference.

Pete Becker

unread,

Aug 23, 2009, 7:22:10 AM8/23/09

to

Juha Nieminen wrote:
> Sam wrote:
>> Juha Nieminen writes:
>>
>>> Sam wrote:
>>>> Casting from any superclass to a subclass incurs a penalty
>>> I don't think that's true. If the compiler can see both class
>>> declarations and there is a simple inheritance relation between them
>>> (with no virtual functions in the derived class), the pointer doesn't
>> A dynamic_cast from a superclass to a subclass, a derived class, only
>> works if the superclass has, at least, a virtual destructor.
>
> Since the smart pointer was told what the derived class type is, why
> would it do a dynamic_cast? What would be the point?

To ensure that the conversion is valid. Having done that at construction
time, tr1's shared_ptr holds the converted pointer, and does not need to
do the dynamic_cast again.

>
> The only difference between a dynamic_cast and a static_cast in this
> case is that the former might return a null pointer. If for whatever
> reason the smart pointer was told that the object type is A but in
> reality it's an object of different type B, you will get buggy behavior
> regardless of whether the smart pointer uses dynamic_cast or
> static_cast: In the former case you will get a null pointer access, in
> the latter memory trashing (as the member functions of the object are
> called with the wrong type of object). Either situation is completely
> erroneous.

But you can check for a null pointer; you can't check for a bogus
conversion that you told the compiler to do.

Pete Becker

unread,

Aug 23, 2009, 7:23:47 AM8/23/09

to

Juha Nieminen wrote:
> Pete Becker wrote:
>> Juha Nieminen wrote:
>>> Increments and decrements are in no way guaranteed to be atomic, and
>>> in some architectures they may well not be. Even if they were, there's
>>> still a huge mutual exclusion problem here:
>>>
>>> if (! --m_ptr->m_count) {
>>> delete m_ptr;
>>> }
>>>
>>> Guess what happens if another thread executes this same code in
>>> between the decrement and the comparison to null in this thread, and the
>>> counter happened to be 2 to begin with.
>>>
>> If the decrement is atomic (not an atomic CPU instruction, but atomic in
>> the sense of not tearing and producing a result that's visible to all
>> threads that use the variable) then this works just fine. Of course, all
>> the other manipulations of this variable must also be similarly atomic.
>
> I don't understand how that can work if the result of the decrement is
> not immediately visible to all threads.
>

Which is why I said "producing a result that's visible to all threads..."

What I neglected to say is that the entire decrement and test must be

Pete Becker

unread,

Aug 23, 2009, 7:24:49 AM8/23/09

to

Chris M. Thomasson wrote:
> "Pete Becker" <pe...@versatilecoding.com> wrote in message
> news:KsadnSFScpGWrQ3X...@giganews.com...
>> Juha Nieminen wrote:
>>>
>>> Increments and decrements are in no way guaranteed to be atomic, and
>>> in some architectures they may well not be. Even if they were, there's
>>> still a huge mutual exclusion problem here:
>>>
>>> if (! --m_ptr->m_count) {
>>> delete m_ptr;
>>> }
>>>
>>> Guess what happens if another thread executes this same code in
>>> between the decrement and the comparison to null in this thread, and the
>>> counter happened to be 2 to begin with.
>>>
>>
>> If the decrement is atomic (not an atomic CPU instruction, but atomic
>> in the sense of not tearing and producing a result that's visible to
>> all threads that use the variable) then this works just fine. Of
>> course, all the other manipulations of this variable must also be
>> similarly atomic.
>
> You would also need to ensure that the read-modify-write sequence was a
> full atomic operation.

Yes. That's what I was thinking, but didn't say.

Pete Becker

unread,

Aug 23, 2009, 7:27:30 AM8/23/09

to

James Kanze wrote:
> On Aug 22, 7:51 pm, Pete Becker <p...@versatilecoding.com> wrote:
>> Juha Nieminen wrote:
>
>>> Increments and decrements are in no way guaranteed to be
>>> atomic, and in some architectures they may well not be. Even
>>> if they were, there's still a huge mutual exclusion problem
>>> here:
>
>>> if (! --m_ptr->m_count) {
>>> delete m_ptr;
>>> }
>
>>> Guess what happens if another thread executes this same code
>>> in between the decrement and the comparison to null in this
>>> thread, and the counter happened to be 2 to begin with.
>
>> If the decrement is atomic (not an atomic CPU instruction, but
>> atomic in the sense of not tearing and producing a result
>> that's visible to all threads that use the variable) then this
>> works just fine. Of course, all the other manipulations of
>> this variable must also be similarly atomic.
>
> That's fine, but on what machines is the decrement atomic.

"... NOT AN ATOMIC CPU INSTRUCTION, BUT ..."

That clearly say that it's not necessarily an atomic instruction.

> On
> an Intel, only if it's done as a single instruction, preceded by
> a lock prefix, and I'm not even sure then (and the compilers I
> use don't generate the lock prefix, even if the expression has a
> volatile qualified type). On a Sparc (and most other RISC
> architectures), decrementation requires several machine
> instructions, period, so is not atomic.
>

You're refuting a claim that I didn't make.

Sam

unread,

Aug 23, 2009, 9:41:00 AM8/23/09

to

Juha Nieminen writes:

> Sam wrote:
>> Juha Nieminen writes:
>>
>>> Sam wrote:
>>>> Casting from any superclass to a subclass incurs a penalty
>>>
>>> I don't think that's true. If the compiler can see both class
>>> declarations and there is a simple inheritance relation between them
>>> (with no virtual functions in the derived class), the pointer doesn't
>>
>> A dynamic_cast from a superclass to a subclass, a derived class, only
>> works if the superclass has, at least, a virtual destructor.
>
> Since the smart pointer was told what the derived class type is, why
> would it do a dynamic_cast? What would be the point?

Because if the subclass derives from multiple instances of the superclass,
only dynamic_cast can figure it out.

> The only difference between a dynamic_cast and a static_cast in this
> case is that the former might return a null pointer. If for whatever
> reason the smart pointer was told that the object type is A but in
> reality it's an object of different type B, you will get buggy behavior
> regardless of whether the smart pointer uses dynamic_cast or
> static_cast: In the former case you will get a null pointer access, in

Even before you get to this point, the reference pointer's method that did
the conversion can check for a null pointer and throw an exception
immediately.

> the latter memory trashing (as the member functions of the object are
> called with the wrong type of object). Either situation is completely
> erroneous.

Correct, but in one case you end up with a well defined failure case, that
can be easily debugged. In the other case, you'll keep plugging along for
some time before blowing up, with the debugger showing an utter mess, and
you may not be able to easily determine what the root cause of it was.

>>> If you want to use a shared_ptr which points to an object which must
>>> not be destroyed by that shared_ptr, you can tell it.
>>
>> I thought that the whole point of a shared_ptr is so that the referenced
>> object may be destroyed at the appropriate time. If you don't want the
>> object destroyed, you don't need a shared_ptr.
>
> You might not have an option. You could have, for example, a function
> which takes a boost::shared_ptr as parameter, and thus you have no other
> option but to give it one.

That's a design issue, that's solved by a redesign.

Sam

unread,

Aug 23, 2009, 9:47:40 AM8/23/09

to

Pete Becker writes:

> Juha Nieminen wrote:
>> Pete Becker wrote:
>>> Juha Nieminen wrote:
>>>> Increments and decrements are in no way guaranteed to be atomic, and
>>>> in some architectures they may well not be. Even if they were, there's
>>>> still a huge mutual exclusion problem here:
>>>>
>>>> if (! --m_ptr->m_count) {
>>>> delete m_ptr;
>>>> }
>>>>
>>>> Guess what happens if another thread executes this same code in
>>>> between the decrement and the comparison to null in this thread, and the
>>>> counter happened to be 2 to begin with.
>>>>
>>> If the decrement is atomic (not an atomic CPU instruction, but atomic in
>>> the sense of not tearing and producing a result that's visible to all
>>> threads that use the variable) then this works just fine. Of course, all
>>> the other manipulations of this variable must also be similarly atomic.
>>
>> I don't understand how that can work if the result of the decrement is
>> not immediately visible to all threads.
>>
>
> Which is why I said "producing a result that's visible to all threads..."

gcc manual, section 5.47, under the description of atomic functions, states
the following:

In most cases, these builtins are considered a full barrier. That is, no
memory operand will be moved across the operation, either forward or
backward. Further, instructions will be issued as necessary to prevent the
processor from speculating loads across the operation and from queuing
stores after the operation.

I interpret this as stating that the results of these atomic functions will
be immediately visible to all other threads.

Pete Becker

unread,

Aug 23, 2009, 10:46:22 AM8/23/09

to

There are ways to implement fully atomic accesses to data objects,
including guaranteeing visibility. What I was responding to was your
assertion (which has disappeared from the history):

>> The actual
>> increment/decrement operations are the same as with shared_ptr --
>> compiler-specific instructions that compile down to atomic CPU
>> operations, so that they are thread-safe.

Using "atomic CPU operations" is not sufficient, in general. Some
platforms might support them, but others don't.

Keith H Duggar

unread,

Aug 23, 2009, 11:57:43 AM8/23/09

to

> > > na‚àö√òve meaning for thread safety.

> > It points out nothing about the notion being "naive". That is
> > your coloration. It simply points out the likely costs.
>
> It doesn't use the word "naive", no. But that's a more or less
> obvious interpretation of what it does say---that requiring the
> "strong" guarantee is a more or less naive interpretation of
> thread safety.

Well in one sense it is naive because it is exactly how (in my
experience) a novice programmer interprets the term from simple
common sense alone. On the other hand, it is also used by (some)
experts as a non-naive goal to which they aim; and this results
in useful thinking, coding, etc such as even in this case with
strong thread-safe smart pointers. It's analogous to how aiming
for "immutability" often leads to very useful designs. Obviously
immutability is an even more restrictive concept nevertheless it
is still very useful.

> > > As pointed out above, Posix doesn't require it, and in fact,
> > > no expert that I know defines thread-saftey in that way.
> > > > Finally, I will note N2519
> > > >http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2519.html
> > > > which claims that the level of thread-safe that
> > > > boost::shared_ptr provides is "not widely recognized or named
> > > > in the literature" and that is consistent with my experience
> > > > as well.
> > > It's curious that in the sentence immediately following this
> > > statement, he cites a document that does "recognize" this
> > > level of thread safety. And if this level of thread safety
> > > has no special name, it's probably because it is what is
> > > generally assumed by "thread-safety" by the experts in the
> > > field; I've never seen any article by an expert in threading
> > > that spoke of "strong thread-safety" other than to explain
> > > that this is not what is meant by "thread-safety".
> > The crux of our disagreement is two fold 1) you hold that a
> > function can be "thread-safe" even if it conditionally imposes
> > some extra requirements such as different objects, buffer
> > pointers, etc 2) you hold that "all the experts" agree with
> > you.
> > As to 1) I simply disagree.
>
> Then practically speaking, thread-safety is a more or less
> useless term, except in a few special cases.

Please see above as to whether it is useful.

> As I pointed out, my meaning is the one Posix uses, which is a
> pretty good start for "accepted use" of a term.

Even after careful consideration of your points, I am still not
convinced that the conditional thread-safety the _r variants give
is the definition of POSIX thread-safe. If the _r implementations
are defining then it would be but then it's rarely a good idea
for implementations to define concepts.

I suppose you might argue that they had a clear concept and _r are
just examples of that concept. However, in their design rationale
they considered other implementations for the _r variants, namely
dynamic allocation and thread-local storage, both of which would
have provided strong thread-safety. And it seems to me the choice
not to provide the thread-local storage strong solution was simply
incidental rather than fundamental. In other words, some practical
portability issues and not fundamental concepts controlled the
POSIX implementation decision.

> > I think something is "thread-safe" only if it is the naive
> > sense of "if the class works when there is only a
> > single-thread and is thread-safe, then it works when there are
> > multiple threads with *no additional synchronization coding
> > required* (note this was just a *toy* way of putting it, read
> > the article I'm about to link for a more useful wording and
> > discussion of it).
>
> In other words, functions like localtime_r, which Posix
> introduced precisely to offer a thread safe variant aren't
> thread safe.

Correct, the _r variants are "conditionally" thread-safe and I
think the implementation choice was largely incidental to the
concept of "thread-safe".

> > As to 2) well there are at least 3 experts who hold my view:
> > Brian Goetz, Joshua Bloch, and Chris Thomasson. Here is, an
> > article by Brian Goetz that lays out the issues very nicely:
> >http://www.ibm.com/developerworks/java/library/j-jtp09263.html
>
> Except for Chris, I've never heard of any of them. But
> admittedly, most of my information comes from experts in Posix
> threading.

Can you please tell us some of the experts you have in mind?

> > Finally, as to 1) ultimately it is a matter of definition. If
> > you are right and we all agree to call the "as thread-safe as
> > a built-in type" just "thread-safe" that would be fine too.
> > However, as you can see, there is disagreement and my point
> > here was simply that one should be a bit more careful that
> > just saying boost::shared_ptr is "thread-safe". Indeed, one
> > should call it exactly what the Boost document calls it "as
> > thread-safe as a built-in type" or perhaps "conditionally
> > thread-safe" so as to be careful and avoid confusion.
>
> It's always worth being more precise, and I agree that when the
> standard defines certain functions or objects as "thread-safe",
> it should very precisely define what it means by the term---in
> the end, it's an expression which in itself doesn't mean much.

Well for novice programmers it seems to mean something clear by
simple default of common sense language: if my use of it works
with a single thread then it works as-is with multiple threads.

> Formally speaking, no object or function is required to meet its
> contract unless the client code also fulfills its obligations;
> formally speaking, an object or function is "thread safe" if it
> defines its contractual behavior in a multithreaded environment,
> and states what it requires of the client code in such an
> environment. Practically speaking, I think that this would
> really be the most useful definition of thread-safe as well, but
> I think I'm about the only person who sees it that way.

This is an interesting point. However, I think you might agree
there are some common-sense limits to what those contracts can
require. For example, would you consider:

/* thread-safety : foo must be wrapped in a mutex lock/unlock
* pairing. If this requirement is met then foo is thread-safe.
* /
int foo ( ) ;

the foo() above to be thread-safe? I wouldn't. And yet clearly
it has a contract that defines its "thread-safety". The POSIX _r
functions have a contract like

/* thread-safety : the memory location *result must not be modified
* by another thread until localtime_r returns. If this requirement
* is met then localtime_r is thread-safe.
* /
struct tm *localtime_r(const time_t *restrict timer,
struct tm *restrict result);

which is of course more reasonable; but, I'm still thinking this
is "conditionally thread-safe" not just "thread-safe".

> The fact remains, however, that Posix and others do define
> thread-safety in a more or less useful form, which is much less
> strict than what you seem to be claiming.
>
> > And by the way, that last point is not just some pointless
> > nit- pick. Even in the last year-and-half at work, I caught
> > (during review) three cases of thread-unsafe code that was a
> > result of a boost::shared_ptr instance being shared unsafely
> > (all were cases of one thread writing and one reading). When I
> > discussed the review with the coders all three said exactly
> > the same thing "But I thought boost::shared_ptr was
> > thread-safe?". Posts like Juha's that say unconditionally
> > "boost::shared_ptr is thread-safe" continue to help perpetuate
> > this common (as naive as you might say it is)
> > misunderstanding.
>
> OK. I can understand your problem, but I don't think that the
> problem is with boost::shared_ptr (or even with calling it
> thread-safe); the problem is education. In my experience, the

Except that using precise terms helps to educate. So calling it
"conditionally thread-safe" would help to simultaneously educate
while just calling it "thread-safe" helps to introduce bugs.

> vast majority of programmers don't understand threading issues
> in general: I've seen more than a few cases of people putting
> locks in functions like std::vector<>::operator[], which return
> references, and claiming the strong thread-safe guarantee. Most
> of the time, when I hear people equating strong thread-safety
> with thread-safety in general, they are more or less at about
> this level---and the word naive really does apply. Just telling
> them that boost::shared_ptr is not thread safe in this sense is
> treating the symptom, not the problem, and will cause problems
> further down the road.

I'm not saying we should tell them boost::shared_ptr is "not
thread-safe" because yes it would cause other problems. I'm saying
we should tell them it is "conditionally thread-safe" or "as thread
safe as a built-in type" which is what the Boost documentation says.
Because that at least encourages a curious one to ask "what are the
conditions?" what does "as a built-in type mean?" etc.

We should reserve "thread-safe" for those structures that are rock-
solid, no-brainer safe with multi-threads ie strong thread-safe.

> (Again, IMHO, the best solution would be
> to teach them that "thread-safety" means that the class has
> documented its requirements with regards to threading somewhere,
> and that client code has to respect those requirements, but I
> fear that that's a loosing battle.)

Can you please tell me what kind of limits if any you would place
on client code requirements? Is the foo() I gave earlier "thread-
safe"? Or what about a more complex where the "contract" required
synchronization between all calls of multiple functions foo(),
bar(), baz(), ...? Because at some point it seems this definition
of thread-safe would become equally useless.

KHD