This confuses me somewhat because Lippman, Lajoie and Moo's C++ Primer
4th ed contains extensive smart-pointer code and it doesn't appear
terribly difficult.
Are there hidden problems with the RCSP code in C++ Primer 4th ed that
I'm missing?
Many Thanks.
I wouldn't know, as I haven't looked closely at the Primer's
implementation and I probably wouldn't find the bugs anyway,
as, IIRC, Scott's point was that they are quite subtle.
Some personal data: Out of two companies I have seen in the
last years where a home-grown 'shared_ptr'-lookalike was used,
two had subtle errors in it. Since one of them was derived
from an early version of boost's 'shared_ptr', I suppose that,
too, had subtle errors in it.
Here's two ways to find out more about this (please and report
back here): I know that Scott Meyers draws a lot of his wisdom
from newsgroup discussions and doesn't at all hesitate to ask
seemingly simple questions (for us to find out that the answer
is all but simple). So you could go and dig in google groups
about discussions he sparked (I'd look in comp.lang.c++moderated
and comp.standard.c++) on this.
Or you could go and sift through boost's archives for their
discussions on the topic. (Which might bring you to Scott's
postings, too, now that I think of it.)
> Many Thanks.
HTH,
Schobi
Hi Schobi,
I haven't done what you suggested, but I've thought and read more
about smart pointers since my last post. I doubt that there is
anything incorrect or bugged about the smart-pointer code in C++
Primer. However, it isn't very sophisticated (perhaps rightly so,
since it is, after all, intended as a book for novices), and the C++
Primer code does nothing to address the issue of cycles of pointers.
(Boost handles this by the weak_ptr class template.) Presumably, Scott
Meyers did address this issue, and maybe others. So, a safe bet is
that the reason Scott Meyers (according to his own account) struggled
a bit on his implementation, whereas C++ Primer handles the matter
simply, is that the Meyers approach was much more ambitious in its
scope.
I have never seen the smart pointer implementation in C++ Primer, but
I do agree that creating a good smart pointer class is a very
non-trivial problem. There are all kinds of safety and usability (and to
a lesser extent, efficiency) concerns which are often hard or laborious
to address.
Basically small pointers usually attempt to emulate regular pointers
as far as possible, but adding reference counting to the mix in order to
automate object destruction. However, regardless of all the syntactical
tools available in C++, it's still very hard, if not outright
impossible, to make a class fully emulate a pointer.
The most obvious case is that a pointer can point to either an
individual object or an array, and deleting is different in those cases.
(Of course one could argue that it's a *good* thing that smart pointers
to objects and smart pointers to arrays are kept separate and mutually
incompatible.)
Then there's the question of inheritance: A pointer of one type can
point to an object of another, inherited type. Pointers pointing to
different types like this can be assigned to each other. Moreover, you
can cast (statically or dynamically) a pointer of base class type to a
pointer of derived class type. A smart pointer desiring to emulate
regular pointers would need to address these issues.
A pointer can point to an incomplete type. Making a smart pointer
correctly support incomplete types is possible, but not glaringly
obvious. (Except in the case of intrusive smart pointers, for which it's
just not possible.)
An object might have been allocated with something else than the
default system allocator. Many naive smart pointer implementations out
there don't support deallocation using that same special allocator.
(Also, most don't support allocating their own ancillary data using a
user-specified allocator, rather than the default system allocator. I
think even the boost smart pointers don't support this.)
Most naive smart pointer implementations are not thread-safe. Making a
smart pointer thread-safe efficiently is extremely difficult.
Making a smart pointer safe to use is extremely difficult, if not
impossible. It's basically impossible to make sure that the same pointer
is not given to two different (non-instrusive) smart pointers. Or that a
pointer to something not allocated with 'new' is not given to it.
Intrusive smart pointers have their own gotchas, most of which naive
implementations get wrong. (For example most of them fail to take into
account that objects might be directly assigned to other objects, which
can mess up the reference counter if not specifically taken into account.)
> CplusplusNewbie wrote:
>> Are there hidden problems with the RCSP code in C++ Primer 4th ed
>> that I'm missing?
>
> I have never seen the smart pointer implementation in C++ Primer,
> but
> I do agree that creating a good smart pointer class is a very
> non-trivial problem. There are all kinds of safety and usability (and
> to a lesser extent, efficiency) concerns which are often hard or
> laborious to address.
There is no best universal smart pointer class. One uses what is best for
any given situation.
>
> Basically small pointers usually attempt to emulate regular pointers
> as far as possible, but adding reference counting to the mix in order
> to automate object destruction. However, regardless of all the
> syntactical tools available in C++, it's still very hard, if not
> outright impossible, to make a class fully emulate a pointer.
Agreed. But this is not the goal.
>
> The most obvious case is that a pointer can point to either an
> individual object or an array, and deleting is different in those
> cases. (Of course one could argue that it's a *good* thing that smart
> pointers to objects and smart pointers to arrays are kept separate and
> mutually incompatible.)
Pointers are needed for entity objects, and those do not come in arrays.
Value objects can be held in std::vector, for example. In 10 years of C++
I have not yet encountered a need to have a smartpointer to an array.
> Then there's the question of inheritance: A pointer of one type can
> point to an object of another, inherited type. Pointers pointing to
> different types like this can be assigned to each other. Moreover, you
> can cast (statically or dynamically) a pointer of base class type to a
> pointer of derived class type. A smart pointer desiring to emulate
> regular pointers would need to address these issues.
Right, this is nasty. Does C++0x introduce something to help here?
> A pointer can point to an incomplete type. Making a smart pointer
> correctly support incomplete types is possible, but not glaringly
> obvious. (Except in the case of intrusive smart pointers, for which
> it's just not possible.)
The worst case would be silent misbehavior upon deletion because of using
incomplete type. One solution to this is to use intrusive smart pointers.
> An object might have been allocated with something else than the
> default system allocator. Many naive smart pointer implementations out
> there don't support deallocation using that same special allocator.
> (Also, most don't support allocating their own ancillary data using a
> user-specified allocator, rather than the default system allocator. I
> think even the boost smart pointers don't support this.)
>
One solution to this is to use intrusive smartpointers.
> Most naive smart pointer implementations are not thread-safe. Making
> a
> smart pointer thread-safe efficiently is extremely difficult.
If performance is critical, then the data should not be visible in
multiple threads anyway. In this case the smartpointer can well ignore
multithreading issues.
When needed, protecting the refcounter with a mutex is trivial. Besides,
there are lot of cases when passing a smartpointer to a function is
actually not needed. If the function is not going to store the
smartpointer somewhere, one can pass a reference/raw pointer to the
smartpointer or to the original object instead, thus avoiding any
potential contention by updating the reference count. Using intrusive
smartpointers guarantees that the function can still recreate a smart
pointer if needed, even if it has a reference or raw pointer to the
original object only.
>
> Making a smart pointer safe to use is extremely difficult, if not
> impossible. It's basically impossible to make sure that the same
> pointer is not given to two different (non-instrusive) smart pointers.
> Or that a pointer to something not allocated with 'new' is not given
> to it.
You are right, one solution to this is to use intrusive smartpointers.
>
> Intrusive smart pointers have their own gotchas, most of which naive
> implementations get wrong. (For example most of them fail to take into
> account that objects might be directly assigned to other objects,
> which can mess up the reference counter if not specifically taken into
> account.)
In my experience, smartpointers are needed only for entity objects, which
usually do not support direct assignment anyway, so the problem does not
arise often. The intrusive reference counter should reside in a separate
base class anyway, implementing the proper copy/assignment behavior, so
one has to take this problem into account only once.
Best regards
Paavo
As I said: I don't have the book at hand, so I can't
possibly comment.
> However, it isn't very sophisticated (perhaps rightly so,
> since it is, after all, intended as a book for novices), and the C++
> Primer code does nothing to address the issue of cycles of pointers.
> (Boost handles this by the weak_ptr class template.) Presumably, Scott
> Meyers did address this issue, and maybe others.
No, he didn't. I think you're referring to Item 28 of
"More Effective C++" and I've just skimmed over it. I
found nothing of cycles.
> So, a safe bet is
> that the reason Scott Meyers (according to his own account) struggled
> a bit on his implementation, whereas C++ Primer handles the matter
> simply, is that the Meyers approach was much more ambitious in its
> scope.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1450.html#Implementation-difficulty
See above. His smart pointer is actually quite simple,
compared to boost's or TR1's. For the reasons you gave,
I doubt that a C++ Primer has a considerably more
ambitious implementation.
Schobi
The problem arises when you need to *share* the same array among
multiple entities (which might even reside in different threads). In
some situations it might not be completely clear which entity is the
last one to reference the array data. (Which one is the last to
reference the array might even be non-deterministic, in the case of
sharing between multiple threads.)
>> A pointer can point to an incomplete type. Making a smart pointer
>> correctly support incomplete types is possible, but not glaringly
>> obvious. (Except in the case of intrusive smart pointers, for which
>> it's just not possible.)
>
> The worst case would be silent misbehavior upon deletion because of using
> incomplete type. One solution to this is to use intrusive smart pointers.
Using intrusive smart pointers is only a "solution" in that it
completely disallows using incomplete types. (If that's the preferred
"solution", then I'm sure it would be possible to make a non-intrusive
smart pointer also give a compile-time error on incomplete types.)
It doesn't solve the problem if you would need pointers to incomplete
types for whatever reason.
>> An object might have been allocated with something else than the
>> default system allocator. Many naive smart pointer implementations out
>> there don't support deallocation using that same special allocator.
>> (Also, most don't support allocating their own ancillary data using a
>> user-specified allocator, rather than the default system allocator. I
>> think even the boost smart pointers don't support this.)
>>
> One solution to this is to use intrusive smartpointers.
I don't see how. Whether the smart pointer is intrusive or not has no
effect on how it should destroy the object. If the object was allocated
using some custom allocator, the smart pointer should use the same
allocator to destroy the object, regardless of whether the smart pointer
is intrusive or not.
>> Making a smart pointer safe to use is extremely difficult, if not
>> impossible. It's basically impossible to make sure that the same
>> pointer is not given to two different (non-instrusive) smart pointers.
>> Or that a pointer to something not allocated with 'new' is not given
>> to it.
>
> You are right, one solution to this is to use intrusive smartpointers.
Intrusive smart pointers have no protection against someone giving
them a pointer to an object not allocated dynamically.
(Basically avoiding the problem is left to the user of the smart
pointer. The smart pointer itself has no way of protecting itself
against this.)
>> Intrusive smart pointers have their own gotchas, most of which naive
>> implementations get wrong. (For example most of them fail to take into
>> account that objects might be directly assigned to other objects,
>> which can mess up the reference counter if not specifically taken into
>> account.)
>
> In my experience, smartpointers are needed only for entity objects, which
> usually do not support direct assignment anyway, so the problem does not
> arise often. The intrusive reference counter should reside in a separate
> base class anyway, implementing the proper copy/assignment behavior, so
> one has to take this problem into account only once.
Making intrusive objects safe for assignment is easy, but most people
who make naive intrusive smart pointer implementations never realize the
problem exists, and thus never implement the solution. That's why when
you see a first-timer implementing such a smart pointer, you can be
almost 100% sure it will have this precise problem.
Of course at a more general level, smart pointers have other problems
as well, which cannot be easily guarded against, at least not by the
pointer itself. Recursive referencing is the most famous one, but not
the only one.
A much subtler problem can happen also in another kind-of recursive
situation. For example, if module A owns an object B (and manages it
using a smart pointer), and a function of B calls a function of A which
might drop the object in question, the destruction can happen in the
middle of the B function execution. In other words, when the function in
A returns, and the function in B which called it continues, it might be
operating on a destroyed object. Mayhem ensues.
The only possible safeguard against this is that every function in B
which calls an outside module increments the reference counter at the
beginning of the function and decrements it at the end (assuming it has
a way of doing this). How much overhead this adds to the function
depends on the situation.
> Paavo Helde wrote:
>> Pointers are needed for entity objects, and those do not come in
>> arrays. Value objects can be held in std::vector, for example. In 10
>> years of C++ I have not yet encountered a need to have a smartpointer
>> to an array.
>
> The problem arises when you need to *share* the same array among
> multiple entities (which might even reside in different threads). In
> some situations it might not be completely clear which entity is the
> last one to reference the array data. (Which one is the last to
> reference the array might even be non-deterministic, in the case of
> sharing between multiple threads.)
I would encapsulate the array in a class, and use a smartpointer to the
class. For starters, without a class there would be no way for
encapsulation (making the data private). What's more, in multithreaded
regime there would be no place for a mutex (or equivalent) for protecting
changes to the array (a mutex in the array smartpointer would only
protect the reference count).
Oh, I misread your problem. I thought you were complaining about that the
reference counter piece is allocated by some other mechanism than the
object itself. If the reference counter is inside the object, this would
not happen.
And anyway, in case of intrusive smartpointers the deletion of the object
happens inside the object's refcount Release() method, which usually just
calls "delete this;", but can be made much more sophisticated easily if
needed. It would be harder for non-intrusive smartpointer, which would
probably require holding extra data together with the refcounter and
tight cooperation between the two classes.
When the first-timer would implement std::sort or std::vector, there
would be lots of bugs initially as well. That's why the smartpointer
classes ought to be in the standard library.
> Of course at a more general level, smart pointers have other
> problems
> as well, which cannot be easily guarded against, at least not by the
> pointer itself. Recursive referencing is the most famous one, but not
> the only one.
>
> A much subtler problem can happen also in another kind-of recursive
> situation. For example, if module A owns an object B (and manages it
> using a smart pointer), and a function of B calls a function of A
> which might drop the object in question, the destruction can happen in
> the middle of the B function execution. In other words, when the
> function in A returns, and the function in B which called it
> continues, it might be operating on a destroyed object. Mayhem ensues.
Using a smart-pointed object "without guards" (i.e. by plain pointer) is
a (possibly premature) optimization which has its own dangers. When in
doubt, don't do it.
>
> The only possible safeguard against this is that every function in B
> which calls an outside module increments the reference counter at the
> beginning of the function and decrements it at the end (assuming it
> has a way of doing this). How much overhead this adds to the function
> depends on the situation.
You are talking about passing smartpointers by value (constantly updating
the refcount). This is the normal way, everything else is an
optimization. And if one is really concerned about performance, then the
whole workflow should not contain frequent inter-thread synchronization,
otherwise the code would not run well in parallel on multicore machines,
however efficient the locking mechanism is. This means that for
performance-critical paths one should use thread-private data and thread-
private smartpointers with no locking, and extra care is needed when
passing the data structures over thread boundaries. I think this is
something where standard smartpointers are often deficient, for example
for such thread-passing the knowledge of refcount is needed (is it 1 or
more, in which case a deep copy of the structure is needed), but I think
often the refcount is private and there is no way to query it.
Hmm, you are starting to convince me that smartpointers are a
particularly tricky subject. I vaguely recall the time when I was
learning C++ and implementing them by myself, and by sure I stumbled
across every obstacle you have listed. This was so long ago I had
forgotten the compromises and solutions I adopted in order to get them
work reliably (use intrusive pointers via a base class, prohibit non-
dynamic allocation of objects, do not create cycles, ...).
So what's the solution? Garbage collection? It should probably work for
most of the smartpointer uses where the only resource is memory. However,
occasionally there may appear other resources as well, which require
deterministic destruction.
Paavo
> No, he didn't. I think you're referring to Item 28 of
> "More Effective C++" and I've just skimmed over it. I
> found nothing of cycles.
I don't know if he mentionned them, but his reference pointers
don't handle cycles.
> > So, a safe bet is that the reason Scott Meyers (according to
> > his own account) struggled a bit on his implementation,
> > whereas C++ Primer handles the matter simply, is that the
> > Meyers approach was much more ambitious in its scope.
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1450.html#Im...
I wonder if he isn't letting himself be led astray by Boost.
boost::shared_ptr attempts to do a lot more than Scott's
reference counted pointer (which is perfectly adequate, and in
fact preferable to boost::shared_ptr for most uses).
Of the three "difficulties" he explicitly mentions, the first
two concern possible exceptions from functions which in a simple
reference counted pointer are automatically no throw (since they
only involve incrementing and decrementing integers and
assigning pointers). As for the third, it's really just a
special case of a more general problem, related to self
assignment, and isn't present if you use the swap idiom for
assignment, or any of the other means for handling self
assignment (i.e. anything which avoids premature destruction of
the right hand side of the assignment).
With regards to the last example, however: reference counted
pointers are not a panacea. As a general rule, it's probably
best to avoid using reference counted pointers on objects which
themselves contain reference counted pointers.
--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
> I have never seen the smart pointer implementation in C++
> Primer, but I do agree that creating a good smart pointer
> class is a very non-trivial problem. There are all kinds of
> safety and usability (and to a lesser extent, efficiency)
> concerns which are often hard or laborious to address.
The only real problem is a design one: what problem are you
trying to solve. Reference counted pointers are not a panacea,
and should only be used where appropriate. Once you've defined
that, they aren't that difficult to implement, with one caveat.
> Basically small pointers usually attempt to emulate regular
> pointers as far as possible, but adding reference counting to
> the mix in order to automate object destruction.
That's probably the error. C++ pointers are inherited from C,
and have a lot of "features" which you generally don't want to
use.
> However, regardless of all the syntactical tools available in
> C++, it's still very hard, if not outright impossible, to make
> a class fully emulate a pointer.
And you probably don't want to.
> The most obvious case is that a pointer can point to either an
> individual object or an array, and deleting is different in
> those cases. (Of course one could argue that it's a *good*
> thing that smart pointers to objects and smart pointers to
> arrays are kept separate and mutually incompatible.)
It is a good thing. I don't think I've ever used array new, in
over 17 years of C++.
> Then there's the question of inheritance: A pointer of one
> type can point to an object of another, inherited type.
> Pointers pointing to different types like this can be assigned
> to each other. Moreover, you can cast (statically or
> dynamically) a pointer of base class type to a pointer of
> derived class type. A smart pointer desiring to emulate
> regular pointers would need to address these issues.
Or not. At one time, I implemented support for this in my
reference counted pointer, mainly to prove to myself that it
could be done. In practice, I don't think I've ever actually
used it (except in the unit tests). In practice, casting up and
down a hierarchy is fairly rare, and almost always involves full
entity objects, which have an explicit lifetime, and so
shouldn't be managed by reference counted pointers. (I'm sure
that there are exceptions. But as I said, I've not encountered
one in 17 years of C++.)
> A pointer can point to an incomplete type. Making a smart
> pointer correctly support incomplete types is possible, but
> not glaringly obvious. (Except in the case of intrusive smart
> pointers, for which it's just not possible.)
And of course, using anything but an intrusive reference counted
pointer is just folly.
> An object might have been allocated with something else than
> the default system allocator. Many naive smart pointer
> implementations out there don't support deallocation using
> that same special allocator. (Also, most don't support
> allocating their own ancillary data using a user-specified
> allocator, rather than the default system allocator. I think
> even the boost smart pointers don't support this.)
And... So there are some objects you can't manage with
reference counted pointers.
> Most naive smart pointer implementations are not thread-safe.
> Making a smart pointer thread-safe efficiently is extremely
> difficult.
This is the one caveat. Making reference counting thread safe
isn't that difficult, but it does entail a certain overhead.
And usually, the thread safety isn't necessary (so you might not
want the overhead there all the time).
> Making a smart pointer safe to use is extremely difficult, if
> not impossible. It's basically impossible to make sure that
> the same pointer is not given to two different
> (non-instrusive) smart pointers. Or that a pointer to
> something not allocated with 'new' is not given to it.
I only use intrusive reference counting; anything else IS a
folly. And the fact that a class derives from the required base
class is a fairly strong indicator to the client code that it
must be allocated dynamically. (At one time, at least, I even
had implementation dependent code in the constructor of the
required base class to ensure dynamic allocation. The following
works both under Solaris on a Sparc and under Linux on a PC:
extern int _end ;
int dummy ;
assert( reinterpret_cast< char const* >( this )
> reinterpret_cast< char const* >( &_end )
&& reinterpret_cast< char const* >( this )
< reinterpret_cast< char const* >( &dummy ) ) ;
.)
> Intrusive smart pointers have their own gotchas, most of which
> naive implementations get wrong. (For example most of them
> fail to take into account that objects might be directly
> assigned to other objects, which can mess up the reference
> counter if not specifically taken into account.)
Objects which are dynamically allocated do not normally support
assignment. But it doesn't matter: the required base class for
the intrusive reference counter declares both the copy
constructor and the assignment operator private; if the derived
class wants to support these operators, it can, but it will be
forced to do the right thing with regards to the base class.
[concerning reference counting pointers...]
> So what's the solution? Garbage collection? It should probably
> work for most of the smartpointer uses where the only resource
> is memory. However, occasionally there may appear other
> resources as well, which require deterministic destruction.
Reference counting pointers don't offer true deterministic
destruction either. I've yet to see a case where reference
counting pointers would work, but garbage collection wouldn't
provide a cleaner (simpler, more efficient) solution.
> As a general rule, it's probably
> best to avoid using reference counted pointers on objects which
> themselves contain reference counted pointers.
Can you give some rationale for this claim? Is it just a dead sure way to
avoid cycles? I'm asking because at least 90% of smartpointers in our
applications reside inside of other smartpointed objects.
Best regards
Paavo
Obviously, it is a 100% sure means of avoiding cycles.
More generally, however: what sort of object would be managed by
a reference counted pointer, and would also contain one? There
are probably exceptions, but the only types of objects I can
reasonably see containing reference counted pointers are entity
objects, and you don't want reference counted pointers to entity
objects, since entity objects normally have deterministic
lifespans. If I look at my own code, almost all of the objects
I have which are managed by reference counted pointers are
agents of some sort or another---small polymorphic objects with
no state of their own, which are created and used to do just one
special thing. Any pointers they have are for navigation, and
so are either raw pointers, or if there is some chance that the
lifetime of the object referred to ends before the agent is
finished, some sort of "observer" pointer, which will
automatically be null'ed in the destructor of the pointed to
object.
No, I'm not. You didn't understand the situation I'm talking about.
Module A owns an object of type B. A manages B using a smart pointer
(ie. A owns the smart pointer instance, which is managing the B object).
Now some function of B gets called from somewhere (eg. from inside A).
This function in B calls some function in A. This latter function drops
the B object it owns. If it was the last reference to the B object, the
B object gets destroyed.
But now when the function in A returns, it returns to the function in
B, which continues oblivious to anything that A might have done. Now the
function in B is acting on a destroyed object (ie. 'this' points to a
deleted object).
There are no smartpointers passed by value anywhere in this scenario.
The only way for the function in B to guard itself against this
scenario possibly happening is to increment the refcount at the
beginning and decrement it at the end (possibly destroying itself). If B
has no access to the refcount, this is actually not possible at all.
Alternatively the code which calls this function in B has to make the
incrementation. However, in neither case it's obvious that this has to
be done in order to protect the object from being destroyed too early.
> Hmm, you are starting to convince me that smartpointers are a
> particularly tricky subject.
They are. In general, one should design C++ programs in such way that
you don't *need* any smart pointers, while still keeping the code safe.
(In most cases your program will probably become more efficient as well,
as a side-effect.)
You have mentioned that before, and I still don't get it.
The correct solution to the problem is to create an empty copy
constructor and assignment operator in the base class (ie. they simply
do nothing). After that, assigning objects around will not mess up the
reference count.
That solution is very simple and obvious (once you know it). However,
instead of that, you prefer to write about the exact same amount of code
just to make the copy constructor and assignment operator private. For
what end? To make the user's life more difficult for no apparent purpose?
> if the derived
> class wants to support these operators, it can, but it will be
> forced to do the right thing with regards to the base class.
And what would that "right thing" be? The right thing is to *do
nothing* in the base class. How can the derived class "do nothing" any
way better than the base class?
I see. It appears we are using smartpointers in quite for different
purposes. Most our uses involve complicated data structures (containers,
arrays, tables, images), parts of which are in shared use for efficiency
reasons, and refcounted for keeping track of the shared usage count. In
principle we could always make deep copies when creating new data
structures and avoid smartpointers altogether, but this would create
excessive overhead (already now in some situations we are approaching the
virtual address space limit, for 32-bit compilations).
For most cases garbage collection would work for our code as well.
However, smartpointed object lifetime is more deterministic than in case
of garbage-collection (where it is logically infinite). If all references
to the object are gone, the object is destroyed _in situ_. In some cases
this triggers extra actions in our application, which have to be carried
out deterministically. One could possibly argue that in such cases the
ownership issues must be clear enough anyway (a human must figure it out
and be sure in which timepoint the object is destructed) so that
smartpointer approach would not be needed, but this would require extra
effort and make the data model inhomogeneous, so we are using
smartpointers also for such data objects. Maybe this is a mistake, making
our application more fragile.
Best regards
Paavo
The problem is not obvious only if the functions are not documented or
nobody is reading the documentation. As always in C++, one should know
what one is doing. Note that the similar problem can arise much more
simply as well:
// in a member function of B...
delete this;
// not turning attention the the previous line...
this->m_data = 0;
Of course this error can be catched much more easily, but that's not the
point; the point is that the programmer should know the effects of each
line of code he writes.
If the potential effect of destroying B is documented properly, then the
method of B can easily protect itself against the premature destruction
(again, only when using intrusive smartpointers!):
// in a member function of B...
SmartPtr keepalive(this);
a->DangerousFunction();
this->m_data = 0; // ok
...
You see that the object is kept alive by another temporary smartpointer,
which does exactly the things you mention: incrementing and decrementing
the refcount. Not a brilliant solution, and probably the overall
application design is convoluted if such tricks are needed, but
nevertheless this can be made to work. I have used such things in the
past, and later abandoned them after cleaning up the application design.
Paavo
Aha. You're using them to implement CoW. I'll admit that I
hadn't considered that---my pre-standard String class used CoW
(but that was before threading because common), but since then,
I haven't needed it much. But you're right, reference counted
pointers are a good solution here, provided that the structure
is guaranteed to be a tree, or at least an acyclic graph. (Come
to think about it, that's exactly how I manage the parse tree in
my RegularExpression class. I'd sort of forgotten about that,
however, since it's something I wrote around 15 years ago.)
> For most cases garbage collection would work for our code as
> well. However, smartpointed object lifetime is more
> deterministic than in case of garbage-collection (where it is
> logically infinite). If all references to the object are gone,
> the object is destroyed _in situ_.
Which in the case of arbitrary trees can be very expensive in
terms of runtime. Complicated graphs are the choice example
when one want a benchmark showing the speed of garbage
collection:-).
> In some cases this triggers extra actions in our application,
> which have to be carried out deterministically. One could
> possibly argue that in such cases the ownership issues must be
> clear enough anyway (a human must figure it out and be sure in
> which timepoint the object is destructed) so that smartpointer
> approach would not be needed, but this would require extra
> effort and make the data model inhomogeneous, so we are using
> smartpointers also for such data objects. Maybe this is a
> mistake, making our application more fragile.
It all depends. I've definitely used reference counted pointers
for triggering some specific actions (generally not in
conjunction with memory management). I would have expected that
if you're dealing with trees, etc., about the only pointer which
would have such associated actions would be the root, but I
suppose there could be specific cases of subtrees as well. In
which case, homogeneity would argue for using reference counted
pointers everywhere. Curiously, the only time I needed
something similar was in Java:-)---we defined a dispose function
in the base class, and walked the tree, calling it. Arguably,
this is a "cleaner" solution, since it provides for things like
error reporting, etc. On the other hand, it's a bit more work,
and there are certainly cases where it's clear that errors can't
occur (or would be fatal).
> You have mentioned that before, and I still don't get it.
The main reason I allocate objects dynamically, rather than
copying them, is polymorphism. Polymorphism and assignment
don't work well together.
> The correct solution to the problem is to create an empty copy
> constructor and assignment operator in the base class (ie.
> they simply do nothing). After that, assigning objects around
> will not mess up the reference count.
That's another possibility. In my case, I find that almost none
of my reference counted objects are copiable or assignable---if
they were, I wouldn't be allocating them dynamically to begin
with. So the more reasonable solution seems to be to ban copy
and assignment in the required base class. If the derived class
really wants to provided it (e.g. to support cloning), all it
has to do is write a copy constructor or an assignment operator
which doesn't invoke the ones in the base class.
> That solution is very simple and obvious (once you know it).
I considered it, and decided that blocking them was preferable.
But either is really OK.
> However, instead of that, you prefer to write about the exact
> same amount of code just to make the copy constructor and
> assignment operator private. For what end? To make the user's
> life more difficult for no apparent purpose?
It makes the client code simpler, since classes which derive
from this base class no longer have to inhibit copy and
assignment themselves.
I may have missed something, but why do you think that reference
counting doesn't work on cyclic graphs? I am using (intrusive)
reference counting in the C Object System everywhere without
experiencing problem but I may have not yet encountered problematic
cases you are talking about (i.e. it's not a naive question)...
the code handling cyclic references can be seen here, line 234 (quite
simple)
http://cos.cvs.sourceforge.net/viewvc/cos/CosBase/src/AutoRelease.c?view=markup
a+, ld.
Let's say you have a ring of objects referring to each other via
smartpointers. Let's assume that there are also two objects in this ring
which are referenced from outside. When the first reference to the ring
is dropped, the ring should remain intact, when the second is dropped,
then the whole ring should be destroyed. I do not quite see how to
accomplish this without either knowing the number and the order of
deletion of the external references, or without expensive scanning the
structures at each reference drop.
I believe some garbage collection schemes resolve this by postponing the
scan and release until some more convenient point in the future.
Paavo
This is an obvious example where the mentioned code works. As I said,
this is not a naive question.
Did you read the code? Basically it says (e.g. in the smart_ptr
destructor), using _intrusive_ reference counting as mentioned:
if (obj->ref_cnt() > 1) obj->dec_cnt();
else
if (obj->ref_cnt() == 1) { obj->dec_cnt(); delete obj; }
else
;
this approach transforms a cyclic graph into an acyclic graph
automatically from the point of view of the reference counting.
cheers,
ld.
Because it leaks memory.
> I am using (intrusive) reference counting in the C Object
> System everywhere without experiencing problem but I may have
> not yet encountered problematic cases you are talking about
> (i.e. it's not a naive question)...
> the code handling cyclic references can be seen here, line 234
> (quite
> simple)http://cos.cvs.sourceforge.net/viewvc/cos/CosBase/src/AutoRelease.c?v...
I had a quick look, but I'm not familiar with the language it's
written in (it's not C nor C++), so I can't judge. But either
it leaks memory when there are cyclic references, or it's not
simply reference counting. (I saw some references to "pools",
which suggests that it is some sort of restricted garbage
collection, rather than reference counting. But as there was
more than half of it that I didn't understand, who knows.)
[...]
> > Let's say you have a ring of objects referring to each other
> > via smartpointers. Let's assume that there are also two
> > objects in this ring which are referenced from outside. When
> > the first reference to the ring is dropped, the ring should
> > remain intact, when the second is dropped, then the whole
> > ring should be destroyed. I do not quite see how to
> > accomplish this without either knowing the number and the
> > order of deletion of the external references, or without
> > expensive scanning the structures at each reference drop.
> This is an obvious example where the mentioned code works. As
> I said, this is not a naive question.
> Did you read the code? Basically it says (e.g. in the
> smart_ptr destructor), using _intrusive_ reference counting as
> mentioned:
> if (obj->ref_cnt() > 1) obj->dec_cnt();
> else
> if (obj->ref_cnt() == 1) { obj->dec_cnt(); delete obj; }
> else
> ;
> this approach transforms a cyclic graph into an acyclic graph
> automatically from the point of view of the reference
> counting.
How? I don't see where it changes any pointers, so it couldn't
possibly change the structure of the graph. Consider:
struct Object : public RefCntObj
{
typedef RefCntPtr< Object > Ptr ;
Ptr other ;
} ;
void
f()
{
Object::Ptr p1( new Object ) ;
Object::Ptr p2( new Object ) ;
p1->other = p2 ;
p2->other = p1 ;
}
What happens at the end of f?
Off the top of my head: Any container with run-time-determined
life-span containing polymorphic objects?
Schobi
I don't think so:
"I published the code for a reference-counting smart
pointer [...], and despite [...], a small parade of
valid bug reports has trickled in for years."
This sounds more like he's had bugs in his implementation.
> Of the three "difficulties" he explicitly mentions,
Note that this is a proposal by Peter Dimov, beman Dawes,
and Greg Colvin. Scott Meyers' quote is from "Effective
STL".
However, I assume that you're referring to the three
examples given in the proposal...
> the first
> two concern possible exceptions from functions which in a simple
> reference counted pointer are automatically no throw (since they
> only involve incrementing and decrementing integers and
> assigning pointers). [...]
Given a non-intrusive smart pointer with shared ownership
semantics, you need to allocate the counter somewhere. And
using normal 'new' might result in an exception -- in which
case the ctor will leak the object passed in.
I fail to see how this could be due to an over-ambitious
implementation in the boost library.
> [...]
> With regards to the last example, however: reference counted
> pointers are not a panacea. As a general rule, it's probably
> best to avoid using reference counted pointers on objects which
> themselves contain reference counted pointers.
Mhmm. And how are you then going to implement a tree with
reference counting smart pointers?
Schobi
Yes I did, but it seemed to use a lot of macros defined somewhere else,
so it was not clear to me what it was actually doing. The usage of word
'pool' hinted about something else than just smartpointed objects, as
already noticed also by James Kanze, so maybe it deals with cycles.
> Basically it says (e.g. in the smart_ptr
> destructor), using _intrusive_ reference counting as mentioned:
>
> if (obj->ref_cnt() > 1) obj->dec_cnt();
> else
> if (obj->ref_cnt() == 1) { obj->dec_cnt(); delete obj; }
> else
> ;
This is the basic smartpointer behavior AFAICS, which would leak memory
in case of cycles. Note that there may be multiple partially overlapping
cycles, so the node's refcount can be arbitrarily high, and still the
node should be destroyed, together with all of the cycles, when the last
external reference to the cycles has been dropped.
Paavo
> I may have missed something, but why do you think that reference
> counting doesn't work on cyclic graphs? I am using (intrusive)
> reference counting in the C Object System everywhere without
> experiencing problem but I may have not yet encountered problematic
> cases you are talking about (i.e. it's not a naive question)...
The problem with reference counted cyclic structures is that the
reference count never drops below one, so the objects making up the
cycle will never be destroyed.
This can best be shown with an example.
Lets assume we have 4 objects: A, B, C and X. X is a local variable of
some function, while A, B and C are dynamically allocated. All pointers
in the code are reference counted.
We start with the situation that X points to A, A to B, B to C and C to
A. Thus the reference counts for each (dynamically allocated) object is:
A: 2
B: 1
C: 1
Now, X goes out of scope and is destroyed. This causes a reference to A
to be dropped, resulting in these reference counts:
A: 1
B: 1
C: 1
As all objects are still referenced, no further destruction takes place.
BUT, all external references to the cycle have been dropped, so the
objects in the cycle have been leaked.
A decent garbage collector would notice that the cycle is no longer
accessible and collect it, but reference counted pointers can't do that
(unless the cycle was broken artificially).
>
> the code handling cyclic references can be seen here, line 234 (quite
> simple)
>
http://cos.cvs.sourceforge.net/viewvc/cos/CosBase/src/AutoRelease.c?view=markup
That code does not handle the cycle problem described above. It just
ensures that if it starts to collect a cycle, it will destroy each
object at most once.
>
> a+, ld.
Bart v Ingen Schenau
--
a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq
c.l.c FAQ: http://c-faq.com/
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/
Ok, I get it ;-) and in particular I understand why I never
encountered the problem. I use reference counting to share ownership
(hence, manual reference counting) and in your example above, you
consider that the all cycle A+B+C has the same status as the elements
A, B, C taken individually. Looks like a design flaw for me. A bit
like considering an array and the elements of the array with the same
status. So while I can share the elements (which may survive the life
of the cycle), I will always have a reference which considers the
entire cycle has an entity, and take care to destroy it.
Thanks for the answer.