Comments on TR1 smart pointers

Gianni Mariani

unread,

Jul 6, 2003, 10:25:24 PM7/6/03

to

I would like to discuss :

http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1450.html

I would like to see 3 more features in the libraries regarding smart
pointers, in particular reference counting smart pointers.

a) Returning or passing smart pointers.

A number of issues occur when passing or returning pointers. The most
critical one is when in a tight loop and passing a reference counted
pointer. The calling of a virtual method to increment and decrement the
reference count can become a significant performance issue. Passing the
raw pointer produces confusing semantics. The best solution is to
create 3 co-operating smart pointer classes. Once class is intended to
be for objects that are semi-persistant, the other 2 are for passing or
returning pointers. Of the 2 pointer passing templates one is intended
to imply a policy that the receiver is being passed the responsibility
to release a reference while the other is intended to imply that no sucj
responsibility exists. Passing of a raw reference counted pointer would
be considered at worst an error or at best a legacy issue.

In practice, this system reduces the errors due to reference counting
leaks/bugs to virtually zero. I have implemented such a scheme where on
of the smart pointer types (the one where the implication is to pass the
obligation to call release) is instrumented and will assert on improper
usage.

b) Interface for reference counting.

There are a number of different methods names/schemes used for managing
reference counts. I suggest that to make the library truly useful, it
would be best to parameterize the reference counting methods through a
template argument. This would enable the use of the smart pointer
templates with systems like MS-COM as well as inc()/dec() methods.

c) Reference to pointer within.

There are older interfaces that expect a pointer to pointer to the
object to be returned. It would be most advantageous for a `safe' way
to access the inner pointer.

I have a working implementation of a smart pointer reference counting
library that implements all of these features. I would be more than
happy to submit it as an alternative smart pointer system for consideration.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]

Richard Smith

unread,

Jul 8, 2003, 2:27:26 PM7/8/03

to

Gianni Mariani wrote:
>
> I would like to discuss :
>
> http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1450.html

[...]

> The calling of a virtual method to increment and decrement the
> reference count can become a significant performance issue.

Maybe I'm being unobservant, but can you point me to a bit of N1450 that
requires virtual functions for incrementing and decrementing the reference
count (or equivalent)? Certainly, the Boost 1.30 implementation does not
do a virtual function call in these circumstances. The boost::shared_ptr
copy constructor uses the implictly generated copy constructor, which calls
down to boost::details::shared_count's copy constructor, which in turn calls
boost::details::sp_counted_base::add_ref, which is not virtual and all ought
to be trivially inlineable. A similar call sequence happens in the
destructor, which only calls the virtual dispose function when the reference
count reaches zero.

The only use of virtual functions, or calls via function pointers, etc. in
the Boost implementation is in the handling of the deletion function object,
and which is (I believe) unavoidable if the class is to be useable on an
incomplete type. Also, in my experience, the cost of a virtual function
call is relatively small compared to the cost of deleting an object.

> The best solution is to
> create 3 co-operating smart pointer classes.

You've yet to convince me there's a problem at all, and certainly not one
that would justify adding two new smart pointer types.

> Once class is intended to
> be for objects that are semi-persistant, the other 2 are for passing or
> returning pointers. Of the 2 pointer passing templates one is intended
> to imply a policy that the receiver is being passed the responsibility
> to release a reference

This is what std::auto_ptr does when passed by value.

> while the other is intended to imply that no sucj
> responsibility exists.

And this is what passing the raw object by const reference is for.
Alternatively, passing shared_ptr by const reference (or even by value)
would work.

> In practice, this system reduces the errors due to reference counting
> leaks/bugs to virtually zero.

Could you please give an example of the sort of leak you hope to avoid? In
my experience with boost::shared_ptr, the main source of memory leaks is
when you have cyclic references. After that, I think I can honestly say, I
get more memory leaks due to compiler bugs than due to misusing
boost::shared_ptr.

> b) Interface for reference counting.
>
> There are a number of different methods names/schemes used for managing
> reference counts. I suggest that to make the library truly useful, it
> would be best to parameterize the reference counting methods through a
> template argument. This would enable the use of the smart pointer
> templates with systems like MS-COM as well as inc()/dec() methods.

One of the design principles given in this documentation is [III.A.3]:

| No Extra Parameters
|
| Following the "as close as possible" principle, the proposed smart
| pointers have a single template parameter, the type of the pointee.
| Avoiding additional parameters ensures interoperability between
| libraries from different authors, and also makes shared_ptr easier to
| use, teach and recommend.

Although I'm not overly familiar with COM, my understanding is that by using
a suitable deletion function, a shared_ptr can be allowed to made to hold a
COM object. Persumably the game is to get the deletion function to call
obj->Release().

This avoids any need to specify the functions to increment and decrement the
reference counts.

> c) Reference to pointer within.
>
> There are older interfaces that expect a pointer to pointer to the
> object to be returned. It would be most advantageous for a `safe' way
> to access the inner pointer.

shared_ptr<T> my_ptr;

{
T *tmp = my_ptr.get();
legacy_function(&tmp);

// Depending on legacy_function's semantics, you might need to
assert(tmp == my_ptr.get());

// or
if (tmp != my_ptr.get())
my_ptr.reset(tmp);

// or even
if (tmp != my_ptr.get()) {
*my_ptr = *tmp;
delete tmp;
}
}

> I have a working implementation of a smart pointer reference counting
> library that implements all of these features. I would be more than
> happy to submit it as an alternative smart pointer system for
consideration.

I think a more constructive course of action would be to explain precisely
which bits of the current proposal you are unhappy with and why, and how
exactly you would modify them. Submitting an entirely new alternative smart
pointer proposal is unlikely to be productive, especially when the existing
proposal is as popular and heavily used as the Boost one.

--
Richard Smith

Gianni Mariani

unread,

Jul 8, 2003, 5:55:31 PM7/8/03

to

Richard Smith wrote:
> Gianni Mariani wrote:
>
>>I would like to discuss :
>>
>>http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1450.html
>
>
> [...]
>
>
>>The calling of a virtual method to increment and decrement the
>>reference count can become a significant performance issue.
>
>
> Maybe I'm being unobservant, but can you point me to a bit of N1450 that
> requires virtual functions for incrementing and decrementing the reference
> count (or equivalent)? Certainly, the Boost 1.30 implementation does not
> do a virtual function call in these circumstances. The boost::shared_ptr
> copy constructor uses the implictly generated copy constructor, which calls
> down to boost::details::shared_count's copy constructor, which in turn calls
> boost::details::sp_counted_base::add_ref, which is not virtual and all ought
> to be trivially inlineable. A similar call sequence happens in the
> destructor, which only calls the virtual dispose function when the reference
> count reaches zero.

OK - we might be on different planets here but if we do this to COM
pointers then they are virtual and can't be trivially inlineable.

It makes little sense to make the argument that on one end they are
trivially inlineable and on the other end to say they work with COM
which can't be trivially inlineable.

>
> The only use of virtual functions, or calls via function pointers, etc. in
> the Boost implementation is in the handling of the deletion function object,
> and which is (I believe) unavoidable if the class is to be useable on an
> incomplete type. Also, in my experience, the cost of a virtual function
> call is relatively small compared to the cost of deleting an object.

Small but potentially significant. One of the underlining philosophies
in C++ is that you don't pay for what you don't need. The boost
reference counted model seems (correct me if I am wrong) to require
incrementing and decrementing reference counts when reference counted
objects are trivially passed or retruned. I suggest that this is
directly contradicting the philosophy stated.

>
>
>>The best solution is to
>>create 3 co-operating smart pointer classes.
>
>
> You've yet to convince me there's a problem at all, and certainly not one
> that would justify adding two new smart pointer types.
>
>
>> Once class is intended to
>>be for objects that are semi-persistant, the other 2 are for passing or
>>returning pointers. Of the 2 pointer passing templates one is intended
>>to imply a policy that the receiver is being passed the responsibility
>>to release a reference
>
>
> This is what std::auto_ptr does when passed by value.

I don't get what auto_ptr has to do with this discussion. Please elaborate.

>
>
>>while the other is intended to imply that no sucj
>>responsibility exists.
>
>
> And this is what passing the raw object by const reference is for.
> Alternatively, passing shared_ptr by const reference (or even by value)
> would work.

But passing a const pointer has nothing to do with policies regarding
obligation for decrementing reference count.

>
>
>>In practice, this system reduces the errors due to reference counting
>>leaks/bugs to virtually zero.
>
>
> Could you please give an example of the sort of leak you hope to avoid?

Yep, when passing a reference counted pointer and the obligation to
decrement the reference count and no decrement is done by the callee (or
similary when returning a reference counted pointer).

In
> my experience with boost::shared_ptr, the main source of memory leaks is
> when you have cyclic references. After that, I think I can honestly say, I
> get more memory leaks due to compiler bugs than due to misusing
> boost::shared_ptr.
>

Yes, and multiple useless calls to reference counting methods.

In tight loops you want to avoid this and if you are limited to using
raw pointers you diminish greatly the value of what the smart pointers
are doing for you.

>
>>b) Interface for reference counting.
>>
>>There are a number of different methods names/schemes used for managing
>>reference counts. I suggest that to make the library truly useful, it
>>would be best to parameterize the reference counting methods through a
>>template argument. This would enable the use of the smart pointer
>>templates with systems like MS-COM as well as inc()/dec() methods.
>
>
> One of the design principles given in this documentation is [III.A.3]:
>
> | No Extra Parameters
> |
> | Following the "as close as possible" principle, the proposed smart
> | pointers have a single template parameter, the type of the pointee.
> | Avoiding additional parameters ensures interoperability between
> | libraries from different authors, and also makes shared_ptr easier to
> | use, teach and recommend.
>
> Although I'm not overly familiar with COM, my understanding is that by using
> a suitable deletion function, a shared_ptr can be allowed to made to hold a
> COM object. Persumably the game is to get the deletion function to call
> obj->Release().
>
> This avoids any need to specify the functions to increment and decrement the
> reference counts.

I don't get it OR this is just plain FUD. You can certainly make it so
the template has default paramater values and hence you get the benefit
described above yet it is trivial to use where you're applying these
templates to reference counting interfaces that pre-exist (like COM).

Why limit the ultility of the smart pointer classes ?

>
>
>>c) Reference to pointer within.
>>
>>There are older interfaces that expect a pointer to pointer to the
>>object to be returned. It would be most advantageous for a `safe' way
>>to access the inner pointer.
>
>
> shared_ptr<T> my_ptr;
>
> {
> T *tmp = my_ptr.get();
> legacy_function(&tmp);
>
> // Depending on legacy_function's semantics, you might need to
> assert(tmp == my_ptr.get());
>
> // or
> if (tmp != my_ptr.get())
> my_ptr.reset(tmp);
>
> // or even
> if (tmp != my_ptr.get()) {
> *my_ptr = *tmp;
> delete tmp;
> }
> }
>

I think we're talking about different things.

>
>>I have a working implementation of a smart pointer reference counting
>>library that implements all of these features. I would be more than
>>happy to submit it as an alternative smart pointer system for
>
> consideration.
>
> I think a more constructive course of action would be to explain precisely
> which bits of the current proposal you are unhappy with and why, and how
> exactly you would modify them. Submitting an entirely new alternative smart
> pointer proposal is unlikely to be productive, especially when the existing
> proposal is as popular and heavily used as the Boost one.

Before I go and invest my time doing such a thing, I would prefer that
this discussion have some interest.

Just like std::auto_ptr, I think there is danger that c++ developers
will find significant issues with the boost shared pointers if a more
hearty discussion does not occur.

In simple terms, I'm saying "I have some ideas that may make sense, if
you want me to explain it further, then show me you're interested in
dicussing it".

So far, it seems like people (other than yourself) care very little.

G

Peter Dimov

unread,

Jul 8, 2003, 10:43:09 PM7/8/03

to

gi2n...@mariani.ws (Gianni Mariani) wrote in message news:<beaj87$c...@dispatch.concentric.net>...

> I would like to discuss :
>
> http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1450.html
>
> I would like to see 3 more features in the libraries regarding smart
> pointers, in particular reference counting smart pointers.
>
>
> a) Returning or passing smart pointers.
>
> A number of issues occur when passing or returning pointers. The most
> critical one is when in a tight loop and passing a reference counted
> pointer. The calling of a virtual method to increment and decrement the
> reference count can become a significant performance issue.

There is no requirement that reference count updates need to be
implemented with virtual member functions.

> Passing the raw pointer produces confusing semantics.

Not necessarily. A shared_ptr parameter implies that the function
needs to be able to take ownership. If this is not the case, a
reference or a raw pointer (depending on whether NULL is a valid
argument) is more appropriate. Otherwise, passing shared_ptr by const
reference may eliminate the copy.

David Abrahams

unread,

Jul 8, 2003, 10:45:48 PM7/8/03

to

gi2n...@mariani.ws (Gianni Mariani) writes:

> Richard Smith wrote:
>> Gianni Mariani wrote:
>>
>>>I would like to discuss :
>>>
>>>http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1450.html
>> [...]
>>
>>>The calling of a virtual method to increment and decrement the
>>>reference count can become a significant performance issue.
>>
>> Maybe I'm being unobservant, but can you point me to a bit of N1450
>> that
>> requires virtual functions for incrementing and decrementing the reference
>> count (or equivalent)? Certainly, the Boost 1.30 implementation does not
>> do a virtual function call in these circumstances. The boost::shared_ptr
>> copy constructor uses the implictly generated copy constructor, which calls
>> down to boost::details::shared_count's copy constructor, which in turn calls
>> boost::details::sp_counted_base::add_ref, which is not virtual and all ought
>> to be trivially inlineable. A similar call sequence happens in the
>> destructor, which only calls the virtual dispose function when the reference
>> count reaches zero.
>
> OK - we might be on different planets here but if we do this to COM
> pointers then they are virtual and can't be trivially inlineable.
>
> It makes little sense to make the argument that on one end they are
> trivially inlineable and on the other end to say they work with COM
> which can't be trivially inlineable.

This seems to be a perfect argument for the Boost model. What great
advantage is there in inlining all of a shared_ptr destructor which
calls a non-inlinable COM destructor?

>> The only use of virtual functions, or calls via function pointers,
>> etc. in the Boost implementation is in the handling of the deletion
>> function object, and which is (I believe) unavoidable if the class
>> is to be useable on an incomplete type. Also, in my experience,
>> the cost of a virtual function call is relatively small compared to
>> the cost of deleting an object.
>
> Small but potentially significant.

Have you got an application where you've measured it to be
significant?

> One of the underlining philosophies in C++ is that you don't pay for
> what you don't need. The boost reference counted model seems
> (correct me if I am wrong) to require incrementing and decrementing
> reference counts when reference counted objects are trivially passed
> or retruned.

Only when passed. The compiler is free to apply the RVO in most
situations where a shared_ptr is returned.

> I suggest that this is directly contradicting the philosophy stated.

It's the only safe possibility other than (arguably) auto_ptr, given
the current core language definition. If you need to not pay for
reference counting when objects are passed, you should use an auto_ptr
or a raw pointer (or show us a better alternative).

>>>while the other is intended to imply that no sucj responsibility
>>>exists.
>>
>> And this is what passing the raw object by const reference is for.
>> Alternatively, passing shared_ptr by const reference (or even by value)
>> would work.
>
> But passing a const pointer has nothing to do with policies regarding
> obligation for decrementing reference count.

If I write:

foo(shared_ptr<T> const&);

Then I can pass any shared_ptr<T> to foo without changing its
reference count. There is no obligation to change its reference
count.

>>>In practice, this system reduces the errors due to reference counting
>>>leaks/bugs to virtually zero.

You don't need these reductions with boost::shared_ptr because the
errors you claim to be preventing don't occur with boost::shared_ptr.

>> Could you please give an example of the sort of leak you hope to
>> avoid?
>
> Yep, when passing a reference counted pointer and the obligation to
> decrement the reference count and no decrement is done by the callee
> (or similary when returning a reference counted pointer).

Those sorts of problems are the result of your "reference counted
pointer" model which automates too little.

>> In my experience with boost::shared_ptr, the main source of memory
>> leaks is when you have cyclic references. After that, I think I
>> can honestly say, I get more memory leaks due to compiler bugs than
>> due to misusing boost::shared_ptr.
>>
>
> Yes, and multiple useless calls to reference counting methods.
>
> In tight loops you want to avoid this and if you are limited to using
> raw pointers you diminish greatly the value of what the smart pointers
> are doing for you.

Likewise if you have to keep track of when a shared_ptr's pointee may
have become invalid, which is essentially what your proposal seems to
require. After you pass it to a function which takes one of your
reference-stealing pointers as a parameter, you can't use it anymore.

>>>b) Interface for reference counting.
>>>
>>>There are a number of different methods names/schemes used for managing
>>>reference counts. I suggest that to make the library truly useful, it
>>>would be best to parameterize the reference counting methods through a
>>>template argument. This would enable the use of the smart pointer
>>>templates with systems like MS-COM as well as inc()/dec() methods.
>>
>> One of the design principles given in this documentation is
>> [III.A.3]:
>> | No Extra Parameters
>> |
>> | Following the "as close as possible" principle, the proposed smart
>> | pointers have a single template parameter, the type of the pointee.
>> | Avoiding additional parameters ensures interoperability between
>> | libraries from different authors, and also makes shared_ptr easier to
>> | use, teach and recommend.
>>
>> Although I'm not overly familiar with COM, my understanding is that
>> by using a suitable deletion function, a shared_ptr can be allowed
>> to made to hold a COM object. Persumably the game is to get the
>> deletion function to call
>> obj->Release().
>> This avoids any need to specify the functions to increment and
>> decrement the reference counts.
>
> I don't get it OR this is just plain FUD. You can certainly make it
> so the template has default paramater values and hence you get the
> benefit described above

No you don't. If the parameter is there people will use it and you
will have many different types of smart pointer to the same T, which
don't interoperate. An extra parameter also impairs learnability.

> yet it is trivial to use where you're applying these templates to
> reference counting interfaces that pre-exist (like COM).
>
> Why limit the ultility of the smart pointer classes ?

There's no limitation; you don't need to use COM's slow refcounting
when you can use shared_ptr's fast refcounting.

>>>c) Reference to pointer within.
>>>
>>>There are older interfaces that expect a pointer to pointer to the
>>>object to be returned. It would be most advantageous for a `safe' way
>>>to access the inner pointer.

Ouch. The only safe pointer to the contained T* of a shared_ptr is a
T*const*. What legacy interface benefits from providing that?

>>>I have a working implementation of a smart pointer reference
>>>counting library that implements all of these features. I would be
>>>more than happy to submit it as an alternative smart pointer system
>>>for consideration.
>>
>> I think a more constructive course of action would be to explain
>> precisely which bits of the current proposal you are unhappy with
>> and why, and how exactly you would modify them. Submitting an
>> entirely new alternative smart pointer proposal is unlikely to be
>> productive, especially when the existing proposal is as popular and
>> heavily used as the Boost one.

And already accepted into the TR.

> Before I go and invest my time doing such a thing, I would prefer that
> this discussion have some interest.

You've got several responses now.

> Just like std::auto_ptr, I think there is danger that c++ developers
> will find significant issues with the boost shared pointers if a
> more hearty discussion does not occur.

Let's have it, then!

> In simple terms, I'm saying "I have some ideas that may make sense,
> if you want me to explain it further, then show me you're interested
> in dicussing it".
>
> So far, it seems like people (other than yourself) care very little.

I think you mistake strong disagreement for disinterest.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Howard Hinnant

unread,

Jul 8, 2003, 10:46:28 PM7/8/03

to

In article <bef891$c...@dispatch.concentric.net>, Gianni Mariani
<gi2n...@mariani.ws> wrote:

| So far, it seems like people (other than yourself) care very little.

Fwiw, I care a lot.

I'm intimately familiar with the tr1::shared_ptr having independently
implemented it. After reading your post, it seemed like you had not
looked closely at shared_ptr, and there was no reference to enable me
to look more closely at your design.

My day is simply too full to drop everything everytime somebody says:
I've got a better idea, want to know more?

It literally took me 6 months to schedule the two weeks it took to
crawl inside of tr1::shared_ptr enough to feel like I understood it.
I'm probably slow, but that's what it takes me.

If you've got a better idea, and if you feel strongly enough about it,
post it somewhere so we can study it at our convenience. And since
std::tr1::shared_ptr is already there, it would be to your advantage to
compare and contrast with it (with code examples!) in order to shorten
your reader's learning curve.

--
Howard Hinnant
Metrowerks

Gianni Mariani

unread,

Jul 9, 2003, 2:41:35 PM7/9/03

to

Howard Hinnant wrote:
> In article <bef891$c...@dispatch.concentric.net>, Gianni Mariani
> <gi2n...@mariani.ws> wrote:
>
> | So far, it seems like people (other than yourself) care very little.
>
> Fwiw, I care a lot.
>
> I'm intimately familiar with the tr1::shared_ptr having independently
> implemented it. After reading your post, it seemed like you had not
> looked closely at shared_ptr, and there was no reference to enable me
> to look more closely at your design.

See attachment.

>
> My day is simply too full to drop everything everytime somebody says:
> I've got a better idea, want to know more?

Hint of arrogance I hear ?

>
> It literally took me 6 months to schedule the two weeks it took to
> crawl inside of tr1::shared_ptr enough to feel like I understood it.
> I'm probably slow, but that's what it takes me.

>
> If you've got a better idea, and if you feel strongly enough about it,
> post it somewhere so we can study it at our convenience. And since
> std::tr1::shared_ptr is already there, it would be to your advantage to
> compare and contrast with it (with code examples!) in order to shorten
> your reader's learning curve.
>

I have attached the header that defines the templates. If you're
interested I'll make an alpha release of the entire library soon as it
stands soon.

at_lifetime.h

Dave Harris

unread,

Jul 9, 2003, 2:41:52 PM7/9/03

to

gi2n...@mariani.ws (Gianni Mariani) wrote (abridged):

> OK - we might be on different planets here but if we do this to COM
> pointers then they are virtual and can't be trivially inlineable.
>
> It makes little sense to make the argument that on one end they are
> trivially inlineable and on the other end to say they work with COM
> which can't be trivially inlineable.

There are two counts. When boost::shared_ptr wraps a COM object, it
allocates a second count, and does all its increments and decrements to
that - inline. The original COM count is only decremented (out of line)
when the boost count hits zero.

In fact, boost always allocates its own count. So you pay the time and
space costs of the allocation, even if you are wrapping a COM object, or
an object with intrusive count, which has no need of it. This is one of
the main downside of boost::shared_ptr, in my view.

If you're not familiar with the boost pointer, it is worth reading:
http://www.boost.org/libs/smart_ptr/sp_techniques.html

> The boost reference counted model seems (correct me if I am wrong)
> to require incrementing and decrementing reference counts when
> reference counted objects are trivially passed or retruned.

I agree it would be nice if cases like:

{
boost::shared_ptr<int> p( new int );
func( p );
}

could be optimised so that p's count is never more than 1. As far as I can
tell, the main things preventing this are the use_count() and unique()
members. Without these, many increments and decrements could be omitted
under the "as if" rule. I imagine this was not an issue when shared_ptr
was a standalone library, but now that is becoming part of the standard
perhaps the trade-offs should be reconsidered.

You seem to be advocating a kind of manual optimisation - using a
different kind of pointer when passing or returning. That sounds
cumbersome and error-prone to me. I'd rather the optimisation be done
automatically, by the compiler, when it can prove it is safe.

-- Dave Harris, Nottingham, UK

David Abrahams

unread,

Jul 9, 2003, 5:25:27 PM7/9/03

to

gi2n...@mariani.ws (Gianni Mariani) writes:

> /**
> * MadeTransfer()
> * --------------
> * This method sets a member variable indicating that
> * ownership of the pointee has been successfully
> * transferred from this AT_LifeLine to some persistent
> * pointer.
> */
>
> inline void MadeTransfer() const
> {
> Assert( ! m_hasBeenTransferred.m_value );
> m_hasBeenTransferred.m_value = true;
> }
>

Why is setting m_hasBeenTransferred to true better than incrementing
a reference count? It doesn't seem like you can achieve much of a
savings here.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

Gianni Mariani

unread,

Jul 9, 2003, 5:58:05 PM7/9/03

to

Dave Harris wrote:

>
> You seem to be advocating a kind of manual optimisation - using a
> different kind of pointer when passing or returning. That sounds
> cumbersome and error-prone to me. I'd rather the optimisation be done
> automatically, by the compiler, when it can prove it is safe.

It may "sound" cumbersome but it actually models programmer intentions
cleanly. Hence, I think you'd be surprised as to how simple it is.

Pete Becker

unread,

Jul 9, 2003, 8:41:42 PM7/9/03

to

Howard Hinnant wrote:
>
> It literally took me 6 months to schedule the two weeks it took to
> crawl inside of tr1::shared_ptr enough to feel like I understood it.
> I'm probably slow, but that's what it takes me.
>

No, not really slow. I'm figuring one to two weeks (not full time,
though). But not until I've finished random number generators. <g>

--
To delight in war is a merit in the soldier,
a dangerous quality in the captain, and a
positive crime in the statesman.
George Santayana

Richard Smith

unread,

Jul 9, 2003, 11:48:36 PM7/9/03

to

Gianni Mariani wrote:
> Richard Smith wrote:
> > Gianni Mariani wrote:

[...]

> > Maybe I'm being unobservant, but can you point me to a bit of N1450 that
> > requires virtual functions for incrementing and decrementing the
reference
> > count (or equivalent)? Certainly, the Boost 1.30 implementation does
not

> > do a virtual function call in these circumstances. [...]

>
> OK - we might be on different planets here but if we do this to COM
> pointers then they are virtual and can't be trivially inlineable.

If you mean that IUnknown::AddRef and IUnknown::Release are virtual and
can't be inlined, then you're quite correct. Boost's reference counting
mechanism does not use this -- it has a it's own independent reference
count. Only when Boost's reference count drops to zero does it (virtually)
call the custom deletion function, which will call IUnknown::Release. Thus
you can copy boost::shared_ptr's around as much as you want, and you will
not incur the overhead of calls to IUnknown::Release until the object is
finally destroyed. The only overhead is from Boost's own reference counting
mechanism, which is much more efficient.(And with a bit of hand-crafted
assembler to do the atomic incrementing, etc. the mutex in the 1.30
implementation could be removed, making it even more so. Or perhaps I'm
missing something?)

> It makes little sense to make the argument that on one end they are
> trivially inlineable and on the other end to say they work with COM
> which can't be trivially inlineable.

See above -- it makes perfect sense to me. Irrespective of whether or not
you're using COM, everything other than the last destructor call (the one
that returns the reference count to zero) should be trivially inlineable.

> Small but potentially significant.

As David Abrahams asked, "have you got an application where you've measured
it to be
significant"? I've just knocked together a test program that demonstrates
(I think) that it is not significant. Here it is, minus #includes, etc.:

struct X { double w,x,y,z; };
void foo( boost::shared_ptr<X> ) {} // Doesn't get inlined

int main() {
timer t; // my own timer class
for ( int i(0), n(int(1E6)); i<n; ++i ) {
boost::shared_ptr<X> x(new X);
foo(x); foo(x);
}
std::cout << std::setprecision(3) << t.get_time() << "s\n";
}

(Note that I've included a call to new in the inner loop. This is justified
as a real-world inner loop is unlikely to decrement many reference counts to
zero without there being calls to new in the loop.)

I ran and compiled this on Linux as follows:

[richard@verdi test]$ g++-3.2.1 smptr.cpp -O2 -g -osmptr \
-I /usr/local/include/boost_1_30/
[richard@verdi test]$ N=20; for ((i=0;$i<$N;++i)); do \
./smptr; done | awk '{s+=$1} END{printf("%.3f\n", s/'$N')}'
0.961

Then I modified the boost/details/shared_count.hpp header so that instead of
virtually calling dispose and then destruct, it static casts up to the
relevant type and calls the functions non-virtually. (I'm happy to send the
relevant diff to anyone interested.) Compiling and running again gives a
figure of 0.914. This represents about a 5% speed improvement and seems to
be reproduceable. Is this really a big enough improvement to justify
loosing the ability to specify arbitrary deletion functions (needed, for
example, to support COM objects), and loosing the ability to (usefully)
instantiate the class on incomplete types? In my opinion it is not.

Also bear in mind that the body of my inner loop was relatively trivial --
two uninlined calls to foo, with associated incrementing and decrementing of
the reference counts. A more realistic example (e.g. one where foo does
something) will reduce the size of the improvement.

> The boost
> reference counted model seems (correct me if I am wrong) to require
> incrementing and decrementing reference counts when reference counted
> objects are trivially passed or retruned.

Don't pass by value -- pass by const reference. In many circumstances, the
compiler is allowed to make use of various return value optimisations to
remove the copy constructor calls when objects are returned. Together these
can help remove spurious reference increments and decrements.

> >>Of the 2 pointer passing templates one is intended
> >>to imply a policy that the receiver is being passed the responsibility
> >>to release a reference
> >
> >
> > This is what std::auto_ptr does when passed by value.
>
> I don't get what auto_ptr has to do with this discussion. Please
elaborate.

The std::auto_ptr class was designed to allow ownership to be passed to or
returned from functions. It does not incur the overhead of reference
counting, and I've never seen an implementation that uses any virtual
functions. If this is what you want, you should be using auto_ptr (or
something else) for the job. The shared_ptr class is not, and isn't
intended as, a panacea. Use it when it is appropriate, but there will
always circumstances where better choices are available.

> > And this is what passing the raw object by const reference is for.
> > Alternatively, passing shared_ptr by const reference (or even by value)
> > would work.
>
> But passing a const pointer has nothing to do with policies regarding
> obligation for decrementing reference count.

If you don't want the function being called to do anything with the
reference count, and you know the function doesn't need to store the object,
then you don't have to pass it as a reference counted object. Pass it as by
raw pointer, or raw const reference, or even const reference to a
shared_ptr. None of these will cause any additional incrementing or
decrementing of the reference count. Personally, I usually pass objects by
const reference to shared_ptr. This gives the called function the
flexibility to retain a copy if necessary (and thus increment the reference
count), but avoids modifying the reference count if it doesn't need to.
I've usually found this to be a good solution.

> > Could you please give an example of the sort of leak you hope to avoid?
>
> Yep, when passing a reference counted pointer and the obligation to
> decrement the reference count and no decrement is done by the callee (or
> similary when returning a reference counted pointer).

Could you actually give an example (i.e. C++ code) that causes such a leak
using boost::shared_ptr? I don't see how it is possible without willfully
abusing the pointer's interface, in which case you deserve everything you
get.

> In
> > my experience with boost::shared_ptr, the main source of memory leaks is
> > when you have cyclic references. After that, I think I can honestly
say, I
> > get more memory leaks due to compiler bugs than due to misusing
> > boost::shared_ptr.
> >
>
> Yes, and multiple useless calls to reference counting methods.

Well written code and a good compiler (i.e. one with good NRVO support) can
avoid a great many unnecessary increments and decrements. And as I've said,
with a good smart_ptr implementation the reference counting functions can be
very lightweight -- they do *not* routinely result in virtual function
calls.

> In tight loops you want to avoid this and if you are limited to using
> raw pointers you diminish greatly the value of what the smart pointers
> are doing for you.

There is always a trade off between efficiency and elegance / safety /
genericity. In a tight inner loop, you may want to consider using a
different type of pointer, such as boost::scoped_ptr. (It's a shame this
was removed from the proposed addition to the standard, although a const
std::auto_ptr will do the same job.) Perhaps functions called by the inner
loop should be taking the underlying object either by const reference or by
pointer.

> > Although I'm not overly familiar with COM, my understanding is that by
using
> > a suitable deletion function, a shared_ptr can be allowed to made to
hold a
> > COM object. Persumably the game is to get the deletion function to call
> > obj->Release().
> >
> > This avoids any need to specify the functions to increment and decrement
the
> > reference counts.
>
> I don't get it OR this is just plain FUD. You can certainly make it so
> the template has default paramater values and hence you get the benefit
> described above yet it is trivial to use where you're applying these
> templates to reference counting interfaces that pre-exist (like COM).

I didn't say it wasn't possible to add a template (with default argument),
merely that it wasn't desireable. David Abrahams has already explained why,
although I'd perhaps add that with the current template template parameter
semantics (and yes, people do use these) an additional template parameter
with a default argument is not invisible.

> Why limit the ultility of the smart pointer classes ?

I don't believe it is.

> >>c) Reference to pointer within.
> >>
> >>There are older interfaces that expect a pointer to pointer to the
> >>object to be returned. It would be most advantageous for a `safe' way
> >>to access the inner pointer.

[...]

> I think we're talking about different things.

Perhaps you'd care to elaborate what you're talking about, then? Can you
give a example (including C++ code) of where you would want to extract a T**
pointer, and what semantics you would want it to have? In particular,

- what happens when the value of the underlying pointer (i.e. the
T*) is changed?

- what happens to the ownership of the pointer, and how is the
reference count affected.

- what is the lifetime of the the T**?

> Before I go and invest my time doing such a thing, I would prefer that
> this discussion have some interest.

It already has -- you've had replies from Peter Dimov, David Abrahams and
Howard Hinnant -- all extremely highly respected members of the C++
community.

> In simple terms, I'm saying "I have some ideas that may make sense, if
> you want me to explain it further, then show me you're interested in
> dicussing it".

I'm very interested. If there really are issues that could be addressed,
then I would be very interested to know exactly what they are, and how they
could be addressed.

> So far, it seems like people (other than yourself) care very little.

I think people are just skeptical rather than disinterested. If you can
produce some realistic examples (i.e. including C++ code) where the issues
you mention manifest, I think people will pay attention.

--
Richard Smith

Peter Dimov

unread,

Jul 9, 2003, 11:50:33 PM7/9/03

to

bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20030709...@brangdon.madasafish.com>...

>
> I agree it would be nice if cases like:
>
> {
> boost::shared_ptr<int> p( new int );
> func( p );
> }
>
> could be optimised so that p's count is never more than 1.

If your 'func' above is

void func(shared_ptr<int> q);

and not

void func(shared_ptr<int> const & q);

then the optimization you want is not possible. Consider:

void func(shared_ptr<int> q)
{
// side effect 1
q.reset(); // potential side effect 2
// side effect 3
}

Whether q is unique or not is detectable. The compiler is not allowed
to destroy p before func returns, regardless of the presence or
absence of use_count().

Richard Smith

unread,

Jul 10, 2003, 12:21:54 PM7/10/03

to

Gianni Mariani wrote:

> Howard Hinnant wrote:
> >
> > My day is simply too full to drop everything everytime somebody says:
> > I've got a better idea, want to know more?
>
> Hint of arrogance I hear ?

I doubt it; more likely a statement of fact.

> I have attached the header that defines the templates. If you're
> interested I'll make an alpha release of the entire library soon as it
> stands soon.

Could I draw your attention to Howard Hinnant's comment "it

would be to your advantage to compare and contrast with it
(with code examples!) in order to shorten your reader's

learning curve". I think you would have been wise to have
followed it. I suspect you'll find relatively few people
will be prepared to wade through a 90k post in order to
separate the germane differences from the TR1 shared_ptr
from differences in style or syntax.

I've made some detailed comments on your suggestion, below.
For those who don't want to wade through them, here's a
summary:

1. If you want it to be considered instead of
boost::shared_ptr for C++0x, then write it in a way
that looks and feels like the rest of the STL. In
particular, where possible make expressions that work
on std::auto_ptr have the same meaning on your
pointers.

2. You seem to be advocating an intrusive shared_ptr
implementation to the exclusion of any possible
externally reference counted implementation. I've
nothing against and intrusive implementation -- indeed
I think it would be a good thing. HOWEVER, I wouldn't
want to require all shared pointers to be intrusive.

3. Your AT_LifeLine class is attempting to solve the same
problem as std::auto_ptr, and in the same way. You'd
be better off looking at std::auto_ptr, learning
thoroughly how it works (it is decidedly non-trivial)
and thinking about how it might need to be changed (if
at all) in the presence of additional standardised
smart pointers.

4. Your smart pointer classes have a great many implicit
conversions to and from raw pointers. This is
dangerous: even though an intrusive implementation
allows smart pointers to be converted to and from raw
pointers, it is very easy to get resource leaks using
this.

5. The idea of using optional run-time assertions to catch
usage errors is in principle good, but in this case
leads to poor exception safety. Consider what happens
when you construct a AT_LifeLine (your std::auto_ptr
clone) and an exception causes it to be destroyed
before relinquishing owenership. Depending on whether
or not you have debugging turned on, you'll either get
an assertion or a resource leakk, either of which are
bad.

6. Your AT_LifeTime class is basically the same as
boost::shared_ptr, other than it uses an intrusive
reference count. Can you summarise exactly what you
think this class offers that boost::shared_ptr does
not. In particular, can we see example usages of both,
and, if your arguments are based on efficiency, some
reproducible figures backing up your claims.

7. I'm not at all convinced that all of the conversions
between AT_LifeTime, AT_LifeLine and AT_LifeView
increment and decrement the reference counts as
appropriate.

8. The AT_LifeView class doesn't (so far as I can see)
provide any functionality that a raw pointer or
reference doesn't provide. Therefore I don't see any
point to it.

9. AT_Pointer seems to be another attempt to reimplement
std::auto_ptr. Why bother re-implementing it? Your
version doesn't work as well as the standard one.

In short, I don't think you'll generate much enthusiasm for
your implementation, however, if you think there are
worthwhile points to be considered, do suggest them, but as
modifications to std::auto_ptr, TR1's std::shared_ptr or
std::weak_ptr. Proposing a wholesale alternative without
clear discussion on its benefits over the existing one will
not win you many supporters.

------------------------------------------------------------
Specific comments on your code follow:

> template < class w_ClassRef > class AT_ReferenceTraits;

If you're attempting to propose something as an alternative
to the TR1 shared_ptr, and hence eventually for probable
addition to the Standard, you should use the same naming
convention as the Standard, viz, lower_case class and
function names, and InterCaps for template parameters.
Oh, and no reverse Polish. This may sound pedantic, but
people used to reading the Standard library will find it
easier to parse.

If you want serious consideration to be given to your
suggestions, you have to go out of your way to make it easy
for people to quickly see its salient points. This also
means removing the extraneous clutter from the file so that
the *interface* is immediately clear. (I've done this to
the bits of your design that I've quoted.)

> /**
> * AT_LifeControl is an abstract base class for any class whose lifetime
> * is managed by means of reference counting.
> */
> class AT_LifeControl
> {
> public:
> virtual ~AT_LifeControl();
> virtual int AddRef() = 0;
> virtual int Release() = 0;
> };

You're now advocating an intrusive implementation, are you?
This is not a bad idea, as long as an externally reference
counted implementation is also possible, ideally without
exposing the difference to end user.

(Clearly there are potential differences in interface
between external and intrusive shared pointers. For
example, it could be safe to do

shared_ptr<T> a( new T ), b( a.get() );

with a intrusive shared pointer, but rarely will be with an
external one. IMHO, this is best left illegal for all
shared pointers, leaving the interfaces to intrusive and
external pointers exactly the same.)

> template <class w_ClassRef> class AT_ReferenceTraits
> {
> public:
> static inline void IncRefCount( w_ClassRef i_ptr )
> { i_ptr->AddRef(); }
> static inline void DecRefCount( w_ClassRef i_ptr )
> { i_ptr->Release(); }
> };

When an object is created, does it have a reference count of
1, or must IncRefCount() be called?

[quoting out of order]
> * If you are using your own reference-counting scheme, i.e. one not
> * based on AT_LifeControl, then you will need to define your own
> * helper class that implements the same interface as AT_ReferenceTraits.

Can you write me such a policy class that handles external
reference counting? What are the function arguments to
IncRefCount? And how is the reference count accessed from
them?

> * (Incidentally, the name "AT_LifeLine" is meant to evoke the image
> * of someone tossing a rope to someone adrift at sea, saying, in
> * effect, "Here, this is yours, you'd better take ownership of it.")

Evoking the image of a shared pointer might be more
appropriate. Seriously, though, I really don't like your
choices of names.

> * Instrumenting an AT_LifeLine object to check for memory leaks is a
> * useful debugging aid, but is expensive in terms of execution time.

This is an quality-of-implementation issue, and really isn't
relevant to this discussion. (I've removed all such
debugging hooks from the quoted text.)

> template <
> class w_ClassRef,
> class w_RefTraits = AT_ReferenceTraits< w_ClassRef >
> >
> class AT_LifeLine
> {
> w_ClassRef m_ptrVal;

So, w_ClassRef is supposed to be a pointer type. I.e.
completely at odds to almost all existing smart pointer
implementions where the first argument is the pointee. Thus
you want to write AT_LifeLine<T*> rather than shared_ptr<T>.

> inline AT_LifeLine(
> const AT_LifeLine< w_ClassRef, w_RefTraits > & i_ptr
> )
> : m_ptrVal( i_ptr.m_ptrVal )
> {}

You avoid incrementing the reference count here, using the
principle that ownership flows from the RHS to the LHS
without ever being shared between them. A few points:-

* The RHS is being semantically modified by the
constructor, and therefore should be taken by
non-const reference.

* I know you had MadeTransfer functions to assert in
"debug" builds, but is it really your intention that a
"non-debug" build can produce subtle bugs (e.g. double
destruction) by copying such a pointer twice? I would
think that just setting m_ptrVal to NULL on the RHS
would be safest. At least that way, you'll get a SEGV
fairly rapidly when something goes wrong.

* It looks to me like you're getting very close to
reimplementing std::auto_ptr.

> inline AT_LifeLine( w_ClassRef i_ptr = 0 )
> : m_ptrVal( i_ptr )
> {}

Should be explicit.

> inline AT_LifeLine( w_ClassRef i_ptr, bool i_takeOwnership )
> : m_ptrVal( i_ptr )
> {
> if ( i_takeOwnership ) {
> if ( i_ptr ) {
> w_RefTraits::IncRefCount( i_ptr );
> }
> }
> }

What's wrong with combining this with the previous
constructor by defaulting i_takeOwnership to false?

Hang on. Is the intention that I must write

AT_LifeLine<T*>( new T, true )

to take ownership of the pointer? This will be very error
prone, as people are used to using smart pointers that use
the syntax

shared_ptr<T>( new T )

Having it *not* take ownership by default is very
counter-intuitive.

> inline ~AT_LifeLine() {}

What happens when an exception results in the pointer being
destroyed before it has transfered ownership away? Like
this it leaks resources, or with your debugging hooks, will
assert. This is a very bad.

> inline AT_LifeLine & operator= ( const w_ClassRef i_ptr )
> {
> m_ptrVal = i_ptr;
> return *this;
> }

I don't think you want an operator= that has a raw pointer
on its RHS. And I note that it doesn't take ownership of
the pointer.

> inline w_ClassRef operator-> () const
> {
> return m_ptrVal;
> }

It's normal to supply an operator*() as well.

> inline operator w_ClassRef () const
> {
> return m_ptrVal;
> }

Eurgh. Implicit conversions to the raw pointer type. Have
you considered how easy it would be to implicitly convert it
to a raw pointer and thus destroy the smart pointer,
probably causing a resource leak. (Unless the destructor
asserts, which doesn't seem much better.)

> inline operator bool () const
> {
> return m_ptrVal != 0;
> }

Look at how boost::shared_ptr makes use of

typedef T* (shared_ptr<T>::*boolean)() const;

operator boolean() const
{ return px == 0? 0: &shared_ptr<T>::get; }

This is a far better solution to this problem as your
solution implicitly allows expressions such as

AT_LifeLine a, b, c( a / b );

> template < class w_ClassRef, class w_RefTraits >
> class AT_LifeTime

Many of my comments about AT_LifeLine apply again here. I
won't bother repeating them.

> inline AT_LifeTime( w_ClassRef ptr, bool i_takeOwnership )
> : m_ptrVal( ptr )
> {
> if ( i_takeOwnership ) {
> if ( ptr ) {
> w_RefTraits::IncRefCount(ptr);
> }
> }
> }

What if w_RefTraits::IncRefCount() throws an exception?
(If it's managing an externally reference count, it might
need to allocate one as you've provided no hook for the
constructor to do that otherwise. Having said that, I'm not
at all sure it's possible to do external reference counting
at all sanely with your design.)

> template < class w_ClassRefRhs, class w_RefTraitsRhs >
> inline AT_LifeTime(
> const AT_LifeView< w_ClassRefRhs, w_RefTraitsRhs > & i_ptr
> )
> : m_ptrVal( i_ptr.m_ptrVal )
> {
> // Bump the pointee's reference count.
[...]
> template < class w_ClassRefRhs, class w_RefTraitsRhs >
> inline AT_LifeTime(
> const AT_LifeLine< w_ClassRefRhs, w_RefTraitsRhs > & i_ptr
> )
> : m_ptrVal( i_ptr.m_ptrVal )
> {}

So, if I construct a LifeTime from a LifeView, I increment
the refcount, but if I constructor a LifeTime from a
LifeLine, I don't. And, if I construct a LifeView from
a LifeLine, I don't increment the RC. Thus,

LifeLine ll;

LifeTime( LifeView(ll) ); // Increments ref count
LifeTime( ll ); // Does not increment ref count

Yet, in both cases, the LifeTime's destructor will do the
same thing. How can this be right?

> inline w_ClassRef * InnerReference()
> {
> ReleasePointee( 0 );
> return & m_ptrVal;
> }

What's wrong with just assigning the new value into the
pointer? If a legacy (e.g. C) function needs a
pointer-to-pointer, then you can always do

myLifeTime<T*> ptr;
T* tmp = 0;
legacy_function( &tmp );
ptr = tmp; // Or whatever the correct way to transfer
// ownershipe is.

> inline w_ClassRef Adopt( const w_ClassRef i_ptr )

The conventional name for this method is reset. This is
what std::auto_ptr calls it.

> * The AT_LifeView smart pointer is intended to be used in a very
> * specific situation: You need to pass an object's smart pointer
> * to a function so the function can do something with the object.

Why not just pass a reference to the underlying type? (Not
even the pointer, just a plain old reference.)

> * While this isn't incorrect, it is unnecessary for this
> * kind of function, and thus adds a performance penalty for no
> * reason.

If you make the mechanism for altering the refence count
inlineable, then this becomes a trivial penalty. This is
what the Boost implementation does. See other posts in this
thread for more details.

> template < class w_ClassRef, class w_RefTraits >
> class AT_LifeView

I don't see what this class does that a raw pointer or
reference doesn't do. If you don't want to transfer
ownership, this is precisely when you should be using a raw
pointer or reference.

> * The AT_Pointer smart pointer works essentially like the auto_ptr
> * of the Standard Template Library (STL).

Well, why not just use std::auto_ptr? There are a lot of
subtleties to its design that you've overlooked. (Consider
std::auto_ptr<T>::auto_ptr_ref<U> for example. Do you
understand why that exists?)

> * Therefore, one should use
> * an AT_Pointer only if the pointee is not capable of reference
> * counting.

External reference counting allows any type to be reference
counted. If your smart pointer framework does not provide
for external reference counting, it is highly unlikely to be
considered as a possible alternative to TR1's smart pointer
framework.

> inline AT_Pointer( w_ClassRef i_ptr = 0 )

Should be explicit.

> inline AT_Pointer(
> const AT_Pointer< w_ClassRef > & i_ptr )

Should have a non-const RHS.

--
Richard Smith

Gianni Mariani

unread,

Jul 10, 2003, 5:55:20 PM7/10/03

to

David Abrahams wrote:
> gi2n...@mariani.ws (Gianni Mariani) writes:
>
>
>> /**
>> * MadeTransfer()
>> * --------------
>> * This method sets a member variable indicating that
>> * ownership of the pointee has been successfully
>> * transferred from this AT_LifeLine to some persistent
>> * pointer.
>> */
>>
>> inline void MadeTransfer() const
>> {
>> Assert( ! m_hasBeenTransferred.m_value );
>> m_hasBeenTransferred.m_value = true;
>> }
>>
>
>
> Why is setting m_hasBeenTransferred to true better than incrementing
> a reference count? It doesn't seem like you can achieve much of a
> savings here.
>

Because the true MadeTransfer() method is :

inline void MadeTransfer() const {}

It does nothing !

There are two versions of LifeLine, one for "debugging" and the regular
version. You're looking at the debug version.

Howard Hinnant

unread,

Jul 10, 2003, 5:55:25 PM7/10/03

to

In article <3F0CAA84...@acm.org>, Pete Becker
<peteb...@acm.org> wrote:

| But not until I've finished random number generators. <g>

You're ahead of me on that one Pete, and good job! :-)

--
Howard Hinnant
Metrowerks

Carl Daniel

unread,

Jul 10, 2003, 9:38:48 PM7/10/03

to

"Richard Smith" wrote:

> If you mean that IUnknown::AddRef and IUnknown::Release are virtual
> and can't be inlined, then you're quite correct. Boost's reference
> counting mechanism does not use this -- it has a it's own independent
> reference count. Only when Boost's reference count drops to zero
> does it (virtually) call the custom deletion function, which will
> call IUnknown::Release. Thus you can copy boost::shared_ptr's around
> as much as you want, and you will not incur the overhead of calls to
> IUnknown::Release until the object is finally destroyed. The only
> overhead is from Boost's own reference counting mechanism, which is
> much more efficient.(And with a bit of hand-crafted assembler to do
> the atomic incrementing, etc. the mutex in the 1.30 implementation
> could be removed, making it even more so. Or perhaps I'm missing
> something?)

Perhaps I'm missing something obvious, but the use of a mutex to provide MT
safety in boost::shared_ptr guarantees that it is in fact far less efficient
than typical COM reference counting, including the virtual function call
overhead. If shared_ptr was updated to use InterlockedIncrement, then it
could be faster than the typical COM AddRef() call due to the virtual
function call in the latter.

Worse, COM objects "know" how their reference counts should be maintained.
A COM object that's "single threaded" can be used in an MT program, and yet
never use synchronization on access to the reference count (since it's an
error for multiple threads to ever enter the object). Thus, for a very
common case, the boost::shared_ptr mechanism is nearly guaranteed to be less
efficient than normal COM reference counting.

-cd

Gianni Mariani

unread,

Jul 10, 2003, 9:43:21 PM7/10/03

to

Richard Smith wrote:
> Gianni Mariani wrote:

Ricahrd,

Thanks, you make some good points I need to consider further, However I
will answer some of your questions.

>
>
>>Howard Hinnant wrote:
>>
>>>My day is simply too full to drop everything everytime somebody says:
>>>I've got a better idea, want to know more?
>>
>>Hint of arrogance I hear ?
>
>
> I doubt it; more likely a statement of fact.
>

Not touching this ... :)

>
>>I have attached the header that defines the templates. If you're
>>interested I'll make an alpha release of the entire library soon as it
>>stands soon.
>
>
> Could I draw your attention to Howard Hinnant's comment "it
> would be to your advantage to compare and contrast with it
> (with code examples!) in order to shorten your reader's
> learning curve". I think you would have been wise to have
> followed it. I suspect you'll find relatively few people
> will be prepared to wade through a 90k post in order to
> separate the germane differences from the TR1 shared_ptr
> from differences in style or syntax.
>
> I've made some detailed comments on your suggestion, below.
> For those who don't want to wade through them, here's a
> summary:
>
> 1. If you want it to be considered instead of
> boost::shared_ptr for C++0x, then write it in a way
> that looks and feels like the rest of the STL. In
> particular, where possible make expressions that work
> on std::auto_ptr have the same meaning on your
> pointers.
>

Fair comment, if there is merit, this is mostly cosmetic and can be
easily done. The original intent was not for a substitute for
boost::shared_ptr.

> 2. You seem to be advocating an intrusive shared_ptr
> implementation to the exclusion of any possible
> externally reference counted implementation. I've
> nothing against and intrusive implementation -- indeed
> I think it would be a good thing. HOWEVER, I wouldn't
> want to require all shared pointers to be intrusive.

Are not these 2 different concepts (intrusive vs externally reference
counted ?).

I have yet to come across the need for an externally reference counted
facility. I'll need to consider this further.

>
> 3. Your AT_LifeLine class is attempting to solve the same
> problem as std::auto_ptr, and in the same way. You'd
> be better off looking at std::auto_ptr, learning
> thoroughly how it works (it is decidedly non-trivial)
> and thinking about how it might need to be changed (if
> at all) in the presence of additional standardised
> smart pointers.

Yes, perhaps. I have yet to find a case where auto_ptr was satisfactory
and hence I overlooked it. Need to reconsider since it's been a while
since I gave up on it (auto_ptr that is).

>
> 4. Your smart pointer classes have a great many implicit
> conversions to and from raw pointers. This is
> dangerous: even though an intrusive implementation
> allows smart pointers to be converted to and from raw
> pointers, it is very easy to get resource leaks using
> this.

Yes, the model requies that no raw pointers are used in the code except
from a new, or when a pointer is "transferred" which admitedly is a
hang-over from a while back. That could go.

>
> 5. The idea of using optional run-time assertions to catch
> usage errors is in principle good, but in this case
> leads to poor exception safety. Consider what happens
> when you construct a AT_LifeLine (your std::auto_ptr
> clone) and an exception causes it to be destroyed
> before relinquishing owenership. Depending on whether
> or not you have debugging turned on, you'll either get
> an assertion or a resource leakk, either of which are
> bad.

This is probably the most significant issue. Since I shy away from
using exceptions due to the significant complexity, this has never been
an issue. Having said that, expception safe code is a practice I
advocate and hence your observation is correct.

I'm not as concerned about the assertion as I am with the resource leak.
The assertion is there to catch the very errors it asserts for.

>
> 6. Your AT_LifeTime class is basically the same as
> boost::shared_ptr, other than it uses an intrusive
> reference count. Can you summarise exactly what you
> think this class offers that boost::shared_ptr does
> not. In particular, can we see example usages of both,
> and, if your arguments are based on efficiency, some
> reproducible figures backing up your claims.

I can.

>
> 7. I'm not at all convinced that all of the conversions
> between AT_LifeTime, AT_LifeLine and AT_LifeView
> increment and decrement the reference counts as
> appropriate.

It has passed a set of tests ...

>
> 8. The AT_LifeView class doesn't (so far as I can see)
> provide any functionality that a raw pointer or
> reference doesn't provide. Therefore I don't see any
> point to it.

Except that the semantics of a raw pointer are ambiguous while that of a
LifeView is not.

>
> 9. AT_Pointer seems to be another attempt to reimplement
> std::auto_ptr. Why bother re-implementing it? Your
> version doesn't work as well as the standard one.

AT_Pointer happens to be in the same file. AT_Pointer works like so:

std::list< AT_Pointer< foo * > >

auto_ptr does not.

>
> In short, I don't think you'll generate much enthusiasm for
> your implementation, however, if you think there are
> worthwhile points to be considered, do suggest them, but as
> modifications to std::auto_ptr, TR1's std::shared_ptr or
> std::weak_ptr. Proposing a wholesale alternative without
> clear discussion on its benefits over the existing one will
> not win you many supporters.
>
>
> ------------------------------------------------------------
> Specific comments on your code follow:
>
>
>
>>template < class w_ClassRef > class AT_ReferenceTraits;
>
>
> If you're attempting to propose something as an alternative
> to the TR1 shared_ptr, and hence eventually for probable
> addition to the Standard, you should use the same naming
> convention as the Standard, viz, lower_case class and
> function names, and InterCaps for template parameters.
> Oh, and no reverse Polish. This may sound pedantic, but
> people used to reading the Standard library will find it
> easier to parse.

When and if we get to that point, we'll worry about that.

>
> If you want serious consideration to be given to your
> suggestions, you have to go out of your way to make it easy
> for people to quickly see its salient points. This also
> means removing the extraneous clutter from the file so that
> the *interface* is immediately clear. (I've done this to
> the bits of your design that I've quoted.)
>
>
>>/**
>> * AT_LifeControl is an abstract base class for any class whose lifetime
>> * is managed by means of reference counting.
>> */
>>class AT_LifeControl
>>{
>> public:
>> virtual ~AT_LifeControl();
>> virtual int AddRef() = 0;
>> virtual int Release() = 0;
>>};
>
>
> You're now advocating an intrusive implementation, are you?
> This is not a bad idea, as long as an externally reference
> counted implementation is also possible, ideally without
> exposing the difference to end user.

I need to see why you would want to mix these concepts.

>
> (Clearly there are potential differences in interface
> between external and intrusive shared pointers. For
> example, it could be safe to do
>
> shared_ptr<T> a( new T ), b( a.get() );
>
> with a intrusive shared pointer, but rarely will be with an
> external one. IMHO, this is best left illegal for all
> shared pointers, leaving the interfaces to intrusive and
> external pointers exactly the same.)

What would be the purpose of designing a class that required external
reference counting ? If this is an issue primarily of legacy
compatability, then I have still not run into this issue.

I'd need to see what real-life problems are being solved with external
reference counting.

>
>
>>template <class w_ClassRef> class AT_ReferenceTraits
>>{
>> public:
>> static inline void IncRefCount( w_ClassRef i_ptr )
>> { i_ptr->AddRef(); }
>> static inline void DecRefCount( w_ClassRef i_ptr )
>> { i_ptr->Release(); }
>>};
>
>
> When an object is created, does it have a reference count of
> 1, or must IncRefCount() be called?

has a reference count of 1.

>
> [quoting out of order]
>
>> * If you are using your own reference-counting scheme, i.e. one not
>> * based on AT_LifeControl, then you will need to define your own
>> * helper class that implements the same interface as AT_ReferenceTraits.
>
>
> Can you write me such a policy class that handles external
> reference counting? What are the function arguments to
> IncRefCount? And how is the reference count accessed from
> them?
>

I suspect so.

>
>> * (Incidentally, the name "AT_LifeLine" is meant to evoke the image
>> * of someone tossing a rope to someone adrift at sea, saying, in
>> * effect, "Here, this is yours, you'd better take ownership of it.")
>
>
> Evoking the image of a shared pointer might be more
> appropriate. Seriously, though, I really don't like your
> choices of names.
>

Seriously, I don't care. ... I truly don't mean to be rude, in my
career I have wasted more time arguing about aesthetics and I have very
little intention to waste much more time. Having said that I can
discuss this topic ad nauseum which is not very productive. My
threshold however to a better reccomendation is very low. In other
words, if you have a better name, I'll change it. "shared_ptr" is
probably better.

>
>> * Instrumenting an AT_LifeLine object to check for memory leaks is a
>> * useful debugging aid, but is expensive in terms of execution time.
>
>
> This is an quality-of-implementation issue, and really isn't
> relevant to this discussion. (I've removed all such
> debugging hooks from the quoted text.)
>
>
>>template <
>> class w_ClassRef,
>> class w_RefTraits = AT_ReferenceTraits< w_ClassRef >
>>
>>class AT_LifeLine
>>{
>> w_ClassRef m_ptrVal;
>
>
> So, w_ClassRef is supposed to be a pointer type. I.e.
> completely at odds to almost all existing smart pointer
> implementions where the first argument is the pointee. Thus
> you want to write AT_LifeLine<T*> rather than shared_ptr<T>.

yes.

or

typedef X * Xp;

AT_Lifetime<Xp>

>
>
>> inline AT_LifeLine(
>> const AT_LifeLine< w_ClassRef, w_RefTraits > & i_ptr
>> )
>> : m_ptrVal( i_ptr.m_ptrVal )
>> {}
>
>
> You avoid incrementing the reference count here, using the
> principle that ownership flows from the RHS to the LHS
> without ever being shared between them. A few points:-
>
> * The RHS is being semantically modified by the
> constructor, and therefore should be taken by
> non-const reference.

hmm, ok, you've hit a point that I glossed over during implementation.
The reference count is somewhat separate from the rest of the class
implementation. Hence the reference count is mutable.

>
> * I know you had MadeTransfer functions to assert in
> "debug" builds, but is it really your intention that a
> "non-debug" build can produce subtle bugs (e.g. double
> destruction) by copying such a pointer twice? I would
> think that just setting m_ptrVal to NULL on the RHS
> would be safest. At least that way, you'll get a SEGV
> fairly rapidly when something goes wrong.

The original intent was that the LifeLine is akin to passing a raw
pointer and the LifeLine was a way to mark such a pointer as passing the
responsibility to decrement the reference count. Your suggestion
changes that intent but requires consideration.

>
> * It looks to me like you're getting very close to
> reimplementing std::auto_ptr.
>
>
>> inline AT_LifeLine( w_ClassRef i_ptr = 0 )
>> : m_ptrVal( i_ptr )
>> {}
>
>
> Should be explicit.

yes.

>
>
>> inline AT_LifeLine( w_ClassRef i_ptr, bool i_takeOwnership )
>> : m_ptrVal( i_ptr )
>> {
>> if ( i_takeOwnership ) {
>> if ( i_ptr ) {
>> w_RefTraits::IncRefCount( i_ptr );
>> }
>> }
>> }
>
>
> What's wrong with combining this with the previous
> constructor by defaulting i_takeOwnership to false?
>
> Hang on. Is the intention that I must write
>
> AT_LifeLine<T*>( new T, true )

No. This constructor can go.

>
> to take ownership of the pointer? This will be very error
> prone, as people are used to using smart pointers that use
> the syntax
>
> shared_ptr<T>( new T )
>

AT_LifeLine<T*>( new T )

does the same.

> Having it *not* take ownership by default is very
> counter-intuitive.
>

Right.

>
>> inline ~AT_LifeLine() {}
>
>
> What happens when an exception results in the pointer being
> destroyed before it has transfered ownership away? Like
> this it leaks resources, or with your debugging hooks, will
> assert. This is a very bad.

The alternative may also be "bad".

>
>
>> inline AT_LifeLine & operator= ( const w_ClassRef i_ptr )
>> {
>> m_ptrVal = i_ptr;
>> return *this;
>> }
>
>
> I don't think you want an operator= that has a raw pointer
> on its RHS. And I note that it doesn't take ownership of
> the pointer.

AT_LifeLine<T*> p;

...

p = new T;

... I think you found a bug.

>
>
>> inline w_ClassRef operator-> () const
>> {
>> return m_ptrVal;
>> }
>
>
> It's normal to supply an operator*() as well.
>

This would provide a raw pointer ?

>
>> inline operator w_ClassRef () const
>> {
>> return m_ptrVal;
>> }
>
>
> Eurgh. Implicit conversions to the raw pointer type. Have
> you considered how easy it would be to implicitly convert it
> to a raw pointer and thus destroy the smart pointer,
> probably causing a resource leak. (Unless the destructor
> asserts, which doesn't seem much better.)
>

No. In the intended use, this would never happen.

>
>> inline operator bool () const
>> {
>> return m_ptrVal != 0;
>> }
>
>
> Look at how boost::shared_ptr makes use of
>
> typedef T* (shared_ptr<T>::*boolean)() const;
>
> operator boolean() const
> { return px == 0? 0: &shared_ptr<T>::get; }
>
> This is a far better solution to this problem as your
> solution implicitly allows expressions such as
>
> AT_LifeLine a, b, c( a / b );
>

OK,

LifeTime( ll ); transfers responsibility

LifeTime( LifeView(ll) ); does not

It works.

>
>
>> inline w_ClassRef * InnerReference()
>> {
>> ReleasePointee( 0 );
>> return & m_ptrVal;
>> }
>
>
> What's wrong with just assigning the new value into the
> pointer? If a legacy (e.g. C) function needs a
> pointer-to-pointer, then you can always do

>
> myLifeTime<T*> ptr;
> T* tmp = 0;
> legacy_function( &tmp );
> ptr = tmp; // Or whatever the correct way to transfer
> // ownershipe is.
>
>

legacy_function( ptr.InnerReference() );

replaces 3 of the lines above.

>
>> inline w_ClassRef Adopt( const w_ClassRef i_ptr )
>
>
> The conventional name for this method is reset. This is
> what std::auto_ptr calls it.
>
>
>
>
>> * The AT_LifeView smart pointer is intended to be used in a very
>> * specific situation: You need to pass an object's smart pointer
>> * to a function so the function can do something with the object.
>
>
> Why not just pass a reference to the underlying type? (Not
> even the pointer, just a plain old reference.)

how would you resolve the ambiguity of

T * ptr = new T;
...
shared_ptr<T> sptr = ptr;

Method( ptr );
...

void Method( T * ptr )
{
shared_ptr<T> sptr = ptr;
}

Obviously somthing is wrong here. *1

>
>
>> * While this isn't incorrect, it is unnecessary for this
>> * kind of function, and thus adds a performance penalty for no
>> * reason.
>
>
> If you make the mechanism for altering the refence count
> inlineable, then this becomes a trivial penalty. This is
> what the Boost implementation does. See other posts in this
> thread for more details.
>

At an inevitable cost.

>
>
>>template < class w_ClassRef, class w_RefTraits >
>>class AT_LifeView
>
>
> I don't see what this class does that a raw pointer or
> reference doesn't do. If you don't want to transfer
> ownership, this is precisely when you should be using a raw
> pointer or reference.
>

It resolves the issue *1 above.

>
>
>
>> * The AT_Pointer smart pointer works essentially like the auto_ptr
>> * of the Standard Template Library (STL).
>
>
> Well, why not just use std::auto_ptr? There are a lot of
> subtleties to its design that you've overlooked. (Consider
> std::auto_ptr<T>::auto_ptr_ref<U> for example. Do you
> understand why that exists?)

AT_Pointer was somthing I used to work around issues with a std::list
class I was using. It's been a while since I've looked at it. I'm not
proposing to discuss this. As far as I am concerned it's incomplete but
useful in a narrow way.

However, while we're on the topic, does:
std::list< auto_ptr<T> >

do what I expect ?

>
>> * Therefore, one should use
>> * an AT_Pointer only if the pointee is not capable of reference
>> * counting.
>
>
> External reference counting allows any type to be reference
> counted. If your smart pointer framework does not provide
> for external reference counting, it is highly unlikely to be
> considered as a possible alternative to TR1's smart pointer
> framework.
>

Again, I raise the issue of mixing too many concepts in the same class.
Mixing multiple concepts inevitably reduces the utility of a class
(yes contrary to popular opinion.)

>
>> inline AT_Pointer( w_ClassRef i_ptr = 0 )
>
>
> Should be explicit.
>
>
>> inline AT_Pointer(
>> const AT_Pointer< w_ClassRef > & i_ptr )
>
>
> Should have a non-const RHS.
>
> --
> Richard Smith
>

Thanks.
G

Richard Smith

unread,

Jul 11, 2003, 1:52:34 PM7/11/03

to

Gianni Mariani wrote:
> Richard Smith wrote:
> > Gianni Mariani wrote:

[...]

> > 1. If you want it to be considered instead of
> > boost::shared_ptr for C++0x, then write it in a way
> > that looks and feels like the rest of the STL. In
> > particular, where possible make expressions that work
> > on std::auto_ptr have the same meaning on your
> > pointers.
>
> Fair comment, if there is merit, this is mostly cosmetic and can be
> easily done. The original intent was not for a substitute for
> boost::shared_ptr.

Errr... At the very beginning of this thread you said "I

would like to see 3 more features in the libraries regarding
smart pointers, in particular reference counting smart

pointers". Now either you want your classes to replace
the existing proposal based on boost::shared_ptr, or you
want to augment it. Given that your AT_LifeTime class
serves almost exactly the same purpose as boost::shared_ptr
(except yours uses an intrusive reference count whereas
boost uses an external reference count), I can only assume
you want to replace it.

Irrespective of how you see your suggestion with respect to
the TR1 shared_ptr, the comment about making it feel more
like the STL, and in particular std::auto_ptr, still stands.
If you still think there is merit in your suggestions and
would like to see them considered for the next Standard, you
have to sell it to people ... and this will be hard work.

> > 2. You seem to be advocating an intrusive shared_ptr
> > implementation to the exclusion of any possible
> > externally reference counted implementation. I've
> > nothing against and intrusive implementation -- indeed
> > I think it would be a good thing. HOWEVER, I wouldn't
> > want to require all shared pointers to be intrusive.
>
> Are not these 2 different concepts (intrusive vs externally reference
> counted ?).

Yes, they're two different concepts, but semantically they
achieve the same things. Intrusive reference counting can
be more efficient than external reference counting as it
avoids the overhead of allocating an separate block of
memory to hold the reference count. (Good allocator
technology can help reduce this difference, though.) As far
as a user of the shared pointer is considered, he doesn't
care whether or not it's intrusive or extrusive -- this is
purely and implementation detail. Therefore, I think it is
desireable that if both intrusive and external reference
counting are included, they should seemlessly interoperate
so I can just write

shared_ptr< foo >; // foo supports intrusive reference
// counting, so this is used

shared_ptr< bar >; // bar does not support intrusive
// reference counting, so and
// external reference count is
// allocated

> I have yet to come across the need for an externally reference counted
> facility. I'll need to consider this further.

Really? I'm very surprised. Suppose I have (for what ever
reason) a

std::map< std::string, std::vector<std::string> >

-- quite an expensive object to be copying around. Now, how
do I avoid copying it? Obviously I can pass it by reference
rather than by value, or I perhaps I can heap-allocate
it and pass it by pointer. If I do this, how do I handle
its lifetime? Reference counting is the obvious way, but
std::map is not derived from AT_LifeControl, so I can't use
it with your smart pointer. With boost::shared_ptr, this is
trivial: I just write

boost::shared_ptr< std::map< ... > >

and an external refernce count is allocated and handled for
me. With your scheme, my only option is to create a wrapper
class aggregating the map and deriving from AT_LifeControl,
and then I have the problem of forwarding all the functions
in the interface.

> >
> > 3. Your AT_LifeLine class is attempting to solve the same
> > problem as std::auto_ptr, and in the same way. You'd
> > be better off looking at std::auto_ptr, learning
> > thoroughly how it works (it is decidedly non-trivial)
> > and thinking about how it might need to be changed (if
> > at all) in the presence of additional standardised
> > smart pointers.
>
> Yes, perhaps. I have yet to find a case where auto_ptr was satisfactory
> and hence I overlooked it. Need to reconsider since it's been a while
> since I gave up on it (auto_ptr that is).
>

> > 4. Your smart pointer classes have a great many implicit
> > conversions to and from raw pointers.

[...]

> the model requies that no raw pointers are used in the code except
> from a new, or when a pointer is "transferred" which admitedly is a
> hang-over from a while back. That could go.

Good. If you need to access a raw pointer (and from time to
time you will), provide a get() method to do it.

[...]

> Since I shy away from
> using exceptions due to the significant complexity

What "significant complexity"? Exceptions are a damn sight
easier to program with than C-style error codes, and once
your used to RAII, writing exception safe code is easy.

> I'm not as concerned about the assertion as I am with the resource leak.
> The assertion is there to catch the very errors it asserts for.

But it (the assertion) also creates the error. Consider:
somewhere an exception gets thrown -- this means "something
has gone wrong, but I think it quite possible that someone
higher up the stack can recover from it". However, if the
exception is thrown between the AT_LifeLine being created
and it transferring ownership,

> > 6. Your AT_LifeTime class is basically the same as
> > boost::shared_ptr, other than it uses an intrusive
> > reference count. Can you summarise exactly what you
> > think this class offers that boost::shared_ptr does
> > not.
>

> I can.

Well, will you then, please?

> > 8. The AT_LifeView class doesn't (so far as I can see)
> > provide any functionality that a raw pointer or
> > reference doesn't provide. Therefore I don't see any
> > point to it.
>
> Except that the semantics of a raw pointer are ambiguous while that of a
> LifeView is not.

There's no ambiguity with a reference. If you want to avoid
the overhead of reference counting on a particular function
call, you can do one of two things:

class foo {};

void fn1( const foo& );
void fn2( const boost::shared_ptr<foo>& );

int main() {
boost::shared_ptr< foo > f( new foo );
fn1(*f);
fn2(*f);
}

There's no doubt about fn1: no transfer of owenship is ever
intended. The call to fn2 also avoids the overhead of
reference counting, but allows fn2 to have ownership if it's
necessary.

> > 9. AT_Pointer seems to be another attempt to reimplement
> > std::auto_ptr. Why bother re-implementing it? Your
> > version doesn't work as well as the standard one.
>
> AT_Pointer happens to be in the same file. AT_Pointer works like so:
>
> std::list< AT_Pointer< foo * > >
>
> auto_ptr does not.

No, there is no guarantee that your AT_Pointer will work
correctly like this. Look again at it's copy constructor:

inline AT_Pointer(
const AT_Pointer< w_ClassRef > & i_ptr )

: m_ptrVal( i_ptr.m_ptrVal )
{
// Zero out the pointee address in the AT_Pointer
// being copied, so that it is impossible to delete
// the pointee twice accidentally.
// ------------------------------------------------
i_ptr.m_ptrVal = 0;
}

Ownership is transfered from the RHS to the LHS, and the LHS
pointer is nullified. This is precisely the same semantics
that std::auto_ptr has, and exactly the issue that prevents
std::auto_ptr from being used in an STL container. In
particular, AT_Pointer does not model the CopyConstructible
and Assignable concept defined in 20.1.3. According to
23.1/3, "the type of objects stored in these components must
meet the requirement of CopyConstructible". Therefore you
are not allowed to put AT_Pointers in std::lists.

> > You're now advocating an intrusive implementation, are you?
> > This is not a bad idea, as long as an externally reference
> > counted implementation is also possible, ideally without
> > exposing the difference to end user.
>
> I need to see why you would want to mix these concepts.

I don't see that a intrusive implementation alone can
provide a complete solution, whereas I think a external
implementation by itself could. (See my example, above,
using std::map for reasons.)

> What would be the purpose of designing a class that required external
> reference counting ? If this is an issue primarily of legacy
> compatability, then I have still not run into this issue.
>
> I'd need to see what real-life problems are being solved with external
> reference counting.

Not legacy code. See above.

> > Can you write me such a policy class that handles external
> > reference counting? What are the function arguments to
> > IncRefCount? And how is the reference count accessed from
> > them?
>
> I suspect so.

Well, could you show me one, please? I don't see how it can
feasibly be handled in your framework (short of having a
global map from pointer to ref count, or something equally
ugly). Demonstrate to me that I'm wrong.

[About class names:]

> Seriously, I don't care.

Well I do, and you just have to take a look at the boost
mailing list to see that lots of highly respected developers
will spend significant amounts of time finding well-chosen
names for classes.

> > So, w_ClassRef is supposed to be a pointer type. I.e.
> > completely at odds to almost all existing smart pointer
> > implementions where the first argument is the pointee. Thus
> > you want to write AT_LifeLine<T*> rather than shared_ptr<T>.
>
> yes.
>
> or
>
> typedef X * Xp;
>
> AT_Lifetime<Xp>

I think your missing the point: this is not the interface
that other smart pointer classes use, and I can see no
advantage to doing it differently.

> > * The RHS is being semantically modified by the
> > constructor, and therefore should be taken by
> > non-const reference.
>
> hmm, ok, you've hit a point that I glossed over during implementation.
> The reference count is somewhat separate from the rest of the class
> implementation. Hence the reference count is mutable.

The point is not whether the argument needs to be a
non-const reference to compile, but whether it semantically
should be non-const. Clearly, the copy constructor changes
the observable state of the source smart pointer.
(Initially it is a useable smart pointer, afterwards,
ownership has been transferred away and the pointer is
effectively unusable.) As it changes the state of the
source object, it should take the source by non-const
reference. The fact that this allows you to remove the
'mutable' keyword is incidental.

> >> inline ~AT_LifeLine() {}
> >
> > What happens when an exception results in the pointer being
> > destroyed before it has transfered ownership away? Like
> > this it leaks resources, or with your debugging hooks, will
> > assert. This is a very bad.
>
> The alternative may also be "bad".

Why? Why should it be "bad" to call DecRefCount on
destruction if the ownership has not been transfered away?
Can you give me a concrete example of how this might cause
problems?

> > I don't think you want an operator= that has a raw pointer
> > on its RHS. And I note that it doesn't take ownership of
> > the pointer.
>
> AT_LifeLine<T*> p;
>
> ...
>
> p = new T;

That's fine, but the problem is in accidental assignment
from a pointer where you should not take ownership. Having
a method to do this, e.g.

p.reset( new T );

makes this more explicit.

> > It's normal to supply an operator*() as well.
> >
>
> This would provide a raw pointer ?

No, a raw reference. This makes the following work:

void function( const T& );

AP_LifeLine<T*> ll;

function( *ll );

> > Eurgh. Implicit conversions to the raw pointer type. Have
> > you considered how easy it would be to implicitly convert it
> > to a raw pointer and thus destroy the smart pointer,
> > probably causing a resource leak. (Unless the destructor
> > asserts, which doesn't seem much better.)
> >
>
> No. In the intended use, this would never happen.

Well, don't provide the conversion operators then. If
they're not needed in the intended use, and are unsafe,
get rid of them.

> >> inline w_ClassRef * InnerReference()

> >
> > What's wrong with just assigning the new value into the
> > pointer?
> >

> > myLifeTime<T*> ptr;
> > T* tmp = 0;
> > legacy_function( &tmp );
> > ptr = tmp; // Or whatever the correct way to transfer
> > // ownershipe is.
>
> legacy_function( ptr.InnerReference() );
>
> replaces 3 of the lines above.

The point is that legacy code will attach many different
semantics to a T**, so there is no one correct answer. For
example, why do you release and nullify the original
pointer? Some applications might want T** to point to a
valid pointer which they possibly update. And what about
legacy code that tries to create the object using the wrong
heap (e.g. using malloc rather than new)?

Furthermore, there's no reason why this needs to be a member
function -- as I've shown above, everything can be done
using the public interface. If your legacy libraries attach
particular semantics to T** pointers, then, fine provide
your own helper function, but don't make it part of the
smart pointer's interface.

> >> * The AT_LifeView smart pointer is intended to be used in a very
> >> * specific situation: You need to pass an object's smart pointer
> >> * to a function so the function can do something with the object.
> >
> >
> > Why not just pass a reference to the underlying type? (Not
> > even the pointer, just a plain old reference.)
>
> how would you resolve the ambiguity of
>
> T * ptr = new T;

> ....

> shared_ptr<T> sptr = ptr;
>
> Method( ptr );

> ....

>
> void Method( T * ptr )
> {
> shared_ptr<T> sptr = ptr;
> }
>
> Obviously somthing is wrong here. *1

For a start, with boost::shared_ptr this won't compile.
Boost's shared_ptr does not allow you to write

shared_ptr<T> sptr = ptr;

instead you must either write

shared_ptr<T> sptr( ptr );

or call reset.

Secondly, you'd be much better off passing T by reference
rather that by pointer. This would have avoided the
problem.

Thirdly, it's a bloody stupid thing to do. You should
never, ever pass anything to a boost::shared_ptr other than
the immediate return value from a new expression.
Following this and a few other simple rules avoids any
problems.

I repeat what I said in my first post to this thread:
aside from problems with cyclic ownership, the majority of
problems I've had using boost::shared_ptr have been due to
compiler bugs. It really is a very simple class to use
correctly.

[Regarding spurious increments / decrements to the ref
count:]

> > If you make the mechanism for altering the refence count
> > inlineable, then this becomes a trivial penalty. This is
> > what the Boost implementation does. See other posts in this
> > thread for more details.
>
> At an inevitable cost.

Insignificant. How many instructions do you think it takes
to do ++*ptr->refcount when inlined?

And you still haven't what's wrong with:

void foo( const boost::shared_ptr<T>& )

This doesn't modify the refcount at all.

> >>template < class w_ClassRef, class w_RefTraits >
> >>class AT_LifeView
> >
> >
> > I don't see what this class does that a raw pointer or
> > reference doesn't do. If you don't want to transfer
> > ownership, this is precisely when you should be using a raw
> > pointer or reference.
> >
>
> It resolves the issue *1 above.

Explain how issue 1, above, applies when you pass the raw
object by reference, e.g.

void foo( const T& );
boost::shared_ptr<T> t;
foo( *t );

> AT_Pointer was somthing I used to work around issues with a std::list
> class I was using.

It doesn't work around the problems at all. You may have
not have noticed them, and it's possible that your
particular STL's std::list broadly works with your class,
but it isn't portable and probably doesn't even work
correctly in all cases in your STL.

> As far as I am concerned it's incomplete but
> useful in a narrow way.

In what way is incomplete?

> However, while we're on the topic, does:
> std::list< auto_ptr<T> >
>
> do what I expect ?

It does exactly the same as

std::list< AT_Pointer<T> >

i.e. it invokes undefined behaviour. Don't do either.

> Again, I raise the issue of mixing too many concepts in the same class.

What I'm trying to point out, is that the problem that
you say AT_Pointer is there to solve -- namely that not all
classes can be reference counted -- is only an issue because
you've chosen to ignore external reference counting. I
really can't see a general smart pointer library being
accepted into the Standard without support for external
reference counting. As your stated aim is to get this
accepted into the Standard, you MUST consider external
reference counting.

--
Richard Smith

Richard Smith

unread,

Jul 11, 2003, 1:52:44 PM7/11/03

to

Carl Daniel wrote:

> Perhaps I'm missing something obvious, but the use of a mutex to provide MT
> safety in boost::shared_ptr guarantees that it is in fact far less efficient
> than typical COM reference counting, including the virtual function call
> overhead. If shared_ptr was updated to use InterlockedIncrement, then it
> could be faster than the typical COM AddRef() call due to the virtual
> function call in the latter.

It's actually much more complicated than this due to the
interaction with weak_ptr. The problem can be seen in the
followng code:

boost::weak_ptr<T> wp;
{
boost::shared_ptr<T> sp( new T ); wp = sp;
}
boost::shared_ptr<T> sp( wp.lock() );

In this, wp is assigned a shared_ptr which immediately goes
out of scope. The wp.lock() function needs to know that the
use count of the now dead shared_ptr is zero. This means
the use count must be held onto until all weak_ptrs have
been removed, which means keeping count of the number of
active weak_ptrs too.

Therefore, instead of the naive

template <class T> shared_ptr {
size_t* ref_count;
T* pointer;
};

you need something like

struct ref_count_holder {
size_t shared_count; // num shared_ptrs
size_t weak_count; // num weak_ptrs + shared_ptrs
};

template <class T> shared_ptr {
ref_count_holder* ref_counts;
T* pointer;
};

and when the ref_count falls to zero, pointer is deleted,
but the ref_count_holder is not deleted until weak_count
falls to zero.

Why does this complicate matters? Because now we have two
separate counts to maintain, and I know of no platforms that
provide atomic operations to manipulate a pair of variable
simulatenously -- hence the use of a mutex in the Boost
implementation.

> Worse, COM objects "know" how their reference counts should be maintained.
> A COM object that's "single threaded" can be used in an MT program, and yet
> never use synchronization on access to the reference count (since it's an
> error for multiple threads to ever enter the object). Thus, for a very
> common case, the boost::shared_ptr mechanism is nearly guaranteed to be less
> efficient than normal COM reference counting.

Quite possibly, but does the COM model allow weak pointers
in the sense used in TR1? My guess is not, but I confess to
not being particularly fluent with the details of COM.

--
Richard Smith

David Abrahams

unread,

Jul 11, 2003, 1:52:50 PM7/11/03

to

ric...@ex-parrot.com (Richard Smith) writes:

> I've made some detailed comments on your suggestion, below.
> For those who don't want to wade through them, here's a
> summary:

Wow, Richard, that's some pretty detailed commentary! I hope you will
participate in the next Boost library review. You could make a great
contribution!

-Dave

P.S. Nice email address. A Python fan, perhaps?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

David Abrahams

unread,

Jul 11, 2003, 8:13:36 PM7/11/03

to

cpda...@nospam.mvps.org ("Carl Daniel") writes:

> "Richard Smith" wrote:
>
>> If you mean that IUnknown::AddRef and IUnknown::Release are virtual
>> and can't be inlined, then you're quite correct. Boost's reference
>> counting mechanism does not use this -- it has a it's own independent
>> reference count. Only when Boost's reference count drops to zero
>> does it (virtually) call the custom deletion function, which will
>> call IUnknown::Release. Thus you can copy boost::shared_ptr's around
>> as much as you want, and you will not incur the overhead of calls to
>> IUnknown::Release until the object is finally destroyed. The only
>> overhead is from Boost's own reference counting mechanism, which is
>> much more efficient.(And with a bit of hand-crafted assembler to do
>> the atomic incrementing, etc. the mutex in the 1.30 implementation
>> could be removed, making it even more so. Or perhaps I'm missing
>> something?)
>
> Perhaps I'm missing something obvious, but the use of a mutex to
> provide MT safety in boost::shared_ptr guarantees that it is in fact
> far less efficient than typical COM reference counting, including
> the virtual function call overhead.

Of course, a mutex is only used if multithreading is enabled.

> If shared_ptr was updated to use InterlockedIncrement, then it could
> be faster than the typical COM AddRef() call due to the virtual
> function call in the latter.

Which shows that the cost is a "mere" implementation detail. Care to
submit a patch?

> Worse, COM objects "know" how their reference counts should be
> maintained. A COM object that's "single threaded"

Well isn't it actually ``a COM _type_ that's "single threaded"?''

> can be used in an MT program, and yet never use synchronization on
> access to the reference count (since it's an error for multiple
> threads to ever enter the object). Thus, for a very common case,
> the boost::shared_ptr mechanism is nearly guaranteed to be less
> efficient than normal COM reference counting.

That's an interesting point. Of course, COM's apartment model is
quite problematic in many cases, and having to decide
single/multi-threadedness per-type instead of per-object can be a
problem too, can't it?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

Edward Diener

unread,

Jul 11, 2003, 8:15:43 PM7/11/03

to

"Carl Daniel" wrote:
> "Richard Smith" wrote:
>
>> If you mean that IUnknown::AddRef and IUnknown::Release are virtual
>> and can't be inlined, then you're quite correct. Boost's reference
>> counting mechanism does not use this -- it has a it's own independent
>> reference count. Only when Boost's reference count drops to zero
>> does it (virtually) call the custom deletion function, which will
>> call IUnknown::Release. Thus you can copy boost::shared_ptr's around
>> as much as you want, and you will not incur the overhead of calls to
>> IUnknown::Release until the object is finally destroyed. The only
>> overhead is from Boost's own reference counting mechanism, which is
>> much more efficient.(And with a bit of hand-crafted assembler to do
>> the atomic incrementing, etc. the mutex in the 1.30 implementation
>> could be removed, making it even more so. Or perhaps I'm missing
>> something?)
>
> Perhaps I'm missing something obvious, but the use of a mutex to
> provide MT safety in boost::shared_ptr guarantees that it is in fact
> far less efficient than typical COM reference counting, including the
> virtual function call overhead. If shared_ptr was updated to use
> InterlockedIncrement, then it could be faster than the typical COM
> AddRef() call due to the virtual function call in the latter.

Perhaps you are missing the fact that other operating systems other than
Windows exists, and that InterlockedIncrement is not a C++ Standard library
function.

Gianni Mariani

unread,

Jul 12, 2003, 1:30:01 AM7/12/03

to

Richard Smith wrote:
> Gianni Mariani wrote:
>
>>Richard Smith wrote:
>>
>>>Gianni Mariani wrote:
>
>
> [...]
>
>
>>> 1. If you want it to be considered instead of
>>> boost::shared_ptr for C++0x, then write it in a way
>>> that looks and feels like the rest of the STL. In
>>> particular, where possible make expressions that work
>>> on std::auto_ptr have the same meaning on your
>>> pointers.
>>
>>Fair comment, if there is merit, this is mostly cosmetic and can be
>>easily done. The original intent was not for a substitute for
>>boost::shared_ptr.
>
>
> Errr... At the very beginning of this thread you said "I
> would like to see 3 more features in the libraries regarding
> smart pointers, in particular reference counting smart
> pointers". Now either you want your classes to replace
> the existing proposal based on boost::shared_ptr, or you
> want to augment it. Given that your AT_LifeTime class
> serves almost exactly the same purpose as boost::shared_ptr
> (except yours uses an intrusive reference count whereas
> boost uses an external reference count), I can only assume
> you want to replace it.
>

You've made a significant argument regarding the exception safety of
AT_LifeLine and hence I need to reconsider my proposal.

The right answer here on this depends on a compromise of the following
requirements:

a) "zero overhead" principle
b) exception safety
c) correctness - or: elimination of usage errors - or : make is possible
for the compiler to pick up usage errors.
(and probably more)

Depending on which requirement I give priority I get different answers
to "what is right".

I tend to place the "zero overhead" principle above all others which is
NOT the right answer for everyone or all situations obviously but
sometimes it is. Hence when it is the overriding principle, using a
solution that does not provide the most efficient solution can be a
significant problem. Also, smart pointers, or rather object lifetime
management, is such a fundamental issue for C++ that it will become
pervasive throughout any code base and hence will become a topic of many
frustrations to come. To cap this paradox off, very few programmers
will grasp the true complexity of smart pointers (as I have already
witnessed). A typical formula for intractible issues.

So, my intention here is to open the discussion on altenatives and
discuss their merits.

The argument becomes far more complex than just smart pointers when you
start considering GC in the picture. GC schemes can be quite successful
and require very little in the way of "smart pointer" support. However
the GC behaviour can make it difficult to succeed with real-time
applications or can still lead to leaks of resources through errors in
programming so it is not a panacea. However if there exists an easy
method of life-time management (GC) why would you want to provide a less
than absolute optimal solution for smart pointers.

Anyhow, I have no point to press here other than I am still considering
all the points that have been made. There are 3 options I see moving
forward.

a) abandon AT_Life* and hence eliminate the issues with LifeLine
b) augment shared_ptr with LifeLine and LifeView lookalikes and include
a template parameter to enable it for legacy apps.
c) have 2 separate smart pointer systems, one meant for optimizing speed
at the cost of potential correctness issues and shared_ptr remains for
optimizing "correctness".

.... fourth option is to explode in disgust.

> Irrespective of how you see your suggestion with respect to
> the TR1 shared_ptr, the comment about making it feel more
> like the STL, and in particular std::auto_ptr, still stands.
> If you still think there is merit in your suggestions and
> would like to see them considered for the next Standard, you
> have to sell it to people ... and this will be hard work.
>

I'll need help on the "feel" thing.

If this is more than aesthetics, please explain, otherwise my prior
comments on this topic apply.

If shared_ptr had a second template arg that allowed you to do this:

shared_ptr< bar, iunknown_shared_ptr_helper >

.... now shared_ptr would be able to use bar's legacy reference
counting sematics, it would satisfy one of my issues regarding shared_ptr.

>
>
>>I have yet to come across the need for an externally reference counted
>>facility. I'll need to consider this further.
>
>
> Really? I'm very surprised. Suppose I have (for what ever
> reason) a
>
> std::map< std::string, std::vector<std::string> >
>
> -- quite an expensive object to be copying around. Now, how
> do I avoid copying it? Obviously I can pass it by reference
> rather than by value, or I perhaps I can heap-allocate
> it and pass it by pointer. If I do this, how do I handle
> its lifetime? Reference counting is the obvious way, but
> std::map is not derived from AT_LifeControl, so I can't use
> it with your smart pointer. With boost::shared_ptr, this is
> trivial: I just write
>
> boost::shared_ptr< std::map< ... > >
>
> and an external refernce count is allocated and handled for
> me. With your scheme, my only option is to create a wrapper
> class aggregating the map and deriving from AT_LifeControl,
> and then I have the problem of forwarding all the functions
> in the interface.
>

That map class is somthing I would allways put in another class. One of
my design principles is that encapsulating is a low (very low) threshold
act. I would almost never create an object like the one above whose
lifetime was not managed by class that contained it.

I could/would also argue this to be "implementation" and hence not
worthy of being passed around and managed by a smart pointer. It would
also be unlikely that I would expose such a member as a part of the
"interface".

This is may be why I never seem to have found it nessasary to use
"extrusive" RC.

>
>>> 3. Your AT_LifeLine class is attempting to solve the same
>>> problem as std::auto_ptr, and in the same way. You'd
>>> be better off looking at std::auto_ptr, learning
>>> thoroughly how it works (it is decidedly non-trivial)
>>> and thinking about how it might need to be changed (if
>>> at all) in the presence of additional standardised
>>> smart pointers.
>>
>>Yes, perhaps. I have yet to find a case where auto_ptr was satisfactory
>>and hence I overlooked it. Need to reconsider since it's been a while
>>since I gave up on it (auto_ptr that is).
>>
>
>
>>> 4. Your smart pointer classes have a great many implicit
>>> conversions to and from raw pointers.
>
> [...]
>
>>the model requies that no raw pointers are used in the code except
>>from a new, or when a pointer is "transferred" which admitedly is a
>>hang-over from a while back. That could go.
>
>
> Good. If you need to access a raw pointer (and from time to
> time you will), provide a get() method to do it.
>
> [...]
>
>>Since I shy away from
>>using exceptions due to the significant complexity
>
>
> What "significant complexity"? Exceptions are a damn sight
> easier to program with than C-style error codes, and once
> your used to RAII, writing exception safe code is easy.
>

Treating errors as exceptions seems to hide too much for less
experienced programmers to truly understand what is going on.

Treating errors as a "normal" operation tends to force programmers to
produce code that deals with errors more appropriatly.

While I say this, I advocate the idea of exception safety mainly because
that means RAII is adopted and hence a "return" or a "break" placed in
the code means that things clean up correctly which leads to fewer
errors during the maintenace phase.

"Exceptions may be easier" does not mean necassarily better simply
because considerations regarding the consequences of errors can easily
be (and regularly are) overlooked.

For the purposes of this discussion however, our differences in opinion
regarding exceptions are irrelevant.

>
>>I'm not as concerned about the assertion as I am with the resource leak.
>> The assertion is there to catch the very errors it asserts for.
>
>
> But it (the assertion) also creates the error. Consider:
> somewhere an exception gets thrown -- this means "something
> has gone wrong, but I think it quite possible that someone
> higher up the stack can recover from it". However, if the
> exception is thrown between the AT_LifeLine being created
> and it transferring ownership,

Yes, however in practice, the assertion is thrown if the code has an
error. Correcting the error is trivial to the point where the error is
no longer able to occur and hence the check is no longer needed which is
why AT_LifeLine exists. While it is a class with a personality it is
more aptly a description of policy or programmer intention.

>
>
>>> 6. Your AT_LifeTime class is basically the same as
>>> boost::shared_ptr, other than it uses an intrusive
>>> reference count. Can you summarise exactly what you
>>> think this class offers that boost::shared_ptr does
>>> not.
>>
>>I can.
>
>
> Well, will you then, please?
>

a) Support for any intrusive reference counting interface
b) Highly optimal implementation (at the expense of exception safety and
increased complexity)

Not implemented in the AT_Life* I posted is a 3rd feature. Support for
a thread safe AT_LifeTime (using a 3rd template parameter).

>
>>> 8. The AT_LifeView class doesn't (so far as I can see)
>>> provide any functionality that a raw pointer or
>>> reference doesn't provide. Therefore I don't see any
>>> point to it.
>>
>>Except that the semantics of a raw pointer are ambiguous while that of a
>>LifeView is not.
>
>
> There's no ambiguity with a reference. If you want to avoid
> the overhead of reference counting on a particular function
> call, you can do one of two things:
>
> class foo {};
>
> void fn1( const foo& );
> void fn2( const boost::shared_ptr<foo>& );
>
> int main() {
> boost::shared_ptr< foo > f( new foo );
> fn1(*f);
> fn2(*f);
> }
>
> There's no doubt about fn1: no transfer of owenship is ever
> intended. The call to fn2 also avoids the overhead of
> reference counting, but allows fn2 to have ownership if it's
> necessary.

I think this is where we have a fundamental disagreement. I can cite
examples where I would like to do the contrary. Notion of constness
does not indicate a notion of lifetime. These are 2 separate concepts
and should not be mixed.

You are right, but it works and will likely continue working correctly
and there would be serious implications for it to no longer work.

However, for this reason alone I would not advocate adding such a
facility into C++ but it would be higly unlikely that in the narrow
scope in which it is used it will not work.

Again, AT_Pointer is here inadvertently, it happens to be in the same
file. If you wish to discuss the merits of AT_Pointer we'll never
finish. I have also implemented an alternative to std:list that is
guarenteed to work correctly with AT_Pointer.

>
>
>>>You're now advocating an intrusive implementation, are you?
>>>This is not a bad idea, as long as an externally reference
>>>counted implementation is also possible, ideally without
>>>exposing the difference to end user.
>>
>>I need to see why you would want to mix these concepts.
>
>
> I don't see that a intrusive implementation alone can
> provide a complete solution, whereas I think a external
> implementation by itself could. (See my example, above,
> using std::map for reasons.)

In my experience, I find very few examples where intrusive RC is needed
and hence I would find it a waste of resources.

>
>
>>What would be the purpose of designing a class that required external
>>reference counting ? If this is an issue primarily of legacy
>>compatability, then I have still not run into this issue.
>>
>>I'd need to see what real-life problems are being solved with external
>>reference counting.
>
>
> Not legacy code. See above.

covered above.

>
>
>>>Can you write me such a policy class that handles external
>>>reference counting? What are the function arguments to
>>>IncRefCount? And how is the reference count accessed from
>>>them?
>>
>>I suspect so.
>
>
> Well, could you show me one, please? I don't see how it can
> feasibly be handled in your framework (short of having a
> global map from pointer to ref count, or something equally
> ugly). Demonstrate to me that I'm wrong.
>

I'll need to do this later.

> [About class names:]
>
>>Seriously, I don't care.
>
>
> Well I do, and you just have to take a look at the boost
> mailing list to see that lots of highly respected developers
> will spend significant amounts of time finding well-chosen
> names for classes.

Yes, you are quite correct. This should be an alarm for you where to
spend your time.

You're welcome to suggest new names.

>
>
>>>So, w_ClassRef is supposed to be a pointer type. I.e.
>>>completely at odds to almost all existing smart pointer
>>>implementions where the first argument is the pointee. Thus
>>>you want to write AT_LifeLine<T*> rather than shared_ptr<T>.
>>
>>yes.
>>
>>or
>>
>>typedef X * Xp;
>>
>>AT_Lifetime<Xp>
>
>
> I think your missing the point: this is not the interface
> that other smart pointer classes use, and I can see no
> advantage to doing it differently.

Create a shared_ptr to an Xp.

>
>
>>> * The RHS is being semantically modified by the
>>> constructor, and therefore should be taken by
>>> non-const reference.
>>
>>hmm, ok, you've hit a point that I glossed over during implementation.
>>The reference count is somewhat separate from the rest of the class
>>implementation. Hence the reference count is mutable.
>
>
> The point is not whether the argument needs to be a
> non-const reference to compile, but whether it semantically
> should be non-const. Clearly, the copy constructor changes
> the observable state of the source smart pointer.
> (Initially it is a useable smart pointer, afterwards,
> ownership has been transferred away and the pointer is
> effectively unusable.) As it changes the state of the
> source object, it should take the source by non-const
> reference. The fact that this allows you to remove the
> 'mutable' keyword is incidental.
>

Well, this is a fundamentally different approach than that of LifeTime
as I described earlier. Reference counting controls the life of the
object. Constness controls the modifyability of the object. I would
think that somthing that is not allowed to be modified should still be
able te be referenced ?

Still, I think this has little bearing on

>
>>>> inline ~AT_LifeLine() {}
>>>
>>>What happens when an exception results in the pointer being
>>>destroyed before it has transfered ownership away? Like
>>>this it leaks resources, or with your debugging hooks, will
>>>assert. This is a very bad.
>>
>>The alternative may also be "bad".
>
>
> Why? Why should it be "bad" to call DecRefCount on
> destruction if the ownership has not been transfered away?
> Can you give me a concrete example of how this might cause
> problems?
>

Because I'd rather do nothing.

>
>>>I don't think you want an operator= that has a raw pointer
>>>on its RHS. And I note that it doesn't take ownership of
>>>the pointer.
>>
>> AT_LifeLine<T*> p;
>>
>> ...
>>
>> p = new T;
>
>
> That's fine, but the problem is in accidental assignment
> from a pointer where you should not take ownership. Having
> a method to do this, e.g.
>
> p.reset( new T );
>
> makes this more explicit.

I might steal this idea. Thanks.

>
>
>>>It's normal to supply an operator*() as well.
>>>
>>
>>This would provide a raw pointer ?
>
>
> No, a raw reference. This makes the following work:
>
> void function( const T& );
>
> AP_LifeLine<T*> ll;
>
> function( *ll );
>

OK.

>
>
>>>Eurgh. Implicit conversions to the raw pointer type. Have
>>>you considered how easy it would be to implicitly convert it
>>>to a raw pointer and thus destroy the smart pointer,
>>>probably causing a resource leak. (Unless the destructor
>>>asserts, which doesn't seem much better.)
>>>
>>
>>No. In the intended use, this would never happen.
>
>
> Well, don't provide the conversion operators then. If
> they're not needed in the intended use, and are unsafe,
> get rid of them.
>

agreed.

>
>>>> inline w_ClassRef * InnerReference()
>>>
>>>What's wrong with just assigning the new value into the
>>>pointer?
>>>
>>> myLifeTime<T*> ptr;
>>> T* tmp = 0;
>>> legacy_function( &tmp );
>>> ptr = tmp; // Or whatever the correct way to transfer
>>> // ownershipe is.
>>
>> legacy_function( ptr.InnerReference() );
>>
>>replaces 3 of the lines above.
>
>
> The point is that legacy code will attach many different
> semantics to a T**, so there is no one correct answer. For
> example, why do you release and nullify the original
> pointer? Some applications might want T** to point to a
> valid pointer which they possibly update.

The intended use is very simply for the creation of a new object.
The semantics are very narrow but very common in COM. I can see this
being a helper friend class/function instead of a member method to make
it appear more separate. I've also used this in an ACE contruction
call. It happens frequenly.

And what about
> legacy code that tries to create the object using the wrong
> heap (e.g. using malloc rather than new)?

And what about that ?

It's up to our interface to make sure the right thing happens.

Am I missing somthing ?

>
> Furthermore, there's no reason why this needs to be a member
> function -- as I've shown above, everything can be done
> using the public interface. If your legacy libraries attach
> particular semantics to T** pointers, then, fine provide
> your own helper function, but don't make it part of the
> smart pointer's interface.
>

OK.

>
>>>>* The AT_LifeView smart pointer is intended to be used in a very
>>>>* specific situation: You need to pass an object's smart pointer
>>>>* to a function so the function can do something with the object.
>>>
>>>
>>>Why not just pass a reference to the underlying type? (Not
>>>even the pointer, just a plain old reference.)
>>
>>how would you resolve the ambiguity of
>>
>>T * ptr = new T;
>>....
>>shared_ptr<T> sptr = ptr;
>>
>>Method( ptr );
>>....
>>
>>void Method( T * ptr )
>>{
>> shared_ptr<T> sptr = ptr;
>>}
>>
>>Obviously somthing is wrong here. *1
>
>
> For a start, with boost::shared_ptr this won't compile.
> Boost's shared_ptr does not allow you to write
>
> shared_ptr<T> sptr = ptr;
>
> instead you must either write
>
> shared_ptr<T> sptr( ptr );
>
> or call reset.
>
> Secondly, you'd be much better off passing T by reference
> rather that by pointer. This would have avoided the
> problem.

Except that null is a valid T* and that null is not a valid T&.

>
> Thirdly, it's a bloody stupid thing to do. You should
> never, ever pass anything to a boost::shared_ptr other than
> the immediate return value from a new expression.
> Following this and a few other simple rules avoids any
> problems.

You're probably right and they should be removed but I need to go
through and figure out where I had problems and see if they are now
resolved with the other classes (View and Line).

>
> I repeat what I said in my first post to this thread:
> aside from problems with cyclic ownership, the majority of
> problems I've had using boost::shared_ptr have been due to
> compiler bugs. It really is a very simple class to use
> correctly.
>
>
> [Regarding spurious increments / decrements to the ref
> count:]
>
>
>>>If you make the mechanism for altering the refence count
>>>inlineable, then this becomes a trivial penalty. This is
>>>what the Boost implementation does. See other posts in this
>>>thread for more details.
>>
>>At an inevitable cost.
>
>
> Insignificant. How many instructions do you think it takes
> to do ++*ptr->refcount when inlined?

If you can inline the increment AND decrement, none. But inlining is
not what I'm concerned about.

There is no question that AT_Life* will allow the compiler to make more
significant optimizations when you cannot inline and that they can be
used to manage virtually any type of intrusive reference counted class.

If shared_ptr could do these things then I would walk away.

>
> And you still haven't what's wrong with:
>
> void foo( const boost::shared_ptr<T>& )
>
> This doesn't modify the refcount at all.

Another level of indirection ?

>
>
>>>>template < class w_ClassRef, class w_RefTraits >
>>>>class AT_LifeView
>>>
>>>
>>>I don't see what this class does that a raw pointer or
>>>reference doesn't do. If you don't want to transfer
>>>ownership, this is precisely when you should be using a raw
>>>pointer or reference.
>>>
>>
>>It resolves the issue *1 above.
>
>
> Explain how issue 1, above, applies when you pass the raw
> object by reference, e.g.
>
> void foo( const T& );
> boost::shared_ptr<T> t;
> foo( *t );

Don't do that.

>
>
>>AT_Pointer was somthing I used to work around issues with a std::list
>>class I was using.
>
>
> It doesn't work around the problems at all. You may have
> not have noticed them, and it's possible that your
> particular STL's std::list broadly works with your class,
> but it isn't portable and probably doesn't even work
> correctly in all cases in your STL.
>

correct.

>
>>As far as I am concerned it's incomplete but
>>useful in a narrow way.
>
>
> In what way is incomplete?

I don't know. I use it in very limited ways so it's complete enough for
what I use it for. I mostly use AT_Life*.

>
>
>>However, while we're on the topic, does:
>> std::list< auto_ptr<T> >
>>
>>do what I expect ?
>
>
> It does exactly the same as
>
> std::list< AT_Pointer<T> >

hmm, except I couldn't get std::list< auto_ptr<T> > to do what I expected.

>
> i.e. it invokes undefined behaviour. Don't do either.

It can and does have defined behaviour in a narrow way and since you
know alot about what you're talking about, you can figure it out.

>
>
>>Again, I raise the issue of mixing too many concepts in the same class.
>
>
> What I'm trying to point out, is that the problem that
> you say AT_Pointer is there to solve -- namely that not all
> classes can be reference counted -- is only an issue because
> you've chosen to ignore external reference counting. I
> really can't see a general smart pointer library being
> accepted into the Standard without support for external
> reference counting. As your stated aim is to get this
> accepted into the Standard, you MUST consider external
> reference counting.
>

My use of AT_Pointer is really not the topic of discssion. I certainly
do not advocate using AT_Pointer in the way I have used it. Having said
that, if you have a very narrow problem to solve it can be done. I have
found that it is the least used smart pointer by far because most things
are done with "intrusive" reference counting.

Carl Daniel

unread,

Jul 12, 2003, 12:11:13 PM7/12/03

to

"Edward Diener" wrote:

> "Carl Daniel" wrote:
>> Perhaps I'm missing something obvious, but the use of a mutex to
>> provide MT safety in boost::shared_ptr guarantees that it is in fact
>> far less efficient than typical COM reference counting, including the
>> virtual function call overhead. If shared_ptr was updated to use
>> InterlockedIncrement, then it could be faster than the typical COM
>> AddRef() call due to the virtual function call in the latter.
>
> Perhaps you are missing the fact that other operating systems other
> than Windows exists, and that InterlockedIncrement is not a C++
> Standard library function.
>

No, not missing that at all. Merely commenting on a comment about
shared_ptr and COM. The fact that COM doesn't exist on many OS's in no way
detracts from the discussion. The fact that InterlockedIncrement is not a
standard library function is similarly irrelevant: the entire boost::thread
library, from which the mutex is obtained, is built on non-standard library
functions (which follows naturally, since there are no standard library
functions for threading/thread safety).

The core issue, IMO, is not even about COM, but the fact that in an MT-safe
version, boost::shared_ptr uses a mutex to protect the atomic increment of
not one but two counts (see Richard Smith's reply to my posting). This
means that copying a boost::shared_ptr in an MT program on Windows is
somewhere between 100's to 1000's of times slower than copying a plain
pointer. I'm sure the cost on a pthreads system will be similarly high.
IMO, that's far too high a price to pay, regardless of whether you're using
it with COM.

-cd

Alexander Terekhov

unread,

Jul 12, 2003, 3:44:30 PM7/12/03

to

Richard Smith wrote:
[...]

> Why does this complicate matters? Because now we have two
> separate counts to maintain, and I know of no platforms that
> provide atomic operations to manipulate a pair of variable
> simulatenously -- hence the use of a mutex in the Boost
> implementation.

You don't really need "atomic operations to manipulate a PAIR
of variable simulatenously".

http://terekhov.de/pthread_refcount_t/experimental/refcount.cpp
http://groups.google.com/groups?selm=3EC0F1F1.B78AA0DA%40web.de
(Subject: Re: Atomic ops on Intel: do they sync?)

--
http://google.com/groups?threadm=3F0EBAF7.9E76235E%40web.de
(Subject: Bah!)

Carl Daniel

unread,

Jul 12, 2003, 3:45:03 PM7/12/03

to

David Abrahams wrote:
> cpda...@nospam.mvps.org ("Carl Daniel") writes:
>> Perhaps I'm missing something obvious, but the use of a mutex to
>> provide MT safety in boost::shared_ptr guarantees that it is in fact
>> far less efficient than typical COM reference counting, including
>> the virtual function call overhead.
>
> Of course, a mutex is only used if multithreading is enabled.

In my experience, nearly all interesting programs that aren't command-line
tools are multi-threaded, so a significant cost, that "only" appears in the
MT version is a significant cost.

>
>> If shared_ptr was updated to use InterlockedIncrement, then it could
>> be faster than the typical COM AddRef() call due to the virtual
>> function call in the latter.
>
> Which shows that the cost is a "mere" implementation detail. Care to
> submit a patch?

See Richard Smith's reply to my posting. I hadn't looked through shared_ptr
in a while, but he's correct - it's updating two counts. To atomixally
increment two counts requires a mutex (or equivalent). If the two counts do
need to be updated atomically, it's doomed to be 100's to 1000's of times
slower than the ST version of the same operation. It's unfortunate that
weak_ptr has such a significant cost. Perhaps it's not actually necessary
that both counts be incremented atomically - I haven't studied the code
enough to say.

>
>> Worse, COM objects "know" how their reference counts should be
>> maintained. A COM object that's "single threaded"
>
> Well isn't it actually ``a COM _type_ that's "single threaded"?''

Yes, that's correct. In typical COM programming, it's unusual to have more
than a single instance of a particular type, but it does happen.

>> Thus, for a very common case,
>> the boost::shared_ptr mechanism is nearly guaranteed to be less
>> efficient than normal COM reference counting.
>
> That's an interesting point. Of course, COM's apartment model is
> quite problematic in many cases, and having to decide
> single/multi-threadedness per-type instead of per-object can be a
> problem too, can't it?

Yes, it can. Mostly a problem for the consumer - choosing to make a
component single threaded is usually a concession to the developer of the
component, since (s)he may be working in a language (such as Visual Basic)
that supports COM development but doesn't effectively support
multi-threading.

-cd

Howard Hinnant

unread,

Jul 12, 2003, 3:45:20 PM7/12/03

to

In article <%iUPa.1119$tE5.12...@newssvr13.news.prodigy.com>, Carl
Daniel <cpda...@nospam.mvps.org> wrote:

| The core issue, IMO, is not even about COM, but the fact that in an MT-safe
| version, boost::shared_ptr uses a mutex to protect the atomic increment of
| not one but two counts (see Richard Smith's reply to my posting). This
| means that copying a boost::shared_ptr in an MT program on Windows is
| somewhere between 100's to 1000's of times slower than copying a plain
| pointer. I'm sure the cost on a pthreads system will be similarly high.
| IMO, that's far too high a price to pay, regardless of whether you're using
| it with COM.

This is drifting from the subject of the original post somewhat, but I
have been questioning the value of a "MT-safe" shared_ptr. I recently
used shared_ptr in an MT environment as part of the implementation of a
larger object. In this particular application I found it better to put
the synchronization primitive at a higher level than shared_ptr's
counts. And once placed at that higher level, an additional mutex (or
whatever) for shared_ptr's count was just an expensive redundancy.

--
Howard Hinnant
Metrowerks

Andrei Alexandrescu

unread,

Jul 12, 2003, 8:13:14 PM7/12/03

to

"Gianni Mariani" <gi2n...@mariani.ws> wrote in message
news:bene2f$c...@dispatch.concentric.net...

> The right answer here on this depends on a compromise of the following
> requirements:
>
> a) "zero overhead" principle
> b) exception safety
> c) correctness - or: elimination of usage errors - or : make is possible
> for the compiler to pick up usage errors.
> (and probably more)
>
> Depending on which requirement I give priority I get different answers
> to "what is right".
>
> I tend to place the "zero overhead" principle above all others which is
> NOT the right answer for everyone or all situations obviously but
> sometimes it is. Hence when it is the overriding principle, using a
> solution that does not provide the most efficient solution can be a
> significant problem. Also, smart pointers, or rather object lifetime
> management, is such a fundamental issue for C++ that it will become
> pervasive throughout any code base and hence will become a topic of many
> frustrations to come. To cap this paradox off, very few programmers
> will grasp the true complexity of smart pointers (as I have already
> witnessed). A typical formula for intractible issues.
>
> So, my intention here is to open the discussion on altenatives and
> discuss their merits.

That's a laudable initiative, and its rationale is very much to my liking.
However, by skimming over the discussion, it looks like you want to replace
TR1 smart pointer's set of tradeoffs with another set of tradeoffs. That
kind of shuffling tradeoffs around doesn't buy a lot, especially given that
there seem to be evidence that the TR1 smart pointer's set of tradeoffs tend
to please many.

This being said, have you taken a look at Loki::SmartPtr. It sports a design
in which the user is in control of the tradeoff, so you can choose
performance or safety within the same framework and without having to define
new smart pointer types from scratch.

Andrei

Gianni Mariani

unread,

Jul 13, 2003, 1:37:05 AM7/13/03

to

And the reason you make assertions on my intentions is ?

That
> kind of shuffling tradeoffs around doesn't buy a lot, especially given that
> there seem to be evidence that the TR1 smart pointer's set of tradeoffs tend
> to please many.

Are you trying to say this is a democratic decision ? I'm lost as to
what your point might be.

>
> This being said, have you taken a look at Loki::SmartPtr. It sports a design
> in which the user is in control of the tradeoff, so you can choose
> performance or safety within the same framework and without having to define
> new smart pointer types from scratch.

Admitedly, I had forgotten about loki. Sure enough it's in my downloads
directory from about 6 months ago. I'll need to look at it further.

I still can't see the LifeLine/LifeView concepts there but I do see the
handlers for different kinds of reference counting control interfaces.

Thanks for the pointer.

Edward Diener

unread,

Jul 13, 2003, 2:00:52 AM7/13/03

to

"Carl Daniel" wrote:
> "Edward Diener" wrote:
>> "Carl Daniel" wrote:
>>> Perhaps I'm missing something obvious, but the use of a mutex to
>>> provide MT safety in boost::shared_ptr guarantees that it is in fact
>>> far less efficient than typical COM reference counting, including
>>> the virtual function call overhead. If shared_ptr was updated to
>>> use InterlockedIncrement, then it could be faster than the typical
>>> COM AddRef() call due to the virtual function call in the latter.
>>
>> Perhaps you are missing the fact that other operating systems other
>> than Windows exists, and that InterlockedIncrement is not a C++
>> Standard library function.
>>
>
> No, not missing that at all. Merely commenting on a comment about
> shared_ptr and COM. The fact that COM doesn't exist on many OS's in
> no way detracts from the discussion.

But it does make it so much easier to say something good about how COM does
something as opposed to a multi-OS paradigm such as shared_ptr.

> The fact that
> InterlockedIncrement is not a standard library function is similarly
> irrelevant: the entire boost::thread library, from which the mutex
> is obtained, is built on non-standard library functions (which
> follows naturally, since there are no standard library functions for
> threading/thread safety).

Agreed.

My point was that to use atomic operations to avoid a mutex involves a lower
level approach for each operating system on which Boost is supported. I have
no doubt this can be done, and probably done best by creating a templated
atomic change which has to be coded differently for each OS. Mutexes are at
a slightly higher level, but your point is well-taken.

>
> The core issue, IMO, is not even about COM, but the fact that in an
> MT-safe version, boost::shared_ptr uses a mutex to protect the atomic
> increment of not one but two counts (see Richard Smith's reply to my
> posting). This means that copying a boost::shared_ptr in an MT
> program on Windows is somewhere between 100's to 1000's of times
> slower than copying a plain pointer. I'm sure the cost on a pthreads
> system will be similarly high. IMO, that's far too high a price to
> pay, regardless of whether you're using it with COM.

I don't see it as too high a price to pay because such a belief is purely
relative. Unless I am copying a very large number of shared_ptrs back and
forth, I probably won't even notice it on today's high-powered CPUs.

You have a good point that perhaps Boost needs to have cross-OS atomic
change functionality for those cases where it exists and use a mutex where
it does not.

Peter Dimov

unread,

Jul 13, 2003, 11:56:54 AM7/13/03

to

cpda...@nospam.mvps.org ("Carl Daniel") wrote in message news:<%iUPa.1119$tE5.12...@newssvr13.news.prodigy.com>...

>
> The core issue, IMO, is not even about COM, but the fact that in an MT-safe
> version, boost::shared_ptr uses a mutex to protect the atomic increment of
> not one but two counts (see Richard Smith's reply to my posting). This
> means that copying a boost::shared_ptr in an MT program on Windows is
> somewhere between 100's to 1000's of times slower than copying a plain
> pointer. I'm sure the cost on a pthreads system will be similarly high.
> IMO, that's far too high a price to pay, regardless of whether you're using
> it with COM.

Your numbers are wrong by a factor of between 13 and 229, another way
of saying that the real win32-MT boost::shared_ptr is 4.37 to 7.6
times slower to copy than a plain pointer according to my
measurements. (*) Feel free to verify my numbers independently and
post the results if there is a significant difference. I wonder why
you consider the comparison meaningful, though.

Taking the time to measure before speculating is well worth the
investment, IMO.

(*) http://www.boost.org/libs/smart_ptr/test/shared_ptr_timing_test.cpp

Richard Smith

unread,

Jul 13, 2003, 11:57:02 AM7/13/03

to

Carl Daniel wrote:

> If the two counts do
> need to be updated atomically, it's doomed to be 100's to 1000's of times
> slower than the ST version of the same operation.

Let's not be too pesimistic -- it's certainly not *that*
bad. Using this test code I posted earlier in this thread:

struct X { double w,x,y,z; };
void foo( boost::shared_ptr<X> ) {}

int main() {
timer t;

for ( int i(0), n(int(1E6)); i<n; ++i ) {
boost::shared_ptr<X> x(new X);
foo(x); foo(x);
}

std::cout << std::setprecision(3) << t.get_time() << "s\n";
}

compiled an run on Linux (with threading via pthreads) by

[richard@verdi test]$ g++-3.2.1 smptr.cpp -O2 -g -osmptr \
-I /usr/local/include/boost_1_30/
[richard@verdi test]$ N=20; for ((i=0;$i<$N;++i)); do \
./smptr; done | awk '{s+=$1} END{printf("%.3f\n", s/'$N')}'

I profiled the singled-threaded version at 0.781 and the
multi-threaded version at 0.967. This corresponds to a 23%
increase in run time.

Repeating with a main() changed to read:

int main() {
timer t;
boost::shared_ptr<X> src(new X);

for ( int i(0), n(int(1E6)); i<n; ++i ) {

boost::shared_ptr<X> x(src);
foo(x); foo(x);
}

std::cout << std::setprecision(3) << t.get_time() << "s\n";
}

I get values of 0.077 (single-threaded) versus 0.221
(multi-threaded), corresponding to a 187% increase in run
time.

> Perhaps it's not actually necessary
> that both counts be incremented atomically - I haven't studied the code
> enough to say.

It is possible to use standard atomic increment / decrements
to on the weak count, which would improve efficiency
somewhat. (This is not an optimisation that Boost currently
does.) With the existing mechanism for creating shared_ptrs
from weak_ptrs, I can't see how to avoid the mutex. I've
discussed this in more detail in response to Alexander
Terekhov's post in this thread.

--
Richard Smith

Richard Smith

unread,

Jul 13, 2003, 1:49:57 PM7/13/03

to

Alexander Terekhov wrote:

> You don't really need "atomic operations to manipulate a PAIR
> of variable simulatenously".
>
> http://terekhov.de/pthread_refcount_t/experimental/refcount.cpp
> http://groups.google.com/groups?selm=3EC0F1F1.B78AA0DA%40web.de
> (Subject: Re: Atomic ops on Intel: do they sync?)

This is very interesting article -- thank you for pointing
me to it.

I spent quite a while last night trying to work out how to
do it without a mutex, and got everything working except the
mechansim for converting a weak_ptr to a shared_ptr: i.e.
the following two functions

shared_ptr<T> weak_ptr<T>::lock()
shared_ptr<T>::shared_ptr(weak_ptr<T> const& )

Quoting from your mail, this is done via the following
function:

bool refs::aquire_strong_from_weak() {
int status = pthread_refcount_increment_positive
( &strong_count );
if (PTHREAD_REFCOUNT_DROPPED_TO_ZERO == status)
return false;
return true.
}

The pthread_refcount_increment_positive function is part of
your own proposed extention to POSIX. It does the following
in an atomic manner:

int pthread_refcount_increment_positive
( pthread_refcount_t *rc )
{
if ( *rc == 0 )
return PTHREAD_REFCOUNT_DROPPED_TO_ZERO;
++*rc;
return 0
}

Which is precisely what is needed in this situation. My
concern, however, is how this can be implemented. Looking
at your implementation at

http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.c

I notice you use a mutex to ensure syncronisation in all
the atomic functions, so we're back to square one -- all
we've succeeded in doing is moving the mutex from the
smart pointer library to the pthread_refcount library.

Atomic operations such as (++*rc) and (--*rc == 0) can often
be implemented using the atomic operations of the underlying
architecture. For example, Intel's x86 series of processors
[where x >= 4] allow the use of the LOCK prefix to
instructions such as INC and DEC, thus using hand-crafted
machine code, it is possible to write very efficient atomic
increment / decrement functions. Windows supplies
Interlocked{Increment, Decrement} functions that do
precisely this.

Unfortunately, I'm not aware of any similar way to implement
your pthread_refcount_increment_positive function. This is
the problem, and means that the TR1 smart pointer library
violates the zero-overhead principle: in threaded code, you
cannot use shared_ptr without paying a overhead for weak_ptr
support, even when you do not use weak_ptr. How large that
overhead is, and whether it's a price worth paying is
another question.

--
Richard Smith

David Abrahams

unread,

Jul 13, 2003, 10:58:24 PM7/13/03

to

cpda...@nospam.mvps.org ("Carl Daniel") writes:

> Perhaps it's not actually necessary that both counts be incremented
> atomically - I haven't studied the code enough to say.

ric...@ex-parrot.com (Richard Smith) writes:

> Unfortunately, I'm not aware of any similar way to implement
> your pthread_refcount_increment_positive function. This is
> the problem, and means that the TR1 smart pointer library
> violates the zero-overhead principle: in threaded code, you
> cannot use shared_ptr without paying a overhead for weak_ptr
> support, even when you do not use weak_ptr. How large that
> overhead is, and whether it's a price worth paying is
> another question.

Well, I think it's easy to get zero-overhead shared_ptr copying back.
Right now we have "cheap" weak_ptr copies and "expensive" shared_ptr
copies. Shouldn't the balance be pushed in the other direction?

I might be missing something, but it seems to me that if, instead of
storing use_count, you store the difference between weak_count and
use_count, shared_ptr can be copied with an atomic increment operation
on the weak_count.

Is there a hidden race condition lurking here? I don't see one.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

Carl Daniel

unread,

Jul 14, 2003, 1:34:33 AM7/14/03

to

Peter Dimov wrote:

>
> Your numbers are wrong by a factor of between 13 and 229, another way
> of saying that the real win32-MT boost::shared_ptr is 4.37 to 7.6
> times slower to copy than a plain pointer according to my
> measurements. (*) Feel free to verify my numbers independently and
> post the results if there is a significant difference. I wonder why
> you consider the comparison meaningful, though.
>
> Taking the time to measure before speculating is well worth the
> investment, IMO.

Yep. My bad. I committed the cardinal sin of speculating based on
incomplete and inaccurate information - my appologies.

I was thrown aback by the use of the name "mutex", when in fact it doesn't
map to an OS mutex at all, but a home-grown spin-lock or CriticalSection,
depending on some #defines and the platform. My re-running of your test
confirms about a 7x performance cost.

-cd

Peter Dimov

unread,

Jul 14, 2003, 1:41:41 PM7/14/03

to

ric...@ex-parrot.com (Richard Smith) wrote in message news:<Pine.LNX.4.55.03...@sphinx.mythic-beasts.com>...

> Alexander Terekhov wrote:
>
> > You don't really need "atomic operations to manipulate a PAIR
> > of variable simulatenously".
> >
> > http://terekhov.de/pthread_refcount_t/experimental/refcount.cpp
> > http://groups.google.com/groups?selm=3EC0F1F1.B78AA0DA%40web.de
> > (Subject: Re: Atomic ops on Intel: do they sync?)
>
> This is very interesting article -- thank you for pointing
> me to it.

[...]

> int pthread_refcount_increment_positive
> ( pthread_refcount_t *rc )
> {
> if ( *rc == 0 )
> return PTHREAD_REFCOUNT_DROPPED_TO_ZERO;
> ++*rc;
> return 0
> }
>
> Which is precisely what is needed in this situation. My
> concern, however, is how this can be implemented.

int pthread_refcount_increment_positive( long * rc )
{
long tmp;

do
{
tmp = *rc;
if(tmp == 0) return PTHREAD_REFCOUNT_DROPPED_TO_ZERO;
}
while( InterlockedCompareExchange(rc, tmp + 1, tmp) != tmp );

return 0;
}

I think.

Gianni Mariani

unread,

Jul 14, 2003, 3:00:54 PM7/14/03

to

Having given some thought to this, I have 2 counter proposals and a few
conclusions.

Some points to consider:

a) when an application uses smart pointers it will be pervasive and
hence the performance of the smart pointer implementation is likely
siginificant.

b) having an "optimal" or "near optimal" implementation available for a
smart reference counted pointer is likely to be used when the
performance/complexity tradeoff warrants the compromize.

Having said that, I consider that there is a possible alternative
implementation of AT_LifeLine that presents a very small overhead and in
some cases may be optimized to zero overhead (when fully inlined) with
exception safety included. I would propose this as possibly a 4th
member of the AT_Life* members leaving AT_LifeLine as a less safe but
100% zero overhead solution.

Now, these could be implemented on top of boost::shared_ptr but it seems
like it's already trying to do too much. The design and intent of
shared_ptr seems like it's trying to be as safe as possible at the cost
of performance and placing such a contradictory set of requirements
(absolute 100% zero overhead and absolute 100% no issues) is likely to
cause far more issues that it's worth trying to solve. However,
interoperability between smart pointer implementations should fairly
simple so I see even less of an issue than I first did when I originally
posted.

Unfortunately the root cause for this IS the expressiveness of the
language. (no intention to start a flame war - honest !) I agree that
C++ already suffers a significant overload of complexity and this
situation is likely not going to change.

So I've given myself yet more homework. If anyone is interested in the
alternative AT_Life* smart pointers let me know and maybe I'll work on
it sooner.

This question should be raised however: The TR1 suggested smart pointers
do not pass the "zero overhead principle" requirement so why should it
be adopted as part of the C++ standard ?

Pete Becker

unread,

Jul 14, 2003, 5:55:47 PM7/14/03

to

Howard Hinnant wrote:
>
> This is drifting from the subject of the original post somewhat, but I
> have been questioning the value of a "MT-safe" shared_ptr. I recently
> used shared_ptr in an MT environment as part of the implementation of a
> larger object. In this particular application I found it better to put
> the synchronization primitive at a higher level than shared_ptr's
> counts. And once placed at that higher level, an additional mutex (or
> whatever) for shared_ptr's count was just an expensive redundancy.
>

In another context, locking inside a stream inserter or extractor
doesn't help multi-threaded code, because it doesn't guarantee that
multiple insertions will operate atomically. So the application has to
add locks around all blocks io operations to the same stream, and the
locks in the inserters and extractors are then redundant.

Thread safety is a design issue. Locks (including locks in smart
pointers and stream operations) are tools. Putting them inside a library
is usually the wrong thing -- too low level. They protect the library
from corruption, but that's not really a problem that ought to be
solved, because the application still won't be correct. A correctly
written multi-threaded application typically won't need those low level
locks.

--

"To delight in war is a merit in the soldier,
a dangerous quality in the captain, and a
positive crime in the statesman."
George Santayana

"Bring them on."
George W. Bush

Andy Sawyer

unread,

Jul 14, 2003, 5:56:11 PM7/14/03

to

In article <%qUPa.1120$rs5.12...@newssvr13.news.prodigy.com>,
on Sat, 12 Jul 2003 19:45:03 +0000 (UTC),

cpda...@nospam.mvps.org ("Carl Daniel") wrote:

> In my experience, nearly all interesting programs that aren't
> command-line tools are multi-threaded, so a significant cost, that
> "only" appears in the MT version is a significant cost.

Quake isn't multi-threaded - I found that pretty interesting :).

More significantly, I also worked in a financial trading exchange
(again, pretty interesting) where the main trade-matching program
(i.e. the heart of the exchange's whole business) was single-threaded
(primarily for legal reasons, apparently). In fact, although it ran on a
multi-processor machine, the requirements were that this application
only ever executed on a single, specified CPU. These examples may be
considered corner cases, but they do illustrate the fact that there are
plenty of complex applications which are still single
threaded. (admittedly, only one of my examples runs on Windows ;))

> > Well isn't it actually ``a COM _type_ that's "single threaded"?''
>
> Yes, that's correct. In typical COM programming, it's unusual to have
> more than a single instance of a particular type, but it does happen.

In my experience, it happens often enough that I wouldn't consider it at
all 'unusual'. I've worked in envrionments where there are literally
thousands of instances of a single type at any given moment.

Regards,
Andy S.
--
"Light thinks it travels faster than anything but it is wrong. No matter
how fast light travels it finds the darkness has always got there first,
and is waiting for it." -- Terry Pratchett, Reaper Man

Dave Harris

unread,

Jul 14, 2003, 5:56:32 PM7/14/03

to

da...@boost-consulting.com (David Abrahams) wrote (abridged):

> That's an interesting point. Of course, COM's apartment model is
> quite problematic in many cases, and having to decide
> single/multi-threadedness per-type instead of per-object can be a
> problem too, can't it?

Deciding per-type is less of a problem than deciding per-program, as the
current shared_ptr<> proposal requires.

With a policy-based design, we could decide on a per-pointer basis,
perhaps with type-wide or program-wide defaults. And this is an
example of a policy which /ought/ to be part of the pointer's type, so
that passing a multi-threaded pointer to single-threaded code becomes a
compile-time error.

-- Dave Harris, Nottingham, UK

Dave Harris

unread,

Jul 14, 2003, 5:56:49 PM7/14/03

to

pdi...@mmltd.net (Peter Dimov) wrote (abridged):
> [sometimes] the optimization you want is not possible.

OK. I tend to think of reference-counting as poor-man's garbage
collection, and I don't care if a counted object is destroyed early
if there are no reachable references to it, but I suppose C++ isn't
like that. Would a special dispensation to optimise temporary
share_ptrs be worth considering?

There remain some important special cases, for example when all the
code is inline and the compile can see that reset() is not called.

typedef boost::shared_ptr<int> IntPtr;

void func1( IntPtr p );

inline void func2( IntPtr p ) {
func1( p );
}

void func3() {
func2( IntPtr( new int ) );
}

The compiler has to increment and decrement when calling func1() for
the reason you explained, but that reason does not apply to the call
of func2(). If IntPtr were typedef'd as (int *), func2() could be a
zero-cost abstraction. It's a shame to lose that because of
use_count(), don't you think?

If use_count() must be kept, then the original poster is right and
the programmer must indicate explicitly when the optimisation is to
be used. This becomes a specific example of the general "pilfer",
"destructive copy" or "move" problem. It's probably better to adopt
a general solution, such as the "&&" reference-to-temporary
proposal, rather than address it specifically in the smart pointer
classes.

Andrei Alexandrescu

unread,

Jul 14, 2003, 5:58:16 PM7/14/03

to

"Gianni Mariani" <gi2n...@mariani.ws> wrote in message

news:beqom2$c...@dispatch.concentric.net...

> Andrei Alexandrescu wrote:
> That
> > kind of shuffling tradeoffs around doesn't buy a lot, especially given
that
> > there seem to be evidence that the TR1 smart pointer's set of tradeoffs
tend
> > to please many.
>
> Are you trying to say this is a democratic decision ? I'm lost as to
> what your point might be.

My point will become clearer once you look at Loki::SmartPtr's design. It
fosters a new look at how you design things and decide on tradeoffse. If
shared_ptr and your design are points on a piece of paper, SmartPtr would be
the space around the paper. Ah, bad comparison.

Andrei

Alexander Terekhov

unread,

Jul 14, 2003, 5:59:22 PM7/14/03

to

Richard Smith wrote:
[...]

> http://www.terekhov.de/pthread_refcount_t/poor-man/beta2/prefcnt.c
>
> I notice you use a mutex to ensure syncronisation in all
> the atomic functions, so we're back to square one -- all
> we've succeeded in doing is moving the mutex from the
> smart pointer library to the pthread_refcount library.

That's why I called that stuff "poor-man". ;-) A *blocking*
pthread_refcount_t is pretty much useless and was only added
to "promote portability". Non-blocking stuff is what you want
to have/use in any "serious appl". See my "experimental take"
on it... and it's written in C++, BTW. ;-)

>
> Atomic operations such as (++*rc) and (--*rc == 0) can often
> be implemented using the atomic operations of the underlying
> architecture. For example, Intel's x86 series of processors
> [where x >= 4] allow the use of the LOCK prefix to
> instructions such as INC and DEC, thus using hand-crafted
> machine code, it is possible to write very efficient atomic
> increment / decrement functions. Windows supplies
> Interlocked{Increment, Decrement} functions that do
> precisely this.

Don't get me started on silly MS-interlocked stuff, please.

>
> Unfortunately, I'm not aware of any similar way to implement
> your pthread_refcount_increment_positive function.

http://google.com/groups?selm=3EC108B8.980ACE53%40web.de
(Subject: Re: scoped static singleton question)

<quote>

int pthread_refcount_increment_positive(
pthread_refcount_t * refcount
)
{
std::size_t val;
do {
val = refcount->atomic.load(); // Naked
if (!val) return PTHREAD_REFCOUNT_DROPPED_TO_ZERO;
assert(PTHREAD_REFCOUNT_MAX > val);
} while (!refcount->atomic.attempt_update(val, val+1)); // Naked
return 0;
}

</quote>

-ALSO-

http://www.terekhov.de/pthread_refcount_t/experimental/refcount.cpp

<quote>

bool increment_if_not_min() throw() {
numeric val;
do {
val = m_value.load(msync::none);
assert(max() > val);
if (min() == val)
return false;
} while (!m_value.attempt_update(val, val + 1, msync::none));
return true;
}

</quote>

> This is
> the problem, and means that the TR1 smart pointer library
> violates the zero-overhead principle: in threaded code, you
> cannot use shared_ptr without paying a overhead for weak_ptr
> support, even when you do not use weak_ptr. How large that
> overhead is, and whether it's a price worth paying is
> another question.

The only overhead is extra space needed for counter and one
extra decrement [initialization aside for a moment] in:

http://groups.google.com/groups?selm=3EC0F1F1.B78AA0DA%40web.de
(Subject: Re: Atomic ops on Intel: do they sync?)

<quote>

void release_strong() {
int status = pthread_refcount_decrement(&strong_count);
if (PTHREAD_REFCOUNT_DROPPED_TO_ZERO == status) {
destruct_object();
status = pthread_refcount_decrement_rel(&weak_count);
if (PTHREAD_REFCOUNT_DROPPED_TO_ZERO == status)
destruct_self();
}
}

</quote>

That's it.

regards,
alexander.

--
http://lists.boost.org/MailArchives/boost/msg43729.php
(Subject: [boost] Re: Weak ref. via atomicity (RE: Smart pointers:...))

Richard Smith

unread,

Jul 14, 2003, 11:01:53 PM7/14/03

to

David Abrahams wrote:

> Well, I think it's easy to get zero-overhead shared_ptr copying back.
> Right now we have "cheap" weak_ptr copies and "expensive" shared_ptr
> copies. Shouldn't the balance be pushed in the other direction?

Ideally, yes.

> I might be missing something, but it seems to me that if, instead of
> storing use_count, you store the difference between weak_count and
> use_count, shared_ptr can be copied with an atomic increment operation
> on the weak_count.

I think the shared_ptr<T>::shared_ptr( const weak_ptr<T>& )
constructor still causes problems. Semantically, it needs
to do the following:

if ( weak_count == difference ) throw bad_weak_ptr;
++weak_count;

Imagine at the start of this block of code, there is one
weak_ptr and one shared_ptr. What happens if between these
two lines, another thread destroys the final shared_ptr,
decrementing weak_count to be equal to difference? This
means (I think) we need a mutex to protect this block of
code. And as soon as we have a mutex here, we need one
where-ever we manipulate weak_count, including the
constructors and destructor of shared_ptr.

Or perhaps I'm misunderstanding what you intended?

--
Richard Smith

Howard Hinnant

unread,

Jul 14, 2003, 11:02:53 PM7/14/03

to

In article <memo.20030713...@brangdon.madasafish.com>, Dave
Harris <bran...@cix.co.uk> wrote:

| Would a special dispensation to optimise temporary
| share_ptrs be worth considering?

Sounds like another excellent motivation for move semantics! :-)

--
Howard Hinnant
Metrowerks

David Abrahams

unread,

Jul 15, 2003, 12:03:22 AM7/15/03

to

bran...@cix.co.uk (Dave Harris) writes:

> da...@boost-consulting.com (David Abrahams) wrote (abridged):
>> That's an interesting point. Of course, COM's apartment model is
>> quite problematic in many cases, and having to decide
>> single/multi-threadedness per-type instead of per-object can be a
>> problem too, can't it?
>
> Deciding per-type is less of a problem than deciding per-program, as the
> current shared_ptr<> proposal requires.
>
> With a policy-based design, we could decide on a per-pointer basis,
> perhaps with type-wide or program-wide defaults.

You don't need policies for that. Policies decide per-type. To
decide per-pointer, you need a flag and runtime checks.

> And this is an example of a policy which /ought/ to be part of the
> pointer's type, so that passing a multi-threaded pointer to
> single-threaded code becomes a compile-time error.

Why should it be one? Most single-threaded code is perfectly
threadsafe. It's usually only when you start doing "threading things"
that you get in trouble.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

Gianni Mariani

unread,

Jul 15, 2003, 11:35:18 AM7/15/03

to

Andrei Alexandrescu wrote:
> "Gianni Mariani" <gi2n...@mariani.ws> wrote in message
> news:beqom2$c...@dispatch.concentric.net...
>
>>Andrei Alexandrescu wrote:
>> That
>>
>>>kind of shuffling tradeoffs around doesn't buy a lot, especially given
>
> that
>
>>>there seem to be evidence that the TR1 smart pointer's set of tradeoffs
>
> tend
>
>>>to please many.
>>
>>Are you trying to say this is a democratic decision ? I'm lost as to
>>what your point might be.
>
>
> My point will become clearer once you look at Loki::SmartPtr's design. It
> fosters a new look at how you design things and decide on tradeoffse. If
> shared_ptr and your design are points on a piece of paper, SmartPtr would be
> the space around the paper. Ah, bad comparison.

Well, my first impression for loki is that it was trying to do too much.
I really need to go back and look at it more but 5 template parameters ?

template
<
typename T,
class OwnershipPolicy = RefCountedWrapper,
class ConversionPolicy = DisallowConversion,
class CheckingPolicy = AssertCheckWrapper,
class StoragePolicy = DefaultSPStorageWrapper
>
class SmartPtr;

OK, maybe 5 is right, I'm advocating 3 in my own design.

I did say "first impression" so I need really need to look at it more
clearly. ... after I get a few other things done first.

Richard Smith

unread,

Jul 15, 2003, 11:35:50 AM7/15/03

to

Peter Dimov wrote:

> int pthread_refcount_increment_positive( long * rc )
> {
> long tmp;
>
> do
> {
> tmp = *rc;
> if(tmp == 0) return PTHREAD_REFCOUNT_DROPPED_TO_ZERO;
> }
> while( InterlockedCompareExchange(rc, tmp + 1, tmp) != tmp );
>
> return 0;
> }

Thank you.

I had been worried that a combination of the need to
support weak_ptr and threads would give an unnecessary
overhead to simple uses of shared_ptr. I'm now happy that
this is not the case.

I don't know how many platform provide something like
InterlockedCompareExchange, but I know all recent ix86s
processors allow this (via the lock cmpxchg instruction).

--
Richard Smith

David Abrahams

unread,

Jul 15, 2003, 11:35:51 AM7/15/03

to

ric...@ex-parrot.com (Richard Smith) writes:

>> I might be missing something, but it seems to me that if, instead of
>> storing use_count, you store the difference between weak_count and
>> use_count, shared_ptr can be copied with an atomic increment operation
>> on the weak_count.
>
> I think the shared_ptr<T>::shared_ptr( const weak_ptr<T>& )
> constructor still causes problems. Semantically, it needs
> to do the following:
>
> if ( weak_count == difference ) throw bad_weak_ptr;
> ++weak_count;
>
> Imagine at the start of this block of code, there is one
> weak_ptr and one shared_ptr. What happens if between these
> two lines, another thread destroys the final shared_ptr,
> decrementing weak_count to be equal to difference? This
> means (I think) we need a mutex to protect this block of
> code. And as soon as we have a mutex here, we need one
> where-ever we manipulate weak_count, including the
> constructors and destructor of shared_ptr.
>
> Or perhaps I'm misunderstanding what you intended?

Nope, you understood what I meant. At the risk of underestimating a
threading problem, I think it might still be solvable.

First, to clarify, let's rename some variables. In the current
design weak_count is really (#weak + #shared), so the difference
we're talking about is really (#weak + #shared) - #shared = #weak.

So, I'm renaming

"weak_count" to "total"
and
"difference" to "nweak"

The goal is to use a mutex only for weak_ptr operations, and use
atomic operations on shared_ptr. Since only weak_ptr operations need
to manipulate nweak, we hope to protect manipulations of nweak with a
mutex but use atomic operations to manipulate total.

Invariants:

total >= nweak

once the last shared_ptr to the pointee disappears, there can
never be another.

OK, back to shared_ptr<T>::shared_ptr( const weak_ptr<T>& )

Since this function manipulates a weak_ptr<T>, it is protected by a
mutex and no other weak_ptr operations on the same pointee can happen
simultaneously, and nweak will not change during this function.

Pure shared_ptr operations can happen at will, so total will be
"volatile" during the execution of this function, but it will never be
less than nweak. Because no shared_ptrs can be created once the last
one has disappeared, however, a special case holds that once there are
no shared_ptrs, total becomes non-"volatile" when the mutex is held.

If we *start* with ++total, we can prevent any concurrent shared_ptr
destructions from deleting the pointee (if any). Then, assuming the
increment also allows us to read total, we can check whether total ==
nweak + 1 and if so, we know that there are no shared_ptrs left. We
can then decrement total and throw. Otherwise we can proceed.

Any problem here?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

---

Beman Dawes

unread,

Jul 15, 2003, 7:49:51 PM7/15/03

to

gi2n...@mariani.ws (Gianni Mariani) wrote in message news:<beuu4n$c...@dispatch.concentric.net>...

> This question should be raised however: The TR1 suggested smart pointers
> do not pass the "zero overhead principle" requirement so why should it
> be adopted as part of the C++ standard ?

The C++ committee isn't an authoritarian organization with a rigid set
of rules or formulas determining what gets accepted and what doesn't.
Rather it is a set of independent individuals who make up their own
minds about proposals. Some are primarily users, while others are
implementers, teachers, researchers, or writers. A wide range of
viewpoints.

So there is no mechanistic reason why a proposal gets accepted. Rather
it is because the collective judgment of the committee members is that
the pros considerably outweigh the cons.

You really should look more closely at Loki's SmartPtr, and
policy-based design in general, as Andrei suggested in another post.
It may give you just the ability to tinker with the behavior and
implementation that you seem to be looking for.

--Beman Dawes

ka...@gabi-soft.fr

unread,

Jul 15, 2003, 7:50:32 PM7/15/03

to

peteb...@acm.org (Pete Becker) wrote in message
news:<3F106B87...@acm.org>...
> Howard Hinnant wrote:

> > This is drifting from the subject of the original post somewhat, but
> > I have been questioning the value of a "MT-safe" shared_ptr. I
> > recently used shared_ptr in an MT environment as part of the
> > implementation of a larger object. In this particular application I
> > found it better to put the synchronization primitive at a higher
> > level than shared_ptr's counts. And once placed at that higher
> > level, an additional mutex (or whatever) for shared_ptr's count was
> > just an expensive redundancy.

> In another context, locking inside a stream inserter or extractor
> doesn't help multi-threaded code, because it doesn't guarantee that
> multiple insertions will operate atomically. So the application has to
> add locks around all blocks io operations to the same stream, and the
> locks in the inserters and extractors are then redundant.

> Thread safety is a design issue. Locks (including locks in smart
> pointers and stream operations) are tools. Putting them inside a
> library is usually the wrong thing -- too low level. They protect the
> library from corruption, but that's not really a problem that ought to
> be solved, because the application still won't be correct. A correctly
> written multi-threaded application typically won't need those low
> level locks.

It depends. When two separate instances transparently share data, you
can't leave locking up to the user (who is not supposed to know about
the shared data). This is a problem with COW strings, for example.

With regards to reference counted smart pointers, I don't know, but my
feeling would be that they also need some locking. Globally, I expect
the same degree of thread safety that Posix gives me for built-in
objects. And I can certainly do something like:

MyClass* p ;
MyClass* q ;
p = q ;

within a single thread, without locking, regardless of what other
pointers in other threads might happen to point to the same object as
q. I would rather expect a similar guarantee from any smart pointer.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

Randy Maddox

unread,

Jul 15, 2003, 7:50:42 PM7/15/03

to

peteb...@acm.org (Pete Becker) wrote in message news:<3F106B87...@acm.org>...
>

> In another context, locking inside a stream inserter or extractor
> doesn't help multi-threaded code, because it doesn't guarantee that
> multiple insertions will operate atomically. So the application has to
> add locks around all blocks io operations to the same stream, and the
> locks in the inserters and extractors are then redundant.

Absolutely correct. In this particular case the correct solution is
to provide an auxiliary StreamLock class that, on insertion
to/extraction from a stream, creates a temporary object that exists
for the duration of the full i/o expression. This has been discussed
elsewhere on this ng and has been used successfully. The other
benefit of this approach is that single-threaded code does not need to
use the StreamLock and therefore pays no price for what it does not
use.

>
> Thread safety is a design issue. Locks (including locks in smart
> pointers and stream operations) are tools. Putting them inside a library
> is usually the wrong thing -- too low level. They protect the library
> from corruption, but that's not really a problem that ought to be
> solved, because the application still won't be correct. A correctly
> written multi-threaded application typically won't need those low level
> locks.
>

Absolutely correct again. A thread-safe library can protect only its
own internal data structures, which is necessary but not sufficient
for the user of that library. And once such a user has added locking
at the appropriate level, the internal locking in the library is
generally, but not always, redundant so we sometimes end up paying a
cost for no benefit.

In general, a correctly designed multi-threaded program must provide
for its own locking and synchronization at a level higher than any
library. Thus I believe that while C++ should provide portable
abstractions that enable a multi-threaded program to be designed
correctly, it is not generally appropriate to build thread-safety into
most libraries since that is the wrong level of abstraction for this.
Some simple data structures with limited interfaces, such as a
reference-counted smart pointer or a stack or queue, can provide
appropriate thread-safety, but most others simply cannot, and should
not try to do so.

Randy.

Gianni Mariani

unread,

Jul 15, 2003, 10:02:26 PM7/15/03

to

Howard Hinnant wrote:
> In article <memo.20030713...@brangdon.madasafish.com>, Dave
> Harris <bran...@cix.co.uk> wrote:
>
> | Would a special dispensation to optimise temporary
> | share_ptrs be worth considering?
>
> Sounds like another excellent motivation for move semantics! :-)
>

Please fill in the dots. Are you advocating somthing like the
AT_LifeView but supported by the compiler ?

G

Peter Dimov

unread,

Jul 15, 2003, 10:02:47 PM7/15/03

to

bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20030713...@brangdon.madasafish.com>...

> pdi...@mmltd.net (Peter Dimov) wrote (abridged):
> > [sometimes] the optimization you want is not possible.
>
> OK. I tend to think of reference-counting as poor-man's garbage
> collection, and I don't care if a counted object is destroyed early
> if there are no reachable references to it, but I suppose C++ isn't
> like that. Would a special dispensation to optimise temporary
> share_ptrs be worth considering?

I think that "as if" and 12.2 are all that is needed. shared_ptr
instances are like all other C++ objects, and the rules are the same.

> There remain some important special cases, for example when all the
> code is inline and the compile can see that reset() is not called.
>
> typedef boost::shared_ptr<int> IntPtr;
>
> void func1( IntPtr p );
>
> inline void func2( IntPtr p ) {
> func1( p );
> }
>
> void func3() {
> func2( IntPtr( new int ) );
> }
>
> The compiler has to increment and decrement when calling func1() for
> the reason you explained, but that reason does not apply to the call
> of func2(). If IntPtr were typedef'd as (int *), func2() could be a
> zero-cost abstraction. It's a shame to lose that because of
> use_count(), don't you think?

No, I don't think. If the compiler can see that there is no reset(),
the compiler can see that there is no use_count(). Even if there _is_
an use_count() call, the compiler can replace it with the proper
value, if it's known from whole program analysis. The spec says that
use_count() returns the number of shared_ptr instances sharing
ownership, not the value of the reference count. There is no reference
count as far as the spec is concerned. :-)

If the circumstances allow the compiler to undetectably alter the
number of instances because of "as if", it can also alter the value of
use_count() to account for the difference.

By the way, the temporary in your example can always be optimized away
because of 12.2.

> If use_count() must be kept, then the original poster is right and
> the programmer must indicate explicitly when the optimisation is to
> be used. This becomes a specific example of the general "pilfer",
> "destructive copy" or "move" problem. It's probably better to adopt
> a general solution, such as the "&&" reference-to-temporary
> proposal, rather than address it specifically in the smart pointer
> classes.

Move doesn't help shared_ptr much since the move cost is fairly close
to the copy cost.

Richard Smith

unread,

Jul 15, 2003, 10:03:07 PM7/15/03

to

Carl Daniel wrote:

> I was thrown aback by the use of the name "mutex", when in fact it doesn't
> map to an OS mutex at all, but a home-grown spin-lock or CriticalSection,
> depending on some #defines and the platform.

I'm not sure this is particularly relevant. Doing some
measurments using the second test program from my earlier
post, I get the following figures:

0.221 multi-threaded using a pthread_mutex
0.240 multi-threaded using a spinlock
0.077 single-threaded

i.e. that a pthread_mutex is faster than a hand-coded
spinlock. Anyhow, this is drifting off topic. The point is
that using some 'heavy-weight' mechanism to guarantee
atomicity (whether a mutex or a spinlock) imposes an
noticeable overhead when compared to the single-threaded
version.

--
Richard Smith

Peter Dimov

unread,

Jul 15, 2003, 10:03:44 PM7/15/03

to

peteb...@acm.org (Pete Becker) wrote in message news:<3F106B87...@acm.org>...

> Howard Hinnant wrote:
> >
> > This is drifting from the subject of the original post somewhat, but I
> > have been questioning the value of a "MT-safe" shared_ptr. I recently
> > used shared_ptr in an MT environment as part of the implementation of a
> > larger object. In this particular application I found it better to put
> > the synchronization primitive at a higher level than shared_ptr's
> > counts. And once placed at that higher level, an additional mutex (or
> > whatever) for shared_ptr's count was just an expensive redundancy.
> >
>
> In another context, locking inside a stream inserter or extractor
> doesn't help multi-threaded code, because it doesn't guarantee that
> multiple insertions will operate atomically. So the application has to
> add locks around all blocks io operations to the same stream, and the
> locks in the inserters and extractors are then redundant.
>
> Thread safety is a design issue. Locks (including locks in smart
> pointers and stream operations) are tools. Putting them inside a library
> is usually the wrong thing -- too low level. They protect the library
> from corruption, but that's not really a problem that ought to be
> solved, because the application still won't be correct. A correctly
> written multi-threaded application typically won't need those low level
> locks.

shared_ptr "locks" are different. They are required to allow user code
to manipulate two different shared_ptr instances simultaneously
without corruption, i.e. to achieve the "as thread safe as an int"
level ("basic" thread safety).

Concurrent read+write, write+write access to a single shared_ptr
("strong" thread safety) still requires user space lock.

Peter Dimov

unread,

Jul 15, 2003, 10:04:40 PM7/15/03

to

cpda...@nospam.mvps.org ("Carl Daniel") wrote in message news:<VKqQa.2731$A37....@newssvr16.news.prodigy.com>...

> My re-running of your test confirms about a 7x performance cost.

The totally nonportable

http://www.pdimov.com/cpp/shared_count_x86_exp.hpp

performs even better. Now "all" we need is a formal proof that the
atomic updates work. :-)

Peter Dimov

unread,

Jul 15, 2003, 10:05:00 PM7/15/03

to

gi2n...@mariani.ws (Gianni Mariani) wrote in message news:<beuu4n$c...@dispatch.concentric.net>...
>

> This question should be raised however: The TR1 suggested smart pointers
> do not pass the "zero overhead principle" requirement so why should it
> be adopted as part of the C++ standard ?

"Zero overhead" was a requirement (once) for core language features,
when C++ needed to survive the competition with C. The goal was that a
C program, when compiled with a C++ compiler, would be just as fast.

>From a pragmatic point of view, there is no such thing as "zero
overhead" (also known as "free lunch"). Someone will always have to
pay the price (not necessarily in performance, of course). For
example, exceptions are "zero overhead" in theory, but on Windows, the
non-zero overhead implementation has the advantage that exceptions can
safely travel through code compiled with different compilers (and even
languages).

Another example: many C++ types are position independent, and can be
moved around in memory with memmove. Even if you only have position
independent types in your program, you still suffer the overhead when
std::vector uses its copy/destroy loop to move elements around.
Clearly, std::vector violates the zero overhead principle. (In more
than one way, I might add.) One might say that so does std::map; if
you don't need the ordering, you stil pay for it.

The difference between a standard library element and a core language
feature is, of course, that you can simply decide to not use the
standard library element if you cannot afford it.

So the real question is not "Is X zero overhead" but "Is X useful, and
if so, can we afford it?" I think that yes, we can afford shared_ptr,
and it is useful.

Howard Hinnant

unread,

Jul 16, 2003, 12:47:39 PM7/16/03

to

In article <bf0ivi$c...@dispatch.concentric.net>, Gianni Mariani
<gi2n...@mariani.ws> wrote:

No, I'm saying that shared_ptr could benefit from:

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2002/n1377.htm

with something like (untested):

template <class T>
inline
shared_ptr<T>::shared_ptr(shared_ptr&& r)
: ptr_(r.ptr_),
s_(r.s_)
{
r.ptr_ = 0;
r.s_ = 0;
}

template <class T>
inline
shared_ptr<T>::shared_ptr(const shared_ptr& r)
: ptr_(r.ptr_),
s_(r.s_)
{
if (s_)
s_->attach();
}

All attach() has to do is increment the use counts (and it is not a
virtual call). But if the shared_ptr is protecting the use counts with
a thread synchronization device, then constructing from a temporary can
avoid that.

Even if there is no MT code, I believe the move constructor will still
be significantly faster than the copy constructor. Both of course are
very fast, but just roughly counting instructions, it looks to me like
the move constructor might be as much as twice as fast. In a heavily
used utility, you don't want to throw something like that away.

--
Howard Hinnant
Metrowerks

Richard Smith

unread,

Jul 16, 2003, 12:47:39 PM7/16/03

to

Gianni Mariani wrote:

> a) when an application uses smart pointers it will be pervasive and
> hence the performance of the smart pointer implementation is likely
> siginificant.

I don't think they necessarily will be used pervasively. I
certainly have writen applications where several sorts of
smart pointer have been used sparingly through the code, but
no single smart pointer class could be described as deployed
'pervasively'.

> b) having an "optimal" or "near optimal" implementation available for a
> smart reference counted pointer is likely to be used when the
> performance/complexity tradeoff warrants the compromize.

Indeed. However I've still to be convinced that your
framework is more efficient than TR1's shared_ptr perhaps
combined with TR1's weak_ptr, std::auto_ptr and/or
boost::scoped_ptr as appropriate.

> Having said that, I consider that there is a possible alternative
> implementation of AT_LifeLine that presents a very small overhead and in
> some cases may be optimized to zero overhead (when fully inlined) with
> exception safety included.

Can I ask two quick questions about it?

1. Are AddRef() and Release() still virtual functions?
If so, it will be rare for even the best compiler to
inline them.

2. Does AT_LifeLine's destructor Release() the held
pointer if it has not transfered ownership away? If
not, I fail to see how it can be considered exception
safe in any normal sense of the term.

> I would propose this as possibly a 4th
> member of the AT_Life* members leaving AT_LifeLine as a less safe but
> 100% zero overhead solution.

If you want to go down this route, I'd advise (as several
others have already in this thread) a policy based framework
such as Andrei Alexandrescu's Loki::SmartPtr. If you want
to understand how this works, you should read Chapter 7 of
his book "Modern C++ Design"; another good discussion on
policy-based smart pointers can be found in Chapter 20 of
Vandevoorde & Josuttis' book "C++ Templates: The Complete
Guide".

> Now, these could be implemented on top of boost::shared_ptr but it seems
> like it's already trying to do too much.

Depends. Would you like a thread-safe weak_ptr to interact
correctly with your class? If so implementing it on top of
TR1's shared_ptr may well be the best solution. As I think
other parts of this thread have demonstrated, getting an
optimal thread-safe shared_ptr implementation that supports
weak_ptr is not entirely trivial.

> So I've given myself yet more homework. If anyone is interested in the
> alternative AT_Life* smart pointers let me know and maybe I'll work on
> it sooner.

The particular issue of rvalue references [see N1377] might
possibly be added to the language.

> This question should be raised however: The TR1 suggested smart pointers
> do not pass the "zero overhead principle" requirement so why should it
> be adopted as part of the C++ standard ?

In what way do you think TR1's shared_ptr violates this
principle? I can think of two possibilities:

1. The TR1 shared_ptr class requires a virtual function
call (or equivalent) to destroy the pointee. This is
due to generic deletion function object.

There's no two ways about this: it is an additional
overhead, though not very large. (See one of my other
posts to this thread. I believe I measured a 5%
difference using quite a contrived example.)

In it's defence, it allows the TR1 shared_ptr to
interoperate with other reference counted pointer
frameworks, including intrusive ones such as COMs. It
also allows it to be used on incomplete types, as in
the "grin_ptr" idiom. (Incidentally, as shared_ptr
can easily be made to support this idiom, would it be
worth including wording in the proposed text to
mandate this?)

2. Because of the need to support weak_ptr, and in
particular the expired() method, shared_ptr needs to
maintain two reference counts.

Again, this is true. However maintaining two
reference counts is not, per se, any harder than
maintaining one. Many current implementations
(including the Boost one) suffer greater than
necessary overheads in multi-threaded builds because
of the way they handle exception safety. I think the
discussion elsewhere in this thread has now
demonstrated that this can be avoided.

If you think there's another way in which TR1's shared_ptr
violates the zero-overhead principle, could you let us know?

--
Richard Smith

Markus Mauhart

unread,

Jul 16, 2003, 12:47:51 PM7/16/03

to

"David Abrahams" <da...@boost-consulting.com> wrote ...

IMO yes, what happens during shared's and weak's dtor ?
In your model ...

shared::~shared
{
int const shared = InterlockedDecrement (&ref_counts->total);
if (shared == 0) delete pointer;
problem
}
Now you would also have to check ref_counts->nweak ... cause
the last one has to delete also ref_counts.
But you may not access ref_counts anymore, cause in your model, during
creation of the shared_ptr, you only incremented total, hence after
decrementing it outside of a critical section, you have lost any right
to access refcounts. E.g. immediately after your "--total", inside
another thread, the last weak_ptr could have died, saw total==0 and
nweak==0 and therefore deleted the complete ref_counts object.

But there might be a cheap poor man's solution using ...

struct RC {
union {UINT16 ui16[2] ;UINT32 ui32 ;};
UINT16& shared_count_ () {return ui16[0] ;} // num shared_ptrs
UINT16& weak_count_ () {return ui16[1] ;} // num weak_ptrs + shared_ptrs
};

.... and "LONG InterlockedExchangeAdd (LONG* pvalue ,LONG difference)",
if 2^16 is enough for you, and if InterlockedExchangeAdd() is portable
enough.

Regards,
Markus.

Gianni Mariani

unread,

Jul 16, 2003, 1:44:41 PM7/16/03

to

Beman Dawes wrote:
> gi2n...@mariani.ws (Gianni Mariani) wrote in message news:<beuu4n$c...@dispatch.concentric.net>...
>
>
>>This question should be raised however: The TR1 suggested smart pointers
>>do not pass the "zero overhead principle" requirement so why should it
>>be adopted as part of the C++ standard ?
>
>
> The C++ committee isn't an authoritarian organization with a rigid set
> of rules or formulas determining what gets accepted and what doesn't.
> Rather it is a set of independent individuals who make up their own
> minds about proposals. Some are primarily users, while others are
> implementers, teachers, researchers, or writers. A wide range of
> viewpoints.
>
> So there is no mechanistic reason why a proposal gets accepted. Rather
> it is because the collective judgment of the committee members is that
> the pros considerably outweigh the cons.

You miss the point of the question. It's not a procedural question as
I'm familiar with the various standardization processes and have been
involved with it in past lives. This is a question on it's meirts. I'd
like to hear the ACTUAL pro's and cons that comittee is weighing and
which ones it has decided to favour. I'd like to understand more about
how the comittee is balancing the decisions and I'd like to see a more
open discussion.

>
> You really should look more closely at Loki's SmartPtr, and
> policy-based design in general, as Andrei suggested in another post.
> It may give you just the ability to tinker with the behavior and
> implementation that you seem to be looking for.

What do you think I'm looking for ?

G

Gianni Mariani

unread,

Jul 16, 2003, 1:51:17 PM7/16/03

to

Peter Dimov wrote:
> gi2n...@mariani.ws (Gianni Mariani) wrote in message news:<beuu4n$c...@dispatch.concentric.net>...
>
>>This question should be raised however: The TR1 suggested smart pointers
>>do not pass the "zero overhead principle" requirement so why should it
>>be adopted as part of the C++ standard ?
>
>

[comments on the state of the zero overhead requirement]

>
> So the real question is not "Is X zero overhead" but "Is X useful, and
> if so, can we afford it?" I think that yes, we can afford shared_ptr,
> and it is useful.

Fair comment.

There are 3 (possibly more) fundamental knobs being teaked when making
design decisions,

a) solution complexity or programmer learning curve
b) feature set
c) performance or minimal overhead of solution

It seems that loki is an attempt to optimize on b), boost is trying to
optimize on a) while Life* is an attempt at optimizing c). (house
painter's brush being used here, don't be picky about it).

So your assesment of the "can we afford it" question is totally
subjective and in the bigger picture `could' be totally off.

It seems like there is a slightly more complex soution than boost with a
less rich feature set than loki that may provide ultimately a solution
that can replace C in all circumstances.

For example, there is a "belief" that C++ is unsuitable for kernel
development because the "overhead" of C++ is just too much. Without
getting too far into religion, the mindset is one of maximal performance
at all cost.

It seems as though the consensus in this group is "to heck with optimal
performance" what we want is a solution space with less complexity
(lower learning costs) at the expense of performance.

My ultimate concern is that abandoning performance for reducing the
learning curve is probably the wrong trade-off for C++. If I wanted a
reduced learning curve solution I'd pick a totally different language,
java, perl, python, pick your favourite lower learning curve language
here. C++ has been the language that allows a programmer to have the
full flexibility of C with a whole bunch of ways to improve type safety
and actually improve on the performance of applications written in C.

Or said in another way: Who will using boost smart pointers please and
are these the appropriate target community for C++ ? So who in the
standard's committee is thinking at this level and is there a concensus?

ka...@gabi-soft.fr

unread,

Jul 16, 2003, 1:51:19 PM7/16/03

to

pdi...@mmltd.net (Peter Dimov) wrote in message
news:<7dc3b1ea.03071...@posting.google.com>...

> The difference between a standard library element and a core language
> feature is, of course, that you can simply decide to not use the
> standard library element if you cannot afford it.

That's a tenuous difference at best. There's nothing that requires you
to use templates, or exceptions (except maybe common sense). And
there's certainly nothing to require you to use double or float -- most
of my programs don't. And I actually once ran into a C programmer who
didn't even know that there were operators like '&' or '|'.

I think about the only thing that you absolutely must use is a function
definition -- a C++ program must include a definition of the function
main.

> So the real question is not "Is X zero overhead" but "Is X useful, and
> if so, can we afford it?" I think that yes, we can afford shared_ptr,
> and it is useful.

The zero overhead has always been a bit of a misleader, I think. But
the goal has always been zero overhead if you don't use it -- not zero
overhead, absolutely. I presume that if I don't use smart pointers, my
program will not run slower or use more memory than it would otherwise.
Zero overhead if I don't use it. The fact that for some more or less
small subset of the feature may have a cheaper solution is not
forbidden.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

---

Dave Harris

unread,

Jul 16, 2003, 4:11:07 PM7/16/03

to

hin...@metrowerks.com (Howard Hinnant) wrote (abridged):

> Even if there is no MT code, I believe the move constructor will still
> be significantly faster than the copy constructor. Both of course are
> very fast, but just roughly counting instructions, it looks to me like
> the move constructor might be as much as twice as fast.

There may be further gains from the destructor. Once all the code is
inlined, the compiler may be able to tell from the assignments in the move
constructor that the destructor will do nothing. And then it may be able
to remove the 0 assignments in the move constructor, too. (Obviously this
depends on details of the implementation.)

So the total cost of a move may be just 2 assignments.

-- Dave Harris, Nottingham, UK

---

Pete Becker

unread,

Jul 16, 2003, 5:27:46 PM7/16/03

to

ka...@gabi-soft.fr wrote:
>
> > Thread safety is a design issue. Locks (including locks in smart
> > pointers and stream operations) are tools. Putting them inside a
> > library is usually the wrong thing -- too low level. They protect the
> > library from corruption, but that's not really a problem that ought to
> > be solved, because the application still won't be correct. A correctly
> > written multi-threaded application typically won't need those low
> > level locks.
>
> It depends.

That's why I said "usually the wrong thing," and "typically won't need
those low level locks." I probably shouldn't have included smart
pointers here, since they seem to be distracting everyone from the point
I was making. Putting locks in smart pointers doesn't make user code
thread safe.

--

"To delight in war is a merit in the soldier,
a dangerous quality in the captain, and a
positive crime in the statesman."
George Santayana

"Bring them on."
George W. Bush

Richard Smith

unread,

Jul 16, 2003, 5:27:51 PM7/16/03

to

Gianni Mariani wrote:

> Howard Hinnant wrote:
> > Sounds like another excellent motivation for move semantics! :-)
>
> Please fill in the dots. Are you advocating somthing like the
> AT_LifeView but supported by the compiler ?

It is described in paper N1377:

http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2002/n1377.htm

--
Richard Smith

Richard Smith

unread,

Jul 16, 2003, 5:28:23 PM7/16/03

to

Gianni Mariani wrote:

> I have yet to come across the need for an externally reference counted
> facility.

A point that no-one has yet mentioned is that supporting
thread-safe weak pointers with an intrusive reference
counted pointer is very difficult. (By intrusive, I mean
that the shared count, the weak count and any syncronisation
objects such as mutexes or spinlocks are held by value in
the class. This means that the shared pointer never
dynamically allocates any memory.) Without weak pointers,
it is hard to get around the cyclic references that are
often encountered in real-world applications.

I think I have finally come up with a solution, but it
relies on some pretty nasty trickery. I think it's legal,
but I'd like a second opinion. The potentially dubious bit
can be collapsed into the following code:

struct foo {
~foo() {}
foo() {} // Does not initialise x
int x;
};

class bar : public foo {};

int main() {
foo* f = new bar;
void* p = dynamic_cast<void*>(f);

f->x = 42;
f->~foo();
assert( ++f->x == 43 ); // I think this is legal ...
new(f) foo;
assert( --f->x == 42 ); // ... but is this?

f->foo::~foo();
operator delete(p);
}

Now, by my reading of the Standard, foo's destructor
implictly calls this->x.~int() [12.4/6]. 5.2.4/1 says that
for a non-class type, such as int, the only affect of this
is to evaluate this->x. This presumably means that the
following assert line is guaranteed to succeed. (I think
this is consistent with 12.4/14.)

The placement new operator implicitly called by the new
expression does not alter the memory pointed to by f
[18.4.1.3/3]. Next, the default constructor of foo is
called. This does not explicitly initialise x, this means
that 12.6.2/4, bullet 2 applies, which says "the entity is
not initialised". It seems to me that this means the memory
is unmodified -- but is this actually true? If so, the
following assert is guaranteed to succeed, otherwise it
invokes undefined behaviour.

Peter Dimov

unread,

Jul 16, 2003, 6:14:23 PM7/16/03

to

ric...@ex-parrot.com (Richard Smith) wrote in message news:<Pine.LNX.4.55.03...@sphinx.mythic-beasts.com>...
> [...] It

> also allows it to be used on incomplete types, as in
> the "grin_ptr" idiom. (Incidentally, as shared_ptr
> can easily be made to support this idiom, would it be
> worth including wording in the proposed text to
> mandate this?)

This is an inadvertent omission, the intent has always been for
17.4.3.6/2 last bullet to not apply to shared_ptr, weak_ptr and
enable_shared_from_this. We need to remember to add this to the TR1
issues list once it's established.

Dave Harris

unread,

Jul 16, 2003, 9:44:09 PM7/16/03

to

Discussing:

typedef boost::shared_ptr<int> IntPtr;

void func1( IntPtr p );

inline void func2( IntPtr p ) {
func1( p );
}

void func3() {

IntPtr p( new int ) );
func2( p );

}

pdi...@mmltd.net (Peter Dimov) wrote (abridged):

> > The compiler has to increment and decrement when calling func1() for
> > the reason you explained, but that reason does not apply to the call
> > of func2(). If IntPtr were typedef'd as (int *), func2() could be a
> > zero-cost abstraction. It's a shame to lose that because of
> > use_count(), don't you think?
>
> No, I don't think. If the compiler can see that there is no reset(),
> the compiler can see that there is no use_count().

There is no reset() or use_count() in func2(), but there might be either
in func1(). Inside func1() the count will be > 1, so reset() will not
cause the int to be deleted, so the only way to detect the count (that I
can see) is to use use_count() (or unique()).

> Even if there _is_ an use_count() call, the compiler can replace it
> with the proper value, if it's known from whole program analysis.

I'd rather a system which could be optimised without whole program
analysis. Especially as there could be other calls to func1() with
different use counts.

> If the circumstances allow the compiler to undetectably alter the
> number of instances because of "as if", it can also alter the value of
> use_count() to account for the difference.

Interesting point, but I don't think it can do that here. The compiler has
to behave as if func2() were not inlined.

> By the way, the temporary in your example can always be optimized away
> because of 12.2.

Thanks - that was a mistake. Corrected in the above code.

> Move doesn't help shared_ptr much since the move cost is fairly close
> to the copy cost.

These costs are being debated elsewhere. The move cost could be 2
assignments, with no need for any kind of threading protection, and no
need to access the pointed-to memory.

But I agree it could be an interesting case study for advocates of move
semantics :-)

-- Dave Harris, Nottingham, UK

---

Andy Sawyer

unread,

Jul 16, 2003, 9:45:59 PM7/16/03

to

In article <bf28l1$c...@dispatch.concentric.net>,
on Wed, 16 Jul 2003 17:44:41 +0000 (UTC),
gi2n...@mariani.ws (Gianni Mariani) wrote:

> You miss the point of the question. It's not a procedural question as
> I'm familiar with the various standardization processes and have been
> involved with it in past lives. This is a question on it's meirts.
> I'd like to hear the ACTUAL pro's and cons that comittee is weighing
> and which ones it has decided to favour. I'd like to understand more
> about how the comittee is balancing the decisions and I'd like to see
> a more open discussion.

The discussion IS open - it's not held in smoke-filled rooms behind
locked doors and armed guards :). Much of the discussion happens online
via email between committee meetings. If, as you say, you're familiar
with standardization processes, then you know you have the opportunity
to participate in the discussion. Join your NSB's (National Standards
Body) C++ standards group. Depending on your NSB, this may or may not
cost you money¹ - but that's outside the control of the C++ commitee.
If your NSB has no such group (or has unacceptably high membership
fees), then perhaps another one would allow you membership (this,
obviously, will depend on the NSB in question, but there *are* people
who have participated in the WG21 discussion as representatives of
countries other than that in which they live). Join the committe. Go to
the meetings. It's as open as you want it to be.

If you're expecting to see detailed minutes of every single point
raised during committee discussion, then you're probably out of luck
unless you attend and make such minutes yourself, for a couple of
reasons. Firstly, a WG21 meeting usually lasts a week or so, and much of
the time is split into several subgroups - thats an awful lot of meeting
time to minute. Secondly, most of the people attending the meetings are
far too busy participating in the discussion to make such detailed
minutes, and generally only key points are recorded (this, I find, is
common whenever technically capable people are allowed to hold meetings
without the overhead of having some PHB present :)

What you *can't* necessarily expect is for *your* viewpoint to be
expressed in a meeting - unless you actually attend and express it
yourself. That doesn't mean it *won't* be expressed, especially if you
can persuade someone who does attend a meeting to express it for you. If
you participate in the process, then comitee members are likely to be
aware of it and take it into consideration when forming their own
opinions.

In the current context (smart pointers), there are several commitee
members who have expended considerable amounts of time and energy in
this area. Some (but not all) of them post in (and, presumably, read)
this newsgroup. They also participate in other forums (notably Boost).
Since they _have_ expended that time and effort, you may have a hard
time persuading them that your approach has benefits over other
techniques - but you won't know until you try.

Regards,
Andy S.
¹ In my case, membership doesn't cost me anything. Your situation may
be different.
--
"Light thinks it travels faster than anything but it is wrong. No matter
how fast light travels it finds the darkness has always got there first,
and is waiting for it." -- Terry Pratchett, Reaper Man

Emil Dotchevski

unread,

Jul 16, 2003, 10:42:31 PM7/16/03

to

> There are 3 (possibly more) fundamental knobs being teaked when making
> design decisions,
>
> a) solution complexity or programmer learning curve
> b) feature set
> c) performance or minimal overhead of solution
>
> It seems that loki is an attempt to optimize on b), boost is trying to
> optimize on a) while Life* is an attempt at optimizing c).

This classification is misleading. Loki::SmartPtr does *not* provide a
superset of the boost::shared_ptr functionality: you can't define a
set of policies for Loki::SmartPtr that can make it act "just like" a
boost::shared_ptr.

> It seems as though the consensus in this group is "to heck with optimal
> performance" what we want is a solution space with less complexity
> (lower learning costs) at the expense of performance.

Don't confuse complexity with usability: boost::shared_ptr is very
very usable while being very very simple to use. This, together with
its robust design and years of real world use is what makes
boost::shared_ptr the only candidate for standardization. You can't
standardize something that has not been proven useful and bug-free,
and this comes with years of real life usage.

Before you say boost::shared_ptr is slow or whatever, you should take
the time to study it -- if for no other reason simply because it is
now standardized.

One of the most useful features of boost::shared_ptr is the fact that
it can safely delete objects of incomplete classes, as long as their
complete definition was present at the time when the shared_ptr was
initialized. This can be combined with the use of protected,
non-virtual base destructors to further improve safety (by making it
impossible to use operator delete to dispose of a managed object)
while improving performance (you pay for a single indirection only,
just as in a normal use of a virtual destructor).

The ability to define boost::shared_ptr to an object of incomplete
class is also an important feature of the raw C/C++ pointers. This
makes it possible to use boost::shared_ptr to hide implementation
details (PIMPL) without compromising type safety. Boost::shared_ptr
aside, the only other option you have in this case is to use a raw
pointer.

Another use of boost::shared_ptr that only raw pointers can match is
that you can completely erase the type information from it, except
that it is perfectly safe:

//Header
boost::shared_ptr<void> create_foo();

//CPP
class Foo
{
...
};
boost::shared_ptr<void> create_foo()
{
return boost::shared_ptr<void>(new Foo());
}

//Somewhere else completely
{
boost::shared_ptr<void> p( create_foo() );
...
} //Calls ~Foo, without even having a declaration of class Foo.

You can also define a custom deleter when you initialize a
boost::shared_ptr and this does *not* involve additional template
arguments: all shared_ptr<T> objects are of the same class for the
same T, regardless of whether custom deleter was specified or not.
This is yet another example of a useful feature that actualy improves
the performance by reducing the number of automatic temporaries. For
comparison, using a policy-based design could sometimes require a
temporary object when passing a smart pointer to a function, even if
the function takes its argument by const reference.

> My ultimate concern is that abandoning performance for reducing the
> learning curve is probably the wrong trade-off for C++.

Can you be more specific? What makes you think boost::shared_ptr is
slow? Any concrete performance data that shows another solution to be
any faster?

> Or said in another way: Who will using boost smart pointers please and
> are these the appropriate target community for C++ ? So who in the
> standard's committee is thinking at this level and is there a concensus?

Obviously, the anwser to the last question is 'yes'. Therefore, the
answer to the first question is also 'yes', as it is the committee
that decides what is appropriate for C++ and what isn't. Everything
else is simply a personal opinion, often combined with lack of
experience and/or understanding of C++.

--Emil

Emil Dotchevski

unread,

Jul 17, 2003, 12:38:08 PM7/17/03

to

> {
> boost::shared_ptr<int> p( new int );
> func( p );
> }

The presence of the use_count() *function*, I think, does not in any
way interfere with optimizations of temporaries. Correct me if I am
wrong, but it seems to me use_count() does not imply the use of
external counter, therefore there is no requirement for the
implementation to increment anything when you make a copy of a
shared_ptr.

Ben Hutchings

unread,

Jul 17, 2003, 12:38:22 PM7/17/03

to

In article <uadbfd...@boost-consulting.com>, David Abrahams wrote:
> ric...@ex-parrot.com (Richard Smith) writes:
<snip>

>> I think the shared_ptr<T>::shared_ptr( const weak_ptr<T>& )
>> constructor still causes problems. Semantically, it needs
>> to do the following:
>>
>> if ( weak_count == difference ) throw bad_weak_ptr;
>> ++weak_count;
>>
>> Imagine at the start of this block of code, there is one
>> weak_ptr and one shared_ptr. What happens if between these
>> two lines, another thread destroys the final shared_ptr,
>> decrementing weak_count to be equal to difference?

<snip>

[Rename weak_count to total and difference to nweak]

> Pure shared_ptr operations can happen at will, so total will be
> "volatile" during the execution of this function, but it will never be
> less than nweak. Because no shared_ptrs can be created once the last
> one has disappeared, however, a special case holds that once there are
> no shared_ptrs, total becomes non-"volatile" when the mutex is held.
>
> If we *start* with ++total, we can prevent any concurrent shared_ptr
> destructions from deleting the pointee (if any). Then, assuming the
> increment also allows us to read total, we can check whether total ==
> nweak + 1 and if so, we know that there are no shared_ptrs left. We
> can then decrement total and throw. Otherwise we can proceed.
>
> Any problem here?

Yes - the pointee will be leaked if this constructor races with the
destructor.

It's not important to keep a counter of weak pointers. The transitions
we care about are:

1. Number of strong pointers drops to zero.
2. Number of strong or weak pointers drops to zero.

Alexander Terekhov presented a solution which reliably detects these
transitions without the need to update two counters. He used a count
of strong pointers and a count of weak pointers + (strong count != 0).
I shall attempt to translate this to fit into the Boost implementation,
using the following platform-dependent functions:

// Atomically read var
long atomic_read(const volatile long & var);
// Atomically set var = new_val iff var == old_val. Return success
// indication.
bool atomic_update(volatile long & var, long old_val, long new_val);

Here's my attempt:

class sp_counted_base
{
public:

sp_counted_base(): use_count_(1), self_count_(1)
{
}

virtual ~sp_counted_base() // nothrow
{
}

// dispose() is called when use_count_ drops to zero, to release
// the resources managed by *this.

virtual void dispose() = 0; // nothrow

// destruct() is called when self_count_ drops to zero.

virtual void destruct() // nothrow
{
delete this;
}

virtual void * get_deleter(std::type_info const & ti) = 0;

void add_ref()
{
long old_use_count;
do
{
old_use_count = atomic_read(use_count_);
if (old_use_count == 0)
throw bad_weak_ptr();
}
while (!atomic_update(use_count_, old_use_count,
old_use_count + 1));
}

void release() // nothrow
{
long old_use_count;
do
old_use_count = atomic_read(use_count_);
while (!atomic_update(use_count_, old_use_count,
old_use_count - 1));

if (old_use_count == 1)
{
dispose();
weak_release();
}
}

void weak_add_ref() // nothrow
{
long old_self_count;
do
old_self_count = atomic_read(self_count_);
while (!atomic_update(self_count_, old_self_count,
old_self_count + 1));
}

void weak_release() // nothrow
{
long old_self_count;
do
old_self_count = atomic_read(self_count_);
while (!atomic_update(self_count_, old_self_count,
old_self_count - 1));

if (old_self_count == 1)
destruct();
}

long use_count() const // nothrow
{
return atomic_read(use_count_);
}

private:

sp_counted_base(sp_counted_base const &);
sp_counted_base & operator= (sp_counted_base const &);

// inv: self_count_ != 0

long use_count_;
long self_count_;
};

Peter Dimov

unread,

Jul 17, 2003, 8:49:03 PM7/17/03

to

do-not-s...@bwsint.com (Ben Hutchings) wrote in message news:<slrnbhbnpc.154....@tin.bwsint.com>...

>
> Alexander Terekhov presented a solution which reliably detects these
> transitions without the need to update two counters. He used a count
> of strong pointers and a count of weak pointers + (strong count != 0).
> I shall attempt to translate this to fit into the Boost implementation,

[...]

> sp_counted_base(): use_count_(1), self_count_(1)
> {
> }

I see it now; thank you very much for extracting the solution from one
of Alexander's links in a form that I can easily understand! :-)

This approach is about 20-25% faster than the "two count atomic" case,
and about 20-25% slower than the single threaded case.

Gianni Mariani

unread,

Jul 17, 2003, 10:36:30 PM7/17/03

to

Emil Dotchevski wrote:
>>There are 3 (possibly more) fundamental knobs being teaked when making
>>design decisions,
>>
>>a) solution complexity or programmer learning curve
>>b) feature set
>>c) performance or minimal overhead of solution
>>
>>It seems that loki is an attempt to optimize on b), boost is trying to
>>optimize on a) while Life* is an attempt at optimizing c).
>
>
> This classification is misleading. Loki::SmartPtr does *not* provide a
> superset of the boost::shared_ptr functionality: you can't define a
> set of policies for Loki::SmartPtr that can make it act "just like" a
> boost::shared_ptr.
>

How does your statement relate to design goals ?

>
>>It seems as though the consensus in this group is "to heck with optimal
>>performance" what we want is a solution space with less complexity
>>(lower learning costs) at the expense of performance.
>
>
> Don't confuse complexity with usability: boost::shared_ptr is very
> very usable while being very very simple to use. This, together with
> its robust design and years of real world use is what makes
> boost::shared_ptr the only candidate for standardization. You can't
> standardize something that has not been proven useful and bug-free,
> and this comes with years of real life usage.

Again, what has this to do with design trade-offs ?

>
> Before you say boost::shared_ptr is slow or whatever, you should take
> the time to study it -- if for no other reason simply because it is
> now standardized.
>

I have. It makes some things that I normally do, slower. Especially
when dealing with COM or COM-Like pointers. The technique I have
described, which you evidently failed to mention here, eliminates the
need to perform reference counting in most of the uses of the smart
pointer. (using AT_LifeView, check it out).

> One of the most useful features of boost::shared_ptr is the fact that
> it can safely delete objects of incomplete classes, as long as their
> complete definition was present at the time when the shared_ptr was
> initialized. This can be combined with the use of protected,
> non-virtual base destructors to further improve safety (by making it
> impossible to use operator delete to dispose of a managed object)
> while improving performance (you pay for a single indirection only,
> just as in a normal use of a virtual destructor).

That's nice. Why exactly is this relevant ?

>
> The ability to define boost::shared_ptr to an object of incomplete
> class is also an important feature of the raw C/C++ pointers. This
> makes it possible to use boost::shared_ptr to hide implementation
> details (PIMPL) without compromising type safety. Boost::shared_ptr
> aside, the only other option you have in this case is to use a raw
> pointer.
>
> Another use of boost::shared_ptr that only raw pointers can match is
> that you can completely erase the type information from it, except
> that it is perfectly safe:
>
> //Header
> boost::shared_ptr<void> create_foo();
>
> //CPP
> class Foo
> {
> ...
> };
> boost::shared_ptr<void> create_foo()
> {
> return boost::shared_ptr<void>(new Foo());
> }
>
> //Somewhere else completely
> {
> boost::shared_ptr<void> p( create_foo() );
> ...
> } //Calls ~Foo, without even having a declaration of class Foo.

Your assertion being ? You can do this with AT_Life* pointers and I
suspect you can do this with loki pointers as well. I fail to see what
your talking about and how this relates to design goals.

>
> You can also define a custom deleter when you initialize a
> boost::shared_ptr and this does *not* involve additional template
> arguments: all shared_ptr<T> objects are of the same class for the
> same T, regardless of whether custom deleter was specified or not.
> This is yet another example of a useful feature that actualy improves
> the performance by reducing the number of automatic temporaries. For
> comparison, using a policy-based design could sometimes require a
> temporary object when passing a smart pointer to a function, even if
> the function takes its argument by const reference.

Again, that's nice but why is this relevant to my previous post ?

>
>
>>My ultimate concern is that abandoning performance for reducing the
>>learning curve is probably the wrong trade-off for C++.
>
>
> Can you be more specific? What makes you think boost::shared_ptr is
> slow? Any concrete performance data that shows another solution to be
> any faster?
>

Slow ? How did you come to this conclusion ? Still, this does not
relate to design goals.

>
>>Or said in another way: Who will using boost smart pointers please and
>>are these the appropriate target community for C++ ? So who in the
>>standard's committee is thinking at this level and is there a concensus?
>
>
> Obviously, the anwser to the last question is 'yes'. Therefore, the
> answer to the first question is also 'yes', as it is the committee
> that decides what is appropriate for C++ and what isn't. Everything
> else is simply a personal opinion, often combined with lack of
> experience and/or understanding of C++.
>

If the answer is so obvious then I suspect you'll be more than willing
to explain it since you've inadvertently missed placing it in the prior
paragraph.

If you cannot describe the design goals of boost:shared_ptr in 200 words
or less, you may not understand the meaning of my previous post.

I truly don't know what the design priorities lie for boost. I honestly
believe that boost:smart_ptr may actually be the wrong answer for some
designs but unless someone can describe exactly what smart_ptr's design
goals are the discussion will be rather frustrating.

It seems like many differences in opinion in this thread arise from an
difference in experience. For example you'll see in my smart pointer
design (AT_Life*) there is no weak poiinter. A thing that is missing
here is a concept I call "twin" which performs much of what a weak
pointer does but is exceedingly different and needs no support from
smart_ptr. std:list frustrates me because I would like an element of a
list to be able to manage itself without any smart pointers. These are
examples of where I would choose a design which is subjectively
"complex" but less code is being generated and hence subjectively
"simpler". I personally find the kinds of designs that std::list is
used for optimal in both expression and peformance. These are personal
differences. The prior post HAS NOTHING TO do with this. This
discussion comes after we have some stated goals.

Your last sentence is half right. Design can be somewhat based on
personal opinion but the underlying foundations are based on fact. I'm
trying to establish a statement of fact which is the answer to the
question, "what are the design goals of smart_ptr ?", in particular what
is the priority of design decisions and the idioms it is indended to
solve most efficiently. Again this should be small, one pager kinds of
descriptions.

I can make a cut at it for AT_Life* smart pointers :

AT_Life* smart pointers are intended to provide a framework for managing
reference counted objects in an "intrusive" manner as optimally as
possible. Design decisions have favoured elimination of operations (see
AT_LifeView) through providing a mechanism where reference counted
pointer policies are described in interfaces and appropriate operations
deduced when creating temporary objects. The temporary objects are
usually no more expensive than operations on a raw pointer. This
"policy" paradigm has a small number of "holes" and it is a design
choice to leave the holes in favour of maintaining an extremely light
weight implementation.

Can you describe the design trade-off formula boost::smart_ptr ?

/G

Peter Dimov

unread,

Jul 17, 2003, 10:37:30 PM7/17/03

to

bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20030716...@brangdon.madasafish.com>...

> Discussing:
>
> typedef boost::shared_ptr<int> IntPtr;
>
> void func1( IntPtr p );
>
> inline void func2( IntPtr p ) {
> func1( p );
> }
>
> void func3() {
> IntPtr p( new int ) );
> func2( p );
> }
>
> pdi...@mmltd.net (Peter Dimov) wrote (abridged):

[...]

> > By the way, the temporary in your example can always be optimized away
> > because of 12.2.
>
> Thanks - that was a mistake. Corrected in the above code.

I agree that in your corrected example, a compiler that has an
intrinsic shared_ptr type and understands that one instance is a
special case (from "observable behavior" point of view) can completely
optimize out func2::p, since without use_count() the difference
between two and three instances is not detectable.

With use_count(), assuming a typical reference counted implementation,
the compiler must increment __use_count by 2 and not one. There will
be no performance difference. The storage for func2::p can still be
optimized out since nobody can see it, and I think that this can be
performed by an "ordinary" compiler with no extra knowledge of
shared_ptr (the copy constructor and destructor must be visible, of
course).

But I'm sure you understand that this is an academic example. :-) A
hypothetical compiler that has intrinsic std:: types (we are still
waiting for it) will probably start with std::vector - much, much
better ROI.

All real world implementations will likely have use_count() since they
need to be testable; if we don't include it in the specification it
will probably still be there but will be named __use_count(). A public
use_count() query is useful when testing or debugging code.

> > Move doesn't help shared_ptr much since the move cost is fairly close
> > to the copy cost.
>
> These costs are being debated elsewhere. The move cost could be 2
> assignments, with no need for any kind of threading protection, and no
> need to access the pointed-to memory.

True, moving a shared_ptr is - of course - faster than copying one,
but there are much better motivating examples. O(1) with a smaller
constant doesn't even come close to O(1) without an allocation instead
of O(N) with allocation.

There is one additional subtlety in the threading case BTW, copying a
shared_ptr is a read access while moving one is a write access, so
when moving an lvalue one needs to ensure that other threads can't see
it.

Dave Harris

unread,

Jul 17, 2003, 10:38:58 PM7/17/03

to

da...@boost-consulting.com (David Abrahams) wrote (abridged):
> >> That's an interesting point. Of course, COM's apartment model is
> >> quite problematic in many cases, and having to decide
> >> single/multi-threadedness per-type instead of per-object can be a
> >> problem too, can't it?
> >
> > With a policy-based design, we could decide on a per-pointer basis,
> > perhaps with type-wide or program-wide defaults.
>
> You don't need policies for that. Policies decide per-type.

When we say "per-object", I take it to mean the pointed-to object, not the
pointer itself. And similarly, "per-type" means the type of the pointed-to
object. A policy-based pointer design would decide per the type of the
pointer, which is different. We could have two pointers with different
policies pointing to the same object.

Ptr<Object,singlethreaded> p1( new Object );
Ptr<Object,multithreaded> p2 = p1;

> To decide per-pointer, you need a flag and runtime checks.

OK, but that wasn't what I meant.

> > And this is an example of a policy which /ought/ to be part of the
> > pointer's type, so that passing a multi-threaded pointer to
> > single-threaded code becomes a compile-time error.
>
> Why should it be one? Most single-threaded code is perfectly
> threadsafe.

Not if it is running concurrently with other single-threaded code that
shares access to the same object. It would be reasonable to have an
explicit conversion (eg for when a higher level lock keeps the other
threads out), but not an implicit conversion. An assignment such as the
above ought to be an error.

("Single-threaded code" here means code using single-threaded idioms,
including single-threaded pointers. The compiler won't know whether there
are actually other active threads when such code is run.)

> It's usually only when you start doing "threading things" that you
> get in trouble.

If you are not doing threading things, you don't need multi-threaded
pointers and the conversion issue doesn't arise.

-- Dave Harris, Nottingham, UK

---

Alexander Terekhov

unread,

Jul 18, 2003, 4:29:29 PM7/18/03

to

Ben Hutchings wrote:
[...]

> I shall attempt to translate this to fit into the Boost implementation,

> using the following platform-dependent functions: ...

Nah.

class sp_counted_base {
/* ... */
refcount<std::size_t, basic> use_count_;
refcount<std::size_t, basic> self_count_;
/* ... */
public:
/* ... */
sp_counted_base() : use_count_(1), self_count_(1) { }

std::size_t use_count() const throw() {
return use_count_.get();
}

void add_ref() throw() {
use_count_.increment();
}

bool lock() throw() {
return use_count_.increment_if_not_min();
}

void weak_add_ref() throw() {
self_count_.increment();
}

void weak_release() throw() {
if (!self_count_.decrement(msync::acq))
destruct();
}

void release() throw() {
if (!use_count_.decrement()) {
dispose();
if (!self_count_.decrement(msync::rel))
destruct();
}
}
/* ... */
};

http://www.terekhov.de/pthread_refcount_t/experimental/refcount.cpp

Oder?

regards,
alexander.

Alexander Terekhov

unread,

Jul 18, 2003, 6:34:30 PM7/18/03

to

Peter Dimov wrote:
[...]

> This approach is about 20-25% faster than the "two count atomic" case,
> and about 20-25% slower than the single threaded case.

You might also want to try something along the lines of: < this
version supersedes "sp_counted_base" stuff that I've posted in my
previous "Nah....Oder" message in this thread here >

class sp_counted_base {
/* ... */

typedef refcount<std::size_t, basic> count;
count use_count_, self_count_;

/* ... */
public:
/* ... */
sp_counted_base() : use_count_(1), self_count_(1) { }

std::size_t use_count() const throw() {
return use_count_.get();
}

void add_ref() throw() {
use_count_.increment();
}

bool lock() throw() {
return use_count_.increment_if_not_min();
}

void weak_add_ref() throw() {
self_count_.increment();
}

void weak_release() throw() {
if (!self_count_.decrement(msync::acq, count::may_not_store_min))
destruct();
}

void release() throw() {
if (!use_count_.decrement()) {
dispose();

if (!self_count_.decrement(msync::rel, count::may_not_store_min))
destruct();
}
}
/* ... */
};

http://www.terekhov.de/pthread_refcount_t/experimental/refcount.cpp
(updated recently)

Oder?

regards,
alexander.

Emil Dotchevski

unread,

Jul 18, 2003, 9:36:31 PM7/18/03

to

> How does your statement relate to design goals ?

> Again, what has this to do with design trade-offs ?

> That's nice. Why exactly is this relevant ?

> Again, that's nice but why is this relevant to my previous post ?

I demonstrated that:

A) boost::shared_ptr has features that no other smart pointers have,
and

B) Many of these features improve performance in real life situations.

I believe this is in contrast with your statement that the design goal
of boost::shared_ptr was ease of use and that because of this its
performance and feature set suffers.

> I truly don't know what the design priorities lie for boost. I honestly
> believe that boost:smart_ptr may actually be the wrong answer for some
> designs but unless someone can describe exactly what smart_ptr's design
> goals are the discussion will be rather frustrating.

The design goals are like a wish list. Design goals are important in
the design stage, boost::shared_ptr is way past this period. You have
it, you can use it, test it, abuse it, anything you want. It either
does what you want, or it doesn't.

I am not attacking the qualities of boost::shared_ptr, you are; your
posts seem to indicate that its standardization was a mistake and that
you have a better solution. For your argument to be taken seriously,
you *must*:

1) Provide hard performance data that clearly shows that
boost::shared_ptr is slower than your own solution in real world
applications;

2) Show that your own solution problems with exception safety can be
solved without significantly reducing its performance;

3) Explain why the so-called loop holes in your design are
insignificant in real world applications;

4) Demonstrate that your solution is universal and its claimed
advantages are not limited to its use with COM objects.

Other people have also expressed an opinion that boost::shared_ptr is
sub-standard and should not have been accepted in C++. I am yet to see
anyone provide a superior alternative (backed up with hard data).

Even if a superior design does exist, AFAIK boost::shared_ptr was the
only smart pointer design presented to the committee for formal
standardization. After all, the committee can only work with formal
proposals. It would be impractical to turn down a good solution just
because in the bright future someone may come up with a better one.

--Emil

Markus Mauhart

unread,

Jul 19, 2003, 2:20:53 AM7/19/03

to

"Alexander Terekhov" <tere...@web.de> wrote ...
>
> Nah.

also Nah.

> class sp_counted_base {
> /* ... */
> refcount<std::size_t, basic> use_count_;
> refcount<std::size_t, basic> self_count_;
> /* ... */
> public:
> /* ... */
> sp_counted_base() : use_count_(1), self_count_(1) { }
>

> void release() throw() {

> if (!use_count_.decrement()) {
> dispose();
> if (!self_count_.decrement(msync::rel))
> destruct();
> }
> }

> void add_ref() throw() {
> use_count_.increment();
> }

.... missing something w.r.t. symmetry ?-)))

> /* ... */
> };

Regards,
Markus.

Gianni Mariani

unread,

Jul 19, 2003, 2:21:24 AM7/19/03

to

Emil Dotchevski wrote:
>>How does your statement relate to design goals ?
>>Again, what has this to do with design trade-offs ?
>>That's nice. Why exactly is this relevant ?
>>Again, that's nice but why is this relevant to my previous post ?
>
>
> I demonstrated that:
>
> A) boost::shared_ptr has features that no other smart pointers have,
> and
>
> B) Many of these features improve performance in real life situations.
>
> I believe this is in contrast with your statement that the design goal
> of boost::shared_ptr was ease of use and that because of this its
> performance and feature set suffers.
>
>
>>I truly don't know what the design priorities lie for boost. I honestly
>>believe that boost:smart_ptr may actually be the wrong answer for some
>>designs but unless someone can describe exactly what smart_ptr's design
>>goals are the discussion will be rather frustrating.
>
>
> The design goals are like a wish list. Design goals are important in
> the design stage, boost::shared_ptr is way past this period. You have
> it, you can use it, test it, abuse it, anything you want. It either
> does what you want, or it doesn't.
>
> I am not attacking the qualities of boost::shared_ptr, you are; your
> posts seem to indicate that its standardization was a mistake and that
> you have a better solution. For your argument to be taken seriously,
> you *must*:

Please indicate where you think I was attacking.

I think it's a mistake to do somthing without really understanding what
it is smart_ptr really solves. As it stands at this time, I can see
areas that it would reduce efficiency of code that I have written and I
already use reference counting smart pointers. To the projects I have
been involved with it's a mistake to use shared_ptr.

As I have already stated in this discussion, I am interested in have a
clear and open discussion. I need to know what it

>
> 1) Provide hard performance data that clearly shows that
> boost::shared_ptr is slower than your own solution in real world
> applications;

doing nothing is certainly faster.

>
> 2) Show that your own solution problems with exception safety can be
> solved without significantly reducing its performance;

that's if I care about solving those things.

>
> 3) Explain why the so-called loop holes in your design are
> insignificant in real world applications;

"loop holes" ? There are no loop hole problems.

>
> 4) Demonstrate that your solution is universal and its claimed
> advantages are not limited to its use with COM objects.

I never claimed it to be universal.

>
> Other people have also expressed an opinion that boost::shared_ptr is
> sub-standard and should not have been accepted in C++. I am yet to see
> anyone provide a superior alternative (backed up with hard data).

That's just it, I don't think there IS anything that merits being put
into the standard unless the goals are clearly defined. So far, I don't
see how anyone can say that shared_ptr merits inclusion if it's not
clear what the scope of what it it trying to solve is well understood
and how that may or may not meet the needs of the users who will most
likely need it.

>
> Even if a superior design does exist, AFAIK boost::shared_ptr was the
> only smart pointer design presented to the committee for formal
> standardization. After all, the committee can only work with formal
> proposals. It would be impractical to turn down a good solution just
> because in the bright future someone may come up with a better one.

Just because it's the only kid on the block does not mean it should win
the prize.

Having said all that, I'm by no means unwilling to change my mind and if
anything one of the possible outcomes of this discussion may be a shift
in my anchor points.

James Dennett

unread,

Jul 19, 2003, 8:25:30 PM7/19/03

to

Gianni Mariani wrote:

> Emil Dotchevski wrote:
>
>> 2) Show that your own solution problems with exception safety can be
>> solved without significantly reducing its performance;
>
>
> that's if I care about solving those things.
>

You may not, for your own projects, but a component that is
unable to work "well" in the presence of exceptions is, in
my opinion, very unlikely to gain support from the committee
for inclusion in Standard C++.

-- James.

Gianni Mariani

unread,

Jul 20, 2003, 5:23:42 PM7/20/03

to

James Dennett wrote:
> Gianni Mariani wrote:
>
>> Emil Dotchevski wrote:
>>
>>> 2) Show that your own solution problems with exception safety can be
>>> solved without significantly reducing its performance;
>>
>>
>>
>> that's if I care about solving those things.
>>
>
> You may not, for your own projects, but a component that is
> unable to work "well" in the presence of exceptions is, in
> my opinion, very unlikely to gain support from the committee
> for inclusion in Standard C++.
>

I didn't say it won't work well with exceptions. I said that I may not
care to have an overhead for having the smart pointers to solve them for
me. VERY different.

G

James Dennett

unread,

Jul 20, 2003, 6:03:37 PM7/20/03

to

Gianni Mariani wrote:
> James Dennett wrote:
>
>> Gianni Mariani wrote:
>>
>>> Emil Dotchevski wrote:
>>>
>>>> 2) Show that your own solution problems with exception safety can be
>>>> solved without significantly reducing its performance;
>>>
>>>
>>>
>>>
>>> that's if I care about solving those things.
>>>
>>
>> You may not, for your own projects, but a component that is
>> unable to work "well" in the presence of exceptions is, in
>> my opinion, very unlikely to gain support from the committee
>> for inclusion in Standard C++.
>>
>
> I didn't say it won't work well with exceptions. I said that I may not
> care to have an overhead for having the smart pointers to solve them for
> me. VERY different.
>
> G

Emil wrote that your solution has "problems with exception
safety". One of the goals of the boost shared pointer, it
seems to me, was to help users to write exception safe code
in the presence of dynamic allocation. While the terminology
may be subjective, hence my use of quotation marks, I was just
pointing out that in one sense a component that has "problems
with exception safety" does NOT work well in the presence of
exceptions.

I've not taken the time to look in detail at your design, and
I do not level any criticism against it. I was merely stating
that if the situation is as Emil writes, and that your solution
has problems with exception safety, that is likely to make it
unacceptable to the committee.

If you'd care to explain here why these concerns are unfounded
I'm sure many committee members will read what you have to say.

Regards,

James.

Markus Mauhart

unread,

Jul 21, 2003, 2:30:59 PM7/21/03

to

""Markus Mauhart"" <markus....@nospamm.chello.at> wrote ...

> "Alexander Terekhov" <tere...@web.de> wrote ...
> >

> > void release() throw() {
> > if (!use_count_.decrement()) {
> > dispose();
> > if (!self_count_.decrement(msync::rel))
> > destruct();
> > }
> > }
>
> > void add_ref() throw() {
> > use_count_.increment();
> > }
>
> .... missing something w.r.t. symmetry ?-)))

forget this one Alexander, I missed your (un-symmetric;-) braces
late in the evening, hence I parsed ...

> > if (!use_count_.decrement()) {
> > dispose();
> > if (!self_count_.decrement(msync::rel))
> > destruct();
> > }

.... as if it was ...

if (!use_count_.decrement())
dispose();
if (!self_count_.decrement(msync::rel))
destruct();

Regards,

Peter Dimov

unread,

Jul 21, 2003, 2:31:16 PM7/21/03

to

tere...@web.de (Alexander Terekhov) wrote in message news:<3F17EF6F...@web.de>...

> Peter Dimov wrote:
> [...]
> > This approach is about 20-25% faster than the "two count atomic" case,
> > and about 20-25% slower than the single threaded case.
>
> You might also want to try something along the lines of: < this
> version supersedes "sp_counted_base" stuff that I've posted in my
> previous "Nah....Oder" message in this thread here >

[...]

For the record, the numbers above are for

http://www.pdimov.com/cpp/shared_count_x86_exp2.hpp

that uses atomic increment and decrement operations where possible
instead of cmpxchg.

Peter Dimov

unread,

Jul 21, 2003, 2:31:25 PM7/21/03

to

gi2n...@mariani.ws (Gianni Mariani) wrote in message news:<bfa904$b...@dispatch.concentric.net>...

>
> I think it's a mistake to do somthing without really understanding what
> it is smart_ptr really solves.

shared_ptr, you mean.

Let's start from the basics. Why do we need a standard shared
ownership smart pointer?

* It is CopyConstructible and Assignable; the standard library likes
such entities. The only standard smart pointer at the time is,
unfortunately, neither.

* The standard library is value-based; an easy way to get safe
reference semantics is to use a shared ownership smart pointer.

* Nearly every project reinvents this wheel. The wheel is hard to
reinvent correctly, and it would be good to have a standard
alternative so that transfer of ownership is possible between
libraries from different authors.

* A standard pointer benefits the C++ community, you only need to
learn one tool unless you have specific needs that are not satisfied
by it.

* A standard pointer that has been proven correct, useful, and good
enough for a variety of situations simplifies design decisions and the
process of writing new code. You simply use it and save a lot of time
in the common case where it's good enough. Some programmers will
always have specific needs that aren't covered well, so they are no
better (and no worse) than before, but the majority will benefit.

There are two candidates for such a pointer, one uses intrusive
counting, the other has an external count. Both have their pros and
cons. We have decided that the non-intrusive variant is a better fit
for 'the' smart pointer because:

* A non-intrusive pointer can be used out of the box with any type,
including built-in types, 'immutable' third party types, and current
standard library types. An intrusive pointer would raise the
inevitable question "should we redesign standard library classes so
that they have a compatible embedded count?" and requires, on average,
more work from the user, often leading to virtual inheritance
scenarios.

* A separate count makes it possible for the pointer to operate
correctly when the type of the pointee is unknown. Incomplete types
and void can be supported, supporting various implementation hiding
techniques that improve the design by removing unnecessary
dependencies. Intrusive counting typically requires that at least some
of the type be visible. boost::intrusive_ptr is, to the best of my
knowledge, the only intrusive pointer that can support incomplete
types with help from the user, but it can't do void.

* A separate count leads to a more natural handling of cv-qualified
pointee types.

* A separate count leads to a more natural support of the weak pointer
concept. A weak pointer (a reference that doesn't own) is often needed
to break cycles and is required for some caching idioms. It can also
be used as an observer that requires no support from the observed type
in order to detect the end of its lifetime (the typical solution is to
make the pointee keep a list of the observers and notify them on
destruction). This, again, leads to a design with less dependencies.

* shared_ptr can easily wrap legacy interfaces of the form:

class X;
X* create();
void destroy(X*);

* shared_ptr can adapt to various counting/ownership strategies. If
you need to share ownership of an object with a library via:

void f(shared_ptr<X> const & px);

you can do so even if you use another smart pointer in your code to
manage X'es.

* shared_ptr is much more complex and hard to implement than
intrusive_ptr. If there are two competitive solutions with similar
merits, one much harder to implement than the other, and we need to
choose one, it is better to put the hard to implement solution in the
standard library, since this leads to less work for users, on average.

> As it stands at this time, I can see
> areas that it would reduce efficiency of code that I have written and I
> already use reference counting smart pointers. To the projects I have
> been involved with it's a mistake to use shared_ptr.

I can only suggest that you revisit your assessment later when you are
more familiar with shared_ptr. There do exist projects where the
performance of the smart pointer counts, but you really should measure
the actual impact and see for yourself. Improving the performance of
code that takes 1% of the total execution time is rarely productive.

Eliminating extra copies is a general problem in C++, and many have
reinvented essentially the same (limited) technique for bypassing the
copy constructor. shared_ptr can benefit, too. We (a different we this
time) believe that a language level solution is needed, and have
submitted a proposal to that effect. But it is also true that
reference counted smart pointers benefit the least from move
semantics. Compared with types that do heap allocations and O(N) copy
loops such as std::vector or std::list, counted pointers are, in most
cases, lost in the noise.

> > 2) Show that your own solution problems with exception safety can be
> > solved without significantly reducing its performance;
>
> that's if I care about solving those things.

Exception safety is important for standard library components.

Emil Dotchevski

unread,

Jul 21, 2003, 8:23:52 PM7/21/03

to

> I've not taken the time to look in detail at your design, and
> I do not level any criticism against it.

Just so I make myself clear, I do *not* criticise any particular
implementation of an alternative to the now standard shared_ptr. I do
criticise the criticism of shared_ptr, particularly because:

- It has been in real wold use for years;

- It has won the acceptance of the standardization committee;

- Often people who criticise it have never used it;

- My own experience with it has lead to improved design and a
reduction of dependencies in my code (and I am not aware of another
smart pointer that would have had the same effect.)

One may say that my position is too convenient for me, that is, I
don't provide any proof for the shared_ptr superiority while I have
been asking for proof of other design's qualities, but this position
is only natural once a given component has been standardized. My
attitude towards someone who claims to have invented a better std::map
would be exactly the same: proove it is better, in a measurable way,
and in a broad application domain, and we may have a discussion.

Plus, the question of superiority is irrelevant. Of course there is a
solution that will perform better in a given specific environment,
that's not the point. It is even possible that shared_ptr's design
could be improved without limiting its application domain. This still
does not mean that its standardization was a mistake, which is why my
reaction towards anyone claiming that has been this harsh.

And yes, exception safety is a must.

--Emil

Alexander Terekhov

unread,

Jul 21, 2003, 10:01:06 PM7/21/03

to

Peter Dimov wrote:
>
> tere...@web.de (Alexander Terekhov) wrote in message news:<3F17EF6F...@web.de>...
> > Peter Dimov wrote:
> > [...]
> > > This approach is about 20-25% faster than the "two count atomic" case,
> > > and about 20-25% slower than the single threaded case.
> >
> > You might also want to try something along the lines of: < this
> > version supersedes "sp_counted_base" stuff that I've posted in my
> > previous "Nah....Oder" message in this thread here >
>
> [...]
>
> For the record, the numbers above are for
>
> http://www.pdimov.com/cpp/shared_count_x86_exp2.hpp
>
> that uses atomic increment and decrement operations

with or without the "lock" prefix? The "lock" prefix is needed on
SMP and/or HT systems (IA32, that is). I'm just curious... ;-) ;-)

regards,
alexander.

--
Unix(R) and UnixWare(R) are registered trademarks of The Open Group
in the United States and other countries. OTOH, "OPENSERVER" is the
ABANDONED (dead) trademark in the United States. In other news, all
SCO operating systems (if any/and etc.) are dead in other countries
and the United States. Please visit <http://www.sco.de>.

Richard Smith

unread,

Jul 22, 2003, 2:46:44 PM7/22/03

to

Gianni Mariani wrote:

> I didn't say it won't work well with exceptions. I said that I may not
> care to have an overhead for having the smart pointers to solve them for
> me. VERY different.

To clarify then, can you state whether you intend your smart
pointer classes to be exception safe? If you would like a
particular example, can you tell me what the following code
will print?

class helper : public AT_LifeControl {
public:
helper() : rc(1)
{ std::cout << "Constructed" << std::endl; }
virtual ~helper()
{ std::cout << "Destructed" << std::endl; }

private:
virtual int AddRef()
{ return ++rc; }
virtual int Release()
{ if (--rc) return rc; delete this; return 0; }

int rc;
};

int main()
{
try {
AT_LifeLine<helper*> h( new helper );
throw std::runtime_error("Exception");
} catch ( const std::exception& e ) {
std::cout << e.what() << std::endl;
}
}

If the code prints anything other than

Constructed
Destructed
Exception

the code is not exception safe. The code you posted earlier
does not do this and hence is not exception safe. There are
no two ways about this: your code (as posted) leaks
resources and is not exception safe.

As other people have pointed out, there is virtually zero
chance that any sort of smart pointer class that is not
exception safe will be accepted in the Standard.

--
Richard Smith

Richard Smith

unread,

Jul 22, 2003, 2:46:49 PM7/22/03

to

Alexander Terekhov wrote:

> Peter Dimov wrote:
> >
> > For the record, the numbers above are for
> >
> > http://www.pdimov.com/cpp/shared_count_x86_exp2.hpp
> >
> > that uses atomic increment and decrement operations
>
> with or without the "lock" prefix? The "lock" prefix is needed on
> SMP and/or HT systems (IA32, that is). I'm just curious... ;-) ;-)

The code in the file linked above uses the Windows
Interlocked* functions, which on my Windows box (Win2k,
dual PIII [SMP]) appear to use the lock prefix.

--
Richard Smith

Peter Dimov

unread,

Jul 22, 2003, 2:47:10 PM7/22/03

to

tere...@web.de (Alexander Terekhov) wrote in message news:<3F1C364D...@web.de>...

> Peter Dimov wrote:
> >
> > tere...@web.de (Alexander Terekhov) wrote in message news:<3F17EF6F...@web.de>...
> > > Peter Dimov wrote:
> > > [...]
> > > > This approach is about 20-25% faster than the "two count atomic" case,
> > > > and about 20-25% slower than the single threaded case.
> > >
> > > You might also want to try something along the lines of: < this
> > > version supersedes "sp_counted_base" stuff that I've posted in my
> > > previous "Nah....Oder" message in this thread here >
> >
> > [...]
> >
> > For the record, the numbers above are for
> >
> > http://www.pdimov.com/cpp/shared_count_x86_exp2.hpp
> >
> > that uses atomic increment and decrement operations
>
> with or without the "lock" prefix? The "lock" prefix is needed on
> SMP and/or HT systems (IA32, that is). I'm just curious... ;-) ;-)

With lock. _InterlockedIncrement/Decrement translate to "lock xadd".
_InterlockedCompareExchange translates to "lock cmpxchg". This is on
AMD Athlon; the numbers for a Pentium 4 may be different.

Gianni Mariani

unread,

Jul 22, 2003, 6:57:47 PM7/22/03

to

Peter Dimov wrote:
> gi2n...@mariani.ws (Gianni Mariani) wrote in message news:<bfa904$b...@dispatch.concentric.net>...
>
>>I think it's a mistake to do somthing without really understanding what
>>it is smart_ptr really solves.
>
>
> shared_ptr, you mean.

:-) - yes ... mistyped.

Agreed, but there are 100% solutions that have an overhead and 95%
solutions that have no overhead.

Most of (if not all, depending on how you measure) the statements above
also apply to the AT_Life* pointers.

Thanks Peter, this is very close to what I am looking for. You touched
on some design trade-offs which I'm very interested in understanding
further and I could not agree more on the "Eliminating extra copies"
paragraph.

However, one of the biggest complaints I face when advocating C++ is
that applications written in C++ have poor performance. Quite frankly,
if a develper does not care about performance, they will likely pick
another language, (Java, C#, perl, python, ... VBScript..) because
theoretically these languages provide (agruable, I know) runtime checks
that prevent the user from doing nasty things that a C or C++ compiler
will let you do without asking permission.

I figure that this would indicate the target application for C++ is
where performance is critical. (because even I, a hard core C++ guy
would build a bunch o stuff with perl before embarking on C++ unless
performance was THE issue.)

C++ libraries that do not provide the fastest and most maintainable
(this is where we get subjective, your weights for "fastest" and
"maintanable" are very likley different to everyone else) solution is
the true question. I tend to err on virtually no compromise on
"fastest" except for abstraction (pure abstract interfaces). In this
way I can argue that C++ is truly faster than C and is appropriate for
all cases where you would use C. An interesting example is C and
reference counted "objects". Very few instances of complex structures
in C use reference counting becuase it can be error prone, however in
C++ with smart pointers, many of the errors can be eliminated with
virtually no overhead and hence can lead to easier to write and faster
leaner code - good. If boost::shared_ptr can't live up to this, I fail
to see it's utility. shared_ptr concerns me the most because in my
experience, when you have a good reference counted smart pointer
implementation it becomes prolific very fast. In applications I have
written over the last 2 years, nearly every call has a reference counted
smart pointer in it and so my sensitivity to this issue.

The alternative view is, you don't need all libraries to run so close to
the iron. There are more levels of performance that just fast and
slower and not all libraries need to be absolutly optimal. This is
likely the position I will end up taking. The AT library that I am
working on will have it's own fast smart pointer implementation (and yes
it will be 100% exception safe ...) and it's goal is to provide an
alternative to the STL where the STL is slow. It will support mapped
files and other fast mechanisms and it will be limited in the platforms
it supports.

However, before I fall into that position, I'm ver curious as to who the
target audience is. This is where I need to ask the hard questions.
For those that have used shared_ptr. Is your application performance
critical ? Are there appropriate alternatives to C++ for your
application ? Why did you choose C++ in the first place ?

<Bold statement>
Traditional text based systems for describing a program are no longer
suitable. C++ is at a level of complexity that very few people will be
able to master adequately in their life-time as a programmer/developer
to be useful. Unfortunately, adding more complexity (like smart_ptr,
bit not neccassarily smart_ptr) only makes the learning curve longer and
hence makes the language less useful.

Many things are done in C++ to accomodate C and they were done in C to
map closely to the iron so that a compiler could be efficiently built.

Conclusion: C++ is ill suited to express programmer intention and for
the compiler to create an optimal program from the intent. For example;
object lifetime management should be somthing that the "language" should
be able to determine for you. (conclude here that even concering a
programmer of life-time management is an issue of undue complexity - and
granted RAII is an interesting paradigm but this is mixing too many
concepts together making it more complex for a programmer to appreciate
the complexity).

Templates and classes come are tools that allow a programmer to create a
description of intent, however it has proven too complex for too many.

A different approach is needed. (And I have a slew of ideas here that
will bore many to tears so I'll hold off ... for now).
</Bold statement>

If you see the world from the position I have stated (regardless of if
you think I am right or wrong), a conclusion you can come to is that C++
would best meet the needs of the future custiomers if it was to provide
the leanest possible constructs. (since it is already ill suited for
anyone but the top few percentile of the programming profession).

Here I have stated my position in somewhat absolute terms however the
reality is always a compromise and so don't take what I've said here too
literally.

The core of my concern is that where I see C++ being adopted and where
the standardization heading are 2 very different places. This can only
lead to more frustrations in adopting more complexity to the standard
and hence being VERY picky about what is standardized at this point is
better than choosing somthing a majority will end up regretting.

Richard Smith

unread,

Jul 22, 2003, 6:57:49 PM7/22/03

to

Gianni Mariani wrote:

> I think it's a mistake to do somthing without really understanding what
> it is smart_ptr really solves. As it stands at this time, I can see
> areas that it would reduce efficiency of code that I have written and I
> already use reference counting smart pointers. To the projects I have
> been involved with it's a mistake to use shared_ptr.

I'm getting a bit fed up with these unsubstantiated claims
that AT_Life* is faster than boost::shared_ptr, and so I've
written some tests to see how they really compares. First,
some comments on how I've tested them.

I've compiled everything using g++ 3.2.1, with "-O2"
optimisation (this is a fairly standard optimisation level
for release builds). I've used version 1.30 of boost and
the default STL provided with g++. The AT_Life* code is as
provided in your post at the beginning of this thread.

I've built the boost shared_ptr with BOOST_DISABLE_THREADS
defined; similarly, I've implemented the AddRef and Release
methods in your hierarchy in a non-atomic manner. This
should be a fair comparision.

I used the following correspondence between AT_Life*
pointers and standard ones:

AT_LifeLine<T*> std::auto_ptr<T>
AT_LifeTime<T*> boost::shared_ptr<T>
AT_LifeView<T*> T const&

The test programs can be found at

http://www.ex-parrot.com/~richard/scratch/at_test.cc
http://www.ex-parrot.com/~richard/scratch/boost_test.cc

The timings I got demonstrated that the standard framework
is significantly faster than yours. (The standard framework
benchmarked at 1.86s, yours at 2.68s.)

I suggest that you do one of the following:

1) provide alternative evidence that demostrates your
smart pointer framework to be faster when used as
intended; or

2) accept that the standard smart pointer framework is
faster than yours.

--
Richard Smith

Dave Harris

unread,

Jul 23, 2003, 10:11:35 PM7/23/03

to

pdi...@mmltd.net (Peter Dimov) wrote (abridged):

> With use_count(), assuming a typical reference counted implementation,
> the compiler must increment __use_count by 2 and not one.

That was the trick I was missing, thanks.

-- Dave Harris, Nottingham, UK

---

Gianni Mariani

unread,

Jul 23, 2003, 10:12:33 PM7/23/03

to

Richard, thanks for taking you time and putting together this benchmark.

Unfortunately you omit a number of critical points in you discussion.

Firstly, you choose to use virtual methods for performing the reference
counting on the AT example and secondly you use of auto_ptr is clearly a
specific case and not as general as AT_LifeLine.

Also, your test is not indicative of how a real-life system would work.
Memory utilization as well as instruction count are huge factors.

I don't have time right now to toy with this, but if you need a few more
hints as to how you can create a true apples and apples comparison, drop
me a line.

G

Richard Smith

unread,

Jul 24, 2003, 9:27:31 PM7/24/03

to

Gianni Mariani wrote:

> Unfortunately you omit a number of critical points in you discussion.
>
> Firstly, you choose to use virtual methods for performing the reference
> counting on the AT example

Perhaps I need to requote your original code: [slightly
abridged]

| /**
| * AT_LifeControl is an abstract base class for any class whose lifetime
| * is managed by means of reference counting.
| */
|
| class AT_LifeControl
| {
| public:
| virtual ~AT_LifeControl(); // Virtual destructor
| virtual int AddRef() = 0;
| virtual int Release() = 0;
| };

"AT_LifeControl is an abstract base class for any class
whose lifetime is managed by means of reference counting."
In my benchmark, tester::data is managed by means of
reference counting, so, as per your comment, I made it a
sub class of AT_LifeControl.

> and secondly you use of auto_ptr is clearly a
> specific case and not as general as AT_LifeLine.

It solved the particular problem required in the test
example, and a common requirement in many applications using
dynamically allocated memory.

> Also, your test is not indicative of how a real-life system would work.
> Memory utilization as well as instruction count are huge factors.

Do you mean (i) memory utilisation affects speed, or (ii)
the amount of memory required by an smart pointer can be
important for its own sake? In response to (i), I would
say, yes, it will, but a priori would expect the two smart
pointer frameworks to scale similarly in this respect. If
you mean (ii), then your framework does save a small amount
of space per reference counted object. This is at the
expense of not supporting (a) deletion of incomplete types,
(b) correct casting of smart pointers to objects without
virtual destructors, and (c) weak pointers. I don't have a
problem with this extra memory overhead -- it still only
requires one additional allocation per object.

> I don't have time right now to toy with this, but if you need a few more
> hints as to how you can create a true apples and apples comparison, drop
> me a line.

I don't want a "few more hints" on how to write a test
suite. What I want is for you to either accept that your
smart pointer framework is less efficient that Boost's, or
to provide hard evidence that I'm wrong, and yours is more
efficient. Almost every post in this thread has asked you
for such evidence and you have failed to provide any.

As many others have said, the Standard has now accepted the
Boost smart_ptr class into TR1. If you really feel that a
mistake has been made in accepting this, you will have to
work hard to demonstrate this and persuade others round to
your way of thinking. If your arguments centre around
efficiency, this will require you to write a benchmark
demonstrating it. If you're not prepared to do this, you
might as well give up.

--
Richard Smith