Deterministic destruction computationally expensive?

Anand Hariharan

unread,

Nov 15, 2004, 6:37:59 AM11/15/04

to

I recently attended a talk given by a .NET evangelist. Surprisingly,
the speaker was quite sincere and explained why Garbage collection is
no panacea (citing examples such as database connections and file
handles), taking his presentation through muddy waters of Dispose,
Close, etc.

At one point he showed how C# chose to overload the "using" keyword in
a completely unrelated context viz., to specify that the variables
within a block defined by "using" should be destroyed as soon as they
leave the scope.

At that point I asked why C# could not have simply incorporated those
semantics as a part of the language rather than requiring the
programmer explicitly request it at specific places. His response:
"Deterministic de-construction is generally expensive, especially if
one has several small objects sporadically sprewn all over."

Is there a merit (statistical/empirical) to his assertion? I thought
C++ went great lengths for RAII to be possible, eschewing runtime
guzzlers (such as mark-and-sweep) largely on performance grounds.

thank you for listening,
- Anand Hariharan

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Francis Glassborow

unread,

Nov 15, 2004, 4:40:19 PM11/15/04

to

In article <22868227.04111...@posting.google.com>, Anand
Hariharan <mailto.anan...@gmail.com> writes

>Is there a merit (statistical/empirical) to his assertion? I thought
>C++ went great lengths for RAII to be possible, eschewing runtime
>guzzlers (such as mark-and-sweep) largely on performance grounds.

I think people often forget that C++ uses a single threaded abstract
machine where RAII works happily. C#, among other languages, uses a
multi-threaded abstract machine where GC has some extra advantages.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Ivan Vecerina

unread,

Nov 15, 2004, 4:38:42 PM11/15/04

to

"Anand Hariharan" <mailto.anan...@gmail.com> wrote in message
news:22868227.04111...@posting.google.com...

> I recently attended a talk given by a .NET evangelist. Surprisingly,
> the speaker was quite sincere and explained why Garbage collection is
> no panacea (citing examples such as database connections and file
> handles), taking his presentation through muddy waters of Dispose,
> Close, etc.

Not such a big surprise, since compared to Java, C# has somewhat
better integrated support for explicit destruction.

> At one point he showed how C# chose to overload the "using" keyword in
> a completely unrelated context viz., to specify that the variables
> within a block defined by "using" should be destroyed as soon as they
> leave the scope.
>
> At that point I asked why C# could not have simply incorporated those
> semantics as a part of the language rather than requiring the
> programmer explicitly request it at specific places. His response:
> "Deterministic de-construction is generally expensive, especially if
> one has several small objects sporadically sprewn all over."
>
> Is there a merit (statistical/empirical) to his assertion? I thought
> C++ went great lengths for RAII to be possible, eschewing runtime
> guzzlers (such as mark-and-sweep) largely on performance grounds.

The assertion definitely does not apply to stack-allocated C++ objects.
However, when class instances (or their data members) are allocated on
the heap, as is most often the case in C# and Java, explicit memory
deallocation can be more expensive than to wait for the next GC cycle.

It all really depends on the memory allocation algorithm that is
being used. In some approaches freeing memory can be as simple
as setting flag in the header of the allocated block; in others
it may require a complex look-up and table update process.

hth,
Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form

M Jared Finder

unread,

Nov 16, 2004, 6:31:54 AM11/16/04

to

Anand Hariharan wrote:
> I recently attended a talk given by a .NET evangelist. Surprisingly,
> the speaker was quite sincere and explained why Garbage collection is
> no panacea (citing examples such as database connections and file
> handles), taking his presentation through muddy waters of Dispose,
> Close, etc.
>
> At one point he showed how C# chose to overload the "using" keyword in
> a completely unrelated context viz., to specify that the variables
> within a block defined by "using" should be destroyed as soon as they
> leave the scope.
>
> At that point I asked why C# could not have simply incorporated those
> semantics as a part of the language rather than requiring the
> programmer explicitly request it at specific places. His response:
> "Deterministic de-construction is generally expensive, especially if
> one has several small objects sporadically sprewn all over."
>
> Is there a merit (statistical/empirical) to his assertion? I thought
> C++ went great lengths for RAII to be possible, eschewing runtime
> guzzlers (such as mark-and-sweep) largely on performance grounds.

This just seems crazy. I can't see how C#'s using, Java's try-finally,
or C++'s automatic destruction would generate code that is different in
any way. I can see there being problems with an old ABI that requires
each function to register itself as having a cleanup step, but standards
can't prevent all stupid implementations. In addition, using garbage
collection will remove much of the work done in destructors since a most
of the resources used in a program tends to be memory.

I'd be interested in what Herb Sutter had to say about this, considering
that one of the big advantages of C++/CLI over C# is the automatic
calling of destructors.

-- MJF

James Dennett

unread,

Nov 16, 2004, 6:39:10 AM11/16/04

to

Francis Glassborow wrote:
> In article <22868227.04111...@posting.google.com>, Anand
> Hariharan <mailto.anan...@gmail.com> writes
>
>>Is there a merit (statistical/empirical) to his assertion? I thought
>>C++ went great lengths for RAII to be possible, eschewing runtime
>>guzzlers (such as mark-and-sweep) largely on performance grounds.
>
>
> I think people often forget that C++ uses a single threaded abstract
> machine where RAII works happily. C#, among other languages, uses a
> multi-threaded abstract machine where GC has some extra advantages.

For the most part, RAII also works well in multi-threaded situations.
Most things aren't shared between threads, in well-designed apps.

-- James

David Abrahams

unread,

Nov 16, 2004, 9:10:15 PM11/16/04

to

James Dennett <jden...@acm.org> writes:

> Francis Glassborow wrote:
> > In article <22868227.04111...@posting.google.com>, Anand
> > Hariharan <mailto.anan...@gmail.com> writes
> >
> >>Is there a merit (statistical/empirical) to his assertion? I thought
> >>C++ went great lengths for RAII to be possible, eschewing runtime
> >>guzzlers (such as mark-and-sweep) largely on performance grounds.
> >
> >
> > I think people often forget that C++ uses a single threaded abstract
> > machine where RAII works happily. C#, among other languages, uses a
> > multi-threaded abstract machine where GC has some extra advantages.
>
> For the most part, RAII also works well in multi-threaded situations.
> Most things aren't shared between threads, in well-designed apps.

Also, I think with a lock-free memory allocator many of the
performance advantages of GC disappear even for the multithreaded
case.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

ka...@gabi-soft.fr

unread,

Nov 16, 2004, 9:09:02 PM11/16/04

to

mailto.anan...@gmail.com (Anand Hariharan) wrote in message
news:<22868227.04111...@posting.google.com>...

> I recently attended a talk given by a .NET evangelist. Surprisingly,
> the speaker was quite sincere and explained why Garbage collection is
> no panacea (citing examples such as database connections and file
> handles), taking his presentation through muddy waters of Dispose,
> Close, etc.

> At one point he showed how C# chose to overload the "using" keyword in
> a completely unrelated context viz., to specify that the variables
> within a block defined by "using" should be destroyed as soon as they
> leave the scope.

> At that point I asked why C# could not have simply incorporated those
> semantics as a part of the language rather than requiring the
> programmer explicitly request it at specific places. His response:
> "Deterministic de-construction is generally expensive, especially if
> one has several small objects sporadically sprewn all over."

> Is there a merit (statistical/empirical) to his assertion? I thought
> C++ went great lengths for RAII to be possible, eschewing runtime
> guzzlers (such as mark-and-sweep) largely on performance grounds.

There's certainly nothing wrong with his assertion; it's true as far as
it goes. I'm not familiar with C#, but I seem to recall reading that it
doesn't have auto variables (at least with object types); this is what
makes RAII work so well in C++. (Note that if you allocate an object
with new in C++, you also have to do something explicit to get its
destructor called when you leave the scope.) As for using, I'm not sure
why it's better than finally -- it's still the user who states what has
to be destroyed, and not the author of the class.

With regards to the performance issues, I believe that this was the case
when C++ was first being defined. Today, of course, garbage collection
typically outperforms manual memory management. And while neither do as
well on performance grounds as no memory management, i.e. variables
allocated directly on the stack, the different semantics they provide
are IMHO an even stronger argument for auto and global variables.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Francis Glassborow

unread,

Nov 16, 2004, 9:13:46 PM11/16/04

to

In article <_Semd.145189$hj.10661@fed1read07>, James Dennett
<jden...@acm.org> writes

> > I think people often forget that C++ uses a single threaded abstract
> > machine where RAII works happily. C#, among other languages, uses a
> > multi-threaded abstract machine where GC has some extra advantages.
>
>For the most part, RAII also works well in multi-threaded situations.
>Most things aren't shared between threads, in well-designed apps.

Indeed but the rub is in the first phrase. 'For the most part' allows
for times when it is not so. It is this kind of problem that is implicit
in using a single threaded abstract machine.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Alf P. Steinbach

unread,

Nov 17, 2004, 6:11:17 AM11/17/04

to

* ka...@gabi-soft.fr:

>
> With regards to the performance issues, I believe that this was the case
> when C++ was first being defined. Today, of course, garbage collection
> typically outperforms manual memory management. And while neither do as
> well on performance grounds as no memory management, i.e. variables
> allocated directly on the stack, the different semantics they provide
> are IMHO an even stronger argument for auto and global variables.

"Of course ... outperforms",
"An even stronger argument for global variables",
are you trolling?

Anyway, I fail to believe most of the assertions bandied about in this
thread (not just those given by you in the paragraph above).

And the OP was asking about the cost of _deterministic_ destruction, which
is not at all in conflict with automatic garbage collection. For example,
a guarantee that an object has its destructor invoked at the point where
the last reference to the object is removed, if that happens, and otherwise
as with asynchronous garbage collection; for an example of what that would
solve, it would guarantee that external objects such as COM objects were
released in a timely manner. What is the connection to threading there
(no remaining references -> no threading issues, as I see it)? Why would it
or could it be inefficient? What does actual experience say?

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Dave Harris

unread,

Nov 17, 2004, 6:13:48 AM11/17/04

to

mailto.anan...@gmail.com (Anand Hariharan) wrote (abridged):

> At that point I asked why C# could not have simply incorporated those
> semantics as a part of the language rather than requiring the
> programmer explicitly request it at specific places. His response:
> "Deterministic de-construction is generally expensive, especially if
> one has several small objects sporadically sprewn all over."
>
> Is there a merit (statistical/empirical) to his assertion? I thought
> C++ went great lengths for RAII to be possible, eschewing runtime
> guzzlers (such as mark-and-sweep) largely on performance grounds.

If you've already got garbage collection, the incremental cost of using it
in a specific case is probably lower than using deterministic destruction
in that case. Note that your quote doesn't claim that RAII is slower than
try/finally (or whatever the C# syntax is). In other words, it doesn't
answer your question directly. He's saying all forms of deterministic
destruction are relatively expensive. He may be arguing that having RAII
in the language would lead to it being used unnecessarily, and so make
some programmes unnecessarily slow.

For his general point, compare:

void with_gc() {
SomeObject *ptr = new SomeObject:
some_proc();
}

with:

void with_raii() {
std::auto_ptr<SomeObject> ptr( new SomeObject );
some_proc();
}

Clearly the latter routine will have some extra code for the destructor of
the auto_ptr, which will make it slower. Also, in the first routine the
call to some_proc() is a tail-call, so can be optimised into a jump, but
the destructor prevents that optimisation in with_raii().

A good GC implementation will not have any offsetting inefficiencies. The
allocation cost will be the same, and deallocation is basically free - GC
costs are mainly proportional to the number of live objects. So it's a net
win for GC (given that it is present and running and the object is on the
heap anyway).

-- Dave Harris, Nottingham, UK

Francis Glassborow

unread,

Nov 17, 2004, 7:05:55 PM11/17/04

to

In article <419ab671....@news.individual.net>, Alf P. Steinbach
<al...@start.no> writes

>What is the connection to threading there
>(no remaining references -> no threading issues, as I see it)? Why would it
>or could it be inefficient? What does actual experience say?

The use of GC assumes that there is no other way to hold an access to an
object other than by references. That means that there are far fewer
programming issues in languages that do not provide other ways to access
an object. Without some measure of language support GC is expensive in
so far as writing correct code is concerned.

The problem with RAII is that dtors normally need to apply some form of
lock to update the count mechanism and that lock is applied even when
the object is only ever referenced by a single thread. Any form of lock
mechanism in a system that has more than one effective processor can
proof very expensive. Currently C++ has no concept that code may
actually be executed in parallel.

Note that it is far from trivial to design a correct multi-threading
model (Java originally got it wrong).

With care, multi-threading built on top of a single-threaded abstract
machine can be made to work as long as the real machine actually
executes as a single thread, as soon as the real machine departs from
being a realisation of the abstract machine we are in trouble; either we
have to force the real machine to behave as if it were single threaded
or we have to rely on language extensions that are probably not
portable. The really nasty thing about the latter route is that those
extensions may be purely semantic without any syntactic visibility; i.e.
the actual behaviour of our code is implementation dependent.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Peter Dimov

unread,

Nov 18, 2004, 7:25:07 AM11/18/04

to

bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20041116215931.2492C@brangdon.m>...

> For his general point, compare:
>
> void with_gc() {
> SomeObject *ptr = new SomeObject:
> some_proc();
> }
>
> with:
>
> void with_raii() {
> std::auto_ptr<SomeObject> ptr( new SomeObject );
> some_proc();
> }
>
> Clearly the latter routine will have some extra code for the destructor of
> the auto_ptr, which will make it slower. Also, in the first routine the
> call to some_proc() is a tail-call, so can be optimised into a jump, but
> the destructor prevents that optimisation in with_raii().

These two examples are not equivalent, so efficiency doesn't matter.
with_raii() executes SomeObject::SomeObject(), some_proc(),
SomeObject::~SomeObject(). with_gc() omits the last call.

Glen Low

unread,

Nov 19, 2004, 8:01:24 AM11/19/04

to

> At that point I asked why C# could not have simply incorporated those
> semantics as a part of the language rather than requiring the
> programmer explicitly request it at specific places. His response:
> "Deterministic de-construction is generally expensive, especially if
> one has several small objects sporadically sprewn all over."

Note that C# has the notion of "value types" (unlike Java), which are
generally stack-based and are scoped like auto variables in C++.
They're not as flexible as C++ auto variables since the choice of
being stack-based is a design decision not a client decision, but they
are often used as the "small objects sporadically strewn all over".

Your question however seemed to be about syntax rather than
implementation or performance -- to paraphrase, "given that I need to
deconstruct an object anyway, why do I need to write it explicitly?"
Knowing that coding up a "using" scope (in C#) or leaving the auto
variable to fall out of scope (in C++) have identical implementations
i.e. run the (expensive) destructor routine & deallocate the heap
memory if any, what does his response really mean?

I think he's not saying "GC is better than RAII" -- that's purely a
performance question as the other posters have said -- but rather, "we
made this syntax to highlight the expense of a deterministic
deconstruction, e.g. the closing of a file". Much like in C++, we have
"reinterpret_cast" to highlight a dangerous C-style cast.

What other syntax choices could be made and still preserve the
simplicity of the language? Given that the object is a "reference
type", it is always allocated on the heap. Given that the default
behavior of heap objects is to be garbage collected at an
indeterminate time, how do you indicate that you want to run the
destructor at the end of scope?

Cheers,
Glen Low, Pixelglow Software
www.pixelglow.com

ka...@gabi-soft.fr

unread,

Nov 19, 2004, 9:55:27 AM11/19/04

to

al...@start.no (Alf P. Steinbach) wrote in message
news:<419ab671....@news.individual.net>...
> * ka...@gabi-soft.fr:

> > With regards to the performance issues, I believe that this was the
> > case when C++ was first being defined. Today, of course, garbage
> > collection typically outperforms manual memory management. And
> > while neither do as well on performance grounds as no memory
> > management, i.e. variables allocated directly on the stack, the
> > different semantics they provide are IMHO an even stronger argument
> > for auto and global variables.

> "Of course ... outperforms",
> "An even stronger argument for global variables",
> are you trolling?

Just stating what I would consider rather obvious facts, mainly:

- Real measurements comparing modern garbage collection and manual
memory management tend to show garbage collection to be faster.
Although, like most benchmarks, it's better to be leary -- YMMV, as
they say.

- Global and auto variables have different semantics (lifetime, etc.)
than dynamic variables. I would have thought that in this group, no
one would disagree with that, and that mighty few, if any, would
argue that these semantics are not useful, and that it would be
better to just allocate everything dynamically.

The real argument for global and auto variables is thus, IMHO, the
additional behavior due to their different semantics. That they are
also faster than dynamic allocation is just icing on the cake.

> Anyway, I fail to believe most of the assertions bandied about in this
> thread (not just those given by you in the paragraph above).

> And the OP was asking about the cost of _deterministic_ destruction, which
> is not at all in conflict with automatic garbage collection.

The OP was asking about a statement concerning C# compared to C++.
IMHO, the statement actually avoided the real issues -- deterministic
destruction of variables which would otherwise be garbage collected may
be more expensive than just letting garbage collection do its work, but
of course, that's NOT what C++ offers. The difference between all
variables being dynamically allocated, and only those you want being
dynamically allocated, is not irrelevant. Nor is the fact that judging
from his comments, the declaration which triggers the "deterministic
de-construction" in C# is still in the user code (like finally in Java),
and does not depend on the object or its type. IMHO, this means that it
is NOT really an improvement on finally, and is still far from RAII as
we understand it in C++.

(The rest of your comments seemed to concern mainly COM, which I know
nothing about, so I'll let others answer them.)

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Nov 19, 2004, 9:55:48 AM11/19/04

to

bran...@cix.co.uk (Dave Harris) wrote in message
news:<memo.20041116215931.2492C@brangdon.m>...

> A good GC implementation will not have any offsetting inefficiencies.

> The allocation cost will be the same, and deallocation is basically
> free - GC costs are mainly proportional to the number of live
> objects. So it's a net win for GC (given that it is present and
> running and the object is on the heap anyway).

With a good relocating GC implementation, allocation cost will be much,
much less than with manual memory management. Actual deallocation costs
depend on different factors, and will often be higher than for manual
management (you do have to relocate still live objects), but the total
of both typically favors garbage collection. And the fact that the cost
of deallocation occurs asynchronously will, in many applications, allow
it to be shifted to dead time (e.g. while waiting for input), which
makes it effectively free.

Of course, I don't think that relocating GC is compatible with C++, as
it now stands.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

John Nagle

unread,

Nov 19, 2004, 10:28:37 AM11/19/04

to

Perl and Python both have deterministic destruction using
reference counts. It's noteworthy that in both of those languages,
memory allocation is rarely a major concern of the programmer.
That's a sign they're doing something right.

There are two basic objections to reference counts -
overhead and circular references.

Perl deals with the second problem by offering
"weak references". If you have a tree with backlinks,
the backlinks should be weak references. The object
goes away when the last ordinary (strong) reference
goes away, regardless of weak references. This will
prevent memory leaks in the more common cases. This is
not leakproof, but the Perl approach safe against bad memory
references.

The performance problem is real, but optimizable.
For C++, it's an artifact of trying to retrofit reference
counts via templates.
Optimizers need to know about reference counts. Reference
count updates can potentially be hoisted out of innner loops
using standard optimization techniques, if the compiler knows
something about reference counts.

If you had optimization of reference counts and subscript
checking at compile time, you could have the safety of Perl
and Python with the speed of C/C++.

This isn't going to happen, because the current C++ committee is
totally uninterested in language safety. Until some major
changes are made in the committee, C++ will not become any
safer, and we'll continue to have very unreliable software
in that language.

John Nagle
Animats
Francis Glassborow wrote:

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Branimir Maksimovic

unread,

Nov 19, 2004, 11:49:47 AM11/19/04

to

mailto.anan...@gmail.com (Anand Hariharan) wrote in message news:<22868227.04111...@posting.google.com>...

....

> At that point I asked why C# could not have simply incorporated those
> semantics as a part of the language rather than requiring the
> programmer explicitly request it at specific places. His response:
> "Deterministic de-construction is generally expensive, especially if
> one has several small objects sporadically sprewn all over."

I can add:
Non deterministic de-construction is generally expensive, especially
if one has limited resources. :)

>
> Is there a merit (statistical/empirical) to his assertion? I thought
> C++ went great lengths for RAII to be possible, eschewing runtime
> guzzlers (such as mark-and-sweep) largely on performance grounds.
>

Assertion would be true if nowadays systems could have unlimited
resources. But presence of "using", "IDisaposable" interface and
similar hacks proves that wrong.
Problem is that GC solution is *forced*, but there is no universal
sledge hammer for memory or resource utilization.
For example when implementing malloc we used two layer technique.
Bottom layer is responisble for good memory utilization( not for speed),
while top layer is responsible for performance (especially when dealling
with threads). This is ok for apps that allocate/deallocate memory chunks
frequently, but not good for one's that allocate lot's of chunks
then deallocate.

Greetings, Bane.

Mike Capp

unread,

Nov 19, 2004, 11:51:42 AM11/19/04

to

ka...@gabi-soft.fr wrote in message news:<d6652001.04111...@posting.google.com>...

> There's certainly nothing wrong with his assertion; it's true as far as
> it goes. I'm not familiar with C#, but I seem to recall reading that it
> doesn't have auto variables (at least with object types); this is what
> makes RAII work so well in C++.

C# has "structs", which are user-defined value types allocated on the
stack. In that sense they are effectively auto. Unfortunately they're
rather restrictive - not quite POD, but heading that way. In
particular, you can't define destructors for them, which makes them
entirely useless for RAII.

C# also makes the spectacular blunder of using identical syntax for
(non-primitive) value and reference semantics, which is a perpetual
source of exciting new bugs.

> As for using, I'm not sure
> why it's better than finally

It's terser but less flexible. Both approaches become very ugly when
you have multiple objects in a scope that require cleanup on exit.

cheers,
Mike

Anand Hariharan

unread,

Nov 20, 2004, 5:20:45 AM11/20/04

to

pdi...@gmail.com (Peter Dimov) wrote in message news:<abefd130.04111...@posting.google.com>...

> bran...@cix.co.uk (Dave Harris) wrote in message news:<memo.20041116215931.2492C@brangdon.m>...
> > For his general point, compare:
> >
> > void with_gc() {
> > SomeObject *ptr = new SomeObject:
> > some_proc();
> > }
> >
> > with:
> >
> > void with_raii() {
> > std::auto_ptr<SomeObject> ptr( new SomeObject );
> > some_proc();
> > }
> >
> > Clearly the latter routine will have some extra code for the destructor of
> > the auto_ptr, which will make it slower. Also, in the first routine the
> > call to some_proc() is a tail-call, so can be optimised into a jump, but
> > the destructor prevents that optimisation in with_raii().
>
> These two examples are not equivalent, so efficiency doesn't matter.
> with_raii() executes SomeObject::SomeObject(), some_proc(),
> SomeObject::~SomeObject(). with_gc() omits the last call.
>

Wasn't that /exactly/ Dave Harris' point? The two examples /can/ be
compared since both envisage calling the default constructor of
'SomeObject' followed by a call to some_proc. 'with_raii()' does more
(viz., calling the destructor of 'SomeObject') but without the
overhead of GC. Dave further makes a case that an implementation that
runs GC allows the optimiser to improve 'with_gc()'.

What did I miss?

- Anand

Alf P. Steinbach

unread,

Nov 20, 2004, 5:31:17 AM11/20/04

to

* ka...@gabi-soft.fr:

> al...@start.no (Alf P. Steinbach) wrote in message
> news:<419ab671....@news.individual.net>...
> > * ka...@gabi-soft.fr:
>
> > > With regards to the performance issues, I believe that this was the
> > > case when C++ was first being defined. Today, of course, garbage
> > > collection typically outperforms manual memory management. And
> > > while neither do as well on performance grounds as no memory
> > > management, i.e. variables allocated directly on the stack, the
> > > different semantics they provide are IMHO an even stronger argument
> > > for auto and global variables.
>
> > "Of course ... outperforms",
> > "An even stronger argument for global variables",
> > are you trolling?
>
> Just stating what I would consider rather obvious facts, mainly:
>
> - Real measurements comparing modern garbage collection and manual
> memory management tend to show garbage collection to be faster.
> Although, like most benchmarks, it's better to be leary -- YMMV, as
> they say.
>
> - Global and auto variables have different semantics (lifetime, etc.)
> than dynamic variables. I would have thought that in this group, no
> one would disagree with that, and that mighty few, if any, would
> argue that these semantics are not useful, and that it would be
> better to just allocate everything dynamically.

OK, I see partly what you mean now -- those are two very different
issues, but the earlier formulation led me to believe you had them
lumped into one; what I'm still unclear on is the "_even_ stronger".

There is another dimension here that is, I think, both more important
and more directly relevant to the OP's question, "deterministic
_destruction_" (the title of this thread), namely the difference between
destructor calls and memory deallocation.

I have no doubt that pure deallocation can both in principle and in
practice be more efficiently done by a general garbage collector, simply
because it's got a global view of things (memory, processor
utilization), except in the few cases where we can use a specially
crafted most efficient allocator such as a simple free-list / cache.

I do doubt that there is any benefit from non-deterministic destructor
calls, other than the dubious one of being able to turn off destructor
calls on a per-object basis (possible in C#, and AFAIK done by default
in Microsoft's Windows Forms GUI library for .NET). The reason I doubt
that there can be any benefit is two-fold: first of all, the calls will
have to made anyway, and I fail to see how they can be faster when done
by the garbage collector (other than that the garbage collector might
choose to do the calls when there is otherwise low processor
utilization); secondly, the kind of performance tweaking seemingly
necessary for professional libraries like the one mentioned here means
that not only do you not know when your destructor will be called, or if
it ever will be called (if the application exits before the object is
collected), you do not even know if there is a chance; so what possible
benefit can automatically called destructor have in for example C# and
Java (personal guideline: don't use it for anything!)?

Is there anything in the C++ standard that prohibits operator delete
from doing just the destructor call part, which I think it can do most
efficiently and in a way that can be _relied_ on, and handing the memory
deallocation bit over to a garbage collector which is better at that?

> The real argument for global and auto variables is thus, IMHO, the
> additional behavior due to their different semantics. That they are
> also faster than dynamic allocation is just icing on the cake.

I guess by "behavior", when it comes to auto variables you're referring
to RAII. Well then, see last question above. Also, the D language
allows classes to be declared 'auto', destructor call when object goes
out of scope, and according to the documentation such objects are
currently not allocated on the stack, i.e. they have the RAII semantics
regarding destructor calls but not regarding memory reclamation.

>
> > Anyway, I fail to believe most of the assertions bandied about in this
> > thread (not just those given by you in the paragraph above).
>
> > And the OP was asking about the cost of _deterministic_ destruction, which
> > is not at all in conflict with automatic garbage collection.
>
> The OP was asking about a statement concerning C# compared to C++.
> IMHO, the statement actually avoided the real issues -- deterministic
> destruction of variables which would otherwise be garbage collected may
> be more expensive than just letting garbage collection do its work, but
> of course, that's NOT what C++ offers.

I think a comparision to C++ would not be meaningful, except perhaps a
hypothetical C++ implementation with operator delete delegating memory
reclamation to a garbage collector. But even that would not capture the
idea of having C# 'using' by default. A more meaningful comparision
would, I think, be to a language based on reference counting, such as VB
6.0 -- or even better a hypothetical language like VB 6.0 semantics
but with a general garbage collector _added_.

> (The rest of your comments seemed to concern mainly COM,

Well, no, COM was just an example in the middle of a sentence. But it
was a very important example, so I'll expand on it. Substitute "file
handle", "network connection", whatever; deterministic destructor calls,
as opposed to deterministic memory reclamation, is extremely important
to have as a property of the class in question, not something the client
code must fix in every case (often forgetting, often doing wrong).

Essentially what the C# evangelist answered the OP was that D-auto-like
classes are "generally expensive, especially if one has several small
objects sporadically sprewn all over.", and that VB6-like reference
counting is "generally expensive, especially if one has several small
objects sporadically sprewn all over", and so on for every alternative
that provides deterministic destructor calls.

I think that's most probably incorrect.

> [COM] I know nothing about, so I'll let others answer them.)

Uh, that's off-topic here... ;-)

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dave Harris

unread,

Nov 20, 2004, 11:17:43 AM11/20/04

to

pdi...@gmail.com (Peter Dimov) wrote (abridged):

> These two examples are not equivalent, so efficiency doesn't matter.
> with_raii() executes SomeObject::SomeObject(), some_proc(),
> SomeObject::~SomeObject(). with_gc() omits the last call.

Yes. That's the point. We're considering the effect that difference in
behaviour has on efficiency. With_gc() grants more freedom to the
implementation and so is potentially more efficient.

The .NET evangelist said "Deterministic de-construction is generally
expensive...". He is not comparing one kind of deterministic
deconstruction (eg try/catch) with another (eg RAII), he's comparing it
with non-deterministic de-construction. So my example code is to the
point.

Although the examples are not equivalent, the first can substitute for the
second (assuming no outstanding references), so it does make sense to
compare their efficiencies.

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Francis Glassborow

unread,

Nov 20, 2004, 11:21:42 AM11/20/04

to

In article <j4dnd.23136$6q2....@newssvr14.news.prodigy.com>, John Nagle
<na...@animats.com> writes

> This isn't going to happen, because the current C++ committee is
>totally uninterested in language safety.

What an extraordinary statement. The Committee (a substantially
different one from that which did most of the work on C98) is
constrained by a requirement no to break legacy code without very good
reason. However within those limits they are not 'totally uninterested
in language safety' they just do not consider it the most important
issue. However as security has become of increasing importance in IT it
has a higher place now than it would have done ten years ago.

> Until some major
>changes are made in the committee, C++ will not become any
>safer, and we'll continue to have very unreliable software
>in that language.

And who do you propose make those changes? WG21 & J16 are composed of
those people who make the effort to participate. No one who wishes to be
involved is excluded, indeed we welcome newcomers as well as new
perspectives.

As for unreliable software, any language widely used by people ranging
across the entire spectrum of competence will have that problem. The
problem is much less with the tools than with those who are unable to
identify competence and listen to advice when they are given it.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Terje Slettebø

unread,

Nov 20, 2004, 10:13:45 PM11/20/04

to

"John Nagle" <na...@animats.com> wrote in message
news:j4dnd.23136$6q2....@newssvr14.news.prodigy.com...

> Perl and Python both have deterministic destruction using
> reference counts. It's noteworthy that in both of those languages,
> memory allocation is rarely a major concern of the programmer.
> That's a sign they're doing something right.
>

> If you had optimization of reference counts and subscript
> checking at compile time, you could have the safety of Perl
> and Python with the speed of C/C++.
>
> This isn't going to happen, because the current C++ committee is
> totally uninterested in language safety.

That's a rather sweeping remark. Well, let's look at the facts:

- The TR1 on library extensions
(http://www2.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1711.pdf)
includes reference-counted smart pointers, for safe use of dynamically
allocated memory.
- There's the STL, with its containers, also taking care of memory
allocation and release issues.
- There are proposals (such as this one:
http://www2.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1727.pdf) to
change undefined behaviour to diagnosed behaviour.
- There are also proposals for making the language safer and easier to use,
by providing new constructs with safer defaults (like "explicit classes")
and stricter type checking.

Indeed, it may be claimed that much of the reason for C++ was to provide a
better, safer, and more expressive language.

You claim the committee is not interested in language safety. However, have
you considered that language safety isn't the only concern? It typically
needs to be balanced against issues such as performance, not restricting
safe use, and backwards compatibility. Saying they're not interested in
language safety is patently untrue.

> Until some major
> changes are made in the committee, C++ will not become any
> safer, and we'll continue to have very unreliable software
> in that language.

As you know, the committee is made up of volunteers, where either themselves
or someone else (like their employer) has to pay, or contribute to, the
expenses to participate, and as such should be commended for this. Changes
won't happen unless someone who feels strongly about it, and is able to do
it, does something about it. The committee has a "ton" of issues and
proposals to deal with, already, with limited manpower. If you feel strongly
about this issue, why don't you participate, and perhaps write a paper on
this? If you think you can do better, why not join the committee?

It's easy enough to criticise someone, for not doing something you want, as
long as you don't have to do it, yourself... The expression "armchair
quarterback" comes to mind.

Regards,

Terje

David B. Held

unread,

Nov 20, 2004, 10:20:41 PM11/20/04

to

Anand Hariharan wrote:

> pdi...@gmail.com (Peter Dimov) wrote in message news:<abefd130.04111...@posting.google.com>...

> > [...]

> > These two examples are not equivalent, so efficiency doesn't matter.
> > with_raii() executes SomeObject::SomeObject(), some_proc(),
> > SomeObject::~SomeObject(). with_gc() omits the last call.
>
> Wasn't that /exactly/ Dave Harris' point? The two examples /can/ be
> compared since both envisage calling the default constructor of
> 'SomeObject' followed by a call to some_proc. 'with_raii()' does more
> (viz., calling the destructor of 'SomeObject') but without the
> overhead of GC. Dave further makes a case that an implementation that
> runs GC allows the optimiser to improve 'with_gc()'.
>
> What did I miss?

The fact that ~SomeObject() might be releasing a contentious resource
handle, of course.

Dave

Stephen Howe

unread,

Nov 21, 2004, 6:43:33 AM11/21/04

to

> This isn't going to happen, because the current C++ committee is
> totally uninterested in language safety.

I doubt whether that is even remotely true.

> Until some major changes are made in the committee, C++ will not become
> any
> safer, and we'll continue to have very unreliable software in that
> language.

I would be extremely interested to know what you would propose that would
make the language "safer".
I have a sneaking suspicion it would be "features" that would have the
drawback that we all have to accept that are executables are either slightly
slower or bigger.
Everybody would then have to swallow your opinion that the safety that
results, is worth paying for a slight loss in speed or bigger executables.
In reality, that is not even going to get off the ground.

If you can suggest "features" that make the langauge safer (for example C90
added function prototypes and C++ in its infancy added type-safe linkage),
_without_ having to give up any loss in speed or efficiency or bloated code,
then I am sure the committee will welcome your proposals with open arms.

Perhaps it be wise to think about the creators of C and creator of C++
achieved in that both languages are
such that executable code produced is pretty close to what the programmer
wrote without any excess run-time baggage.
That was part of the design criterion.
Any "safety" occurs mostly at compile-time, not at run-time.
That is not likely to change.

Stephen Howe

Peter Dimov

unread,

Nov 21, 2004, 6:15:46 PM11/21/04

to

mailto.anan...@gmail.com (Anand Hariharan) wrote in message
news:<22868227.04111...@posting.google.com>...

> pdi...@gmail.com (Peter Dimov) wrote in message news:<abefd130.04111...@posting.google.com>...

> > These two examples are not equivalent, so efficiency doesn't matter.
> > with_raii() executes SomeObject::SomeObject(), some_proc(),
> > SomeObject::~SomeObject(). with_gc() omits the last call.
> >
>
> Wasn't that /exactly/ Dave Harris' point? The two examples /can/ be
> compared since both envisage calling the default constructor of
> 'SomeObject' followed by a call to some_proc. 'with_raii()' does more
> (viz., calling the destructor of 'SomeObject') but without the
> overhead of GC. Dave further makes a case that an implementation that
> runs GC allows the optimiser to improve 'with_gc()'.
>
> What did I miss?

The side effects of ~SomeObject?

Peter Dimov

unread,

Nov 21, 2004, 6:16:08 PM11/21/04

to

bran...@cix.co.uk (Dave Harris) wrote in message

news:<memo.20041119205302.2064C@brangdon.m>...

> pdi...@gmail.com (Peter Dimov) wrote (abridged):
> > These two examples are not equivalent, so efficiency doesn't matter.
> > with_raii() executes SomeObject::SomeObject(), some_proc(),
> > SomeObject::~SomeObject(). with_gc() omits the last call.
>
> Yes. That's the point. We're considering the effect that difference in
> behaviour has on efficiency. With_gc() grants more freedom to the
> implementation and so is potentially more efficient.

No, with_gc doesn't grant more freedom to the implementation. It
simply does not execute ~SomeObject (and the implementation is not
free to invoke it), whereas with_raii does execute ~SomeObject (and
the implementation is not free to not invoke it). This is the only
difference between these two examples. with_raii is not required to
immediately deallocate the memory, if that's what you mean by with_gc
having more freedom.

> The .NET evangelist said "Deterministic de-construction is generally
> expensive...". He is not comparing one kind of deterministic
> deconstruction (eg try/catch) with another (eg RAII), he's comparing it
> with non-deterministic de-construction. So my example code is to the
> point.

Non-deterministic de-construction (if by that you mean finalization)
does not improve performance. Finalizers are evil and don't play well
with high-performance collectors. The only thing that does improve
performance is omitting de-construction.

> Although the examples are not equivalent, the first can substitute for the
> second (assuming no outstanding references), so it does make sense to
> compare their efficiencies.

No, it can't, because the behavior is different. The examples are only
equivalent if ~SomeObject has no observable side effects. If this is
the case, it can be optimized out and the examples become truly
equivalent and can generate the same code.

ka...@gabi-soft.fr

unread,

Nov 22, 2004, 4:33:36 PM11/22/04

to

al...@start.no (Alf P. Steinbach) wrote in message

news:<419e0b3d....@news.individual.net>...

They're linked by a third point: while garbage collection may be faster
than manual management of dynamically allocated objects, neither is as
fast as NO dynamic allocation -- of allocating objects on the stack, or
statically. (This is actually only partially true for large or complex
objects; if stack allocation causes you to make more copies, it may be
slower.) However, the real reason to insist on supporting stack based
and static objects isn't the performance win, but the fact that they
have different semantics, and that these semantics are very, very
useful. (Of course, the improved performance isn't a bad thing either.)

> There is another dimension here that is, I think, both more important
> and more directly relevant to the OP's question, "deterministic
> _destruction_" (the title of this thread), namely the difference
> between destructor calls and memory deallocation.

That's the basic difference between stack based objects and dynamically
allocated objects.

> I have no doubt that pure deallocation can both in principle and in
> practice be more efficiently done by a general garbage collector,
> simply because it's got a global view of things (memory, processor
> utilization), except in the few cases where we can use a specially
> crafted most efficient allocator such as a simple free-list / cache.

> I do doubt that there is any benefit from non-deterministic destructor
> calls, other than the dubious one of being able to turn off destructor
> calls on a per-object basis (possible in C#, and AFAIK done by default
> in Microsoft's Windows Forms GUI library for .NET).

That's pretty much my feeling as well, and IMHO, a garbage collector
could just ignore destructors. But I know that others disagree,
including some who know the problems a lot better than I do.

> The reason I doubt that there can be any benefit is two-fold: first of
> all, the calls will have to made anyway, and I fail to see how they
> can be faster when done by the garbage collector (other than that the
> garbage collector might choose to do the calls when there is otherwise
> low processor utilization); secondly, the kind of performance tweaking
> seemingly necessary for professional libraries like the one mentioned
> here means that not only do you not know when your destructor will be
> called, or if it ever will be called (if the application exits before
> the object is collected), you do not even know if there is a chance;
> so what possible benefit can automatically called destructor have in
> for example C# and Java (personal guideline: don't use it for
> anything!)?

> Is there anything in the C++ standard that prohibits operator delete
> from doing just the destructor call part, which I think it can do most
> efficiently and in a way that can be _relied_ on, and handing the
> memory deallocation bit over to a garbage collector which is better at
> that?

I don't think so. It's an interesting idea. But I suspect that much of
the advantage of garbage collection comes precisely from not having to
call most destructors at all; at least in my code, a fair percentage of
destructors are only concerned with memory management. (Note that it's
"I suspect". I'm really just guessing with regards to performance.)

> > The real argument for global and auto variables is thus, IMHO,
> > the additional behavior due to their different semantics. That
> > they are also faster than dynamic allocation is just icing on
> > the cake.

> I guess by "behavior", when it comes to auto variables you're
> referring to RAII. Well then, see last question above. Also, the D
> language allows classes to be declared 'auto', destructor call when
> object goes out of scope, and according to the documentation such
> objects are currently not allocated on the stack, i.e. they have the
> RAII semantics regarding destructor calls but not regarding memory
> reclamation.

Sort of like an std::auto_ptr with the logic you describe above?

IMHO, the real advantage of the RAII idiom, as it is usually used in
C++, is that the client does absolutely *nothing* to make it work. He
just defines an everyday local variable, and it just works.

Of course, the disadvantage is also that the client has no real control
over it. Like most people, I use RAII to manage mutex locks, and I've
had one or two cases (among several thousands -- they are exceptions)
where the scope of a local variable did NOT correspond to the time I
needed to hold the lock. Luckily, my mutex class still allows you to do
things the old way (which in turn allows creating new management
classes, with different ownership semantics).

I'm not sure; I don't really have enough experience to judge. What I do
feal sure about 1s 1) using local variables (as in C++) for
deterministic destructor calls IS more efficient than trying to achieve
deterministic destruction of dynamically allocated variables, and 2)
classical garbage collection (without deterministic destruction) is
cheaper than reference counting in general.

I'd guess that 99% of the time I need deterministic destruction, the
determination corresponds to the scope of a local variable. In the few
remaining cases, the number of copies of the object is very, very
limited, so I doubt that reference counting would be an objectional
expense.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dave Harris

unread,

Nov 22, 2004, 4:56:40 PM11/22/04

to

pdi...@gmail.com (Peter Dimov) wrote (abridged):

> No, with_gc doesn't grant more freedom to the implementation. It
> simply does not execute ~SomeObject (and the implementation is not
> free to invoke it), whereas with_raii does execute ~SomeObject (and
> the implementation is not free to not invoke it).

In with_gc, ~SomeObject will be executed by the garbage collector.

Remember we're not discussing C++ here, but some hypothetical mixture of
C# and C++ which has both garbage collection and RAII. Whether a C++ with
garbage collection should invoke destructors before reclamation is a
contentious point. I tend to think it shouldn't, but many people disagree
with me and I gather in C# it does.

The original poster's question wrote:
I asked why C# could not have simply incorporated
those semantics as a part of the language

so for this thread we have to build on C# as it is, not on C++ as it might
be.

> Non-deterministic de-construction (if by that you mean finalization)
> does not improve performance. Finalizers are evil and don't play well
> with high-performance collectors. The only thing that does improve
> performance is omitting de-construction.

I agree finalisers are evil. That isn't the point being discussed, though.
The code to call the finaliser exists once as part of the GC, and is not
replicated in each function which creates an object. That affects
efficiency.

> The examples are only equivalent if ~SomeObject has no observable
> side effects. If this is the case, it can be optimized out and
> the examples become truly equivalent and can generate the same code.

It can only be optimised out at a given call site if the implementation of
~SomeObject is known to the compiler at that call site. It may not be.

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Herb Sutter

unread,

Nov 23, 2004, 2:01:12 PM11/23/04

to

On 16 Nov 2004 06:31:54 -0500, M Jared Finder <ja...@hpalace.com> wrote:

>Anand Hariharan wrote:
> > I recently attended a talk given by a .NET evangelist. Surprisingly,

Just curious, who was it?

> > the speaker was quite sincere and explained why Garbage collection is
> > no panacea (citing examples such as database connections and file
> > handles), taking his presentation through muddy waters of Dispose,
> > Close, etc.
> >
> > At one point he showed how C# chose to overload the "using" keyword in
> > a completely unrelated context viz., to specify that the variables
> > within a block defined by "using" should be destroyed as soon as they
> > leave the scope.
> >

> > At that point I asked why C# could not have simply incorporated those
> > semantics as a part of the language rather than requiring the
> > programmer explicitly request it at specific places. His response:

> > "Deterministic de-construction is generally expensive, especially if
> > one has several small objects sporadically sprewn all over."

> >
> > Is there a merit (statistical/empirical) to his assertion?

Short answer: No, but it's a common misconception even among experts.
Having deterministic destruction can incur a minor performance penalty,
and it is this he is thinking about. But deterministic destruction also
gains a significant practical performance advantage, and is otherwise
desirable. Further on this:

> > I thought
> > C++ went great lengths for RAII to be possible, eschewing runtime
> > guzzlers (such as mark-and-sweep) largely on performance grounds.
>

>This just seems crazy. I can't see how C#'s using, Java's try-finally,
>or C++'s automatic destruction would generate code that is different in
>any way.

Right, but it's useful to understand the actual issue. Here it is:

Let's say you have a stack frame containing one or more conceptually local
objects. If any such object should be cleaned up at the end of the
function (or more local scope), either for performance reasons or for
correctness reasons. Depending on the language you're using, you express
that essentially identically as one of the following:

- in C++, a stack-based object with a nontrivial destructor
- in C#, a using clause for a Disposable object
- in Java, the hand-coded Dispose pattern

In each case, you incur the overhead of an implicit or explicit
"try/finally" for the first local object that will need the cleanup -- and
it is that try/finally that the people who worry about performance are
talking about.

Note, however, that:

1. The constructs for expressing this that are essentially the same in all
languages; the only question is ease of use, and the winners there are
C++, C#, and Java, in that order. (To be complete, I should acknowlege
that there are other areas where C# and Java win on ease of use, but in
this particular case it is C++ that is the simpler language.)

2. Generally it's wrong NOT to write the deterministic destruction when
objects are conceptually local. If the object needs to be Dispose'd, you
need to Dispose it. So it's usually a red herring to say this incurs some
potential overhead, because to avoid the overhead would be to write an
incorrect program (and/or a less well-performing one, see below).

3. There are offsetting performance advantages to early destruction. In
particular, you incur a local try/finally for the first local variable in
a given scope that requires the cleanup (additional ones are essentially
free because you already have the try/finally in place), but you often get
great performance benefits later by reducing finalizer pressure and GC
work. (In one example I cite in talks, the microsoft.com website uses .NET
widely but at one point found that they were spending 70% of total system
time(!) in the GC. It wasn't .NET's fault or the GC's fault, but rather in
the way that GC was being used. The CLR performance team analyzed the
problem and told the app team to make one change: Before making a
server-to-server call, clean up (Dispose) all the objects you don't need
any more. With that one change, GC went down to 1%. I submit that the
problem would never have occurred if the app had been written in C++,
which uses deterministic destruction by default. C# and Java have it off
by default, and if you forget to write "using" or the Dispose pattern then
your code will still compile, but will have either a correctness bug or a
latent performance problem.)

Otherwise, if none of the conceptually local object does not require
cleanup, you express that essentially identically as one of the following:

- in C++, a stack-based object with a trivial destructor
(or, a heap-based object)
- in C#, no using clause
- in Java, no Dispose pattern

In each case, you avoid adding the exeption handling to do the cleanup.
Again, it's the same in all language. C++ happens to do turn cleanup on by
default for stack-based objects and does this optimization to
automatically avoid the overhead when the cleanup work is trivial.

So the argument really doesn't boil down to what some people often say,
namely whether deterministic destruction of conceptually local objects is
a good thing or not -- clearly it is important, otherwise you wouldn't
have C++ auto semantics, C# using statements, and Java Dispose patterns!
The argument really boils down to this: When you do need deterministic
destruction, you really do need it regardless of the language you're
using, and to avoid the overhead would be to write an incorrect program
(and often one with more overhead in other places).

>I can see there being problems with an old ABI that requires
>each function to register itself as having a cleanup step, but standards
>can't prevent all stupid implementations. In addition, using garbage
>collection will remove much of the work done in destructors since a most
>of the resources used in a program tends to be memory.

The latter is true for trivial destructors. In short, finalizers (often
but incorrectly called "destructors that run at GC time" which they are
NOT) are fundamentally flawed and extremely complex. (See for example
http://blogs.msdn.com/cbrumme/archive/2004/02/20/77460.aspx.)

I have personally come to the conclusion that destructors and GC are
completely separate and must be kept completely separate. Trying to
conflate the two ideas is the root of most of the problems with current GC
systems in my opinion; in particular, this manifestes most notably in the
case of finalizers which exactly try to tie those two things together, and
in the fact that all major current GC systems attempt to do GC instead of
destructors, rather than in addition to destructors (with the notable
exception of C++/CLI).

Of course, C++/CLI exposes what the CLI does (including finalizers) and
what C++ does (destructors) and by bringing them together shows how
beneficial destructors are even for today's GC systems. I think that
C++/CLI is the best it can be in this regards and is really compelling
over the current alternatives. I also think this approach could be taken
further and further improved upon; I have definite ideas, not ready for
publication, on potential improvements in GC by removing finalizers
outright (which could be viewed as somewhat radical and I agree that
departing from longtime practice is something that should never be done
lightly).

>I'd be interested in what Herb Sutter had to say about this, considering
>that one of the big advantages of C++/CLI over C# is the automatic
>calling of destructors.

Yes. Of course, C# has other advantages; I drool over anonymous delegates
(a restricted but very useful form of lambda functions / closures). It
would be cool to have those in C++... but that's another release...

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

Alf P. Steinbach

unread,

Nov 24, 2004, 4:28:26 AM11/24/04

to

* Herb Sutter:

>
> I have personally come to the conclusion that destructors and GC are
> completely separate and must be kept completely separate. Trying to
> conflate the two ideas is the root of most of the problems with current GC
> systems in my opinion;

Very agreed.

What is the chance of e.g. Microsoft (or some) implementing the idea I
sketched earlier in this thread, of having C++ operator 'delete' just
call destructors and pass the memory deallocation work over to GC?

If someone could just do GC _right_ -- GC, and not all kinds of other
stuff -- then by default we'd get programs that would be both safer
and more efficient, instead of today's less safe and less efficient.

> Yes. Of course, C# has other advantages; I drool over anonymous delegates
> (a restricted but very useful form of lambda functions / closures). It
> would be cool to have those in C++... but that's another release...

Very disagreed. Nothing is as bad as canned functionality. Classes
that are implemented by the compiler, can only be implemented by the
compiler, and can't be extended, are as canned as they can be, VB style.

I'd rather have the corresponding and much more general Java concept in
C++ -- and for that matter, in C#.

But I think C++ is large enough as is... ;-)

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alexander Terekhov

unread,

Nov 24, 2004, 4:26:26 AM11/24/04

to

David Abrahams wrote:
[...]

> > For the most part, RAII also works well in multi-threaded situations.
> > Most things aren't shared between threads, in well-designed apps.
>

> Also, I think with a lock-free memory allocator many of the
> performance advantages of GC disappear even for the multithreaded
> case.

With real GC you don't need to maintain shared counts.

Branimir Maksimovic

unread,

Nov 24, 2004, 4:34:29 AM11/24/04

to

ka...@gabi-soft.fr wrote in message news:<d6652001.04111...@posting.google.com>...

> bran...@cix.co.uk (Dave Harris) wrote in message
> news:<memo.20041116215931.2492C@brangdon.m>...
>
> > A good GC implementation will not have any offsetting inefficiencies.
> > The allocation cost will be the same, and deallocation is basically
> > free - GC costs are mainly proportional to the number of live
> > objects. So it's a net win for GC (given that it is present and
> > running and the object is on the heap anyway).
>
> With a good relocating GC implementation, allocation cost will be much,
> much less than with manual memory management. Actual deallocation costs
> depend on different factors, and will often be higher than for manual
> management (you do have to relocate still live objects), but the total
> of both typically favors garbage collection. And the fact that the cost
> of deallocation occurs asynchronously will, in many applications, allow
> it to be shifted to dead time (e.g. while waiting for input), which
> makes it effectively free.
>
> Of course, I don't think that relocating GC is compatible with C++, as
> it now stands.

I think both of you are wrong. asynchronous GC will probably
work ok with single threaded applications but will choke
on multiple threads.
You cannot move live objects without locks;
That would result either in excessive memory allocation or
slowing down whole program.
Just imagine design that you have 2 threads that does actuall work
and one more that is responsible for memory allocation/deallocation,
which is controled by condition variable triggered by alloc/free
,syscalls, and perhaps timers... and that is basically garbage
collector.
allocs can be fast; just grab memory from already prepared available
pool
and give it away, but costs of reorganization/synchronization can be
high because there is no way that GC can know what other threads are
doing.
In some situations it would work perfect but in other, that can turn
4 CPU machine into one CPU machine :)

How's that better than (for example) this design.
There are memory pool of fixed allocators from strictest allignment
requirement up to some maximum value in different categories.
Category determines granularity of pool so each category is
represented
as vector of pointers to allocators; index of that vector determines
size of allocation and lower_bound of vector of vectors selects
category.
Maximum free memory that pool of allocators can hold is configurable.
Each thread has it's own independent allocator(actually pool of
allocs) and
memory blocks can be used interchangably between threads so one can
allocate other can free without locks.
Each allocator detects block that does not own and puts it in vector
of pointers called foreign blocks. When some configurable threshold
is reached thread puts free list of foreign blocks in global map
protected
by lock indexed by thread id.
When pool of available blocks
is exhausted allocator first takes foreign blocks, but if not
available
acquires lock for global map, then lookups for it's thread id and
checks
if free list is available. If avaliable it takes it.
If not, another allocator is used which is lock based and is
responsible for
maximum memory utilization(defragmentation and that things that
are done by GC).
So cost of allocation/deallocation is lower_bound over vector of eg 12
elements with single indexing into vector calculated by size.
So basically when application starts to allocate cost is high
but when some treshold is reached this flies like jet.
When testing against malloc from glibc we get lesser consumption
of memory( we have set of predefined block sizes which results in
overhead in each block but gets it's toll on lesser or no
fragmentation)
and significantly faster performance on multiple CPU's.
Compared to default allocator which is used in gcc's stdlib
it pushes and pops from completely sepparate lists in
10 different threads on single cpu 800% faster :)
When compared with __USE_MALLOC allocator it scales better
aproximately by the number of CPU's.

Greetings, Bane.

Andrew Browne

unread,

Nov 24, 2004, 2:33:30 PM11/24/04

to

"Herb Sutter" <hsu...@gotw.ca> wrote in message
news:3ht6q09j6gso7lank...@4ax.com...

> Let's say you have a stack frame containing one or more conceptually
> local
> objects. If any such object should be cleaned up at the end of the
> function (or more local scope), either for performance reasons or for
> correctness reasons. Depending on the language you're using, you
> express
> that essentially identically as one of the following:
>
> - in C++, a stack-based object with a nontrivial destructor

<snip>

> Otherwise, if none of the conceptually local object does not require
> cleanup, you express that essentially identically as one of the
> following:
>
> - in C++, a stack-based object with a trivial destructor
> (or, a heap-based object)

Unless I'm missing something, surely in many cases in C++/CLI, what
we're
actually going to be dealing with is something like the following

void F()
{
SomeType^ something = SomeLibraryFunctionReturningAHandle();

// .... some code ...
// we ought to clean up here if SomeType has a nontrivial destructor
}

We can't just do something like

void F()
{
SomeType something(*SomeLibraryFunctionReturningAHandle());

// .... some code ...
// automatic cleanup here
}

because SomeLibraryFunctionReturningAHandle() may return a handle to a
derived type, SomeType may well not have a copy constructor anyway, and
in
any case it would only be the copy that got cleaned up.

It seems to me that we really need something like this:-

void F()
{
smart_handle<SomeType>
something(SomeLibraryFunctionReturningAHandle());

// .... some code ...
// automatic cleanup here
}

and that, just as in Standard C++ we are encouraged to "avoid using bald
pointers" (e.g. Sutter in August 2004 CUJ), we should avoid using bald
handles in C++/CLI and use some smart handle equivalent of
std::auto_ptr or
tr1::shared_ptr. It seems to me that the equivalent of tr1::shared_ptr
would
have to use reference counting. (It could be argued that it could be
specialised to not use reference counting where it could be determined
at
compile time that its template argument T had a trivial destructor -
using
something like tr1::is_base_of<System::IDisposable, T> - but T might
simply
be a base type such as System::Object.) So it looks like we'd have the
performance overhead of smart pointers and garbage collection has
gained us
nothing.

BTW This isn't meant to be an attack on C++/CLI, which is a development
I'm
very interested in. I'd be very happy to find out that I'm mistaken in
my
interpretation above!

Andrew Browne

ka...@gabi-soft.fr

unread,

Nov 24, 2004, 9:09:56 PM11/24/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message
news:<3ht6q09j6gso7lank...@4ax.com>...

[...]

> Let's say you have a stack frame containing one or more conceptually
> local objects. If any such object should be cleaned up at the end of
> the function (or more local scope), either for performance reasons or
> for correctness reasons. Depending on the language you're using, you
> express that essentially identically as one of the following:

> - in C++, a stack-based object with a nontrivial destructor
> - in C#, a using clause for a Disposable object
> - in Java, the hand-coded Dispose pattern

> In each case, you incur the overhead of an implicit or explicit
> "try/finally" for the first local object that will need the cleanup --
> and it is that try/finally that the people who worry about performance
> are talking about.

Doesn't this depend on the implementation. In C++, the cost of the
"try/finally" is essentially 0 in the implementations I'm familiar
with. Presumably, the try/finally in Java could use a similar
technique, but from what I've seen, they don't.

There is another essential difference: in C++, you have a separate
instance of this block, more or less, for each object which has a
non-trivial destructor. In Java, from what I have seen, it is usual to
only have one, even when several objects are concerned.

Finally, it is important to realize what is meant by "non trivial
destructor" in C++. There are two very large classes of objects which
have non trivial destructors in C++, but would not have them in Java:
- objects which use dynamic memory, and have to free it, and
- base class objects, which will typically have a virtual destructor,
which means a user defined destructor, which means a non-trivial
destructor, even if there is absolutely nothing to do.
Or course, it will probably be fairly rare for an object of the second
category to be on stack. On the other hand, it probably won't be that
rare for the on stack objects in C++ to be smart pointers, whose
destruction will trigger the destruction of another object. Ad
infinitum.

I think that even trying to compare a value oriented language with
manual memory management (like C++) to an everything is a dynamic object
with garbage collection language in this regard is a bit like comparing
apples to oranges. There are so many differences, starting with the
fact that you write programs in a different manner, that about the only
real thing you can do is to take two different implementations of the
same program, or the same program specifications, and compare them. But
even then, you are comparing specific implementations of the language.

> Note, however, that:

> 1. The constructs for expressing this that are essentially the same in
> all languages; the only question is ease of use, and the winners there
> are C++, C#, and Java, in that order. (To be complete, I should
> acknowlege that there are other areas where C# and Java win on ease of
> use, but in this particular case it is C++ that is the simpler
> language.)

There is also a question of robustness. I guess it is sort of linked to
ease of use by the client of the class in question, but the fact is that
if I provide my class with a non-trivial destructor in C++, there is no
way the client code can forget to call it. In Java, it is very easy --
in fact, usual in the Java code I've seen -- to forget to set up a
try/finally block. I'm not familiar with C#, but from what I gather
from this discussion, while it is much easier for the client to set up
the necessary mechanism than it would be in Java, it is still up to the
client to do so.

The critical argument for the C++ idiom (in this particular situation)
is that the client doesn't have to do anything. He gets the clean up
code automatically; in fact, he cannot avoid it.

> 2. Generally it's wrong NOT to write the deterministic destruction
> when objects are conceptually local.

I'm not sure I understand this point. In Java, for example, almost all
of my instances of StringBuffer, and a lot of instances of String, were
conceptually local, i.e. they were used locally for a single job, and
then forgotten. I never used deterministic destruction with them,
however, and I don't think that it was wrong not to.

Globally, I'd say that this is probably true for most value oriented
objects. Of course, in C++, a lot of value oriented objects don't have
a non-trivial destructor; those that do normally only have one for
memory management purposes. So these are objects which wouldn't use the
Dispose pattern in Java. But they are objects which are conceptually
local, unless you mean something else by conceptually local.

> If the object needs to be Dispose'd, you need to Dispose it. So it's
> usually a red herring to say this incurs some potential overhead,
> because to avoid the overhead would be to write an incorrect program
> (and/or a less well-performing one, see below).

Agreed. If the object needs to be Dispose'd, then the time needed to
set up the try/finally block will normally be negligible compared to the
time it needs in the dispose method, as well.

Of course, the real difference between C++ and these other languages is
that a lot more objects in C++ need to be Dispose'd, since this is how
we manage memory.

Again, with the compiler I use, there is NO exception handling cleanup
code which is executed in the normal path.

> Again, it's the same in all language. C++ happens to do turn cleanup
> on by default for stack-based objects and does this optimization to
> automatically avoid the overhead when the cleanup work is trivial.

The only optimization is not calling the non-existant destructor. I
don't call that an optimization.

> So the argument really doesn't boil down to what some people often
> say, namely whether deterministic destruction of conceptually local
> objects is a good thing or not -- clearly it is important, otherwise
> you wouldn't have C++ auto semantics, C# using statements, and Java
> Dispose patterns! The argument really boils down to this: When you do
> need deterministic destruction, you really do need it regardless of
> the language you're using, and to avoid the overhead would be to write
> an incorrect program (and often one with more overhead in other
> places).

> >I can see there being problems with an old ABI that requires each
> >function to register itself as having a cleanup step, but standards
> >can't prevent all stupid implementations. In addition, using garbage
> >collection will remove much of the work done in destructors since a
> >most of the resources used in a program tends to be memory.

> The latter is true for trivial destructors.

In C++, a destructor which only does deletes is not trivial. In C# or
in Java, it would not require deterministic disposal -- in fact, one
could argue that it doesn't require it in C++ either, except that
deterministic disposal is the only type of disposal we have.

> In short, finalizers (often but incorrectly called "destructors that
> run at GC time" which they are NOT) are fundamentally flawed and
> extremely complex. (See for example
> http://blogs.msdn.com/cbrumme/archive/2004/02/20/77460.aspx.)

> I have personally come to the conclusion that destructors and GC are
> completely separate and must be kept completely separate.

This is something I've felt for a long time myself as well. On the
other hand, some of the people who know the issues far better than I do
seem to think that calling destructors during garbage collection can be
important, so I'm not sure.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Herb Sutter

unread,

Nov 25, 2004, 3:50:28 AM11/25/04

to

On 24 Nov 2004 14:33:30 -0500, "Andrew Browne"

<clcppm...@this.is.invalid> wrote:
>It seems to me that we really need something like this:-
>
>void F()
>{
> smart_handle<SomeType>
>something(SomeLibraryFunctionReturningAHandle());
>
> // .... some code ...
> // automatic cleanup here
>}

Yes, that's part of the product. It's spelled "auto_handle" which we might
shorted to "auto_hnd".

>and that, just as in Standard C++ we are encouraged to "avoid using bald
>pointers" (e.g. Sutter in August 2004 CUJ), we should avoid using bald
>handles in C++/CLI and use some smart handle equivalent

That's fine in that example. Getting back to the original examples,
stack-allocated automatic objects are useful critters that we use all the
time in C++. Now they work on CLI types too. That's all.

Basically, you are demonstrating nicely that all the C++ techniques and
idioms we're used to with native types really do apply evenly across the
type system also to CLI types. :-) I think that's a good thing.

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Emil Dotchevski

unread,

Nov 25, 2004, 3:49:03 AM11/25/04

to

> The performance problem is real, but optimizable.
> For C++, it's an artifact of trying to retrofit reference
> counts via templates.
> Optimizers need to know about reference counts. Reference
> count updates can potentially be hoisted out of innner loops
> using standard optimization techniques, if the compiler knows
> something about reference counts.

However note that since shared_ptr and weak_ptr are now TR1, at some
point in the future they will be part of the C++ standard. As such,
the implementation is free to optimize code based on their
standardized semantics without having to parse or understand any C++
source code that may be defining them. In other words, optimizers
would be free to "know about" reference counts.

(Then again, if compilers start to optimize based on standardized
semantics, shared_ptr and weak_ptr may not necessarily be the best
candidates for this. It would probably be more beneficial to start off
with std::vector instead, I think.)

> If you had optimization of reference counts and subscript
> checking at compile time, you could have the safety of Perl
> and Python with the speed of C/C++.

Subscript checking is a matter of implementation. A conforming C++
compiler could be checking array boundaries at run-time, including in
the case when the arrays are accessed through pointers. Also,
virtually all current implementations do range checking on all STL
containers and iterators, in debug mode.

> This isn't going to happen, because the current C++ committee is
> totally uninterested in language safety. Until some major
> changes are made in the committee, C++ will not become any
> safer, and we'll continue to have very unreliable software
> in that language.

Perhaps you mean to say that they are resisting core language changes
when standard library changes would suffice.

A core language change like the one you suggest would break a lot of
legacy code. Such code may be "unsafe", but if it is working you would
have hard time convincing its maintainers that they should throw it
away.

--Emil

Gerhard Wesp

unread,

Nov 26, 2004, 8:03:46 AM11/26/04

to

Emil Dotchevski <em...@collectivestudios.com> wrote:
> the case when the arrays are accessed through pointers. Also,
> virtually all current implementations do range checking on all STL
> containers and iterators, in debug mode.

They do?!

I just checked a very widespread open-source implementation, and it
definitely doesn't. I don't know why. I was always asking myself why
the h*** isn't there a simple assert( i < size() ) in
std::vector<>::operator[]( size_type i ). This would already catch *a
lot* of trivial errors and has no performance penalty for release
builds.

Cheers
-Gerhard
--
Gerhard Wesp o o Tel.: +41 (0) 43 5347636
Bachtobelstrasse 56 | http://www.cosy.sbg.ac.at/~gwesp/
CH-8045 Zuerich \_/ See homepage for email address!

Branimir Maksimovic

unread,

Nov 26, 2004, 11:26:50 AM11/26/04

to

ka...@gabi-soft.fr wrote in message
news:<d6652001.04112...@posting.google.com>...

>
> In C++, a destructor which only does deletes is not trivial. In C# or
> in Java, it would not require deterministic disposal -- in fact, one
> could argue that it doesn't require it in C++ either, except that
> deterministic disposal is the only type of disposal we have.
>

Actually, this can be done in c++ either, providing that every class
is derived from same single base class like Object;
I guess C# and Java references are not simple pointers, they are
objects
that interact with memory manager. So, if such design is needed
something like shared_ptr that does not deletes, rather calls
something like
MemManager::instance().dispose(Object*p) instead of delete, can just
notify
manager which runs in other thread that object does not have any
references left and that manager is free to delete it at appropriate
time (for example when pool off available memory blocks is exhausted)
, but I would leave reference counting job to shared_ptr.

Greetings, Bane.

Herb Sutter

unread,

Nov 26, 2004, 6:51:12 PM11/26/04

to

On 24 Nov 2004 21:09:56 -0500, ka...@gabi-soft.fr wrote:
>> - in C++, a stack-based object with a nontrivial destructor
>> - in C#, a using clause for a Disposable object
>> - in Java, the hand-coded Dispose pattern
>
>> In each case, you incur the overhead of an implicit or explicit
>> "try/finally" for the first local object that will need the cleanup --
>> and it is that try/finally that the people who worry about performance
>> are talking about.
>
>Doesn't this depend on the implementation. In C++, the cost of the
>"try/finally" is essentially 0 in the implementations I'm familiar
>with. Presumably, the try/finally in Java could use a similar
>technique, but from what I've seen, they don't.

I was focusing on the fact that there is a try/finally which is present in
all cases. Separate from that is that there are different EH
implementations with different costs for that try block (VC++ currently
uses both of the popular approaches, and which one is generated depends on
the processor target).

>There is another essential difference: in C++, you have a separate
>instance of this block, more or less, for each object which has a
>non-trivial destructor. In Java, from what I have seen, it is usual to
>only have one, even when several objects are concerned.

Some of that is up to the implementation. At the source code level you can
and do get the same effect with the Java Dispose pattern and C# "using"
with what is the equivalent of nested blocks in C++ (i.e., variables with
shorter/nested lifetimes because of being in shorter/inner blocks).

>I think that even trying to compare a value oriented language with
>manual memory management (like C++) to an everything is a dynamic object
>with garbage collection language in this regard is a bit like comparing
>apples to oranges.

That is at the heart of it, yes, but the more I work in this area the
fewer truly fundamental (i.e., not solvable) differences I find.

For the memory management point: It's probably only a matter of time until
we have standardized GC support for the native heap (which is already
optional) which removes one of the two differences.

For the value oriented language point: Yes, there's a basic difference
between value and reference types -- and C++ already has both of them,
which is what makes it easier for C++ to be extended to platforms like CLI
that mostly only have the latter.

>There is also a question of robustness. I guess it is sort of linked to
>ease of use by the client of the class in question, but the fact is that
>if I provide my class with a non-trivial destructor in C++, there is no
>way the client code can forget to call it.

Yes, I was trying to be kind by only calling that "ease of use" here. :-)
I did in another place go further and point out that C++ is all about
"correctness by default," and point #2 was directed at this.

>In Java, it is very easy --
>in fact, usual in the Java code I've seen -- to forget to set up a
>try/finally block. I'm not familiar with C#, but from what I gather
>from this discussion, while it is much easier for the client to set up
>the necessary mechanism than it would be in Java, it is still up to the
>client to do so.
>
>The critical argument for the C++ idiom (in this particular situation)
>is that the client doesn't have to do anything. He gets the clean up
>code automatically;

Bingo.

>in fact, he cannot avoid it.

Other than by using the heap, of course. Or when the "local" object is
returned from another function, such as a factory, and has to be held by a
smart pointer.

>Of course, the real difference between C++ and these other languages is
>that a lot more objects in C++ need to be Dispose'd, since this is how
>we manage memory.

It turns out that a lot more objects in Java and CLI and C# need to be
Dispose'd than the designers of those environments originally seem to have
thought.

>> In short, finalizers (often but incorrectly called "destructors that
>> run at GC time" which they are NOT) are fundamentally flawed and
>> extremely complex. (See for example
>> http://blogs.msdn.com/cbrumme/archive/2004/02/20/77460.aspx.)
>
>> I have personally come to the conclusion that destructors and GC are
>> completely separate and must be kept completely separate.
>
>This is something I've felt for a long time myself as well. On the
>other hand, some of the people who know the issues far better than I do
>seem to think that calling destructors during garbage collection can be
>important, so I'm not sure.

No, it really is worse than that: It is fundamentally impossible in the
general case to correctly call _destructors_ during GC.

That is why people invented the separate concept of _finalizers_, which do
run at GC time, but finalizers really are not destructors, and they can
only do a subset of the operations that a destructor can do. For example,
finalization has to be unordered (because of cycles) and so in your
finalizer you can never reliably touch another finalizable object because
it might already have been torn down. Clearly many destructors can and do
touch other objects, and so it is impossible to call such destructors
correctly at finalization time. See the blog above for more about the
limitations of finalizers.

The main reason people (including GC experts) believe that running
finalizers at GC time is important is because pretty much all major GC
systems that have ever existed do GC _instead of_ destructors. Absent
destructors, you need to have a last chance to tear key objects down
somehow, and finalizers are that last-chance patch. My belief is that in a
language that has GC _in addition to_ destructors, you probably don't need
or want finalizers at all. Finalizers are extremely problematic, as noted
in the blog above.

Herb

---
Herb Sutter (www.gotw.ca) (www.pluralsight.com/blogs/hsutter)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Herb Sutter

unread,

Nov 28, 2004, 6:38:33 AM11/28/04

to

On 26 Nov 2004 18:51:12 -0500, Herb Sutter <hsu...@gotw.ca> wrote:
>That is why people invented the separate concept of _finalizers_, which do
>run at GC time, but finalizers really are not destructors, and they can
>only do a subset of the operations that a destructor can do. For example,
>finalization has to be unordered (because of cycles) and so in your
>finalizer you can never reliably touch another finalizable object because
>it might already have been torn down.

To this I should also add that you can invest effort to arrange things in
such a way that you can be sure that another finalizable object you want
to use hasn't been finalized already when your own finalizer runs, but
such designs typically require handshaking and are very brittle. Not
recommended.

John Nagle

unread,

Nov 29, 2004, 6:02:05 AM11/29/04

to

Gerhard Wesp wrote:
> Emil Dotchevski <em...@collectivestudios.com> wrote:
> > the case when the arrays are accessed through pointers. Also,
> > virtually all current implementations do range checking on all STL
> > containers and iterators, in debug mode.
>
> They do?!

Few STL implementations have serious checking. STLport in debug
mode has strong checking, with iterator validation. But it's rare.

>
> I just checked a very widespread open-source implementation, and it
> definitely doesn't. I don't know why. I was always asking myself why
> the h*** isn't there a simple assert( i < size() ) in
> std::vector<>::operator[]( size_type i ).

That's by design. It's for "consistency with built-in arrays"
and "performance".

(Of course, if C++ compilers knew more about containers, most
subscript checks could be hoisted out of loops, as they were in
some advanced Pascal compilers. But that technology has been
lost to history.)

John Nagle
Animats

Howard Hinnant

unread,

Nov 29, 2004, 4:45:56 PM11/29/04

to

In article <05wqd.26247$zx1....@newssvr13.news.prodigy.com>,
John Nagle <na...@animats.com> wrote:

> Gerhard Wesp wrote:
> > Emil Dotchevski <em...@collectivestudios.com> wrote:
> > > the case when the arrays are accessed through pointers. Also,
> > > virtually all current implementations do range checking on all STL
> > > containers and iterators, in debug mode.
> >
> > They do?!
>
> Few STL implementations have serious checking. STLport in debug
> mode has strong checking, with iterator validation. But it's rare.

Your information is out of date by a couple of years. Most
implementations do have such a debug mode. I can personally speak for
Metrowerks.

-Howard

Seungbeom Kim

unread,

Nov 29, 2004, 4:52:43 PM11/29/04

to

John Nagle wrote:

> Gerhard Wesp wrote:
>
>>I just checked a very widespread open-source implementation, and it
>>definitely doesn't. I don't know why. I was always asking myself why
>>the h*** isn't there a simple assert( i < size() ) in
>>std::vector<>::operator[]( size_type i ).
>
> That's by design. It's for "consistency with built-in arrays"
> and "performance".

With respect to performance, asserts simply become null statements in
release builds, so does it matter? Besides, separate libraries for debug
builds and release builds are not necessary since all the definitions
are in the header files, in most cases. I think the benefit of catching
the errors outweighs the loss of performance in debug builds.

(Well, it could be argued that someone would be sure of having no
out-of-range errors but still want asserts in other cases and want to
leave the asserts in release builds..)

With respect to consistency, no matter whether it is an built-in array
or a vector, as soon as you access an out-of-range element the behaviour
is undefined, so it doesn't matter whether the program continues or
aborts. A sensible program should not depend on what would happen when
it accessed an out-of-range element.

--
Seungbeom Kim

Gerhard Wesp

unread,

Nov 30, 2004, 5:01:10 PM11/30/04

to

Seungbeom Kim <musi...@bawi.org> wrote:
> (Well, it could be argued that someone would be sure of having no
> out-of-range errors but still want asserts in other cases and want to
> leave the asserts in release builds..)

Then it's easy to define one's own assert() macro a la
``always_assert()''.

Cheers
-Gerhard
--
Gerhard Wesp o o Tel.: +41 (0) 43 5347636
Bachtobelstrasse 56 | http://www.cosy.sbg.ac.at/~gwesp/
CH-8045 Zuerich \_/ See homepage for email address!

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Nov 30, 2004, 6:33:23 PM11/30/04

to

Seungbeom Kim <musi...@bawi.org> wrote in message
news:<cog1ob$ldq$1...@news.Stanford.EDU>...
> John Nagle wrote:

> > Gerhard Wesp wrote:

> >>I just checked a very widespread open-source implementation, and it
> >>definitely doesn't. I don't know why. I was always asking myself
> >>why the h*** isn't there a simple assert( i < size() ) in
> >>std::vector<>::operator[]( size_type i ).

> > That's by design. It's for "consistency with built-in arrays"
> > and "performance".

> With respect to performance, asserts simply become null statements in
> release builds, so does it matter?

I've yet to deliver any code where asserts have simply become null
statements. Sounds strang to me, sort of like wearing a life jacket for
your lessons in port, and taking it off when you go to sea.

> Besides, separate libraries for debug builds and release builds are
> not necessary since all the definitions are in the header files, in
> most cases.

Which is formally a problem, since if I do define NDEBUG in one module,
but not in another, I have undefined behavior (violation of the
one-definition rule). Of course, I've never seen a compiler where this
is a problem, but if for some reason the compiler doesn't inline the
code (not likely for vector::operator[], but possible for more complex
functions), it's pretty much up in the air whether the function will
have the asserts or not.

> I think the benefit of catching the errors outweighs the loss of
> performance in debug builds.

> (Well, it could be argued that someone would be sure of having no
> out-of-range errors but still want asserts in other cases and want to
> leave the asserts in release builds..)

That's the usual pratice.

> With respect to consistency, no matter whether it is an built-in array
> or a vector, as soon as you access an out-of-range element the
> behaviour is undefined, so it doesn't matter whether the program
> continues or aborts. A sensible program should not depend on what
> would happen when it accessed an out-of-range element.

Certainly. On the other hand, it's far preferable to know immediatly
that the program is broken, rather than for it to continue, and give
wrong results.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Wesp

unread,

Dec 1, 2004, 5:13:59 PM12/1/04

to

ka...@gabi-soft.fr wrote:
> I've yet to deliver any code where asserts have simply become null
> statements. Sounds strang to me, sort of like wearing a life jacket for
> your lessons in port, and taking it off when you go to sea.

One might just as well compare it to a child learning to ride it's
bicycle, using support wheels. When enough confidence is built up, the
support wheels are thrown away and you can go much faster.

See my other posting with the always_assert() macro, though!

> Which is formally a problem, since if I do define NDEBUG in one module,
> but not in another, I have undefined behavior (violation of the

Hmm... Good point. Seems to necessitate separate debug and release
builds for all your dependent modules.

Cheers
-Gerhard
--
Gerhard Wesp o o Tel.: +41 (0) 43 5347636
Bachtobelstrasse 56 | http://www.cosy.sbg.ac.at/~gwesp/
CH-8045 Zuerich \_/ See homepage for email address!

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dave Harris

unread,

Dec 1, 2004, 5:59:03 PM12/1/04

to

hin...@metrowerks.com (Howard Hinnant) wrote (abridged):

> > Few STL implementations have serious checking. STLport in debug
> > mode has strong checking, with iterator validation. But it's rare.
>
> Your information is out of date by a couple of years. Most
> implementations do have such a debug mode. I can personally speak for
> Metrowerks.

Microsoft's VC++ 7.1 doesn't have such a debug mode. I believe 7.1 is
still their current release. I suspect they account for "most"
implementations, at least in the Windows world.

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Dec 1, 2004, 7:58:17 PM12/1/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message

news:<r7req09anne3692hu...@4ax.com>...

> On 24 Nov 2004 21:09:56 -0500, ka...@gabi-soft.fr wrote:
> >> - in C++, a stack-based object with a nontrivial destructor
> >> - in C#, a using clause for a Disposable object
> >> - in Java, the hand-coded Dispose pattern

> >> In each case, you incur the overhead of an implicit or explicit
> >> "try/finally" for the first local object that will need the
> >> cleanup -- and it is that try/finally that the people who worry
> >> about performance are talking about.

> >Doesn't this depend on the implementation. In C++, the cost of the
> >"try/finally" is essentially 0 in the implementations I'm familiar
> >with. Presumably, the try/finally in Java could use a similar
> >technique, but from what I've seen, they don't.

> I was focusing on the fact that there is a try/finally which is
> present in all cases. Separate from that is that there are different
> EH implementations with different costs for that try block (VC++
> currently uses both of the popular approaches, and which one is
> generated depends on the processor target).

OK. In fact, of course, implementers will usually optimize the most
used constructions. Since the only thing we have to manage memory in
C++ is deterministic destruction, i.e. an implicit try/catch block,
implementors make an important effort to optimize it. If we had
non-deterministic garbage collection, and RAII was only used for
non-memory resources (and transaction management), implementors would
probably make an important effort to optimize garbage collection, and a
lot less effort to optimize deterministic destruction.

> >There is another essential difference: in C++, you have a separate
> >instance of this block, more or less, for each object which has a
> >non-trivial destructor. In Java, from what I have seen, it is usual
> >to only have one, even when several objects are concerned.

> Some of that is up to the implementation. At the source code level you
> can and do get the same effect with the Java Dispose pattern and C#
> "using" with what is the equivalent of nested blocks in C++ (i.e.,
> variables with shorter/nested lifetimes because of being in
> shorter/inner blocks).

I was thinking more along the idea that as soon as you have two local
objects with non-trivial destructors, you have in fact two try/finally
blocks, one for each object (since there is forceably a moment when one
object has been constructed, but not the second). I don't know C#, but
in Java, it is rare to have nested try/finally blocks.

Of course, this could just be another manifestation of the fact that we
need deterministic destruction a lot more in C++, since we have no other
alternative for memory management (which in my experience is all that
90% of the destructors are concerned with).

> >I think that even trying to compare a value oriented language with
> >manual memory management (like C++) to an everything is a dynamic
> >object with garbage collection language in this regard is a bit like
> >comparing apples to oranges.

> That is at the heart of it, yes, but the more I work in this area the
> fewer truly fundamental (i.e., not solvable) differences I find.

I don't think that there are any really fundamental differences. But
they do lead to different programming styles. And implementors of each
language will optimize the styles most appropriate, and thus most used,
in that language. We get the common result that people will write a
program in their preferred language, then transcribe it faithfully in
another language, compare execution times, and report in good faith that
their preferred language is faster, on the base of their benchmark. In
general, a C++ will run slower when transcribed to Java, and a Java
program will run slower when transcribed literally to C++. So comparing
literal transcriptions of the same program is irrelevant. And if what
you are comparing are different programs, what does it mean?

In pratice, I think you have so much variation due to programming style,
and so much due to differences in the implementations, that any
difference due to the language just gets lost in the noise. And I'm
very sceptical of anyone who claims otherwise -- that language A is
inherently faster than language B.

> For the memory management point: It's probably only a matter of time
> until we have standardized GC support for the native heap (which is
> already optional) which removes one of the two differences.

It's already been a long time. The Boehm collector has been available
for many years now.

And strictly speaking, there are things that you can legally do in
standard C++ which will fool the Boehm collector, and probably any other
collector. I don't think that they are things that are necessarily
reasonable, and I'm pretty sure that I've not done them in my code, but
they are legal. Which means that garbage collection isn't available,
even optionally, in standard conformant C++. (Of course, export is
available, and sockets and threads aren't, in standard conformant C++.
Just to relativize the importance.)

> For the value oriented language point: Yes, there's a basic difference
> between value and reference types -- and C++ already has both of them,
> which is what makes it easier for C++ to be extended to platforms like
> CLI that mostly only have the latter.

> >There is also a question of robustness. I guess it is sort of
> >linked to ease of use by the client of the class in question, but
> >the fact is that if I provide my class with a non-trivial destructor
> >in C++, there is no way the client code can forget to call it.

> Yes, I was trying to be kind by only calling that "ease of use"
> here. :-) I did in another place go further and point out that C++ is
> all about "correctness by default," and point #2 was directed at this.

> >In Java, it is very easy -- in fact, usual in the Java code I've
> >seen -- to forget to set up a try/finally block. I'm not familiar
> >with C#, but from what I gather from this discussion, while it is
> >much easier for the client to set up the necessary mechanism than it
> >would be in Java, it is still up to the client to do so.

> >The critical argument for the C++ idiom (in this particular
> >situation) is that the client doesn't have to do anything. He gets
> >the clean up code automatically;

> Bingo.

> >in fact, he cannot avoid it.

> Other than by using the heap, of course. Or when the "local" object is
> returned from another function, such as a factory, and has to be held
> by a smart pointer.

Yes. But that's a lot of effort.

Note that I've had one or two exceptional cases where I did want to
avoid it. But they're exceptional enough to justify some extra effort.

> >Of course, the real difference between C++ and these other languages
> >is that a lot more objects in C++ need to be Dispose'd, since this
> >is how we manage memory.

> It turns out that a lot more objects in Java and CLI and C# need to be
> Dispose'd than the designers of those environments originally seem to
> have thought.

You've noticed that too:-). Perhaps because of my C++ background, where
you really have to be concerned about the lifetime of all objects, or
suffer horrendous memory leaks (and just coming off a project where we
had a contractual penalty of something like $5000/min. for downtime),
the first thing that struck me in Swing was that JFrame.dispose didn't
propagate to the contained objects.

> >> In short, finalizers (often but incorrectly called "destructors
> >> that run at GC time" which they are NOT) are fundamentally flawed
> >> and extremely complex. (See for example
> >> http://blogs.msdn.com/cbrumme/archive/2004/02/20/77460.aspx.)

> >> I have personally come to the conclusion that destructors and GC
> >> are completely separate and must be kept completely separate.

> >This is something I've felt for a long time myself as well. On the
> >other hand, some of the people who know the issues far better than I
> >do seem to think that calling destructors during garbage collection
> >can be important, so I'm not sure.

> No, it really is worse than that: It is fundamentally impossible in
> the general case to correctly call _destructors_ during GC.

As I said, my gut instincts agree with you. But I'm not a specialist in
the question. And at least some of the specialists disagree with me.

> That is why people invented the separate concept of _finalizers_,
> which do run at GC time, but finalizers really are not destructors,
> and they can only do a subset of the operations that a destructor can
> do. For example, finalization has to be unordered (because of cycles)
> and so in your finalizer you can never reliably touch another
> finalizable object because it might already have been torn down.
> Clearly many destructors can and do touch other objects, and so it is
> impossible to call such destructors correctly at finalization time.
> See the blog above for more about the limitations of finalizers.

> The main reason people (including GC experts) believe that running
> finalizers at GC time is important is because pretty much all major GC
> systems that have ever existed do GC _instead of_ destructors. Absent
> destructors, you need to have a last chance to tear key objects down
> somehow, and finalizers are that last-chance patch. My belief is that
> in a language that has GC _in addition to_ destructors, you probably
> don't need or want finalizers at all. Finalizers are extremely
> problematic, as noted in the blog above.

I suspect that *most* uses of finalizers would be for error detection.
For example, I know that in Java, an open file will be closed in the
finalizer (freeing the file descripter resource). Now, IMHO, waiting
for a finalizer, or usually even a destructor, for this, is a coding
error. But it seems a common enough coding error that when an open
fails because of a lack of file descripters, Java runs the garbage
collector and tries again:-).

Other than that, about the only use I've seen of finalizers has been in
connection with weak references -- the finalizers can be used to remove
the reference entierly from the collection (typicaly a map) which
contains it.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Thorsten Ottosen

unread,

Dec 2, 2004, 1:26:25 PM12/2/04

to

"Alf P. Steinbach" <al...@start.no> wrote in message
news:41a392a3....@news.individual.net...
|* Herb Sutter:

| > Yes. Of course, C# has other advantages; I drool over anonymous delegates
| > (a restricted but very useful form of lambda functions / closures). It
| > would be cool to have those in C++... but that's another release...
|
| Very disagreed. Nothing is as bad as canned functionality. Classes
| that are implemented by the compiler, can only be implemented by the
| compiler, and can't be extended, are as canned as they can be, VB style.
|
| I'd rather have the corresponding and much more general Java concept in
| C++ -- and for that matter, in C#.

Are you thinking about anonymous classes? If so, I hope it is a joke :-)
Would you rather type

for_each( v.begin(), v.end(), class { void operator()( auto& r ) {
r->foo(); } }; );

than

for_each( v.begin(), v.end(), std::lambda( auto& r ) { r->foo() }; );

?

-Thorsten

Hans-J. Boehm

unread,

Dec 3, 2004, 10:03:10 AM12/3/04

to

ka...@gabi-soft.fr wrote in message news:<d6652001.04112...@posting.google.com>...

> Herb Sutter <hsu...@gotw.ca> wrote in message
> news:<3ht6q09j6gso7lank...@4ax.com>...
> [...]

> > In each case, you incur the overhead of an implicit or explicit
> > "try/finally" for the first local object that will need the cleanup --
> > and it is that try/finally that the people who worry about performance
> > are talking about.
>
> Doesn't this depend on the implementation. In C++, the cost of the
> "try/finally" is essentially 0 in the implementations I'm familiar
> with. Presumably, the try/finally in Java could use a similar
> technique, but from what I've seen, they don't.
>

At least gcj also tries to use a PC range table to deal with
exceptions, so no extra code is generated. There is always some
potential optimization cost associated with the fact that there is
another control flow path, and hence variables may be live longer,
etc. (I think gcj currently doesn't manage to do this on Windows, but
I think there are obscure technical reasons for that which are likely
to disappear eventually. It's nothing fundamental.)

I'd be surprised if other non-embedded Java implementations behaved
much differently in this regard.

Hans

Dave Harris

unread,

Dec 3, 2004, 9:53:08 AM12/3/04

to

nes...@cs.auc.dk (Thorsten Ottosen) wrote (abridged):

> Are you thinking about anonymous classes? If so, I hope it is a joke :-)

I would hope the short function-like lambda syntax would be syntactic
sugar for the anonymous class.

Classes have advantages over (pointers to) functions. They can inherit,
they can use dynamic polymorphism, they can have state, it's easier to
give them access to local variables (since they are big enough to hold
references), they can be more efficient (since they can use static
binding). Sometimes you need a class. I'd hope any lambda-like proposal
would be compatible with classes.

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Hans-J. Boehm

unread,

Dec 3, 2004, 10:05:08 AM12/3/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message news:<42thq0lf5qft5esmr...@4ax.com>...

> On 26 Nov 2004 18:51:12 -0500, Herb Sutter <hsu...@gotw.ca> wrote:
> >That is why people invented the separate concept of _finalizers_, which do
> >run at GC time, but finalizers really are not destructors, and they can
> >only do a subset of the operations that a destructor can do. For example,
> >finalization has to be unordered (because of cycles) and so in your
> >finalizer you can never reliably touch another finalizable object because
> >it might already have been torn down.
>
> To this I should also add that you can invest effort to arrange things in
> such a way that you can be sure that another finalizable object you want
> to use hasn't been finalized already when your own finalizer runs, but
> such designs typically require handshaking and are very brittle. Not
> recommended.
>

This is actually a contentious point.

Both Modula-3 and Xerox Cedar had topologically ordered finalization.
So does our collector by default. (Mono and gcj go out of their way
to turn it off to satisfy the language specs, which admittedly would
be hard to change at this point.) In my opinion, ordered finalization
is significantly cleaner. It makes essentially no difference to the
implementation. (This is in spite of Chris Brumme's web log on the
subject, which I otherwise mostly agree with.)

I know of no Modula-3 or Cedar programmers who thought ordered
finalization was a bad idea. (And I actually know many Modula-3 and
Cedar programmers.) But everyone who hasn't tried it seems to think
it is.

This does impact programming style. The general rule with
topologically ordered finalization is to minimize objects referenced
from finalizable objects, so that you avoid cycles between finalizable
objects. You will very occasionally need to split objects requiring
finalization into two objects: The main object points to a smaller,
finalizable object, which references only data needed for
finalization.

The big advantage is that you avoid intermittent failures because you
accessed an object that was already finalized, and hence often
contains a dangling pointer to unmanaged memory. It is always safe to
access another object from a finalizer. If you reference it, it will
be finalized after you are.

One disadvantage is that finalization cycles introduce memory leaks.
But it costs virtually nothing for the GC to warn you about such
finalization cycles (ours does), and empirically they are rare.

Another potential issue is that objects may be finalized later. This
rarely seems to be an issue in practice (you generally shouldn't have
enough finalizable objects for it to matter), and can be resolved by
splitting objects.

For a lot more details on this, and on some other interesting issues
with finalization, see

Boehm, ``Destructors, Finalizers, and Synchronization'', Proceedings
of the 2003 ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, Jan. 2003, pp. 262-272.
http://portal.acm.org/citation.cfm?doid=604131.604153 or
http://www.hpl.hp.com/techreports/2002/HPL-2002-335.html .

Hans

Thorsten Ottosen

unread,

Dec 4, 2004, 11:06:14 PM12/4/04

to

"Dave Harris" <bran...@cix.co.uk> wrote in message
news:memo.20041202210958.2224B@brangdon.m...

| nes...@cs.auc.dk (Thorsten Ottosen) wrote (abridged):
| > Are you thinking about anonymous classes? If so, I hope it is a joke :-)
|
| I would hope the short function-like lambda syntax would be syntactic
| sugar for the anonymous class.

FWIW, I think I agree.

| Classes have advantages over (pointers to) functions. They can inherit,
| they can use dynamic polymorphism,

how would you express that?

| they can have state, it's easier to
| give them access to local variables (since they are big enough to hold
| references), they can be more efficient (since they can use static
| binding). Sometimes you need a class. I'd hope any lambda-like proposal
| would be compatible with classes.

I agree with all this. If they are given implicit access to local variables,
it will also
be possible to use it as a local function mechanism.

-Thorsten

Dave Harris

unread,

Dec 6, 2004, 6:58:16 PM12/6/04

to

nes...@cs.auc.dk (Thorsten Ottosen) wrote (abridged):

> | Classes have advantages over (pointers to) functions. They can
> | inherit, they can use dynamic polymorphism,
>
> how would you express that?

I don't know. Perhaps by sticking a ": base" after the word "lambda" or
the word "class".

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Dec 7, 2004, 5:03:20 PM12/7/04

to

Gerhard Wesp wrote:
> ka...@gabi-soft.fr wrote:
> > I've yet to deliver any code where asserts have simply become null
> > statements. Sounds strang to me, sort of like wearing a life
jacket
> > for your lessons in port, and taking it off when you go to sea.

> One might just as well compare it to a child learning to ride it's
> bicycle, using support wheels. When enough confidence is built up,
> the support wheels are thrown away and you can go much faster.

> See my other posting with the always_assert() macro, though!

See also the requirements for the standard assert macro. One very
important one is that you can re-include <assert.h> as many times as
you
want, and each time, the macro is redefined according to the current
setting of NDEBUG. The intention is obvious: if the profiler says that
you can't leave the assert in, then you can encapsulate the function
with something like:

#if PRODUCTION_BUILD
#define NDEBUG
#include <assert.h>
// Critical function...
#undef NDEBUG
#incluce <assert.h>

It's possible to remove asserts locally when they cause a performance
problem, without suppressing them systematically.

> > Which is formally a problem, since if I do define NDEBUG in one
module,
> > but not in another, I have undefined behavior (violation of the

> Hmm... Good point. Seems to necessitate separate debug and release
> builds for all your dependent modules.

Note that I said "formally". In practice, it isn't a problem.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,

Dec 7, 2004, 9:37:09 PM12/7/04

to

Dave Harris wrote:
> nes...@cs.auc.dk (Thorsten Ottosen) wrote (abridged):
> > | Classes have advantages over (pointers to) functions. They can
> > | inherit, they can use dynamic polymorphism,

> > how would you express that?

> I don't know. Perhaps by sticking a ": base" after the word "lambda"
> or the word "class".

Within a function:

lambda { ... } // lambda class...
lambda : public X { ... } // lambda class deriving from X
lambda( ... ) { ... } // lambda function...

In practice, the keyword lambda must always be immediately followed by
one of three tokens, '{', ':' or '('. The first two will always be a
class, the third a function.

IMHO, adding lambda classes to C++ would not be too difficult. A
lambda
class expression defines a temporary object; about the only problem I
can see off hand is defining the lifetime of this object. A pointer to
a lambda object is just like a pointer to any other class type. Any
extra information necessary to find context is hidden in the class
data.
A lambda function is more difficult, because we must find a place to
put
the hidden data, presumably without breaking function pointers or
function pointer compatibility. (G++ uses dynamically generated
trampolines to do this, I think. I don't know if the solution is
universally applicable, however.)

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Thorsten Ottosen

unread,

Dec 8, 2004, 6:21:22 PM12/8/04

to

<ka...@gabi-soft.fr> wrote in message
news:1102412529.3...@z14g2000cwz.googlegroups.com...

| Dave Harris wrote:
| > nes...@cs.auc.dk (Thorsten Ottosen) wrote (abridged):
| > > | Classes have advantages over (pointers to) functions. They can
| > > | inherit, they can use dynamic polymorphism,
|
| > > how would you express that?
|
| > I don't know. Perhaps by sticking a ": base" after the word "lambda"
| > or the word "class".
|
| Within a function:
|
| lambda { ... } // lambda class...
| lambda : public X { ... } // lambda class deriving from X
| lambda( ... ) { ... } // lambda function...
|
| In practice, the keyword lambda must always be immediately followed by
| one of three tokens, '{', ':' or '('. The first two will always be a
| class, the third a function.
|
| IMHO, adding lambda classes to C++ would not be too difficult. A
| lambda
| class expression defines a temporary object; about the only problem I
| can see off hand is defining the lifetime of this object.

why not make it a stack object of implementation defined type?

If you want to save it, simply use auto:

auto fun = lambda { ... };

| A lambda function is more difficult, because we must find a place to
| put
| the hidden data, presumably without breaking function pointers or
| function pointer compatibility.
| (G++ uses dynamically generated
| trampolines to do this, I think. I don't know if the solution is
| universally applicable, however.)

The easy solution might be to break such compatibility and just let a lambda
function
be a stack object too.

-Thorsten

ka...@gabi-soft.fr

unread,

Dec 9, 2004, 8:49:46 PM12/9/04

to

> | Within a function:

That's what I thought I was suggesting. The question remains
concerning
it's lifetime. Basically, my tendency would be to make it a temporary,
with lifetime extending to the end of the full expression. But there
are uses where it is interesting to have an object with the lifetime of
a local variable.

My initial idea, some years back, was to define a built-in class
__local_context. The formal definition of the class was that it
contained a reference to every declared object currently visible in the
function, with the name of the object, and initialized bound to the
object. (This is to define the semantics in a simple manner -- the
idea
was that a compiler could normally do better.) This is just a class,
not an object (for the moment).

Based on that, the structure:
lambda <base_clause>[opt] { <member-specification>[opt] }
would basically create a temporary variable of type:
struct <anon> : __local_context ,<base_clause> {
<member-specification> }
(For the moment, I don't have any good ideas for constructors or the
destructor. But I'm not sure they're necessary.)

Another interesting structure could be:
cleanup <function-body> ;
which would create an unnamed local variable (not a temporary, so
lifetime until the normal end of scope) of type:
struct <anon> : __local_context
{
~<anon> <function-body> // A destructor.
}

But these are just vague ideas for the moment.

> If you want to save it, simply use auto:

> auto fun = lambda { ... };

Or typeof, or whatever it is called today:-).

> | A lambda function is more difficult, because we must find a place
to
> | put the hidden data, presumably without breaking function pointers
> | or function pointer compatibility. (G++ uses dynamically generated
> | trampolines to do this, I think. I don't know if the solution is
> | universally applicable, however.)

> The easy solution might be to break such compatibility and just let a
> lambda function be a stack object too.

In C++ code, we can almost always make do with functional objects,
i.e. a lambda class, above. I can imagine cases, however, where it
might be interesting to have an `extern "C"' lambda function. With an
address compatible with a C function pointer.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

David Abrahams

unread,

Dec 9, 2004, 9:13:36 PM12/9/04

to

Thorsten Ottosen wrote:
> why not make it a stack object of implementation defined type?

"unspecified type" would be more appropriate.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

Dave Harris

unread,

Dec 9, 2004, 9:17:35 PM12/9/04

to

ka...@gabi-soft.fr () wrote (abridged):

> Within a function:
>
> lambda { ... } // lambda class...
> lambda : public X { ... } // lambda class deriving from X
> lambda( ... ) { ... } // lambda function...

Can "public" be the default, please?

> about the only problem I can see off hand is defining the lifetime
> of this object.

I would make it an auto (ie stack-based) object unless the "new" keyword
is used.

Base *p = new lambda : Base { ... };

Of course, smart pointers can be used here too. They can even be intrusive
because we have a full object that can inherit and store whatever
information we need.

Using "new", the object shouldn't use local variables from the enclosing
scope after that scope has ended. I wouldn't provide any special
protection, although implementations could warn about obvious mistakes. It
would be nice to have some easier construction syntax:

Base *make_adder( int x ) {
if (rand())
return new lambda : Base {
int operator()( int y ) {
return x + y; // Error; reference to x.
}
}
else
return new lambda : Base {
int x_ = x; // New syntax.
int operator()( int y ) {
return x_ + y; // Ok; x is copied.
}
}
}

> A lambda function is more difficult, because we must find a place to
> put the hidden data, presumably without breaking function pointers or
> function pointer compatibility.

I wouldn't bother. I'd suggest some syntactic sugar for classes with a
single operator() member. Specifically this:

return new lambda( int y ) : Base {
return x + y;
}

should make a subclass of Base. Without the ": Base", it is still a class.
So for example in this:

for_each( first, last, lamda( auto &r ) { cout << r; } );

there would be no pointer to function and therefore no dynamic dispatch
needed, even conceptually.

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Dec 10, 2004, 7:32:36 AM12/10/04

to

David Abrahams wrote:
> Thorsten Ottosen wrote:
> > why not make it a stack object of implementation defined type?

> "unspecified type" would be more appropriate.

Probably.

My posting was not meant to be taken as a concrete proposal. More just
brainstorming. I realize that if anyone is interested, it would take a
LOT more work to firm it up into something which could even be
considered as a possible proposal.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,

Dec 10, 2004, 7:30:05 AM12/10/04

to

Dave Harris wrote:
> ka...@gabi-soft.fr () wrote (abridged):
> > Within a function:

> > lambda { ... } // lambda class...
> > lambda : public X { ... } // lambda class deriving from X
> > lambda( ... ) { ... } // lambda function...

> Can "public" be the default, please?

That's up to whoever decides to write up a proposal. I was just
throwing out a few vague ideas.

> > about the only problem I can see off hand is defining the lifetime
> > of this object.

> I would make it an auto (ie stack-based) object unless the "new"
> keyword is used.

> Base *p = new lambda : Base { ... };

> Of course, smart pointers can be used here too. They can even be
> intrusive because we have a full object that can inherit and store
> whatever information we need.

Attention: although I said class, as I was looking at it then, I was
really thinking of it as more of a temporary object of class type -- an
object, rather than a type. In that case, of course, new is
irrelevant.

The idea of considering it a class is interesting. (I don't know why I
was thinking object.) If it is a class, however, then we need some way
of immediately constructing a temporary object, when that is what we
want. If we don't support lambda functions, the answer is obvious:
lambda() ... (Note that there is no way to provide a constructor for a
lambda class, since it doesn't have a name. Maybe something should be
considered for that as well.)

> Using "new", the object shouldn't use local variables from the
> enclosing scope after that scope has ended.

Given that I conceived lambda as deriving from a special class
__local_context, which made the local variables visible, I'm not sure
that supporting new is a good idea:-). On the other hand, I have a
couple of functions taking auto_ptr<T> where using local variables in
the T could be useful in some cases.

> I wouldn't provide any special protection, although implementations
> could warn about obvious mistakes.

Some more rope:-). But something which I guess has to be considered.

> It would be nice to have some easier construction syntax:

> Base *make_adder( int x ) {
> if (rand())
> return new lambda : Base {
> int operator()( int y ) {
> return x + y; // Error; reference to x.
> }
> }
> else
> return new lambda : Base {
> int x_ = x; // New syntax.
> int operator()( int y ) {
> return x_ + y; // Ok; x is copied.
> }
> }
> }

If lambda is a class :

new lambda : Base {
lambda() { x_ = x ; }
// ...
} ;

?

Supporting destructors seems even more useful.

> > A lambda function is more difficult, because we must find a place
to
> > put the hidden data, presumably without breaking function pointers
> > or function pointer compatibility.

> I wouldn't bother.

That's my feelings exactly:-). Note the context I was responding in.
And my later posting, developing some of the ideas around lambda
classes.

> I'd suggest some syntactic sugar for classes with a
> single operator() member. Specifically this:

> return new lambda( int y ) : Base {
> return x + y;
> }

Sounds like a good idea. So now what do I have to write to get a
temporary.

> should make a subclass of Base. Without the ": Base", it is still a
> class. So for example in this:

> for_each( first, last, lamda( auto &r ) { cout << r; } );

> there would be no pointer to function and therefore no dynamic
> dispatch needed, even conceptually.

Agreed. Except that if lamda is a type, the syntax is invalid, because
you haven't got an object, and if it is an object, then things like new
and local named variables are out. And allowing it to be both sounds
to
me like a perfect recepe for more of the declaration/expression
ambiguity problems we know and love so well.

Anyhow, I'm now convinced that it should be a type. But I'm still
convinced that it should be a temporary object. And I'm very open to
any ideas as to how we can distinguish between the two uses.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

David Abrahams

unread,

Dec 10, 2004, 10:49:37 AM12/10/04

to

ka...@gabi-soft.fr wrote:

> Thorsten Ottosen wrote:
>>
>> why not make it a stack object of implementation defined type?
>
> That's what I thought I was suggesting. The question remains
> concerning
> it's lifetime. Basically, my tendency would be to make it a temporary,
> with lifetime extending to the end of the full expression. But there
> are uses where it is interesting to have an object with the lifetime of
> a local variable.

So then just bind it to an auto&, and it will persist.

One thing that I think has been overlooked is that there should be a
syntax for lambdas that's as convenient and expressive as the current
library solutions. If you have to write anything as complicated as:

lambda (x, y) { return x + y; }

instead of:

_1 + _2

I think we'll be taking a lateral -- rather than a forward -- step.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Thorsten Ottosen

unread,

Dec 10, 2004, 12:00:57 PM12/10/04

to

<ka...@gabi-soft.fr> wrote in message
news:1102590487.6...@z14g2000cwz.googlegroups.com...

| Thorsten Ottosen wrote:
| > <ka...@gabi-soft.fr> wrote in message

| > | IMHO, adding lambda classes to C++ would not be too difficult. A

| > | lambda class expression defines a temporary object; about the only
| > | problem I can see off hand is defining the lifetime of this object.
|
| > why not make it a stack object of implementation defined type?
|
| That's what I thought I was suggesting. The question remains
| concerning
| it's lifetime. Basically, my tendency would be to make it a temporary,
| with lifetime extending to the end of the full expression. But there
| are uses where it is interesting to have an object with the lifetime of
| a local variable.

ok, I guess I meant local variable then.

| Another interesting structure could be:
| cleanup <function-body> ;
| which would create an unnamed local variable (not a temporary, so
| lifetime until the normal end of scope) of type:
| struct <anon> : __local_context
| {
| ~<anon> <function-body> // A destructor.
| }
|
| But these are just vague ideas for the moment.

yeah, maybe if we consider a tool like scope guard, then
we can just create a cleaup functions as seperate anonymous functions:

ScopeGuard cleanup = MakeGuard( lambda() { foo(); } );

-Thorsten

--
Thorsten Ottosen
----------------------------
Dezide Aps -- Intelligent Customer Support:
www.dezide.com

Aalborg University --- Decision Support Systems:
http://www.cs.aau.dk/index2.php?content=Research/bss

C++ Boost:
www.boost.org

C++ Standard:
http://www.open-std.org/JTC1/SC22/WG21/

Terje Sletteb=F8

unread,

Dec 10, 2004, 11:35:40 PM12/10/04

to

<ka...@gabi-soft.fr> wrote in message
news:1102412529.3...@z14g2000cwz.googlegroups.com...

> Dave Harris wrote:
>> nes...@cs.auc.dk (Thorsten Ottosen) wrote (abridged):
>>> | Classes have advantages over (pointers to) functions. They can
>>> | inherit, they can use dynamic polymorphism,
>
>>> how would you express that?
>
>> I don't know. Perhaps by sticking a ": base" after the word "lambda"
>> or the word "class".
>
> Within a function:
>
> lambda { ... } // lambda class...
> lambda : public X { ... } // lambda class deriving from X
> lambda( ... ) { ... } // lambda function...
>
> In practice, the keyword lambda must always be immediately followed by
> one of three tokens, '{', ':' or '('. The first two will always be a
> class, the third a function.

Hm, how about omitting the word "lambda", as well:

class { ... }
class : public X { ...}
void (...) { ... }

I find it rather natural to express a class without a name, as... a class
without a name. :) Same for function (this syntax for lambda functions was
discussed at an ACCU mailing list (effective-cpp) a while back, and was
received generally favourably. That also went for the following).

In cases where the function-notation may be ambiguous (or needing potential
lookahead, to distinguish what it is), one might reuse "inline" to
disambiguate. E.g.:

generate_n(out, count, int() { return rand() % 2; }); // "int()" has a
well-defined meaning in C++

generate_n(out, count, inline int() { return rand() % 2; }); // Now
unambiguous

Regards,

Terje

James Kanze

unread,

Dec 11, 2004, 12:28:40 AM12/11/04

to

David Abrahams wrote:
> ka...@gabi-soft.fr wrote:

>>Thorsten Ottosen wrote:

>>>why not make it a stack object of implementation defined type?

>>That's what I thought I was suggesting. The question remains
>>concerning
>>it's lifetime. Basically, my tendency would be to make it a temporary,
>>with lifetime extending to the end of the full expression. But there
>>are uses where it is interesting to have an object with the lifetime of
>>a local variable.

> So then just bind it to an auto&, and it will persist.

The thought occurred to me. For various personal reasons, I'm not as
active in standardization as I once was, so I don't know what the
current status is -- I'm unaware of this auto, for example, and while I
do know that there will be something along the lines of typeof, I
couldn't figure out how to make it work, and I was also stymied by the
fact that you can't bind a temporary to a reference to non const (and
some of the uses I can think of would require non-const).

> One thing that I think has been overlooked is that there should be a
> syntax for lambdas that's as convenient and expressive as the current
> library solutions. If you have to write anything as complicated as:

> lambda (x, y) { return x + y; }

> instead of:

> _1 + _2

> I think we'll be taking a lateral -- rather than a forward -- step.

That's a good point. In most of the uses I would have had for his in
the past, it would have been irrelevant, in that I needed a specific
base class as well.

That said, I think I rather prefer that some explicit sign be present
that I'm using a lambda. Although I agree that something lighter than
what we've currently been talking about would be an improvement for
small things like the above. If the function has any real length, then
you probably want to give the parameters explicit names. But then, if
the function has any real length, you probably want to split it out into
another function/class anyway -- my experience with anonymous inner
classes in Java taught me that unless they are very, very short, they
quickly lead to totally unreadable code.

(And as you can probably gather from the above, I'm still very much in
the brainstorming mode. Just a lot of vague, sometimes conflicting
ideas for the moment.)

--
James Kanze home: www.gabi-soft.fr
Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 pl. Pierre S=E9mard, 78210 St.-Cyr-l'=C9cole, France +33 (0)1 30 23 00 34

Maciej Sobczak

unread,

Dec 11, 2004, 10:24:14 AM12/11/04

to

David Abrahams wrote:

> One thing that I think has been overlooked is that there should be a
> syntax for lambdas that's as convenient and expressive as the current
> library solutions. If you have to write anything as complicated as:
>
> lambda (x, y) { return x + y; }
>
> instead of:
>
> _1 + _2
>
> I think we'll be taking a lateral -- rather than a forward -- step.

Depends.
The _1, _2, ... names tell me nothing about their meaning. Looks more
like an assembly (and even before macros were invented for assembly)
than a language for writing clear and understandable code.

I want to name things which I use. This means that some language notion
of name declaration is in order. The example with:

lambda(x, y) { return x + y; }

is quite clear, because there is a place for naming (declaring) things
and a place for using things.

This may prove really good when you write lambda expressions as a result
of moving code around. Even better - it makes it easier to refactor
lambdas into normal functions if such a needs arises. The cryptic _1,
_2, etc. force me to rewrite the code when moving it to lambda or back.

What about:

int (int x, int y) { return x + y; }

It is even longer, but specifying types (especially the return type) is
what enables me to use lambda expressions with overloaded targets.

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

Dave Harris

unread,

Dec 11, 2004, 11:47:50 AM12/11/04

to

ka...@gabi-soft.fr () wrote (abridged):

> Attention: although I said class, as I was looking at it then, I was
> really thinking of it as more of a temporary object of class type -- an
> object, rather than a type. In that case, of course, new is
> irrelevant.

I saw it as both type and object. It defines a class, which is immediately
and implicitly instantiated to make an object. You can't have an object
without a type.

for_each( first, last, lambda : Base { ... } );
const Base &b = lambda : Base { ... };

Base *p = new lambda : Base { ... };

const Base &b = lambda : Base { ... };

Base *p = new lambda : Base { ... };

would be similar to:
struct unnamed : Base { ... };

for_each( first, last, unnamed() );
const Base &b = unnamed();
Base *p = new unnamed();

In all cases there is an unnamed type, and also an object of that type.
(The type is different in each case, of course.) To me this is simple,
straight forward and uniform.

Functors would use slightly different syntax and make a class which just
had an operator().

const Base &b = void lambda( int x ) : Base { ... };

becomes:
struct unnamed : Base { void operator()( int x ) { ... } };
const Base &b = unnamed();

and of course the operator() is virtual if it is virtual in the base
class, as usual.

[Later...]
I've now read Terje Sletteb's post, and I agree that using a bare "struct"
and/or "inline" instead of "lambda" would be fine.

for_each( first, last, struct : Base { ... } );
const Base &b = struct : Base { ... };
Base *p = new struct : Base { ... };
const Base &b2 = void struct( int x ) : Base { ... };

Distinguishing between functor and function by using "inline" is possible,
but not a case I much care about:

void (*p)( int x ) = void inline( int x ) { ... };

and it does pose problems for the implementation.

> The idea of considering it a class is interesting. (I don't know why I
> was thinking object.) If it is a class, however, then we need some way
> of immediately constructing a temporary object, when that is what we
> want. If we don't support lambda functions, the answer is obvious:
> lambda() ...

I think the instantiation should always happen. I don't see that this
creates any ambiguity or other problems. Requiring an extra pair of
brackets would just add clutter to the common case.

A key point is that we are talking about contexts in which currently an
object is allowed, but a class definition isn't. This means there can be
no conflict with the current standard. It might mean a specific rule for
new expressions, but that's not a problem. New already has a specific
rule. Likewise for inheritance (although inheriting from a lambda doesn't
seem to lead to readable code).

In contexts where a class definition is allowed, then it should not
instantiate and you get a type rather than an object. The language already
supports anonymous classes in some cases, eg:

struct { ... } x;

and extending this to allow inheritance is surely not a problem:

struct : Base { ... } x;

although now Terje's syntax is compelling. Using

lambda : Base { ... } x;

would be OK too.

> Anyhow, I'm now convinced that it should be a type. But I'm still
> convinced that it should be a temporary object. And I'm very open to
> any ideas as to how we can distinguish between the two uses.

Can you given an example of a problem, given what I have above?

> (Note that there is no way to provide a constructor for a
> lambda class, since it doesn't have a name. Maybe something should be
> considered for that as well.)

The obvious syntax is to use "lambda" as the name.

for_each( first, last, lambda { int sum;
lambda() : sum(0) {}
operator()( auto &i ) { sum += i; }
~lambda() { cout << sum; }
} )

In practice, the need for constructors and destructors is much reduced if
we can access variables from the enclosing scope.

int sum = 0;
for_each( first, last, lambda( auto &i ) { sum += i; } );
cout << sum;

And in any case we may be reaching the point of diminishing returns. As
soon as the class gets a little bit complicated, I find it clearer to name
it and define it out of line anyway.

> Given that I conceived lambda as deriving from a special class
> __local_context, which made the local variables visible, I'm not sure
> that supporting new is a good idea:-).

I thought __local_context was just a way of expressing the semantics, and
not meant to be taken literally. Apart from getting in the way of new, it
implies that the stack frame is an object, which it is not. I think it is
__local_context which is the bad idea :-)

-- Dave Harris, Nottingham, UK

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

James Kanze

unread,

Dec 11, 2004, 7:01:13 PM12/11/04

to

Dave Harris wrote:
> ka...@gabi-soft.fr () wrote (abridged):

>>Attention: although I said class, as I was looking at it then, I was
>>really thinking of it as more of a temporary object of class type -- an
>>object, rather than a type. In that case, of course, new is
>>irrelevant.

> I saw it as both type and object. It defines a class, which is
immediately
> and implicitly instantiated to make an object. You can't have an object
> without a type.

Agreed. The type was always there.

> for_each( first, last, lambda : Base { ... } );
> const Base &b = lambda : Base { ... };
> Base *p = new lambda : Base { ... };

Attention: in the first two, it is both a type and an unnamed object.
In the last, it is only a type -- it is the new which created the
object.

Perhaps that is enough disabiguation. It is both a type and an unnamed
object, except when it is the type in a new expression. I'll admit that
I'm awfully sceptical of rules with an "except", however.

> const Base &b = lambda : Base { ... };
> Base *p = new lambda : Base { ... };

> would be similar to:
> struct unnamed : Base { ... };

> for_each( first, last, unnamed() );
> const Base &b = unnamed();
> Base *p = new unnamed();

> In all cases there is an unnamed type, and also an object of that type.
> (The type is different in each case, of course.) To me this is simple,
> straight forward and uniform.

The last is tricky -- it is the new which creates the object, not the
(). They only look textually the same.

> Functors would use slightly different syntax and make a class which jus=
t
> had an operator().

> const Base &b = void lambda( int x ) : Base { ... };

> becomes:
> struct unnamed : Base { void operator()( int x ) { ... } };
> const Base &b = unnamed();

> and of course the operator() is virtual if it is virtual in the base
> class, as usual.

OK.

Of course, this means that a lambda class cannot have a non-default
constructor. (Or at least, you could never call it. Or... we add yet
another place where there is a possible ambiguity beween declarations
and expressions.) I don't think that this is a problem -- since you
never get more than one instance of the class, you can't construct it
with different values. And the default constructor can access anything
you could have passed it as a parameter.

> [Later...]
> I've now read Terje Sletteb's post, and I agree that using a bare
"struct"
> and/or "inline" instead of "lambda" would be fine.

> for_each( first, last, struct : Base { ... } );
> const Base &b = struct : Base { ... };
> Base *p = new struct : Base { ... };
> const Base &b2 = void struct( int x ) : Base { ... };

What's the advantage (except not introducing a new keyword)? I find the
keyword lambda clearer, but I don't think it something exceedingly
important.

> Distinguishing between functor and function by using "inline" is
possible,
> but not a case I much care about:

> void (*p)( int x ) = void inline( int x ) { ... };

> and it does pose problems for the implementation.

As far as I'm concerned, I think we should forget about the functions.
But that's just my gut feeling, right now.

>>The idea of considering it a class is interesting. (I don't know why I
>>was thinking object.) If it is a class, however, then we need some way
>>of immediately constructing a temporary object, when that is what we
>>want. If we don't support lambda functions, the answer is obvious:
>>lambda() ...

> I think the instantiation should always happen. I don't see that this
> creates any ambiguity or other problems. Requiring an extra pair of
> brackets would just add clutter to the common case.

> A key point is that we are talking about contexts in which currently an
> object is allowed, but a class definition isn't.

Except in the case of new. No object is allowed there. (On the other
hand, a class definition isn't either.)

> This means there can be

> no conflict with the current standard. It might mean a specific rule fo=

r
> new expressions, but that's not a problem. New already has a specific

> rule. Likewise for inheritance (although inheriting from a lambda doesn=

't
> seem to lead to readable code).

> In contexts where a class definition is allowed, then it should not
> instantiate and you get a type rather than an object. The language
already
> supports anonymous classes in some cases, eg:

> struct { ... } x;

> and extending this to allow inheritance is surely not a problem:

> struct : Base { ... } x;

> although now Terje's syntax is compelling. Using

> lambda : Base { ... } x;

> would be OK too.

I see one difference: if I write struct (or class), I have no access to
the local variables -- if I write lambda, I do.

Of course, going back to my original suggestion:

struct : __local_context { ... } x ;

would give access to the local variables.

>>Anyhow, I'm now convinced that it should be a type. But I'm still
>>convinced that it should be a temporary object. And I'm very open to
>>any ideas as to how we can distinguish between the two uses.

> Can you given an example of a problem, given what I have above?

Just the problem with new. As you say, it's probably overcomable.

>>(Note that there is no way to provide a constructor for a
>>lambda class, since it doesn't have a name. Maybe something should be
>>considered for that as well.)

> The obvious syntax is to use "lambda" as the name.

> for_each( first, last, lambda { int sum;
> lambda() : sum(0) {}
> operator()( auto &i ) { sum += i; }
> ~lambda() { cout << sum; }
> } )

The idea struck me too about 10 seconds after I had posted:-).

> In practice, the need for constructors and destructors is much reduced =

if
> we can access variables from the enclosing scope.

> int sum = 0;
> for_each( first, last, lambda( auto &i ) { sum += i; } );
> cout << sum;

> And in any case we may be reaching the point of diminishing returns. As
> soon as the class gets a little bit complicated, I find it clearer to
name
> it and define it out of line anyway.

Agreed (on both points).

>>Given that I conceived lambda as deriving from a special class
>>__local_context, which made the local variables visible, I'm not sure
>>that supporting new is a good idea:-).

> I thought __local_context was just a way of expressing the semantics, a=
nd
> not meant to be taken literally. Apart from getting in the way of new, =
it
> implies that the stack frame is an object, which it is not. I think it =

is
> __local_context which is the bad idea :-)

Maybe. I was thinking of it as a more general underlying mechanism.
__local_context would have existed, at least in the formal definition of
the language, although you wouldn't normally have used it. It had
occured to me as a means of unifying lambda and cleanup. Of course, you
don't need cleanup. You could always just write:

auto& x = lambda { ~lambda() { ... } } ;

Still, the utility seems sufficient to me that it is worth supporting
something like:

cleanup { ... } ;

(Or we could call it finally:-).

--
James Kanze home: www.gabi-soft.fr

Conseils en informatique orient=E9e objet/
Beratung in objektorientierter Datenverarbeitung
9 pl. Pierre S=E9mard, 78210 St.-Cyr-l'=C9cole, France +33 (0)1 30 23 00 34

Dave O'Hearn

unread,

Dec 12, 2004, 4:58:37 AM12/12/04

to

Terje Sletteb=F8 wrote:
> Hm, how about omitting the word "lambda", as well:

Yes, please. It's worth quoting _Structure and Interpretation of
Computer Programs_ in regards to this word "lambda",

"It would be clearer and less intimidating to people learning Lisp if a
name more obvious than lambda, such as make-procedure, were used. But
the convention is firmly entrenched. The notation is adopted from the
[lambda] calculus, ..." Page 63, footnote 53. (I have forgotton how to
form a citation properly...)

C++ has a fine tradition of eschewing arcane terminology when it is of
no value. If "method" rates a 7 on the arcane-o-meter and "subclass"
rates perhaps a 4, "lambda" is easily in the 15+ range. There is just
no need to burden students of C++ with historical sidebars about it.
--
Dave O'Hearn

David Abrahams

unread,

Dec 12, 2004, 5:00:29 AM12/12/04

to

Maciej Sobczak wrote:
> David Abrahams wrote:
>
>> One thing that I think has been overlooked is that there should be a
>> syntax for lambdas that's as convenient and expressive as the current
>> library solutions. If you have to write anything as complicated as:
>>
>> lambda (x, y) { return x + y; }
>>
>> instead of:
>>
>> _1 + _2
>>
>> I think we'll be taking a lateral -- rather than a forward -- step.
>
> Depends.
> The _1, _2, ... names tell me nothing about their meaning.

Neither does the symbol '{' ... until you learn what it means in C++.
I'm not attached to those particular symbols, you understand. But if
lambdas are forced to be much more verbose than this they often won't be
worth using. The whole point is to have something concise you can drop
in without interrupting the flow of a function.

> Looks more
> like an assembly (and even before macros were invented for assembly)
> than a language for writing clear and understandable code.

Ever tried a language that tries to make it possible to program in
"plain English?" <shiver>

> I want to name things which I use. This means that some language notion
> of name declaration is in order. The example with:
>
> lambda(x, y) { return x + y; }
>
> is quite clear, because there is a place for naming (declaring) things
> and a place for using things.

Yes, that's very clear, and there is a place for named lambda variables.
But it's often overkill.

> This may prove really good when you write lambda expressions as a result
> of moving code around.

Why?

> Even better - it makes it easier to refactor
> lambdas into normal functions if such a needs arises. The cryptic _1,
> _2, etc. force me to rewrite the code when moving it to lambda or back.

No they don't.

char f(int _1, char* _2) { return _2[_1]; }

> What about:
>
> int (int x, int y) { return x + y; }
>
> It is even longer, but specifying types (especially the return type) is
> what enables me to use lambda expressions with overloaded targets.

How exactly would you create such an overload set? Example, please!

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gabriel Dos Reis

unread,

Dec 12, 2004, 5:12:19 AM12/12/04

to

Maciej Sobczak <no....@no.spam.com> writes:

| David Abrahams wrote:
|
| > One thing that I think has been overlooked is that there should be a
| > syntax for lambdas that's as convenient and expressive as the current
| > library solutions. If you have to write anything as complicated as:
| >
| > lambda (x, y) { return x + y; }
| >
| > instead of:
| >
| > _1 + _2
| >
| > I think we'll be taking a lateral -- rather than a forward -- step.
|
| Depends.
| The _1, _2, ... names tell me nothing about their meaning. Looks more
| like an assembly (and even before macros were invented for assembly)
| than a language for writing clear and understandable code.

You must be alone :-) :-)

| I want to name things which I use. This means that some language notion
| of name declaration is in order. The example with:
|
| lambda(x, y) { return x + y; }
|
| is quite clear, because there is a place for naming (declaring) things
| and a place for using things.

In the paper "generalizing overloading for C++ 2000", it is argued
that we don't need hard-to-read long names and that multi-character
names are a relic of languages that relied heavily on global names and
encouraged overly-large scopes. I guess the proposed notation _<n>
has something to it :-)

Wel,, yes, I do want a clear mark that I'm using a lambda and have a
choice of the name I'll use.

|
| This may prove really good when you write lambda expressions as a result
| of moving code around. Even better - it makes it easier to refactor
| lambdas into normal functions if such a needs arises. The cryptic _1,
| _2, etc. force me to rewrite the code when moving it to lambda or back.
|
| What about:
|
| int (int x, int y) { return x + y; }
|
| It is even longer, but specifying types (especially the return type) is
| what enables me to use lambda expressions with overloaded targets.

It may probably introduce some parsing oroblems.
Byt, I would go even further and drop the "return" stuff.

--
Gabriel Dos Reis
g...@integrable-solutions.net

Thorsten Ottosen

unread,

Dec 12, 2004, 5:37:04 PM12/12/04

to

"David Abrahams" <da...@boost-consulting.com> wrote in message

| One thing that I think has been overlooked is that there should be a
| syntax for lambdas that's as convenient and expressive as the current
| library solutions. If you have to write anything as complicated as:
|
| lambda (x, y) { return x + y; }
|
| instead of:
|
| _1 + _2
|
| I think we'll be taking a lateral -- rather than a forward -- step.

perhaps $1 etc could be builtin placeholders for lambdas?

Even without super terse notation, I see the following major benefits:

- better error diagnostics
- no bind
- no mn_fun
- fewer forwarding problems
- local function support
- implicit access to local context

-Thorsten

Maciej Sobczak

unread,

Dec 12, 2004, 6:07:34 PM12/12/04

to

David Abrahams wrote:

>>The _1, _2, ... names tell me nothing about their meaning.

> Neither does the symbol '{' ... until you learn what it means in C++.
> I'm not attached to those particular symbols, you understand. But if
> lambdas are forced to be much more verbose than this they often won't be
> worth using. The whole point is to have something concise you can drop
> in without interrupting the flow of a function.

Interrupt?
For me, the point in lambdas is to increase the "locality of reference"
of the source code and to be able to see what the functor does in the
place where it is used:

sort(b, e, compare);

Err... do I need to search the source code for the definition of compare
to understand what's going on?

Having this;

sort(b, e, _1 << _2);

is fine.
Having this:

sort(b, e, bool (int l, int r) { return l < r; });

is fine, too, but having this:

// 1.
sort(b, e,
bool (Person const &p1, Person const &p2)
{ return p1.age < p2.age; }
};

is for me clearer than:

// 2.
sort(b, e, _1.age < _2.age);

(and I do not even mention that (2) does not work with Boost.Lambda -
the explicit declaration of p1 and p2 in (1) make the expressions p1.age
and p2.age well-defined)

>>This may prove really good when you write lambda expressions as a result
>>of moving code around.

> Why?

Because I may want to inject as lambda a piece of code that previously
was existing as a separate function that did not use _N names.

Consider:

// 3.
bool (Person const &p1, Person const &p2)
{ return p1.age < p2.age; }

Think of (1) as a drop-in from (3).

>>Even better - it makes it easier to refactor
>>lambdas into normal functions if such a needs arises. The cryptic _1,
>>_2, etc. force me to rewrite the code when moving it to lambda or back.

> No they don't.
>
> char f(int _1, char* _2) { return _2[_1]; }

Are you joking? Do you have enough time to defend it at code reviews?

>>It is even longer, but specifying types (especially the return type) is
>>what enables me to use lambda expressions with overloaded targets.

> How exactly would you create such an overload set? Example, please!

void foo(int);
void foo(char);

template <class Functor>
void bar(Functor f)
{
foo(f(96));
}

// later:
bar(char (int i) { return i + 1; }); // lambda returns char

As a result, the *second* overload of foo is called with (char)97 (='a')
as the parameter.

With this:

bar(_1 + 1);

the *first* foo would be called and I do not see how to make it call the
second overload.

(and I do not mention that the above does not compile with Boost.Lambda)

For completeness, I do not mind constructs like:

sort(b, e, template <typename T>
bool (T const &l, T const &r)
{ return l < r; }
);

;)

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

James Kanze

unread,

Dec 12, 2004, 6:06:09 PM12/12/04

to

Gabriel Dos Reis wrote:
> Maciej Sobczak <no....@no.spam.com> writes:

> | David Abrahams wrote:

> | > One thing that I think has been overlooked is that there should be a
> | > syntax for lambdas that's as convenient and expressive as the current
> | > library solutions. If you have to write anything as complicated as:

> | > lambda (x, y) { return x + y; }

> | > instead of:

> | > _1 + _2

> | > I think we'll be taking a lateral -- rather than a forward -- step.

> | Depends.

> | The _1, _2, ... names tell me nothing about their meaning. Looks more
> | like an assembly (and even before macros were invented for assembly)
> | than a language for writing clear and understandable code.

> You must be alone :-) :-)

Not really, although it's true that there aren't enough "coders"
concerned software engineering issues. (I've put "coders" in
parentheses, because I think it sometimes has negative overtones. Which
I don't want here, but I don't know of any good word to indicate those
who are mainly charged with the actual writing of the code, as opposed
with those more concerned with developement process in general.)

> | I want to name things which I use. This means that some language notion
> | of name declaration is in order. The example with:

> | lambda(x, y) { return x + y; }

> | is quite clear, because there is a place for naming (declaring) things
> | and a place for using things.

> In the paper "generalizing overloading for C++ 2000", it is argued
> that we don't need hard-to-read long names and that multi-character
> names are a relic of languages that relied heavily on global names and
> encouraged overly-large scopes. I guess the proposed notation _<n>
> has something to it :-)

Is the argumentation based on some real studies, or just wishful
thinking. (I don't want to attack you in particular, but I have the
impression that in more than a few cases recently, "coders" seem to have
allowed what they like to do to influence their ideas with regards to
software process. I would say that it has definitly been an influence
with regards to some -- not all -- aspects of extreme programming, for
example.)

Of course, names don't have to be long to be meaningful. As the control
variable for a tight loop, "i" is more meaningful than
"indexIntoTheArray". As a general rule, the length of the name should
be inversely proportional to the size of its scope. On the average --
"sin" is a pretty short name for a global function, but it is definitely
the best name possible in this case. (Much better than the Math.sin in
another popular language.)

> Wel,, yes, I do want a clear mark that I'm using a lambda and have a
> choice of the name I'll use.

Then we don't really disagree:-).

Perhaps Maciej should have written: "There are times that _1, _2, ...
names tell me nothing about their meaning..." I'm not used to such
names, so they don't speak to me yet. But that's a relative situation;
names using Kanji wouldn't speak to me either, but I know that there are
programmers for whom they would be a lot more meaningful than the names
I use. And as a convention in certain contexts, like a lambda being
used for a algorithm out of the STL, why not -- I'm more used to x and
y, but that's really a question of habit.

> | This may prove really good when you write lambda expressions as a
result
> | of moving code around. Even better - it makes it easier to refactor
> | lambdas into normal functions if such a needs arises. The cryptic _1,
> | _2, etc. force me to rewrite the code when moving it to lambda or back.

> | What about:

> | int (int x, int y) { return x + y; }

> | It is even longer, but specifying types (especially the return type) is
> | what enables me to use lambda expressions with overloaded targets.

> It may probably introduce some parsing oroblems.
> Byt, I would go even further and drop the "return" stuff.

What do you use in its place? Given that the function may be more than
just a simple return statement.

--
James Kanze home: www.gabi-soft.fr

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 pl. Pierre Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

Maciej Sobczak

unread,

Dec 12, 2004, 6:07:55 PM12/12/04

to

Gabriel Dos Reis wrote:

> Byt, I would go even further and drop the "return" stuff.

I would like to have the possibility to declare local objects in
lambdas. Not mentioning some usual control statements (if/else, etc.).

In general, I see lambdas as unnamed *functions*.
Which means that they have everything that a normal function has, except
name.

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Thorsten Ottosen

unread,

Dec 12, 2004, 6:08:50 PM12/12/04

to

"James Kanze" <ka...@none.news.free.fr> wrote in message
news:41bb2bba$0$9794$626a...@news.free.fr...

| Dave Harris wrote:
| > ka...@gabi-soft.fr () wrote (abridged):

| > Distinguishing between functor and function by using "inline" is

| possible,
| > but not a case I much care about:
|
| > void (*p)( int x ) = void inline( int x ) { ... };
|
| > and it does pose problems for the implementation.
|
| As far as I'm concerned, I think we should forget about the functions.
| But that's just my gut feeling, right now.

Other issues aside, I find the syntax

return_type inline(arg_list) { ... }

very compelling.

| > The obvious syntax is to use "lambda" as the name.
|
| > for_each( first, last, lambda { int sum;
| > lambda() : sum(0) {}
| > operator()( auto &i ) { sum += i; }
| > ~lambda() { cout << sum; }
| > } )
|
| The idea struck me too about 10 seconds after I had posted:-).

As a related issue, the cool feature of letting auto define function templates
is not there in the current auto proposal.

| Maybe. I was thinking of it as a more general underlying mechanism.
| __local_context would have existed, at least in the formal definition of
| the language, although you wouldn't normally have used it. It had
| occured to me as a means of unifying lambda and cleanup. Of course, you
| don't need cleanup. You could always just write:
|
| auto& x = lambda { ~lambda() { ... } } ;

cleanup should be really easy with lambda function + scope guard. I don't
see why we would need anything else.

-Thorsten.

Gabriel Dos Reis

unread,

Dec 13, 2004, 5:30:14 AM12/13/04

to

James Kanze <ka...@none.news.free.fr> writes:

[...]

| > | I want to name things which I use. This means that some language notion
| > | of name declaration is in order. The example with:
|
| > | lambda(x, y) { return x + y; }
|
| > | is quite clear, because there is a place for naming (declaring) things
| > | and a place for using things.
|
| > In the paper "generalizing overloading for C++ 2000", it is argued
| > that we don't need hard-to-read long names and that multi-character
| > names are a relic of languages that relied heavily on global names and
| > encouraged overly-large scopes. I guess the proposed notation _<n>
| > has something to it :-)
|
| Is the argumentation based on some real studies, or just wishful
| thinking.

Why don't you judge by yourself, based on the paper available here

http://www.research.att.com/~bs/whitespace98.pdf

in the section "Overloading missing whitespace"?

| (I don't want to attack you in particular, but I have the

Oh, you can, I'm used to it ;-)

[...]

| > Wel,, yes, I do want a clear mark that I'm using a lambda and have a
| > choice of the name I'll use.
|
| Then we don't really disagree:-).

Ah?

[...]

| > | int (int x, int y) { return x + y; }
|
| > | It is even longer, but specifying types (especially the return type) is
| > | what enables me to use lambda expressions with overloaded targets.
|
| > It may probably introduce some parsing oroblems.
| > Byt, I would go even further and drop the "return" stuff.
|
| What do you use in its place?

The last statement -- which must be an expression statement, if you
drop the redundant "return" keyword.

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gabriel Dos Reis

unread,

Dec 13, 2004, 5:41:04 AM12/13/04

to

"Thorsten Ottosen" <nes...@cs.auc.dk> writes:

| "David Abrahams" <da...@boost-consulting.com> wrote in message
|
| | One thing that I think has been overlooked is that there should be a
| | syntax for lambdas that's as convenient and expressive as the current
| | library solutions. If you have to write anything as complicated as:
| |
| | lambda (x, y) { return x + y; }
| |
| | instead of:
| |
| | _1 + _2
| |
| | I think we'll be taking a lateral -- rather than a forward -- step.
|
| perhaps $1 etc could be builtin placeholders for lambdas?

If I must write $1, then I still prefer "x". Semi ;-)

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gabriel Dos Reis

unread,

Dec 13, 2004, 5:42:13 AM12/13/04

to

Maciej Sobczak <no....@no.spam.com> writes:

| Gabriel Dos Reis wrote:
|
| > Byt, I would go even further and drop the "return" stuff.
|
| I would like to have the possibility to declare local objects in
| lambdas. Not mentioning some usual control statements (if/else, etc.).

Yes, why would not you?

| In general, I see lambdas as unnamed *functions*.
| Which means that they have everything that a normal function has, except
| name.

No, they don't. Lambdas, as have been used here in this thread,
display a different beast. If you look at them carefully, you'll
realize that they are the *initializer* for function, as such acts
more like expressions rather than statements. In the early days (late
'70, early '80), syntax like

int (*abs)(int x) = &{ x < 0 ? -x : x; };

(reminiscent of BCPL) have been suggested to make the "pointeresque"
nature of functions more obvious. If you go back as far as to
Algol68r (and CPL, CPL, BCPL), you'll see that those ideas are not new.
But now, it seems like we must introduce more obscure syntaxes to make
them non-parallel ;-)

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Seungbeom Kim

unread,

Dec 13, 2004, 5:42:55 AM12/13/04

to

Terje Sletteb=F8 wrote:
> Hm, how about omitting the word "lambda", as well:

I hope we can avoid using "lambda" as a keyword at all ways possible,
because I can think of many programs which use the word as identifiers
already; it represents eigenvalues, wavelengths, etc. in their
respective contexts.

--
Seungbeom Kim

David Abrahams

unread,

Dec 13, 2004, 5:44:20 AM12/13/04

to

Maciej Sobczak wrote:
> David Abrahams wrote:
>
>>>The _1, _2, ... names tell me nothing about their meaning.
>
>> Neither does the symbol '{' ... until you learn what it means in C++.
>> I'm not attached to those particular symbols, you understand. But if
>> lambdas are forced to be much more verbose than this they often won't be
>> worth using. The whole point is to have something concise you can drop
>> in without interrupting the flow of a function.
>
> Interrupt?

Yes, interrupt. Your sarcasm is unwarranted.

> For me, the point in lambdas is to increase the "locality of reference"
> of the source code and to be able to see what the functor does in the
> place where it is used:
>
> sort(b, e, compare);
>
> Err... do I need to search the source code for the definition of compare
> to understand what's going on?

That's what I meant. That search is an interruption.

> Having this;
>
> sort(b, e, _1 << _2);
>
> is fine.
> Having this:
>
> sort(b, e, bool (int l, int r) { return l < r; });
>
> is fine, too

Wow, you really think so? I don't. Let's count the characters that are
really not really important to what we're trying to achieve:

sort(b, e, bool (int l, int r) { return l < r; });

^^^^^^^^^^ ^^^^^^ ^^^^^^^^^^ ^^^

IMO that's way too much syntactic overhead.

> , but having this:
>
> // 1.
> sort(b, e,
> bool (Person const &p1, Person const &p2)
> { return p1.age < p2.age; }
> };
>
> is for me clearer than:
>
> // 2.
> sort(b, e, _1.age < _2.age);

Not for me. I don't think _1 and _2 are ideal, but they're better than
having to write out -- and maintain when the element type changes -- a
function signature.

> (and I do not even mention that (2) does not work with Boost.Lambda -

True but irrelevant. A well-designed built-in language construct would
support it.

> the explicit declaration of p1 and p2 in (1) make the expressions p1.age
> and p2.age well-defined)

No fair comparing the capabilities of a library designed under C++98
constraints with those of your hypothetical C++0x language feature!

>>>This may prove really good when you write lambda expressions as a result
>>>of moving code around.
>
>> Why?
>
> Because I may want to inject as lambda a piece of code that previously
> was existing as a separate function that did not use _N names.

Like I've said, I have no particular attachment to those names. My
point was that the new language feature should allow us to write code
that was as terse, or nearly so.

>>>Even better - it makes it easier to refactor
>>>lambdas into normal functions if such a needs arises. The cryptic _1,
>>>_2, etc. force me to rewrite the code when moving it to lambda or back.
>
>> No they don't.
>>
>> char f(int _1, char* _2) { return _2[_1]; }
>
> Are you joking?

No. They do not "force you to rewrite code."

> Do you have enough time to defend it at code reviews?

You're missing the point. Any name that is a legal identifier would work.

>>>It is even longer, but specifying types (especially the return type) is
>>>what enables me to use lambda expressions with overloaded targets.
>
>> How exactly would you create such an overload set? Example, please!
>
> void foo(int);
> void foo(char);
>
> template <class Functor>
> void bar(Functor f)
> {
> foo(f(96));
> }
>
> // later:
> bar(char (int i) { return i + 1; }); // lambda returns char
>
> As a result, the *second* overload of foo is called with (char)97 (='a')
> as the parameter.
>
> With this:
>
> bar(_1 + 1);
>
> the *first* foo would be called

No, the type of the resulting expression is automatically deduced in
context.

> and I do not see how to make it call the
> second overload.
>
> (and I do not mention that the above does not compile with Boost.Lambda)

True(?) but irrelevant.

We need a terse notation for lambdas that evaluate simple expressions.
I've been avoiding suggesting specific notations because I don't have
time to figure out what could really work, but consider for example
something along the lines of:

{x -> x + 1}
{x,y -> x[y]}

Cheers,

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Maciej Sobczak

unread,

Dec 13, 2004, 10:28:10 PM12/13/04

to

Maciej Sobczak wrote:

>>>This may prove really good when you write lambda expressions as a result
>>>of moving code around.
>
>
>>Why?
>
>
> Because I may want to inject as lambda a piece of code that previously
> was existing as a separate function that did not use _N names.
>
> Consider:
>
> // 3.
> bool (Person const &p1, Person const &p2)
> { return p1.age < p2.age; }
>
> Think of (1) as a drop-in from (3).

Of course, it should be:

// 3.
bool compare(Person const &p1, Person const &p2)
{ return p1.age < p2.age; }

My error reflects exactly the intended use of lambdas - I want to use
lambdas as regular functions, just without names.
Turning a regular function into a lambda (or back) should be just a
matter of removing (or adding, which I failed to do) name.

Maciej Sobczak

unread,

Dec 13, 2004, 10:32:03 PM12/13/04

to

Gabriel Dos Reis wrote:

> | > Byt, I would go even further and drop the "return" stuff.
> |
> | I would like to have the possibility to declare local objects in
> | lambdas. Not mentioning some usual control statements (if/else, etc.).
>
> Yes, why would not you?

If you allow control statements in lambdas, you need some way to say "I
want to *return* this value" when you are ready, which is not
necessarily at the end of the function's body. Yes, if/else may be
replaced by ?: for most of the time, but throw some loops into the mix
and the lack of explicit return statement is quite constraining.

I know that lambdas are intended as drop-ins for extremely simple
functions and are nice to show when the function itself can be expressed
as a single expression, but limiting lambdas only to such cases would
not make sense, IMHO.

Or we'd better explicitly stress the difference by using names "lambda"
and "unnamed function". I'm interested in the latter.

> | In general, I see lambdas as unnamed *functions*.
> | Which means that they have everything that a normal function has, except
> | name.
>
> No, they don't.

Well, I talk about how *I* see them.
The fact that there are no lambdas in the C++ language at the moment
gives me the license to brainstorm in any direction I want. ;)

For me, lambda is an unnamed function defined where it is used (or let's
differ these by names).
I understand that for some, lambda is a mathematical function, possibly
defined in terms of a computable expression, which of course has its own
merit, but I'd rather see it as a "function" in the sense that I already
know it from the rest of the language.

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Dec 13, 2004, 10:29:44 PM12/13/04

to

David Abrahams wrote:
> Maciej Sobczak wrote:
>> David Abrahams wrote:
>
>>>>It is even longer, but specifying types (especially the return type) is
>>>>what enables me to use lambda expressions with overloaded targets.
>>
>>> How exactly would you create such an overload set? Example, please!
>>
>> void foo(int);
>> void foo(char);
>>
>> template <class Functor>
>> void bar(Functor f)
>> {
>> foo(f(96));
>> }
>>
>> // later:
>> bar(char (int i) { return i + 1; }); // lambda returns char
>>
>> As a result, the *second* overload of foo is called with (char)97 (='a')
>> as the parameter.
>>
>> With this:
>>
>> bar(_1 + 1);
>>
>> the *first* foo would be called
>
> No, the type of the resulting expression is automatically deduced in
> context.

Whoops, I guess I misread your example. Yes, the expression

c + 1

where c is a char has type int, and not char. In Boost.Lambda you'd
write something like:

_1 + (char)1

to get what you want. Or, if you insist on doing things the hard way:

static_cast_<char>(_1 + 1)

But anyway, I'm not saying that a language solution should look like
Boost.Lambda.

{x->char: x + 1}

would be a nice notation.

Maciej Sobczak

unread,

Dec 13, 2004, 10:32:46 PM12/13/04

to

David Abrahams wrote:

>>For me, the point in lambdas is to increase the "locality of reference"
>>of the source code and to be able to see what the functor does in the
>>place where it is used:
>>
>>sort(b, e, compare);
>>
>>Err... do I need to search the source code for the definition of compare
>>to understand what's going on?
>
>
> That's what I meant. That search is an interruption.

Exactly, and that's why I want to drop in a regular function definition
in the place of "compare" above.

> Wow, you really think so? I don't. Let's count the characters that are
> really not really important to what we're trying to achieve:
>
> sort(b, e, bool (int l, int r) { return l < r; });
> ^^^^^^^^^^ ^^^^^^ ^^^^^^^^^^ ^^^
>
> IMO that's way too much syntactic overhead.

I don't buy this.
Get the "Hello, World" program and "count the characters that are really
not really important to what we're trying to achieve".
The conclusion is?

If we advertise C++ to have strong type system, then I do not want to
give it up by writing code where the type is always deduced. I want to
define the return type and the types of parameters just as I do
elsewhere. I may want to achieve some specific conversion effects this
way, just as I do elsewhere.
The "type deduction everywhere" is not compelling to me.

> Not for me. I don't think _1 and _2 are ideal, but they're better than
> having to write out -- and maintain when the element type changes -- a
> function signature.

Why? You do this with regular functions.

>>the explicit declaration of p1 and p2 in (1) make the expressions p1.age
>>and p2.age well-defined)
>
>
> No fair comparing the capabilities of a library designed under C++98
> constraints with those of your hypothetical C++0x language feature!

It is not only the comparison. I may want to use the base type of what's
passed in, or any other operation that I use in regular function
signatures. In other words, I want to control types that I use.
If not, there should always be a possibility to write a template lambda.

> Like I've said, I have no particular attachment to those names. My
> point was that the new language feature should allow us to write code
> that was as terse, or nearly so.

That's a good point. But do not throw out of the window the type safety
that comes with function signatures.

>>>No they don't.
>>>
>>> char f(int _1, char* _2) { return _2[_1]; }
>>
>>Are you joking?
>
>
> No. They do not "force you to rewrite code."
>
>
>>Do you have enough time to defend it at code reviews?
>
>
> You're missing the point. Any name that is a legal identifier would work.

Maybe I was not clear.
Nobody uses _N (or any other "generic" naming scheme) in regular code.
This means that moving code from lambda to the regular function means
changing names to comply with coding standards, because in regular code
there is no excuse not to follow those standards.
I want to use my coding standards (this includes naming conventions) in
lambdas.

>>With this:
>>
>>bar(_1 + 1);
>>
>>the *first* foo would be called
>
>
> No, the type of the resulting expression is automatically deduced in
> context.

And the context above is that 96 is int, so _1 + 1 is int as well, so
the *first* foo would be called.
Do I miss something?

> We need a terse notation for lambdas that evaluate simple expressions.

Except when those expressions are not that simple.
If you define lambda as a function that contains only a simple
expression, then this would be a very non-orthogonal feature in the
language.

For example, nobody prevents me from declaring long function as inline
(what the compiler does with it is another issue). Why prevent people
from writing lambdas that are longer than just a simple expression?

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gabriel Dos Reis

unread,

Dec 13, 2004, 10:34:11 PM12/13/04

to

David Abrahams <da...@boost-consulting.com> writes:

[...]

| We need a terse notation for lambdas that evaluate simple expressions.
| I've been avoiding suggesting specific notations because I don't have
| time to figure out what could really work, but consider for example
| something along the lines of:
|
| {x -> x + 1}
| {x,y -> x[y]}

That is a different statement from your previous:

One thing that I think has been overlooked is that there should be a
syntax for lambdas that's as convenient and expressive as the current
library solutions. If you have to write anything as complicated as:

lambda (x, y) { return x + y; }

instead of:

_1 + _2

I think we'll be taking a lateral -- rather than a forward -- step.

Some people working on lambdas for C++0x have suggested

{ x | x + 1 }
{ a, i | a[i] }

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dave O'Hearn

unread,

Dec 13, 2004, 10:34:32 PM12/13/04

to

David Abrahams wrote:
> Wow, you really think so? I don't. Let's count the characters that
> are really not really important to what we're trying to achieve:
>
> sort(b, e, bool (int l, int r) { return l < r; });
> ^^^^^^^^^^ ^^^^^^ ^^^^^^^^^^ ^^^

I think some people are failing to appreciate that C++ already does
tons of type deduction whenever we use a generic algorithm. After all,
if anyone felt like it, they could write this:

std::sort<
std::vector< std::string >::iterator,
some_functor_type
>(vec.begin(), vec.end(), some_functor);

Fortunately, we are instead able to write,

std::sort(vec.begin(), vec.end(), some_functor);

Asking for the same amount of brevity in lambdas is nothing radical.
--
Dave O'Hearn

ka...@gabi-soft.fr

unread,

Dec 14, 2004, 2:47:03 PM12/14/04

to

Gabriel Dos Reis wrote:
> Maciej Sobczak <no....@no.spam.com> writes:

> | Gabriel Dos Reis wrote:

> | > Byt, I would go even further and drop the "return" stuff.

> | I would like to have the possibility to declare local objects in
> | lambdas. Not mentioning some usual control statements (if/else,
> | etc.).

> Yes, why would not you?

> | In general, I see lambdas as unnamed *functions*. Which means that
> | they have everything that a normal function has, except name.

> No, they don't. Lambdas, as have been used here in this thread,
> display a different beast.

I'm not sure that there is a concensus as to what lambdas should be.
IMHO, at the lowest level, they should be an anonymous object of an
anonymous class type. And I rather think that they should support
everything that class types do -- I would consider inheritance
essential, for example. On the other hand, Dave Abrahams'comments to
the effect that when all you want is _1 > a (or x > a, or ...), having
to write out all of the rest which is necessary for the general form is
more than just a pain; if you're in the flow of a simple function, a
call to sort with a relatively simple ordering criteron should not
require more than a single line. So whatever is adopted shoud provide
some sort of short cuts for the simple cases (which are probably also,
by far, the most frequent).

--
James Kanze GABI Software http://www.gabi-soft.fr

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,

Dec 14, 2004, 2:55:11 PM12/14/04

to

Gabriel Dos Reis wrote:
> James Kanze <ka...@none.news.free.fr> writes:
[...]

> | > | int (int x, int y) { return x + y; }

> | > | It is even longer, but specifying types (especially the return
> | > | type) is what enables me to use lambda expressions with
> | > | overloaded targets.

> | > It may probably introduce some parsing oroblems.
> | > Byt, I would go even further and drop the "return" stuff.

> | What do you use in its place?

> The last statement -- which must be an expression statement, if you
> drop the redundant "return" keyword.

Well, I like the idea. But is it C(++) ? It sounds too elegant and
function programmy for C++:-).

Pesonally, I'd say drop the return keyword, and require that the last
statement be an expression statement if the return type is not void.
But I rather suspect that that idea is a non-runner from the start:-).

Seriously, of course, I'd oppose removing the return keyword because of
the code it would break, although I do like the idea that the only way
of returning from a function is by falling off the end. As for simply
allowing the keyword to be dropped in the case where the last statement
is an expression statement and the type of the expression is
convertable
to the return type of the function (which, I imagine, you'd like to see
automatically determined as well:-)), I'll have to think about it.

--
James Kanze GABI Software http://www.gabi-soft.fr

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Dave O'Hearn

unread,

Dec 14, 2004, 3:06:01 PM12/14/04

to

Thorsten Ottosen wrote:
> perhaps $1 etc could be builtin placeholders for lambdas?

I believe there are issues with the dollar sign, that it isn't
available in non-US character sets, or something. Odd, since many
programming languages use the thing, but I'm fairly sure I read this in
TC++PL or D&E.

> Even without super terse notation, I see the following major
> benefits:
>
> - better error diagnostics
> - no bind
> - no mn_fun
> - fewer forwarding problems
> - local function support
> - implicit access to local context

Another benefit is improved performance. The different isn't huge, but
chaining binders together has inferior performance to hand-written
functors. Boost.Lambda manages to match the performance of chaining
binders, and sometimes beats it, while providing nicer syntax, but
still hand-written functors are better,

http://www.boost.org/doc/html/lambda/s07.html

>From a software point of view, the performance difference is only an
optimization, but the STL made a promise that generic code could be
every bit as efficient as hand-written code. Binders seem out of
harmony with generic programming, as they break the promise.

Also, for all that Boost.Lambda is fun and has nice syntax, it is an
egregious hack! (In a good way, of course!) The thing has so many
templates it gives me nightmares. But the alternative is nasty chains
of binders, or functors with poor code locality... It really would be
nice to have a 4th option without any of these shortcomings. Generic
programming feels incomplete without it.

--
Dave O'Hearn

Allan W

unread,

Dec 14, 2004, 3:27:00 PM12/14/04

to

Are you suggesting that lambdas can do nothing but return a value in a
single expression?
- They use iteration?
- The only way to have selection is with the conditional operator?
- The only way to have intermediate results is with the comma operator
-- and that won't do much good unless they affect globals, because
there are no local variables?

If the body of a lambda is limited to a single RETURN statement, and
we're 100% certain that we're never going to regret this decision in
the future -- then yes, the keyword "return" is redundant and could be
dropped.

That's a huge IF!

If a lambda can declare local variables, use switch/if/for/while/do
statements, and/or have local blocks... or if it ever might conceivably
add these constructs at some point in the future... then why invent a
new mini-language? Every person that reads this newsgroup already knows
how to write a function that returns a value, and it requires the
"return" keyword. No need to change that.

Gabriel Dos Reis

unread,

Dec 14, 2004, 3:29:32 PM12/14/04

to

Maciej Sobczak <no....@no.spam.com> writes:

[...]

| Maybe I was not clear.
| Nobody uses _N (or any other "generic" naming scheme) in regular code.

Well, "nobody" in which set?
:-)

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Dec 14, 2004, 3:33:24 PM12/14/04

to

> >> void foo(int);
> >> void foo(char);

> >> With this:

> >> bar(_1 + 1);

> c + 1

> _1 + (char)1

Which still has type int.

> to get what you want.

Which is? I'd argue that if it makes a difference, you've abused
overloading. (On the other hand, there may be cases in template
programming where the difference is significant.)

> Or, if you insist on doing things the hard way:

> static_cast_<char>(_1 + 1)

Well, for basic arithmetic types, I still use the old C-style casts:-),
so it would be "(char)(_1 + 1)".

> But anyway, I'm not saying that a language solution should look like
> Boost.Lambda.

> {x->char: x + 1}

> would be a nice notation.

For example. Unless what was wanted was x->typeof(x) :-).

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,

Dec 14, 2004, 3:35:14 PM12/14/04

to

Gabriel Dos Reis wrote:
> David Abrahams <da...@boost-consulting.com> writes:

> [...]

> | We need a terse notation for lambdas that evaluate simple
> | expressions. I've been avoiding suggesting specific notations
> | because I don't have time to figure out what could really work, but
> | consider for example something along the lines of:

> | {x -> x + 1}
> | {x,y -> x[y]}

Note that both of these are syntactically legal as is.

> That is a different statement from your previous:

> One thing that I think has been overlooked is that there should be
a
> syntax for lambdas that's as convenient and expressive as the
> current library solutions. If you have to write anything as
> complicated as:

> lambda (x, y) { return x + y; }

> instead of:

> _1 + _2

> I think we'll be taking a lateral -- rather than a forward -- step.

> Some people working on lambdas for C++0x have suggested

> { x | x + 1 }
> { a, i | a[i] }

As is this.

I like the ideas being put foreward (as long as it is understood that
they are shortcuts for a "complete" presentation, when I need something
more than just a simple expression). I would prefer, however, that the
syntax be something that is not currently legal -- the last thing C++
needs is another tricky parse issue.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Gabriel Dos Reis

unread,

Dec 14, 2004, 3:39:16 PM12/14/04

to

Maciej Sobczak <no....@no.spam.com> writes:

I do not understand your statements. You want to declare obejcts, I
said "yes, you can". If you want to have if/else; that is fine. You
can have a poney too. What I'm saying is that I would go even further
and replace the

return x + 1;

with a simple

x + 1;

that is what you're against? Or are you against something more
fundamental I did not say?

[...]

| I understand that for some, lambda is a mathematical function, possibly

I'm not one of those.

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Dec 14, 2004, 8:33:25 PM12/14/04

to

Maciej Sobczak wrote:
> Gabriel Dos Reis wrote:

> > | > Byt, I would go even further and drop the "return" stuff.

> > | I would like to have the possibility to declare local objects in
> > | lambdas. Not mentioning some usual control statements (if/else,
> > | etc.).

> > Yes, why would not you?

> If you allow control statements in lambdas, you need some way to say
> "I want to *return* this value" when you are ready, which is not
> necessarily at the end of the function's body. Yes, if/else may be
> replaced by ?: for most of the time, but throw some loops into the
mix
> and the lack of explicit return statement is quite constraining.

Quite. It means that you have to write clean code:-).

Seriously, I didn't see anywhere where Gabriel said that you would not
be allowed to use an explicit return. All he said (reinterpreting a
bit) was that if the last statement was an expression statement,
falling
off the end of the function was the equivalent of returning the value
of
the expression. I sort of like the idea. In fact, I rather like the
idea of extending it to all functions. With restrictions, of course:
for starters, the type of the expression cannot be void.

> I know that lambdas are intended as drop-ins for extremely simple
> functions and are nice to show when the function itself can be
> expressed as a single expression, but limiting lambdas only to such
> cases would not make sense, IMHO.

Actually, I'm not sure. I've a little bit of experience with inner
classes in Java, and I can tell you that if the inner class contains
more than a single function, and that function contains more than just
a
single expression statement, the code very quickly becomes illegible.
But there are exceptions, and I'd prefer that we leave the options open
to the programmer, even if it allows him to write some pretty bad code.

> Or we'd better explicitly stress the difference by using names
> "lambda" and "unnamed function". I'm interested in the latter.

Well, I don't quite see myself typing in "unnamed function" every time,
but suggestions like "inline" and "auto" could be considered.

> > | In general, I see lambdas as unnamed *functions*. Which means
> > | that they have everything that a normal function has, except
name.

> > No, they don't.

> Well, I talk about how *I* see them.
> The fact that there are no lambdas in the C++ language at the moment
> gives me the license to brainstorm in any direction I want. ;)

> For me, lambda is an unnamed function defined where it is used (or
> let's differ these by names).

> I understand that for some, lambda is a mathematical function,
> possibly defined in terms of a computable expression, which of course
> has its own merit, but I'd rather see it as a "function" in the sense
> that I already know it from the rest of the language.

And for some (like myself) lambda is an anonymous instance of an
anonymous class:-). But I definitly like the possibility of
simplifying
the class definition for frequently used cases, e.g. reducing the class
definition of a functional object to simply the code in the function.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Gabriel Dos Reis

unread,

Dec 14, 2004, 8:36:09 PM12/14/04

to

"Allan W" <all...@my-dejanews.com> writes:

| Are you suggesting that lambdas can do nothing but return a value in a
| single expression?

No.

| - They use iteration?

No.

| - The only way to have selection is with the conditional operator?

No.

| - The only way to have intermediate results is with the comma operator

No.

| -- and that won't do much good unless they affect globals, because
| there are no local variables?

No.

I'm amazed of how you could have drawn that set of questions from my
earlier posting :-)

| If the body of a lambda is limited to a single RETURN statement, and
| we're 100% certain that we're never going to regret this decision in
| the future -- then yes, the keyword "return" is redundant and could be
| dropped.

No, the point is that you can have sequence of statement, only the
last one is considered candidate for return value, unless the whole
lambda is set to return no value. This is not new invention, it has
been there for a long time; some even as C extensions.

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Maciej Sobczak

unread,

Dec 14, 2004, 8:37:33 PM12/14/04

to

Gabriel Dos Reis wrote:

> I do not understand your statements. You want to declare obejcts, I
> said "yes, you can". If you want to have if/else; that is fine. You
> can have a poney too. What I'm saying is that I would go even further
> and replace the
>
> return x + 1;
>
> with a simple
>
> x + 1;
>
> that is what you're against? Or are you against something more
> fundamental I did not say?

The way I understood your previous post is that you wanted to remove the
return keyword *completely* from the lambda definition, leaving only a
single expression which value is meant to be a value of the lambda
function. I'm against it, because not everything can be expressed as a
single expression.

However, if you allow all regular constructs (locals, control
statements, poney, etc.) and just want to remove return keyword from the
*last* executed statement, that's fine, I'm not die-hard against it.

Note, however, that this concept is not bound to lambdas in any way. It
can be applied to regular functions as well, or should I say it *should*
be applied everywhere for the sake of consistency:

int abs(int i)
{
i >=0 ? i : -1 * i;
}

int fact(int n)
{
n <= 1 ? 1 : n * fact(n - 1);
}

// or more funny:
int fact(int n)
{
if (n <= 1)
1;
else
n * fact(n - 1);
}

I'm not against it. This is what we already have in some scripting
languages. It may be useful in lambdas, and in some short (usually
inline) regular functions.

--
Maciej Sobczak : http://www.msobczak.com/
Programming : http://www.msobczak.com/prog/

[ See http://www.gotw.ca/resources/clcm.htm for info about ]