Returning locally created variables

Marcus Lindblom

unread,

Jan 5, 1999, 3:00:00 AM1/5/99

to

Hi!

(newbie c++ question)

I have a function that looks like this:

ndMatrix operator*(ndMatrix &)
{
ndMatrix temp;
....
return ndMatrix;
}

which simply multiplies the two matrices. It works just as it should.

Now, it seems a bit inefficient to create a object, give it some
values, create a copy of it, return the copy and then delete the object.

Wouldn't it be better if I just created the object once and returned
a reference to that object?

When I try that, VC++ 6.0 gives me a warning and when compiling
with optimization on, (Release build), my code fails. This is probably
correct from the compilers perspective, but not from where I see it! :)

So, is there a solution to this problem or do I have to stick with my code
the way it is?

My C++ book is too basic to include any discussion on such problems.
(I know it inside out by now anyway..)

All help gratefully accepted.
/Marcus

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Thiemo Seufer

unread,

Jan 5, 1999, 3:00:00 AM1/5/99

to

Marcus Lindblom wrote in message <36920...@d2o54.telia.com>...
[snip]

>Now, it seems a bit inefficient to create a object, give it some
>values, create a copy of it, return the copy and then delete the
object.

A good compiler may optimize the superfluous copy away.

>Wouldn't it be better if I just created the object once and returned
>a reference to that object?
>
>When I try that, VC++ 6.0 gives me a warning and when compiling
>with optimization on, (Release build), my code fails. This is probably
>correct from the compilers perspective, but not from where I see it! :)

Returning a reference/pointer to a non-static local variable is a
well known bug. Dereferencing the return value will be an attempt
to access a already gone variable, which leads to an access violation.

Thiemo Seufer

Francis Glassborow

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to

In article <36920...@d2o54.telia.com>, Marcus Lindblom <marcus.lindblo
m...@cyberdude.com.no.spam> writes

>Hi!
>
>(newbie c++ question)
>
>I have a function that looks like this:
>
>ndMatrix operator*(ndMatrix &)
>{
> ndMatrix temp;
> ....
> return ndMatrix;
>}
>
>which simply multiplies the two matrices. It works just as it should.
>

>Now, it seems a bit inefficient to create a object, give it some
>values, create a copy of it, return the copy and then delete the object.
>

>Wouldn't it be better if I just created the object once and returned
>a reference to that object?

But where will that object be. Local objects go in memory (usually a
stack) that is recycled when you exit the block (function).

However C++ allows an intelligent implementation to do what you want as
an optimisation, though it has to do it by using the space for the
return value for temp.

However there are better solutions involving lazy copying and counted
references.

>
>When I try that, VC++ 6.0 gives me a warning and when compiling
>with optimization on, (Release build), my code fails. This is probably
>correct from the compilers perspective, but not from where I see it! :)
>

>So, is there a solution to this problem or do I have to stick with my code
>the way it is?
>
>My C++ book is too basic to include any discussion on such problems.
>(I know it inside out by now anyway..)

Among other texts that may help is 'Secrets of the C++ Masters' by
Alger. Much better content than the title would suggest - it was
reissued recently with a different title, but with very little change
(sad because C++ has moved on a little since it was written)

>
>All help gratefully accepted.
>/Marcus

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Dai Corry

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to

Marcus Lindblom wrote in message <36920...@d2o54.telia.com>...

>Hi!
>
>(newbie c++ question)
>
>I have a function that looks like this:
>
>ndMatrix operator*(ndMatrix &)
>{
> ndMatrix temp;
> ....
> return ndMatrix;
>}
>

>Now, it seems a bit inefficient to create a object, give it some
>values, create a copy of it, return the copy and then delete the object.
>Wouldn't it be better if I just created the object once and returned
>a reference to that object?

Run, do not walk (virtually if necessary), to the nearest good computer
bookstore
and purchase Scott Meyers' book MORE EFFECTIVE C++. Then read the discussion
in Items 19 "Understand the origin of temporary objects" and 20 "Facilitate
the
return value optimization". (See also Item 23 in his first book EFFECTIVE
C++.)

Basically: yes, it may be inefficient. That's why the standard allows an
aggressively-
optimizing compiler to build the temporary object in the memory space where
the
returned object will end up, saving you the cost of the double constructor/
destructor invocations. Of course, your compiler may not be sufficiently
aggressive,
in which case there are at least four possible workarounds:

1) use a better compiler
2) if you're stuck with a particular vendor's compiler, wait until they
release
a better version
3) pass in a pointer (or reference) to an object that will become the target
of the return value and let the function build its result there. You'll
have
to manage memory yourself, and you'll lose the syntactic sugar of being
able to call the function as if it were an operator, but you'll have as
much
control over efficiency as you like
4) don't worry about it

The option I recommend is (4). It galls, I know, but premature
hand-optimization
is probably responsible for as many bugs as bad design, IMHO. Only after you
have
determined by inspection and measurement that

a) your code does not run fast enough to satisfy the customer's needs,
AND
b) profiling shows that this particular "extra" constructor call is the
culprit (or is at least a significant contributor to the runtime
burden), AND
c) your compiler does not implement the return value optimization and
you can't replace it with one that does

should you go to the trouble and expense of -- and burden future maintainers
with -- hand-optimized code. You'll be appalled at how often it ain't worth
it.
Honest.

Trust the compiler. (And profile your code.) If you can't do that, write in
assembly language.

Like me. <grin>

Gabriel Dos_Reis

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to

"Marcus Lindblom" <marcus....@cyberdude.com.no.spam> writes:

> Hi!
>
> (newbie c++ question)
>
> I have a function that looks like this:
>
> ndMatrix operator*(ndMatrix &)
> {
> ndMatrix temp;
> ....
> return ndMatrix;

I assume this should read "return temp;".

> }
>
> which simply multiplies the two matrices. It works just as it should.
>

> Now, it seems a bit inefficient to create a object, give it some
> values, create a copy of it, return the copy and then delete the object.
>
> Wouldn't it be better if I just created the object once and returned
> a reference to that object?
>

No, that is a bad idea.

> When I try that, VC++ 6.0 gives me a warning and when compiling
> with optimization on, (Release build), my code fails. This is probably
> correct from the compilers perspective, but not from where I see it! :)
>
> So, is there a solution to this problem or do I have to stick with my code
> the way it is?

It depends whether your compiler implements the infamous "named return
value optimization." If so then just toggle on the appropriate options and
see what happens.

You might also want to read about expression template techniques or
do a search for past threads in this newsgroup or comp.std.c++ about
"return value optimizations"

>
> My C++ book is too basic to include any discussion on such problems.
> (I know it inside out by now anyway..)
>

May I suggest you to take at look at
"The C++ Programming Language", 3rd ed
Bjarne Stroustrup, 1997 Addison Wesley.

It has a section devoted to this question and relatives.

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

Stan Brown

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to

[posted and e-mailed]

In newsgroup comp.lang.c++.moderated, article
<36920...@d2o54.telia.com>, the lovely and talented Marcus Lindblom
(marcus....@cyberdude.com) wrote:

>I have a function that looks like this:
>
>ndMatrix operator*(ndMatrix &)
>{
> ndMatrix temp;
> ....
> return ndMatrix;
>}

^^^^^^^^ Presumably you meant "return temp" here ?

>Now, it seems a bit inefficient to create a object, give it some
>values, create a copy of it, return the copy and then delete the object.
>
>Wouldn't it be better if I just created the object once and returned
>a reference to that object?

Well, yes. But where will you create it? In the code above, temp lives on
the stack, and it is destroyed upon function return. So if you return a
reference to it, you are returning a reference to an object that no
longer exists.

>My C++ book is too basic to include any discussion on such problems.
>(I know it inside out by now anyway..)

Run, don't walk, to the bookstore and buy Scott Meyers' /More Effective
C++/. Items 19 and 20 discuss your query better than I ever could.

--
Stan Brown, Oak Road Systems, Cleveland, Ohio, USA
http://www.concentric.net/%7eBrownsta/
My reply address is correct as is. The courtesy of providing a correct
reply address is more important to me than time spent deleting spam.

Bill Seymour

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to

Marcus Lindblom wrote:
>
> Hi!
>
> (newbie c++ question)

>
> I have a function that looks like this:
>
> ndMatrix operator*(ndMatrix &)
> {
> ndMatrix temp;
> ....
> return ndMatrix;
> }
>

> which simply multiplies the two matrices. It works just as it should.
>

> Now, it seems a bit inefficient to create a object, give it some
> values, create a copy of it, return the copy and then delete the object.
>
> Wouldn't it be better if I just created the object once and returned
> a reference to that object?
>

(First, I assume that you meant the operator* above to be a
member function of ndMatrix; otherwise I don't know what you
mean by "the two matrices" unless one of them is the uninitialized
one called temp...a probable crash unless you have a default
constructor.)

Well, to implement a binary operator, you have to make a new object
some time. The usual method, though, is to make the op= operator
a member function which returns a reference to *this, then make
the binary op itself global. Example,

class ndMatrix
{
public:
// ...
ndMatrix(const ndMatrix&); // gotta have copy ctor
ndMatrix& operator*=(const ndMatrix&); // do the deed
};

ndMatrix& ndMatrix::operator*=(const ndMatrix& rhs)
{
// ...
return *this;
}

ndMatrix operator*(const ndMatrix& lhs, const ndMatrix& rhs)
{
return ndMatrix(lhs) *= rhs;
}

Compilers are specifically permitted to optimize away the
block scope construction in the operator* function and
construct the new ndMatrix object directly in the memory
where it would be returned. I've heard this called the
"return value optimization." I think I read about it
in Scott Meyers' book, Effective C++.

Note also passing by const reference, not just by
reference; this lets the compiler know that you
won't be modifying the object referred to (or,
even better, lets *you* know, by way of a compiler
error, that you're modifying something that you
shouldn't).

--Bill

Sebastien MARC

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to

It is a classic question.
First, the argument of operator* should be a const reference since the
argument of operator is not affected by the multiplication:
ndMatric c=a*b; // Does not affect a nor b.

If you do:
ndMatrix& operator*(const ndMatrix &)
{
ndMatrix ndMatrix;
....
return ndMatrix;
}
then your reference point point beyond the stack frame pointer. A crash
is
guaranteed. That's probably what happens to you.

You cannot really get rid of the temporary. Some might try:
ndMatrix& operator*(const ndMatrix &)
{
static ndMatrix ndMatrix;
....
return ndMatrix;
}
but you are not thread-safe is that case. Moreover as a good design rule
you should avoid static objects: You don't know when they are going to
be
initialized really.

Others might try:
void multiple_matrix(const ndMatrix&, const ndMatrix&, ndMatrix&
result);
{
...
}

That way not temporary is created but you loose the beauty of the
operator=. This is the most efficient solution however.

I guess the best solution is to write something like:
ndMatrix operator*(const ndMatrix &)
{
ndMatrix ndMatrix=*this; // use copy constructor
....
return ndMatrix;
}
That's way you let a door open to your compiler so it can optimize the
initalization of the local ndMatrix, but you still pay the price of the
temporary when you return (although good compilers might eliminate it
but i
would not bet on it).

One last thing: It seems your operator* is inline. I would not place
such a
function inline: The code bloat problem; You're function is going to be
inlined for each multiplication. The multiplication is going to be far
more
expensive then the cost of a function call anyway. With each inlining
you'll end-up swapping, slowing down your application significantly.

Seb.

Marcus Lindblom <marcus....@cyberdude.com.no.spam> wrote in article
<36920...@d2o54.telia.com>...

> Hi!
>
> (newbie c++ question)
>
> I have a function that looks like this:
>
> ndMatrix operator*(ndMatrix &)
> {
> ndMatrix temp;
> ....
> return ndMatrix;
> }
>
> which simply multiplies the two matrices. It works just as it should.
>
> Now, it seems a bit inefficient to create a object, give it some
> values, create a copy of it, return the copy and then delete the
object.
>
> Wouldn't it be better if I just created the object once and returned
> a reference to that object?
>

> When I try that, VC++ 6.0 gives me a warning and when compiling
> with optimization on, (Release build), my code fails. This is probably
> correct from the compilers perspective, but not from where I see it!
:)
>
> So, is there a solution to this problem or do I have to stick with my
code
> the way it is?
>

> My C++ book is too basic to include any discussion on such problems.
> (I know it inside out by now anyway..)
>

> All help gratefully accepted.
> /Marcus
>
>

tav...@connix.com

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to

In article <36920...@d2o54.telia.com>,

"Marcus Lindblom" <marcus....@cyberdude.com.no.spam> wrote:
> Hi!
>
> (newbie c++ question)
>
> I have a function that looks like this:
>
> ndMatrix operator*(ndMatrix &)
> {
> ndMatrix temp;
> ....
> return ndMatrix;
> }
>
> which simply multiplies the two matrices. It works just as it should.
>
> Now, it seems a bit inefficient to create a object, give it some
> values, create a copy of it, return the copy and then delete the object.
>
> Wouldn't it be better if I just created the object once and returned
> a reference to that object?
>

[... SNIP ...]

Marcus,

Get a copy of "Effective C++" by Scott Meyers (ISBN 0-201-92488-9) and read
Item 23: Don't try and return a reference when you must return an object.
That should explain why you need to construct a new object in the above
situation. Yeah, it may seem inefficient, but nothing else will work, so you
have to just suck up and deal.

This book (and it's sequel, called, strangely enough, "More Effective C++")
belongs on every C++ programmer's bookshelf.

-Chris

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Siemel Naran

unread,

Jan 7, 1999, 3:00:00 AM1/7/99

to

On 5 Jan 1999 15:45:31 -0500, Marcus Lindblom

>(newbie c++ question)

Not quite.

>I have a function that looks like this:
>
>ndMatrix operator*(ndMatrix &)
>{
> ndMatrix temp;
> ....
> return ndMatrix;
>}
>
>which simply multiplies the two matrices. It works just as it should.
>
>Now, it seems a bit inefficient to create a object, give it some
>values, create a copy of it, return the copy and then delete the object.

Due to the return value optimization, the above is maximally
efficient. That is, the local variable 'temp' is created directly
into the return value space, and no constructors and no destructors
are called at the end of the function. Compilers aren't required
to perform the return value optimization. This implies that a
valid copy constructor and destructor should always be accessible,
even if the optimization is performed and the copy constructor and
destructor are not needed. At present, most compilers do the return
value optimization for unnamed local variables, but not for named
local variables.

If you are using egcs, you can use named return as a temporary
workaround. Eg,
Object f() return out { Object out; /* stuff */ return out; }
The 'return' statement is not necessary. The problem with named
return is that Object() is required to have a default constructor.
Anyway, it's a temporary workaround.

>Wouldn't it be better if I just created the object once and returned
>a reference to that object?

No. The local object is deleted, so the reference to it is invalid.
If you use static objects, the problem now is how many static
objects do you need as one may do the following reasonable thing:
f(a+b,c+d); // need 2 static variables in operator+

>When I try that, VC++ 6.0 gives me a warning and when compiling
>with optimization on, (Release build), my code fails. This is probably
>correct from the compilers perspective, but not from where I see it! :)
>
>So, is there a solution to this problem or do I have to stick with my code
>the way it is?

VC6 is right. Just wait until the return value optimization
becomes commonplace, or use operator constructors. Even if
the return value optimization is in effect, operator
constructors may still be a little more efficient as they
save one iteration through a for loop. Eg,

class Matrix
{
struct MULT { };
Matrix(const Matrix& lhs, const Matrix& rhs, MULT);
// involves one iteration through 'lhs' and 'rhs'
explicit Matrix(size_type N);
};

inline Matrix operator*(const Matrix& lhs, const Matrix& rhs)
{
return Matrix(lhs,rhs,MULT()); // return unnamed local variable
}

Because most compilers do the return value optimization for
unnamed local variables, the return value in operator* will
be created directly in the return space. So no copy ctors
and no dtors will be called, although they are required to
exist.

You can also implement op* in terms of op*=

class Matrix
{
explicit Matrix(size_type N);
Matrix(const Matrix&);
Matrix& operator*=(const Matrix& that); // returns *this
};

inline Matrix operator*(const Matrix& lhs, const Matrix& rhs)
{
return Matrix(lhs)*=rhs; // return unnamed local variable
}

This one makes a copy of 'lhs' and then changes the copy in
place through op*=. For unnamed local variables, most
compilers already do the return value optimization, so the
copy is created directly in the return space and changed
there. In any case, there are two traversals through a for
loop -- once in the copy ctor and once in the op*=. So the
operator ctor method is faster as it has only one traversal
through a for loop.

But there's one caveat. The compiler must also be sure that
op*= returns *this. If this is so, then Matrix(lhs) creates
the copy directly in the return space, and op*=(rhs) changes
the copy in place. But the standard does not require op*=
to return *this, although maybe it should require this.
It is possible for op*= to return some other matrix, such as
a static matrix. This means that in general, the optimization
can't be performed.

As it is reasonable to return *this, the compiler could assume
that op*= returns *this and do the optimization. Then at
link time, when the definitions of all the functions are
available, if the linker sees that op*= does not return *this,
it could do an unoptimization. I doubt any compilers do
this yet. In any case, there are still two iterations through
the for loop -- once in the copy constructor and once in the
op*=, so the operator constructor method is likely to be
faster.

One can implement op*= and op* in terms of a single core
function. This is just an extension of the operator constructor
idea. Eg,

class Matrix
{
struct MULT
{
void operator()(Matrix&, const Matrix&, const Matrix&) const;
};

Matrix(const Matrix& lhs, const Matrix& rhs, MULT mult)
{
// make space, but don't initialize it
mult(*this,lhs,rhs);
}

Matrix& operator*=(const Matrix& that)
{
MULT()(*this,*this,that);
return *this;
}

friend inline Matrix operator*(const Matrix& lhs, const Matrix& rhs)
{
return Matrix(lhs,rhs,MULT());
}
};

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

Jamie Hamilton

unread,

Jan 8, 1999, 3:00:00 AM1/8/99

to

Marcus Lindblom wrote in message <36920...@d2o54.telia.com>...
>Hi!
>
>(newbie c++ question)
>

>I have a function that looks like this:
>
>ndMatrix operator*(ndMatrix &)
>{
> ndMatrix temp;
> ....
> return ndMatrix;
>}
>

One practical suggestion that was mentioned but not much explained in
this discussion is that in order to get the best optimizations, you
generally
want functions to return an unnamed temporary instead of declaring a
named
variable and then returning it. E.g.:
(some minor corrections to your code included)

ndMatrix ndMatrix::operator*(const ndMatrix & matrix) const
{
return ndMatrix(*this, matrix);
}

in which case you would need a constructor that takes two args
and does the multiplying. Unfortunately, this can be confusing
when you have multiple binary operators, so you might have to
be creative with your constructors and the arguments they take.
But the upside is that you won't get the extra copying in any
decent optimizing compiler.

Gabriel Dos_Reis

unread,

Jan 8, 1999, 3:00:00 AM1/8/99

to

"Jamie Hamilton" <jham...@Radix.Net> writes:

[...]

>
> One practical suggestion that was mentioned but not much explained in
> this discussion is that in order to get the best optimizations, you
> generally
> want functions to return an unnamed temporary instead of declaring a
> named
> variable and then returning it. E.g.:
> (some minor corrections to your code included)
>
> ndMatrix ndMatrix::operator*(const ndMatrix & matrix) const
> {
> return ndMatrix(*this, matrix);
> }
>
> in which case you would need a constructor that takes two args
> and does the multiplying. Unfortunately, this can be confusing
> when you have multiple binary operators, so you might have to
> be creative with your constructors and the arguments they take.

Trying to generalize this approach leads naturally to expression
template techniques which were mentionned earlier.

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Francis Glassborow

unread,

Jan 8, 1999, 3:00:00 AM1/8/99

to

In article <773gk4$5m7$1...@news1.Radix.Net>, Jamie Hamilton
<jham...@Radix.Net> writes

>One practical suggestion that was mentioned but not much explained in
>this discussion is that in order to get the best optimizations, you
>generally
>want functions to return an unnamed temporary instead of declaring a
>named
>variable and then returning it.

Actually, I remain concerned about the whole trend of this thread.
Matrices are large value based objects and therefore I would expect any
half-competent design to include some form of reference counted lazy
copying.

>

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Siemel Naran

unread,

Jan 8, 1999, 3:00:00 AM1/8/99

to

On 8 Jan 1999 18:21:10 -0500, Francis Glassborow

>Actually, I remain concerned about the whole trend of this thread.
>Matrices are large value based objects and therefore I would expect any
>half-competent design to include some form of reference counted lazy
>copying.

But why bother with this? The return value optimization already
gaurantees maximal efficiency in return by value. In a year or
two, most compilers will probably do this optimization. What
about operator constructors? As current compilers do the return
value optimization for unnamed locals already, these operator
constructors are already maximally efficient. Reference counted
objects are good if we want to pass objects by value, and if
we're sure to have many copies of a single object. Reference
counting also presents problems in the face of multi-threading.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Marcus Lindblom

unread,

Jan 8, 1999, 3:00:00 AM1/8/99

to

Jamie Hamilton wrote in message <773gk4$5m7$1...@news1.Radix.Net>...

>Marcus Lindblom wrote in message <36920...@d2o54.telia.com>...

>>(newbie c++ question)

>
>One practical suggestion that was mentioned but not much explained in
>this discussion is that in order to get the best optimizations, you
>generally want functions to return an unnamed temporary instead of
declaring a
>named variable and then returning it.

--snip--

>in which case you would need a constructor that takes two args
>and does the multiplying. Unfortunately, this can be confusing
>when you have multiple binary operators, so you might have to
>be creative with your constructors and the arguments they take.

>But the upside is that you won't get the extra copying in any
>decent optimizing compiler.

Hmm.. that was quite clever. Gnn.. I need more practise. ;-)

I suppose using some sort of enumerated variable in the constructor
call would make things a bit more clear, and that way I can use:

ndMatrix::ndMatrix(ndMatrix &m1,ndMatrix &m2,opType op)

for all my operators, right? (using opType as the enum)

/Marcus

Jamie Hamilton

unread,

Jan 9, 1999, 3:00:00 AM1/9/99

to

Francis Glassborow wrote in message ...

>
>Actually, I remain concerned about the whole trend of this thread.
>Matrices are large value based objects and therefore I would expect any
>half-competent design to include some form of reference counted lazy
>copying.
>

Yes for matrix in particular, but I think the question comes up so often
in general that its worth going over in cases where lazy evaluation
isn't so relevant. Even for small classes, if you do a lot of return
by value, then writing your code to take advantage of the return
value optimization can result in significant speedups.

Also, if you know that you are going to use every value of the matrix,
then lazy eval is just going to add overhead.

Probably its also worth pointing out that operator *= is the best way to
go in many cases. It doesn't generate any temporaries, and thus should
be used whenever possible. E.g., instead of

a = b + c + d + e;

it's potentially much more efficient to write:

a = b;
a += c;
a += d;
a += e;

Gabriel Dos_Reis

unread,

Jan 9, 1999, 3:00:00 AM1/9/99

to

sbn...@localhost.localdomain (Siemel Naran) writes:

> On 8 Jan 1999 18:21:10 -0500, Francis Glassborow
>

> >Actually, I remain concerned about the whole trend of this thread.
> >Matrices are large value based objects and therefore I would expect any
> >half-competent design to include some form of reference counted lazy
> >copying.
>

> But why bother with this? The return value optimization already
> gaurantees maximal efficiency in return by value.

But the Standard does not guarantee the RVO.

> ... In a year or

> two, most compilers will probably do this optimization. What
> about operator constructors?

The RVOs have been there for a while and only few compilers implement
them.

> ... As current compilers do the return

> value optimization for unnamed locals already, these operator
> constructors are already maximally efficient.

This topic has been discussed over and over. Consider :

struct X {

X& operator += (const X&);
};

X operator+ (const X& a, const X& b) { return X(a) += b; }

Actually operator+ returns whatever X::operator+= returns. A *good*
compiler can effectively optimize that only if it sees the
definitions of X::operator+=.

(This remark was raised by John Potter recently).

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Alex Martelli

unread,

Jan 9, 1999, 3:00:00 AM1/9/99

to

Marcus Lindblom wrote in message <36965...@d2o54.telia.com>...
[snip]

>I suppose using some sort of enumerated variable in the constructor
>call would make things a bit more clear, and that way I can use:
>
>ndMatrix::ndMatrix(ndMatrix &m1,ndMatrix &m2,opType op)
>
>for all my operators, right? (using opType as the enum)

A simple implementation of that idea would require runtime
discrimination on the value of op (a switch, or whatever);
using different overloads (different TYPES for opType) lets
the compiler select the right operator/constructor for you
at compile time, a nice and basically-free optimization.

Alex

-----------== Posted via Newsfeeds.Com, Uncensored Usenet News ==----------
http://www.newsfeeds.com/ The Largest Usenet Servers in the World!
-----------== Over 66,000 Groups, Plus a Dedicated Binaries Server ==----------

Francis Glassborow

unread,

Jan 9, 1999, 3:00:00 AM1/9/99

to

In article <slrn79d5uv....@localhost.localdomain>, Siemel Naran
<sbn...@localhost.localdomain> writes

> Reference counted
>objects are good if we want to pass objects by value, and if
>we're sure to have many copies of a single object. Reference
>counting also presents problems in the face of multi-threading.

Which as matrices are mathematical objects, passing by value is
instinctive to those in the problem domain. If the design is for MT
use, then the designer has to work that much harder anyway:)

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Siemel Naran

unread,

Jan 9, 1999, 3:00:00 AM1/9/99

to

On 8 Jan 1999 18:38:39 -0500, Marcus Lindblom

>Jamie Hamilton wrote in message <773gk4$5m7$1...@news1.Radix.Net>...

>I suppose using some sort of enumerated variable in the constructor

>call would make things a bit more clear, and that way I can use:
>
>ndMatrix::ndMatrix(ndMatrix &m1,ndMatrix &m2,opType op)
>
>for all my operators, right? (using opType as the enum)

Sure, this works. However, your operator constructor will have
to do a switch-case on 'op'. Unless the switch-case is done at
compile time, this implies a small run-time overhead. But this
time overhead is probably negligible compared to the matrix
operation itself. The code for your operator might look a
little messy because it will have many functions in it. Eg,

ndMatrix::ndMatrix(const ndMatrix& lhs, const ndMatrix& rhs, opType op)
{
switch (op)
{
case MULT: /*stuff*/ break; // first function
case ADD : /*stuff*/ break; // second function
}
}

One clear advantage of this method is that you don't have to
hard-code the operator at compile time. So you can do
something like this, where the operator to use is chosen at run
time:

inline ndMatrix something(const ndMatrix& lhs, const ndMatrix& rhs)
{
return ndMatrix
(lhs,rhs,
(time is AM)?ndMatrix::PLUS,ndMatrix::MINUS
);
}

The other way is to use dummy nested structs instead of enums.
My previous post gave an example of how to do this. This
method allows you to extend the nested structs with extra
variables. But the operator constructor to use must be decided
at compile time. Eg:

class ndMatrix
{
struct MULT { };
struct DIV { };
ndMatrix(const ndMatrix& lhs, const ndMatrix& rhs, MULT);
ndMatrix(const ndMatrix& lhs, const ndMatrix& rhs, DIV);
};

ndMatrix operator*(const ndMatrix& lhs, const ndMatrix& rhs)
{

return ndMatrix(lhs,rhs,MULT());
}

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Siemel Naran

unread,

Jan 9, 1999, 3:00:00 AM1/9/99

to

On 9 Jan 1999 08:42:44 -0500, Jamie Hamilton <jham...@Radix.Net> wrote:

>Probably its also worth pointing out that operator *= is the best way to
>go in many cases. It doesn't generate any temporaries, and thus should
>be used whenever possible. E.g., instead of
>
>a = b + c + d + e;
>
>it's potentially much more efficient to write:
>
>a = b;
>a += c;
>a += d;
>a += e;

If your expression calls for this sort of thing a lot, then it is
better to write lazy evaluation objects.

"a+b" creates a Plus2 object.

struct Plus2
{
Plus2(const Matrix& lhs, const Matrix& rhs)
: a(lhs), b(rhs) { }

operator Matrix() const; // actually do the adding

const Matrix& a;
const Matrix& b;
};

Then "(a+b)+c" creates a Plus3 object.

struct Plus3
{
Plus3(const Plus2& lhs, const Matrix& rhs)
: a(lhs), b(rhs) { }

operator Matrix() const; // actually do the adding

const Plus2& a;
const Matrix& b;
};

This method is better because it will involve only one
iteration through a for loop. However, the operator
conversion function may not use the return value
optimization. Eg,

Plus3::operator Matrix() const
{
Matrix out(dimensions); // create local 'out'
for (int i=0; i<N; i++) out[i]=a.a[i]+a.b[i]+b[i];
return out; // copy local 'out' into return space
}

But once compilers start to do the return value
optimization (one or two years, I hope), then the
variable 'out' will be created directly in the return
space. If we can afford to, we should assume that
compilers already do the optimization, because sooner
or later, they've got to start doing it.

One may also use templates
template <int N> struct Plus;
Now class Plus has a member function operator[] that calculates
the n'th value.

template <size_t N>
struct Plus
{
Plus(const Plus<N-1>& lhs, const Matrix& rhs)
: a(lhs), b(rhs) { }

double operator[](size_type i) const;
// return by value, not const_reference!
// return a[i]+b[i]

operator Matrix() const;
// call this->operator[](i) N times
// return result

const Plus<N-1> a;
const Matrix& b;

Siemel Naran

unread,

Jan 9, 1999, 3:00:00 AM1/9/99

to

On 9 Jan 1999 08:57:58 -0500, Gabriel Dos_Reis <gdos...@sophia.inria.fr> wrote:
>sbn...@localhost.localdomain (Siemel Naran) writes:

>> ... As current compilers do the return
>> value optimization for unnamed locals already, these operator
>> constructors are already maximally efficient.

>This topic has been discussed over and over. Consider :
>
> struct X {
>
> X& operator += (const X&);
> };
>
> X operator+ (const X& a, const X& b) { return X(a) += b; }
>
>Actually operator+ returns whatever X::operator+= returns. A *good*
>compiler can effectively optimize that only if it sees the
>definitions of X::operator+=.
>
>(This remark was raised by John Potter recently).

I don't think you read my full post. I said that operator
constructors, which are 3-arg constructors, are more
efficient because they will employ only one iteration
through a for loop. By contrast, X(a) entails one
iteration through a for loop, and += entails a second
iteration through the for loop. So operator constructors
already save one iteration through a for loop. Plus,
as most compilers already do the return value optimization
for unnamed locals, these operator constructors are
already maximally efficient. For example code, see this
From sbn...@localhost.localdomain Sat Jan 9 15:03:43 1999
Newsgroups: comp.lang.c++
Subject: Re: [Q}: destructor called twice when object returned from function?

And as for the possibility of operator*= to not return
*this, this was mentioned in my previous post too. After
all, operator*= may return some unrelated static variable.
Incidentally, this point was brought up even earlier by
Christopher Eltschka. My post said that it is reasonable
to assume that operator*= returns *this. Thus the
compiler may do the full optimization (even though the
function is not really inline and the compiler doesn't
know that it returns *this). Then at link time, when
the definitions of all functions are available, if the
compiler sees that operator*= does not return *this, it
can do the un-optimization. For sure, it's an esoteric
optimization, and I don't think any compilers do it.
But it is possible and reasonable. Still, operator
constructors are even faster.

Martijn Lievaart

unread,

Jan 10, 1999, 3:00:00 AM1/10/99

to

[ comp.std.c++ added to newsgroups, as my question is about what the
standard allows]

Jamie Hamilton wrote in message <773gk4$5m7$1...@news1.Radix.Net>...
>

>Marcus Lindblom wrote in message <36920...@d2o54.telia.com>...
>>Hi!
>>
>>(newbie c++ question)
>>
>>I have a function that looks like this:
>>
>>ndMatrix operator*(ndMatrix &)
>>{
>> ndMatrix temp;
>> ....
>> return ndMatrix;
>>}
>>
>
>

>One practical suggestion that was mentioned but not much explained in
>this discussion is that in order to get the best optimizations, you
>generally
>want functions to return an unnamed temporary instead of declaring a
>named

>variable and then returning it. E.g.:
>(some minor corrections to your code included)
>
>ndMatrix ndMatrix::operator*(const ndMatrix & matrix) const
>{
> return ndMatrix(*this, matrix);
>}
>

>in which case you would need a constructor that takes two args
>and does the multiplying. Unfortunately, this can be confusing
>when you have multiple binary operators, so you might have to
>be creative with your constructors and the arguments they take.
>But the upside is that you won't get the extra copying in any
>decent optimizing compiler.
>

I was under the impression that the standard disallowed this
optimization
now (unnamed returnvalue optimization), although it used to be a common
optimization (cfront). Instead the compiler is now allowed to optimize
when
there is a named variabele (named returnvalue optimization).

Am I correct or is Jamie correct?

Thanks in advance,
Martijn
--
My reply-to address is intentionally set to /dev/null,
you can reach me at mlievaart at orion in nl

Jamie Hamilton

unread,

Jan 10, 1999, 3:00:00 AM1/10/99

to

This is much harder to do correctly than you imply from
your code. A Plus2 object has references to two other
matrices(a and b). What happens if those matrices are changed
by some other operation? In this case, the Plus2 object
has to do the calculation and store the result before
a and b can be changed. So *every* non-const matrix function
must check to see if a Plus2 object is holding a reference to
this, and induce the relevant calculation. That's a lot of extra
code, and a high potential for mistakes.

Siemel Naran wrote in message ...

>If your expression calls for this sort of thing a lot, then it is
>better to write lazy evaluation objects.
>
>"a+b" creates a Plus2 object.
>
>struct Plus2
>{
> Plus2(const Matrix& lhs, const Matrix& rhs)
> : a(lhs), b(rhs) { }
>
> operator Matrix() const; // actually do the adding
>
> const Matrix& a;
> const Matrix& b;
>};
>
>

Boris Schaefer

unread,

Jan 11, 1999, 3:00:00 AM1/11/99

to

"Sebastien MARC" <sm...@sailfish.com> writes:

| You cannot really get rid of the temporary. Some might try:
| ndMatrix& operator*(const ndMatrix &)
| {
| static ndMatrix ndMatrix;
| ....
| return ndMatrix;
| }
| but you are not thread-safe is that case. Moreover as a good design rule
| you should avoid static objects: You don't know when they are going to
| be
| initialized really.

I don't think this is entirely true. static objects in functions are
intialized upon the first call of the function. When GLOBAL static
objects are initialized is implementation defined (i.e. unknown to you
and me). After having said this, it's still a bad idea to use a
static, not just because of thread safety, but because expressions
like

ndMatrix m, a, b, c;
m = a * b * c;

will yield wrong results.

| Others might try:
| void multiple_matrix(const ndMatrix&, const ndMatrix&, ndMatrix&
| result);
| {
| ...
| }
|
| That way not temporary is created but you loose the beauty of the
| operator=. This is the most efficient solution however.

You can get the beauty of operator* and the speed of multiply_matrix
with expression templates. You get even more, because you probably
don't won't to write a function for every matrix expression you use.

If you only define mul

e.g.

For something like:

void foo()
{
matrix m, a, b, c, d;
// initialize a,b,c,d to some values

m = a * b + c * d;
}

you would either have to write:

void mul(const matrix& m1, const matrix& m2, matrix& result)
{
// multiply m1 and m2 and assign it to result
}

void add(const matrix& m1, const matrix& m2, matrix& result)
{
// add m1 and m2 and assign it to result
}

void foo()
{
matrix m, a, b, c, d, res1, res2;
// initialize a,b,c,d to some values

mul(a, b, res1);
mul(c, d, res2);
add(res1, res2, m);
}

or this:

void mul_add_mul(a, b, c, d, m)
{
// multiply a by b and c by d,
// add the results and assign them to m
}

void foo()
{
matrix m, a, b, c, d;
// initialize a,b,c,d to some values

mul_add_mul(a, b, c, d, m);
}

The first of these two solutions also creates a load of temporaries
(res1 and res2 in the example) for longer expressions, they just
aren't compiler generated tempoaries, but user generated ones.

The second approach is fast, doesn't create temporaries, but is
cumbersome, if you have many different matrix expressions. You would
have to write one function for every type of expression you use.

That's where expression templates come in. They can be fast, create
only one temporary (no matter how long the expression) and are easy to
use, because you can simply write

void foo() {
matrix m, a, b, c, d;
// initialize a,b,c,d to some values

m = a * b + c * d;
}

basically in your class matrix you have an

operator=(matrix_expr expr)

that assigns the e to the matrix via operator[](int) or what I use
operator()(int,int) (it's easier to transpose matrices this way
because you only have to interchange indices).

I give here the shortest (useful in this context) example that adds
matrices --- matrix multiplication is more complicated, because it
involves dot product of rows and columns. element-wise multiplication
is essentially the same thing as addition (from the code's POV, not
mathematically speaking).

template < typename E1, typename E2, int M, int N >
class matrix_addexpr
{
private:
E1 e1_;
E2 e2_;
public:
typedef typename E1::value_type value_type; // this is fairly
arbitrary
// more
sophistication
// might well be
appropriate

matrix_addexpr(const E1& e1, const E2& e2) : e1_(e1), e2_(e2) {}
value_type operator()(int i, int j) const { return e1_(i, j) +
e2_(i, j); }
};

template < typename T, int ROWS, int COLS >
class matrix
{
public:
typedef T value_type;
typedef matrix<T,ROWS,COLS> self_type;

template < typename E >
matrix(const matrix_addexpr<E,M,N>& expr) {
//
// expr(i, j) calls matrix_addexpr::operator()(int, int)
// which evaluates the expression
//
for(int i = 0; i < ROWS; i++)
for(int j = 0; j < COLS; j++)
data_[i * COLS + j] = expr(i, j);
}

value_type& operator()(int i, int j) { return data_[i * COLS
+ j]; }
value_type operator()(int i, int j) const { return data_[i * COLS
+ j]; }

template < typename E >
self_type& operator=(matrix_expr<E,M,N> expr) {

// create temporary to hold the evaluated expression so that
// assignment doesn't accidentally change a value that would be
// referenced later in the expression like it would happen in:
// m = m * a;

self_type tmp(expr);

for(int i = 0; i < ROWS*COLS; i++)
data_[i] = tmp.data_[i];

return *this;
}

private:
value_type data_[ROWS*COLS];
};

template < typename T1, typename T2, int M, int N >
inline
matr_addexpr < matrix <T1,M,N>,
matrix <T1,M,N>, M, N >
operator*(const matrix<T1,M,N>& v1,
const matrix<T2,M,N>& v2)
{
typedef matrix_addexpr< matrix <T1,M,N>,
matrix <T2,M,N>, M, N > expr_type;

return expr_type(v1, v2);
}

// you need three more definitions for all this to be generally
useful:
// operator*(matrix, marix_addexpr)
// operator*(matrix_addexpr, marix)
// operator*(matrix_addexpr, marix_addexpr)

I think this code should compile, but I'm not sure, since I've taken
this from a small matrix library I wrote (with many ideas shamelessly
stolen from blitz++), edited it heavily for length and then didn't try
to compile it again.

The drawback of all this is, that the code is hard to read, takes long
to write and is annoying to debug because of unreadable LONG error
messages.

The end result is nice though.

--
Boris Schaefer -- s...@psy.med.uni-muenchen.de

You cannot kill time without injuring eternity.

Jim Gewin

unread,

Jan 11, 1999, 3:00:00 AM1/11/99

to

Martijn Lievaart wrote:
> Jamie Hamilton wrote in message <773gk4$5m7$1...@news1.Radix.Net>...
> >

> >ndMatrix ndMatrix::operator*(const ndMatrix & matrix) const
> >{
> > return ndMatrix(*this, matrix);
> >}

[...]

> >But the upside is that you won't get the extra copying in any
> >decent optimizing compiler.
> >
>
> I was under the impression that the standard disallowed this
> optimization
> now (unnamed returnvalue optimization), although it used to be a common
> optimization (cfront). Instead the compiler is now allowed to optimize
> when
> there is a named variabele (named returnvalue optimization).
>
> Am I correct or is Jamie correct?
>
> Thanks in advance,
> Martijn

It appears that you are correct, and the returned
object must have a name:

from 12.8 [class.copy]

For a function call with a class return type, if the
expression in the return statement is the name of a
local object, and the cv-unqualified type of the local
object is the same as the function return type, an
implementation is permitted to omit creating the
temporary object to hold the function return value,
even if the class copy constructor or destructor has
side effects. In these cases, the object is destroyed
at the later of times when the original and the copy
would have been destroyed without the optimization.

Jim
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://reality.sgi.com/austern_mti/std-c++/faq.html ]

Boris Schaefer

unread,

Jan 11, 1999, 3:00:00 AM1/11/99

to

sbn...@localhost.localdomain (Siemel Naran) writes:

| On 8 Jan 1999 18:38:39 -0500, Marcus Lindblom

| >Jamie Hamilton wrote in message <773gk4$5m7$1...@news1.Radix.Net>...
|

| >I suppose using some sort of enumerated variable in the constructor
| >call would make things a bit more clear, and that way I can use:
| >
| >ndMatrix::ndMatrix(ndMatrix &m1,ndMatrix &m2,opType op)
| >
| >for all my operators, right? (using opType as the enum)
|
| Sure, this works. However, your operator constructor will have
| to do a switch-case on 'op'. Unless the switch-case is done at
| compile time, this implies a small run-time overhead. But this
| time overhead is probably negligible compared to the matrix
| operation itself. The code for your operator might look a
| little messy because it will have many functions in it. Eg,
|
| ndMatrix::ndMatrix(const ndMatrix& lhs, const ndMatrix& rhs, opType
op)
| {
| switch (op)
| {
| case MULT: /*stuff*/ break; // first function
| case ADD : /*stuff*/ break; // second function
| }
| }

another way, avoiding the switch would be to use templates:

class add
{
static inline int eval(int a, int b) { return a + b; }
};

class sub
{
static inline int eval(int a, int b) { return a - b; }
};

template < typename OP >
ndMatrix::ndMatrix(const ndMatrix& lhs, const ndMatrix& rhs, OP op)
{
for(int i = 0; i < rows*cols; i++)
data_[i] = OP::eval(lhs[i], rhs[i]);
}

ndMatrix operator+(const ndMatrix& lhs, const ndMatrix& rhs)
{
return ndMatrix(lhs, rhs, add());
}

this doesn't scale well to matrix multiplications.

--
Boris Schaefer -- s...@psy.med.uni-muenchen.de

Nothing cures insomnia like the realization that it's time to get up.

Jamie Hamilton

unread,

Jan 11, 1999, 3:00:00 AM1/11/99

to

Jim Gewin wrote in message <369939D1...@worldnet.att.net>...
>Martijn Lievaart wrote:

>from 12.8 [class.copy]
>
>For a function call with a class return type, if the
>expression in the return statement is the name of a
>local object, and the cv-unqualified type of the local
>object is the same as the function return type, an
>implementation is permitted to omit creating the
>temporary object to hold the function return value,
>even if the class copy constructor or destructor has
>side effects. In these cases, the object is destroyed
>at the later of times when the original and the copy
>would have been destroyed without the optimization.
>

Does this forbid optimization in the unnamed case or
merely make it easier in the named case? It doesn't seem
to say anything about what happens when the copy constructor
and destructor don't have side effects.

I may well be out of date here, but it certainly seems
like the unnamed case would be easier to
implement than the named case. I can't see why
you would permit the named and not the unnamed.
Please correct this impression if it's wrong.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

[ comp.std.c++ is moderated. To submit articles, try just posting with ]

Siemel Naran

unread,

Jan 12, 1999, 3:00:00 AM1/12/99

to

On 11 Jan 1999 08:15:29 -0500, Boris Schaefer

>The first of these two solutions also creates a load of temporaries
>(res1 and res2 in the example) for longer expressions, they just
>aren't compiler generated tempoaries, but user generated ones.
>
>The second approach is fast, doesn't create temporaries, but is
>cumbersome, if you have many different matrix expressions. You would
>have to write one function for every type of expression you use.

In addition, the second approach has many iterations through for
loops, so your lazy evaluation method is still faster as in some
cases it will get by with one iteration through a for loop.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Ian McCulloch

unread,

Jan 12, 1999, 3:00:00 AM1/12/99

to

Jamie Hamilton wrote in message <77bjin$502$1...@news1.Radix.Net>...

>This is much harder to do correctly than you imply from
>your code. A Plus2 object has references to two other
>matrices(a and b). What happens if those matrices are changed
>by some other operation? In this case, the Plus2 object
>has to do the calculation and store the result before
>a and b can be changed. So *every* non-const matrix function
>must check to see if a Plus2 object is holding a reference to
>this, and induce the relevant calculation. That's a lot of extra
>code, and a high potential for mistakes.

I suspect that is not a problem in practice. If you have a persistent
expression (ie one that is not just used as a tempoary in an expression
evaluation) then you WANT the result of the expression to depend on the
current value of the operands at the time the expression is *evaluated*,
not
the values at the time the expression is *created*. A more pathological
problem is what happens if a mutating function is called WITHIN an
expression. I guess the simple answer to that is to not have any
mutating
functions that you would want to use within an expression, but its
unavoidable sometimes. Anyway, is this problem any different to an
ordinary
expression where some mutating functions are used on the right hand side
(eg
int b = a + a++)? I havnt thought about this much, but are there any
expressions where the C++ expression evaluator produces well defined
results, but Siemel's example doesnt?

Cheers,
Ian McCulloch

Gabriel Dos_Reis

unread,

Jan 12, 1999, 3:00:00 AM1/12/99

to

"Jamie Hamilton" <jham...@Radix.Net> writes:

[...]

>

> I may well be out of date here, but it certainly seems like the
> unnamed case would be easier to implement than the named case. I can't
> see why you would permit the named and not the unnamed. Please
> correct this impression if it's wrong.

Actually it appears from my experience that unnamed return value
optimization seems to be easier to implement than the NRVO.
And the Standard explicitly does allow the URVO through the same
paragraph which allows the NRVO.

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

---

Siemel Naran

unread,

Jan 12, 1999, 3:00:00 AM1/12/99

to

On 11 Jan 1999 23:24:15 GMT, Jamie Hamilton <jham...@Radix.Net> wrote:
>Jim Gewin wrote in message <369939D1...@worldnet.att.net>...

>>from 12.8 [class.copy]

>>
>>For a function call with a class return type, if the
>>expression in the return statement is the name of a
>>local object, and the cv-unqualified type of the local
>>object is the same as the function return type, an
>>implementation is permitted to omit creating the
>>temporary object to hold the function return value,
>>even if the class copy constructor or destructor has
>>side effects. In these cases, the object is destroyed
>>at the later of times when the original and the copy
>>would have been destroyed without the optimization.

>Does this forbid optimization in the unnamed case or
>merely make it easier in the named case? It doesn't seem
>to say anything about what happens when the copy constructor
>and destructor don't have side effects.

From common sense, as the following expression
{ Thing t(1,2); return t; } // return named local
is conceptually equivalent to
{ return Thing(1,2); } // return unnamed local
the return value optimization should apply equally to the
named and unnamed cases.

The above quote leaves out the first sentence, and this
sentence suggests that the optimization apply just as
well to the unnamed case. Here's the full paragraph,
from 12.8 "Copying class objects", item 15, where I've
split each sentence into its own paragraph:

(i) Whenever a temporary class object is copied
using a copy constructor, and this object and the
copy have the same cv-unqualified type, an
implementation is free to treat the original and
the copy as two different ways of referring to
the same object and not perform a copy at all,

even if the class copy constructor or destructor

have side effects.

(ii) For a function with a class return type, if
the expresssion in the return statement is the

name of a local object, and the cv-unqualified
type of the local object is the same as the

function return value, an implementation is

permitted to omit creating the temporary object
to hold the function return value, even if the
class copy constructor or destructor has side
effects.

(iii) In these cases, the object is destroyed at

the later of times when the original and the copy
would have been destroyed without the
optimization.

From (i), we see the general statement of the return
value optimization. In short, it is always possible,
and this is whether we are dealing with named or
with unnamed locals.

Sentence (ii) is a repeat of (i) for the special
case of named locals. It used to be the case that
the optimization could only apply to unnamed
locals, or temporaries as some prefer to call it.
But as they extended the rule to named locals, I
guess they figured they should add a sentence to
spell it out in full. (Or at least this is my
guess about the history of this paragraph.)

I'm somewhat confused about this business of
cv-unqualified types. Are they saying that if
the local object has type "T" and the return has
type "const T", that the return value
optimization can't be applied? Eg,
const T read(std::istream& input) {
T out;
input >> out;
return out; // copy required?
}

It is reasonable to return const values. For
builtin types, const return values don't mean
a thing. But for user types, const return
values do mean something. They prevent users
from accidentally calling const member
functions on the returned temporary. In short,
const return values are useful, and it would
be nice if the return value optimization
applied to them. So, what's the deal here?
Why can't the return value optimization be
applied here? Or do top level consts not
figure into the definition of cv-unqualified?

As for pointer types, who cares whether the
return value optimization is applied? After
all, copying a pointer is so cheap anyway.
Eg,
T const * f() {
T * out=0;
return const_cast<T const*>(out); // copy required?
}
As 'out' and the return value don't have the same
cv qualified type, a call to the copy constructor
is forced. But we're copying pointers anyway, and
the copy is inexpensive. But if pointers of type
"T *" and type "T const *" have the same
representation, I don't see why the compiler apply
the return value optimization. Eg, suppose that
sizeof(T*)==sizeof(T const *)==4. Then the variable
'out' could be created directly in the return
space. IOW, the optimization should still apply.

>I may well be out of date here, but it certainly seems
>like the unnamed case would be easier to
>implement than the named case. I can't see why
>you would permit the named and not the unnamed.
>Please correct this impression if it's wrong.

Yes, the unnamed case is easier to implement as the
compiler doesn't have to track variables. Egcs
already does the return value optimization for unnamed
locals. I'm sure most do.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

Gabriel Dos_Reis

unread,

Jan 12, 1999, 3:00:00 AM1/12/99

to

Jim Gewin <jge...@worldnet.att.net> writes:

[...]

>
> It appears that you are correct, and the returned object must have a
> name:
>

> from 12.8 [class.copy]
>
> For a function call with a class return type, if the expression in the
> return statement is the name of a local object, and the cv-unqualified
> type of the local object is the same as the function return type, an
> implementation is permitted to omit creating the temporary object to
> hold the function return value, even if the class copy constructor or
> destructor has side effects. In these cases, the object is destroyed
> at the later of times when the original and the copy would have been
> destroyed without the optimization.

This paragraph starts with:

Whenever a temporary class object is copied using a copy
constructor, and this object and the copy have the same

cv-unqualified type, an implementation is permitted to treat

the original and the copy as two different ways of referring
to the same object and not perform a copy at all, even if the
class copy constructor or destructor have side effects.

I see this as the unnamed return value optimization.

--
Gabriel Dos Reis, dos...@cmla.ens-cachan.fr

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

[ about comp.lang.c++.moderated. First time posters: do this! ]

Siemel Naran

unread,

Jan 12, 1999, 3:00:00 AM1/12/99

to

On 12 Jan 1999 12:54:42 -0500, Ian McCulloch <ipm...@rsphy1.anu.edu.au> wrote:
>Jamie Hamilton wrote in message <77bjin$502$1...@news1.Radix.Net>...

>>This is much harder to do correctly than you imply from
>>your code. A Plus2 object has references to two other
>>matrices(a and b). What happens if those matrices are changed
>>by some other operation? In this case, the Plus2 object
>>has to do the calculation and store the result before
>>a and b can be changed. So *every* non-const matrix function
>>must check to see if a Plus2 object is holding a reference to
>>this, and induce the relevant calculation. That's a lot of extra
>>code, and a high potential for mistakes.

>I suspect that is not a problem in practice. If you have a persistent
>expression (ie one that is not just used as a tempoary in an expression
>evaluation) then you WANT the result of the expression to depend on the
>current value of the operands at the time the expression is *evaluated*,
>not the values at the time the expression is *created*.

Right, this was the case I had in mind:
const Matrix result=a+b+c+d;

The problem arises when people store the Plus<N> objects, as in:
const Plus<4> temp=a+b+c+d; // same as Plus<4> temp(a+b+c+d)
// change a
Matrix result=temp; // changed value of 'a' not considered

To discourage this sort of thing, we make the copy constructor
of class Plus<N> private. The above expression "temp(a+b+c+d)"
actually invokes a call to the copy constructor to copy the
temporary denoted by "a+b+c+d" into 'temp'. The return space
optimization says that the compiler can make the return space
of "a+b+c+d" synonomous with 'temp', thereby avoiding this
call to the copy constructor. Nevertheless, an accessible
copy constructor is required. Try this program:

---------------------------------------------

#include <iostream.h>
#include <list>

struct Plus2
{
public:
Plus2();

private:
Plus2(const Plus2&);

friend Plus2 f() { return Plus2(); }
};

Plus2 f()
{
return Plus2();
// return value optimization says that call to copy ctor and dtor
// to copy local unnamed Plus2() into the return space
// can be eliminated at the 'return' statement
// even still, an accessible copy ctor and dtor is required
// as Plus2 copy ctor is accessible, this is ok
}

int main()
{
Plus2 p(f());
// return space optimization says that call to copy ctor and dtor
// to copy return of f() into 'p'
// can be eliminated
// as Plus2 copy ctor is not accessible, this is an error
}

---------------------------------------------

But C++ allows you to bind a const reference to the temporary return:

int main()
{
const Plus2& p=f(); // ok
// change the matrices denoted in p.lhs and p.rhs
// now evaluate p.operator Matrix()
}

This technique of binding a reference to a temporary is extremely
useful in pass by const reference. Eg,
function(const Matrix&, const Matrix&);
function(a+b,c+d);
But binding a reference to a return value is much less useful.
It is not often used, and some people don't even know of the
feature. If C++ didn't allow you to bind a reference to a
temporary, the above program wouldn't compile.

But to discourage the above construct, we make the operator
function non-const
Plus2::operator Matrix();
Now "Matrix=p" calls p.operator Matrix(), and this does not
work as 'p' is const.

>A more pathological
>problem is what happens if a mutating function is called WITHIN an
>expression. I guess the simple answer to that is to not have any
>mutating
>functions that you would want to use within an expression, but its
>unavoidable sometimes. Anyway, is this problem any different to an
>ordinary
>expression where some mutating functions are used on the right hand side
>(eg
>int b = a + a++)? I havnt thought about this much, but are there any
>expressions where the C++ expression evaluator produces well defined
>results, but Siemel's example doesnt?

This is the more general case. Here the result of "a+b" is not a
Plus2, but rather a matrix that contains within it a Plus2. It's
very similar to reference counted Strings:

Matrix a,b;
Matrix c=a+b; // 'c' represented internally as sum of 'a' and 'b'
a+=a; // 'a' changed; this forces 'c' to be evaluated in full

String a;
String c=a; // 'c' represented internally as 'a' (ie, shallow copy)
a+=a; 'a' changed; this forces 'c' to be evaluated in full

This lazy evaluation Matrix is much harder to implement.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

All...@my-dejanews.com

unread,

Jan 13, 1999, 3:00:00 AM1/13/99

to

In article <slrn79li0j....@localhost.localdomain>,

sbn...@uiuc.edu wrote:
> (i) Whenever a temporary class object is copied
> using a copy constructor, and this object and the
> copy have the same cv-unqualified type, an
> implementation is free to treat the original and
> the copy as two different ways of referring to
> the same object and not perform a copy at all,
> even if the class copy constructor or destructor
> have side effects.
>
> (ii) For a function with a class return type, if
> the expresssion in the return statement is the
> name of a local object, and the cv-unqualified
> type of the local object is the same as the
> function return value, an implementation is
> permitted to omit creating the temporary object
> to hold the function return value, even if the
> class copy constructor or destructor has side
> effects.
>
> (iii) In these cases, the object is destroyed at
> the later of times when the original and the copy
> would have been destroyed without the
> optimization.

[...]

> I'm somewhat confused about this business of
> cv-unqualified types. Are they saying that if
> the local object has type "T" and the return has
> type "const T", that the return value
> optimization can't be applied? Eg,

I'm pretty sure that "the cv-unqualified type of ... is the same as ..."
means that the two types must match, EXCEPT for the const and volatile
qualifiers. Which means that type T matches type const T, but does not
match type U.

> As for pointer types, who cares whether the
> return value optimization is applied? After
> all, copying a pointer is so cheap anyway.
> Eg,
> T const * f() {
> T * out=0;
> return const_cast<T const*>(out); // copy required?
> }
> As 'out' and the return value don't have the same
> cv qualified type, a call to the copy constructor
> is forced. But we're copying pointers anyway, and
> the copy is inexpensive. But if pointers of type
> "T *" and type "T const *" have the same
> representation, I don't see why the compiler apply
> the return value optimization. Eg, suppose that
> sizeof(T*)==sizeof(T const *)==4. Then the variable
> 'out' could be created directly in the return
> space. IOW, the optimization should still apply.

Copying a pointer never has extra side-effects. So this is
already permitted due to the "as-if" rule; the side effects
would match an abstract machine that didn't have this
"optimization."

--
All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Boris Schaefer

unread,

Jan 13, 1999, 3:00:00 AM1/13/99

to

sbn...@localhost.localdomain (Siemel Naran) writes:

| On 11 Jan 1999 08:15:29 -0500, Boris Schaefer
|

| >The first of these two solutions also creates a load of temporaries
| >(res1 and res2 in the example) for longer expressions, they just
| >aren't compiler generated tempoaries, but user generated ones.
| >
| >The second approach is fast, doesn't create temporaries, but is
| >cumbersome, if you have many different matrix expressions. You would
| >have to write one function for every type of expression you use.
|

| In addition, the second approach has many iterations through for
| loops, so your lazy evaluation method is still faster as in some
| cases it will get by with one iteration through a for loop.

You could make the second approach work with only one iteration
through for loops, if you have a function for EVERY expression type
you have.

if you only use:
m = a + b;
m = a + b + c;
m = a + b - c;

you could write these functions:

m_plus_m(const matrix& m1,
const matrix& m2, matrix& result);

m_plus_m_plus_m(const matrix& m1,
const matrix& m2,
const matrix& m3, matrix& result);

m_plus_m_minus_m(const matrix& m1,
const matrix& m2,
const matrix& m3, matrix& result);

as i said, if you use many different expressions, this can become
quite annoying.

Furthermore, I think the lazy evaluation I used *always* (no matter
what the expressions) iterates through 2 for loops: one when creating
the temporary and evaluating the expression and another when assigning
the temporary to the result.

This is to make expressions like m = a * m; work correctly

If it would have only 1 loop (evaluating the expression and assigning
the result in one step), then this expression would yield wrong
results, because elements needed to evaluate the expression are
assigned to.

e.g.

m[0][0] = dot(a.row(0), m.column(0));
m[1][0] = dot(a.row(1), m.column(0));

the second will get you wrong results because m.column(0)[0] now
doesn't hold its original value anymore, but contains:

dot(a.row(0), b.column(0))

I don't think there is a way (at least not a simple one) to check
whether the rhs expression contains the lhs and only use the 2-loop
approach, if that's the case.

--
Boris Schaefer -- s...@psy.med.uni-muenchen.de

Jones' Motto:
Friends come and go, but enemies accumulate.

Francis Glassborow

unread,

Jan 14, 1999, 3:00:00 AM1/14/99

to

In article <slrn79li0j....@localhost.localdomain>, Siemel Naran
<sbn...@localhost.localdomain> writes

>I'm somewhat confused about this business of
>cv-unqualified types. Are they saying that if
>the local object has type "T" and the return has
>type "const T", that the return value
>optimization can't be applied? Eg,

No IMO just the reverse, it is saying 'ignore any cv qualification' and
look at the underlying types.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

[ about comp.lang.c++.moderated. First time posters: do this! ]

Siemel Naran

unread,

Jan 14, 1999, 3:00:00 AM1/14/99

to

On 13 Jan 1999 12:58:50 -0500, Boris Schaefer
>sbn...@localhost.localdomain (Siemel Naran) writes:

>if you only use:
>m = a + b;
>m = a + b + c;
>m = a + b - c;
>
>you could write these functions:
>

>m_plus_m(const matrix& m1,
> const matrix& m2, matrix& result);
>
>m_plus_m_plus_m(const matrix& m1,
> const matrix& m2,

> const matrix& m3, matrix& result);
>

>m_plus_m_minus_m(const matrix& m1,
> const matrix& m2,

> const matrix& m3, matrix& result);
>
>as i said, if you use many different expressions, this can become
>quite annoying.

This approach is essentially equivalent to the lazy evaluation
method! Provided we don't save the Plus<N> objects. Consider
"a+b+c". Our function
Plus<3>::operator Matrix() const;
is equivalent to your m_plus_m_plus_m(...).

The implementation of Plus<3>::operator Matrix() I had in
mind uses Plus<3>::operator[](size_type), which calls
Plus<2>::operator[](size_type). But this is equivalent
after inlining. Eg,

Plus<3>::operator[](size_type i)
{
return a[i]+b[i]; // a has type Plus<2>, b has type Matrix
}

// after inlining Plus<2>::operator[](size_type)
Plus<3>::operator[](size_type i)
{
return a.a[i]+a.b[i]+b[i];
}

And this is equivalent to the "a[i]+b[i]+c[i]" that your
function m_plus_m_plus_m(...) would probably use.

BTW, I seem to messing up Matrix and Vector :).

>Furthermore, I think the lazy evaluation I used *always* (no matter
>what the expressions) iterates through 2 for loops: one when creating
>the temporary and evaluating the expression and another when assigning
>the temporary to the result.
>
>This is to make expressions like m = a * m; work correctly
>
>If it would have only 1 loop (evaluating the expression and assigning
>the result in one step), then this expression would yield wrong
>results, because elements needed to evaluate the expression are
>assigned to.

Yes, the return value optimization doesn't apply to assignment.

Maybe you could make a private member function
Matrix& Matrix::multreverse(const Vector& that);
which basically does this=that*this. By contrast
Matrix& Matrix::operator*=(const Vector& that);
does this=this*that. Both multreverse and op*= do
the calculation in place, so there are no temps.

Now make a*m return a Mult2 object that contains
a reference to 'a' and 'm'. Then m=a*m calls
Matrix& Matrix::operator=(const Mult2&);
which then calls multreverse.

Mult2 should be a nested class of class Matrix
because no one else would often need to access it
-- it is too specialized to be a non-member.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Siemel Naran

unread,

Jan 14, 1999, 3:00:00 AM1/14/99

to

On 13 Jan 1999 17:32:23 GMT, All...@my-dejanews.com
>In article <slrn79li0j....@localhost.localdomain>,

>> (i) Whenever a temporary class object is copied
>> using a copy constructor, and this object and the
>> copy have the same cv-unqualified type, an
>> implementation is free to treat the original and
>> the copy as two different ways of referring to
>> the same object and not perform a copy at all,
>> even if the class copy constructor or destructor
>> have side effects.

>I'm pretty sure that "the cv-unqualified type of ... is the same as ..."

>means that the two types must match, EXCEPT for the const and volatile
>qualifiers. Which means that type T matches type const T, but does not
>match type U.

Yes, I realized after posting that cv-qualified might be
different from cv-unqualified. So "T" is not equivalent
to "const T" by cv-qualified. But "T" is (perhaps?)
equivalent to "const T" by cv-unqualified. Although
"const T" is not equivalent to "T" by cv-unqualified.
Is this right? Can you give a precise definition of
'cv-unqualified'? Eg, does this always force a call
to the copy constructor upon return (ie, the return
value optimization does not apply), and why:

T f() { const T out; return out; }

>Copying a pointer never has extra side-effects. So this is
>already permitted due to the "as-if" rule; the side effects
>would match an abstract machine that didn't have this
>"optimization."

Yes, but only if the representation of "T*" is the same
as that of "T*const" -- that is, they have the same
sizeof, the same numerical interpretation, etc.

Bill Wade

unread,

Jan 14, 1999, 3:00:00 AM1/14/99

to

All...@my-dejanews.com wrote in message
<77gqbb$f8s$1...@nnrp1.dejanews.com>...

>Copying a pointer never has extra side-effects. So this is
>already permitted due to the "as-if" rule; the side effects
>would match an abstract machine that didn't have this
>"optimization."

Not quite. Copying a pointer can modify a global or an argument (the
pointer).

int* Foo(int*& x)
{
static int i;
static int j;

int* result = &i;
x = &j;
return result;
}

int* p = Foo(p);

I believe this is conforming (it is legal to modify POD before it is
constructed). If 'result' is an alias for 'p' (NRVO) this would
incorrectly
leave p pointing at j.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Francis Glassborow

unread,

Jan 14, 1999, 3:00:00 AM1/14/99

to

In article <slrn79q6eq....@localhost.localdomain>, Siemel Naran
<sbn...@localhost.localdomain> writes

>Yes, but only if the representation of "T*" is the same
>as that of "T*const" -- that is, they have the same
>sizeof, the same numerical interpretation, etc.

I suppose that they might be different but I find it hard to accept.
const and volatile are really concerned with compile time behaviour.
Runtime objects do not carry cv qualification (that was discussed and
firmly rejected by WG21 & J16 when discussing RTTI). While static
behaviour can depend on cv qualification (via function overloading)
dynamic behaviour, to the best of my knowledge, cannot.

As long as an appropriate copy ctor exists and is accessible it can be
optimised away when return by value is used. Of course if the
programmer chooses to disallow copying of const qualifed instances the
compiler will generate an error. e.g.

struct X {
// anything
X (X &);
};

X fn(){ X const x;
// anything
return x; // ERROR
}

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

[ about comp.lang.c++.moderated. First time posters: do this! ]

[ comp.std.c++ is moderated. To submit articles, try just posting with ]

Valentin Bonnard

unread,

Jan 14, 1999, 3:00:00 AM1/14/99

to

Siemel Naran wrote:
>
> On 13 Jan 1999 17:32:23 GMT, All...@my-dejanews.com
> >In article <slrn79li0j....@localhost.localdomain>,
>

> >> (i) Whenever a temporary class object is copied
> >> using a copy constructor, and this object and the
> >> copy have the same cv-unqualified type, an
> >> implementation is free to treat the original and
> >> the copy as two different ways of referring to
> >> the same object and not perform a copy at all,
> >> even if the class copy constructor or destructor
> >> have side effects.
>

> >I'm pretty sure that "the cv-unqualified type of ... is the same as ..."
> >means that the two types must match, EXCEPT for the const and volatile
> >qualifiers. Which means that type T matches type const T, but does not
> >match type U.
>
> Yes, I realized after posting that cv-qualified might be
> different from cv-unqualified.

The text in the std simply mean that cv-qualifiers aren't
considered relevant for this optimisation.

const T foo () { return T(); } // optimisation possible
T foo () { return T(); } // optimisation possible
T foo () { const T x; return x; } // optimisation possible

> Yes, but only if the representation of "T*" is the same
> as that of "T*const" -- that is, they have the same
> sizeof, the same numerical interpretation, etc.

And it's the case.

--

Valentin Bonnard mailto:bonn...@pratique.fr
info about C++/a propos du C++: http://pages.pratique.fr/~bonnardv/

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

[ comp.std.c++ is moderated. To submit articles, try just posting with ]

Boris Schaefer

unread,

Jan 15, 1999, 3:00:00 AM1/15/99

to

sbn...@localhost.localdomain (Siemel Naran) writes:

| On 11 Jan 1999 08:15:29 -0500, Boris Schaefer
|

| >The first of these two solutions also creates a load of temporaries
| >(res1 and res2 in the example) for longer expressions, they just
| >aren't compiler generated tempoaries, but user generated ones.
| >
| >The second approach is fast, doesn't create temporaries, but is
| >cumbersome, if you have many different matrix expressions. You would
| >have to write one function for every type of expression you use.
|

| In addition, the second approach has many iterations through for
| loops, so your lazy evaluation method is still faster as in some
| cases it will get by with one iteration through a for loop.

You could make the second approach work with only one iteration
through for loops, if you have a function for EVERY expression type
you have.

if you only use:

m = a + b;
m = a + b + c;
m = a + b - c;

you could write these functions:

m_plus_m(const matrix& m1,
const matrix& m2, matrix& result);

m_plus_m_plus_m(const matrix& m1,
const matrix& m2,

const matrix& m3, matrix& result);

m_plus_m_minus_m(const matrix& m1,
const matrix& m2,

const matrix& m3, matrix& result);

as i said, if you use many different expressions, this can become
quite annoying.

Furthermore, I think the lazy evaluation I used *always* (no matter

what the expressions) iterates through 2 for loops: one when creating
the temporary and evaluating the expression and another when assigning
the temporary to the result.

This is to make expressions like m = a * m; work correctly

If it would have only 1 loop (evaluating the expression and assigning
the result in one step), then this expression would yield wrong
results, because elements needed to evaluate the expression are
assigned to.

e.g.

m[0][0] = dot(a.row(0), m.column(0));
m[1][0] = dot(a.row(1), m.column(0));

the second will get you wrong results because m.column(0)[0] now
doesn't hold its original value anymore, but contains:

dot(a.row(0), b.column(0))

I don't think there is a way (at least not a simple one) to check
whether the rhs expression contains the lhs and only use the 2-loop
approach, if that's the case.

--
Boris Schaefer -- s...@psy.med.uni-muenchen.de

Jones' Motto:

Friends come and go, but enemies accumulate.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Siemel Naran

unread,

Jan 15, 1999, 3:00:00 AM1/15/99

to

On 13 Jan 1999 12:58:50 -0500, Boris Schaefer
>sbn...@localhost.localdomain (Siemel Naran) writes:

>if you only use:
>m = a + b;
>m = a + b + c;
>m = a + b - c;
>
>you could write these functions:
>

>m_plus_m(const matrix& m1,
> const matrix& m2, matrix& result);
>
>m_plus_m_plus_m(const matrix& m1,
> const matrix& m2,

> const matrix& m3, matrix& result);
>

>m_plus_m_minus_m(const matrix& m1,
> const matrix& m2,

> const matrix& m3, matrix& result);
>
>as i said, if you use many different expressions, this can become
>quite annoying.

This approach is essentially equivalent to the lazy evaluation

method! Provided we don't save the Plus<N> objects. Consider
"a+b+c". Our function
Plus<3>::operator Matrix() const;
is equivalent to your m_plus_m_plus_m(...).

The implementation of Plus<3>::operator Matrix() I had in
mind uses Plus<3>::operator[](size_type), which calls
Plus<2>::operator[](size_type). But this is equivalent
after inlining. Eg,

Plus<3>::operator[](size_type i)
{
return a[i]+b[i]; // a has type Plus<2>, b has type Matrix
}

// after inlining Plus<2>::operator[](size_type)
Plus<3>::operator[](size_type i)
{
return a.a[i]+a.b[i]+b[i];
}

And this is equivalent to the "a[i]+b[i]+c[i]" that your
function m_plus_m_plus_m(...) would probably use.

BTW, I seem to messing up Matrix and Vector :).

>Furthermore, I think the lazy evaluation I used *always* (no matter
>what the expressions) iterates through 2 for loops: one when creating
>the temporary and evaluating the expression and another when assigning
>the temporary to the result.
>
>This is to make expressions like m = a * m; work correctly
>
>If it would have only 1 loop (evaluating the expression and assigning
>the result in one step), then this expression would yield wrong
>results, because elements needed to evaluate the expression are
>assigned to.

Yes, the return value optimization doesn't apply to assignment.

Maybe you could make a private member function
Matrix& Matrix::multreverse(const Vector& that);
which basically does this=that*this. By contrast
Matrix& Matrix::operator*=(const Vector& that);
does this=this*that. Both multreverse and op*= do
the calculation in place, so there are no temps.

Now make a*m return a Mult2 object that contains
a reference to 'a' and 'm'. Then m=a*m calls
Matrix& Matrix::operator=(const Mult2&);
which then calls multreverse.

Mult2 should be a nested class of class Matrix
because no one else would often need to access it
-- it is too specialized to be a non-member.

--

----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Francis Glassborow

unread,

Jan 15, 1999, 3:00:00 AM1/15/99

to

In article <slrn79li0j....@localhost.localdomain>, Siemel Naran
<sbn...@localhost.localdomain> writes

>I'm somewhat confused about this business of
>cv-unqualified types. Are they saying that if
>the local object has type "T" and the return has
>type "const T", that the return value
>optimization can't be applied? Eg,

No IMO just the reverse, it is saying 'ignore any cv qualification' and

look at the underlying types.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

[ about comp.lang.c++.moderated. First time posters: do this! ]

Francis Glassborow

unread,

Jan 16, 1999, 3:00:00 AM1/16/99

to

In article <slrn79q6eq....@localhost.localdomain>, Siemel Naran
<sbn...@localhost.localdomain> writes

>Yes, but only if the representation of "T*" is the same
>as that of "T*const" -- that is, they have the same
>sizeof, the same numerical interpretation, etc.

I suppose that they might be different but I find it hard to accept.

const and volatile are really concerned with compile time behaviour.
Runtime objects do not carry cv qualification (that was discussed and
firmly rejected by WG21 & J16 when discussing RTTI). While static
behaviour can depend on cv qualification (via function overloading)
dynamic behaviour, to the best of my knowledge, cannot.

As long as an appropriate copy ctor exists and is accessible it can be
optimised away when return by value is used. Of course if the
programmer chooses to disallow copying of const qualifed instances the
compiler will generate an error. e.g.

struct X {
// anything
X (X &);
};

X fn(){ X const x;
// anything
return x; // ERROR
}

Francis Glassborow Chair of Association of C & C++ Users

64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

[ comp.std.c++ is moderated. To submit articles, try just posting with ]

Valentin Bonnard

unread,

Jan 16, 1999, 3:00:00 AM1/16/99

to

Siemel Naran wrote:
>
> On 13 Jan 1999 17:32:23 GMT, All...@my-dejanews.com
> >In article <slrn79li0j....@localhost.localdomain>,
>

> >> (i) Whenever a temporary class object is copied
> >> using a copy constructor, and this object and the
> >> copy have the same cv-unqualified type, an
> >> implementation is free to treat the original and
> >> the copy as two different ways of referring to
> >> the same object and not perform a copy at all,
> >> even if the class copy constructor or destructor
> >> have side effects.
>

> >I'm pretty sure that "the cv-unqualified type of ... is the same as ..."
> >means that the two types must match, EXCEPT for the const and volatile
> >qualifiers. Which means that type T matches type const T, but does not
> >match type U.
>
> Yes, I realized after posting that cv-qualified might be
> different from cv-unqualified.

The text in the std simply mean that cv-qualifiers aren't
considered relevant for this optimisation.

const T foo () { return T(); } // optimisation possible
T foo () { return T(); } // optimisation possible
T foo () { const T x; return x; } // optimisation possible

> Yes, but only if the representation of "T*" is the same

> as that of "T*const" -- that is, they have the same
> sizeof, the same numerical interpretation, etc.

And it's the case.

--

Valentin Bonnard mailto:bonn...@pratique.fr
info about C++/a propos du C++: http://pages.pratique.fr/~bonnardv/

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

[ about comp.lang.c++.moderated. First time posters: do this! ]

Francis Glassborow

unread,

Jan 16, 1999, 3:00:00 AM1/16/99

to

In article <slrn79q6eq....@localhost.localdomain>, Siemel Naran
<sbn...@localhost.localdomain> writes

>Yes, but only if the representation of "T*" is the same
>as that of "T*const" -- that is, they have the same
>sizeof, the same numerical interpretation, etc.

I suppose that they might be different but I find it hard to accept.

const and volatile are really concerned with compile time behaviour.
Runtime objects do not carry cv qualification (that was discussed and
firmly rejected by WG21 & J16 when discussing RTTI). While static
behaviour can depend on cv qualification (via function overloading)
dynamic behaviour, to the best of my knowledge, cannot.

As long as an appropriate copy ctor exists and is accessible it can be
optimised away when return by value is used. Of course if the
programmer chooses to disallow copying of const qualifed instances the
compiler will generate an error. e.g.

struct X {
// anything
X (X &);
};

X fn(){ X const x;
// anything
return x; // ERROR
}

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

[ about comp.lang.c++.moderated. First time posters: do this! ]

[ comp.std.c++ is moderated. To submit articles, try just posting with ]

Siemel Naran

unread,

Jan 16, 1999, 3:00:00 AM1/16/99

to

On 13 Jan 1999 17:32:23 GMT, All...@my-dejanews.com
>In article <slrn79li0j....@localhost.localdomain>,

>> (i) Whenever a temporary class object is copied

>> using a copy constructor, and this object and the
>> copy have the same cv-unqualified type, an
>> implementation is free to treat the original and
>> the copy as two different ways of referring to
>> the same object and not perform a copy at all,
>> even if the class copy constructor or destructor
>> have side effects.

>I'm pretty sure that "the cv-unqualified type of ... is the same as ..."

>means that the two types must match, EXCEPT for the const and volatile
>qualifiers. Which means that type T matches type const T, but does not
>match type U.

Yes, I realized after posting that cv-qualified might be

different from cv-unqualified. So "T" is not equivalent
to "const T" by cv-qualified. But "T" is (perhaps?)
equivalent to "const T" by cv-unqualified. Although
"const T" is not equivalent to "T" by cv-unqualified.
Is this right? Can you give a precise definition of
'cv-unqualified'? Eg, does this always force a call
to the copy constructor upon return (ie, the return
value optimization does not apply), and why:

T f() { const T out; return out; }

>Copying a pointer never has extra side-effects. So this is
>already permitted due to the "as-if" rule; the side effects
>would match an abstract machine that didn't have this
>"optimization."

Yes, but only if the representation of "T*" is the same

as that of "T*const" -- that is, they have the same
sizeof, the same numerical interpretation, etc.

--

----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

Bill Wade

unread,

Jan 16, 1999, 3:00:00 AM1/16/99

to

All...@my-dejanews.com wrote in message
<77gqbb$f8s$1...@nnrp1.dejanews.com>...

>Copying a pointer never has extra side-effects. So this is

>already permitted due to the "as-if" rule; the side effects
>would match an abstract machine that didn't have this
>"optimization."

Not quite. Copying a pointer can modify a global or an argument (the
pointer).

int* Foo(int*& x)
{
static int i;
static int j;

int* result = &i;
x = &j;
return result;
}

int* p = Foo(p);

I believe this is conforming (it is legal to modify POD before it is
constructed). If 'result' is an alias for 'p' (NRVO) this would
incorrectly leave p pointing at j.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

[ about comp.lang.c++.moderated. First time posters: do this! ]

Boris Schaefer

unread,

Jan 16, 1999, 3:00:00 AM1/16/99

to

sbn...@localhost.localdomain (Siemel Naran) writes:

<snip>

| >Furthermore, I think the lazy evaluation I used *always* (no matter
| >what the expressions) iterates through 2 for loops: one when creating
| >the temporary and evaluating the expression and another when assigning
| >the temporary to the result.
| >
| >This is to make expressions like m = a * m; work correctly
| >
| >If it would have only 1 loop (evaluating the expression and assigning
| >the result in one step), then this expression would yield wrong
| >results, because elements needed to evaluate the expression are
| >assigned to.
|
| Yes, the return value optimization doesn't apply to assignment.
|
| Maybe you could make a private member function
| Matrix& Matrix::multreverse(const Vector& that);
| which basically does this=that*this. By contrast
| Matrix& Matrix::operator*=(const Vector& that);
| does this=this*that. Both multreverse and op*= do
| the calculation in place, so there are no temps.
|
| Now make a*m return a Mult2 object that contains
| a reference to 'a' and 'm'. Then m=a*m calls
| Matrix& Matrix::operator=(const Mult2&);
| which then calls multreverse.

I'm not sure if I understand you correctly, but I think that this
would solve the problem only for m = a * m and not in general, so
something like:

matrix<4, 4> a, b, m;

// assign some values to a, b and m, which are square matrices,
// since the following expression is not in general valid for
// matrices but certainly is valid for square ones.

m = a + b * m;

would still need one temporary and two for loops.

Maybe you could explain a bit more on how you think one can solve this
with only one pass through a for loop.

--
Boris Schaefer -- s...@psy.med.uni-muenchen.de

A language that doesn't affect the way you think about programming is
not worth knowing.

Siemel Naran

unread,

Jan 17, 1999, 3:00:00 AM1/17/99

to

On 16 Jan 1999 15:51:56 -0500, Boris Schaefer
>sbn...@localhost.localdomain (Siemel Naran) writes:

>I'm not sure if I understand you correctly, but I think that this
>would solve the problem only for m = a * m and not in general, so
>something like:
>
> matrix<4, 4> a, b, m;
>
> // assign some values to a, b and m, which are square matrices,
> // since the following expression is not in general valid for
> // matrices but certainly is valid for square ones.
>
> m = a + b * m;
>
>would still need one temporary and two for loops.
>
>Maybe you could explain a bit more on how you think one can solve this
>with only one pass through a for loop.

Note that matrix*matrix is itself three loops, although I've been
treating it as one loop as in these sentences: "Matrix a=b+c"
involves one iteration through a for loop". Similarly,
matrix*vector is two loops. Anyway, why not just use a new
variable 'm' so that you can do "Matrix m=a+b*m" and so that the
return value optimization can be applied?

In any case, my proposed solution for handling "a=b*a" in the
most efficient manner only handles this one case, and not
generalizations like "a=c+b*a".

In any case, I'm pretty sure C++ is able to do whatever is
maximally efficient (as few temps as possible, as few loops as
possible). Exactly what is maximally efficient and how to
implement it in the most general manner, I know only for the
simplest of cases.

--
----------------------------------
Siemel B. Naran (sbn...@uiuc.edu)
----------------------------------

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

Christopher Eltschka

unread,

Jan 18, 1999, 3:00:00 AM1/18/99

to

Bill Wade wrote:
>
> All...@my-dejanews.com wrote in message
> <77gqbb$f8s$1...@nnrp1.dejanews.com>...
>
> >Copying a pointer never has extra side-effects. So this is
> >already permitted due to the "as-if" rule; the side effects
> >would match an abstract machine that didn't have this
> >"optimization."
>
> Not quite. Copying a pointer can modify a global or an argument (the
> pointer).
>
> int* Foo(int*& x)
> {
> static int i;
> static int j;
>
> int* result = &i;
> x = &j;
> return result;
> }
>
> int* p = Foo(p);
>
> I believe this is conforming (it is legal to modify POD before it is
> constructed). If 'result' is an alias for 'p' (NRVO) this would
> incorrectly leave p pointing at j.

Your scenario involves _two_ optimisations:

- The NRVO optimisation makes the return value (an unnamed temporary)
equivalent to result (it "names" that return value "result").
More exactly, instead of creating that temporary, it uses result
directly (12.8/15, sentence (ii)).
- The second optimisation avoids the copy of that return value
by making that temporary return value the same as p. But: Due
to the NRVO, the return value is not an unnamed temporary - it's
the named variable "result" (12.8/15, sentence (i)).

From both I conclude that a compiler is not permitted to
optimize out _both_ copies, unless under as-if rule.
If that conclusion is correct, the URVO should still be more
efficient in general, since here both temporaries may be
optimized out (the one created in the function and then
copied to the return value, and then the return value copied
to the destination).

If I'm right, I see a further interesting conclusion:

If the compiler wants to take full advantage of URVO, it must
allow the second optimisation, even if it doesn't see the
function source (i.e. the function is not inline).
This means that the NRVO cannot be used for non-inline
functions, since the compiler must expect the return value
being optimized out at the caller side (which would not
be legal after NRVO). And this means that NRVO isn't quite
as useful as one may think.

However, if I'm wrong, then NRVO seems indeed dangerous for such
cases. Maybe it would be better to restrict NRVO to non-POD types
(optimisation of POD types can usually be done under the as-if
rule anyway). Or to describe the effect not as just allowing
the optimisation, but as "optimising as if the copy constructor
and destructor had no side effects". That is, the existance of
copy constructor and destructor (with possible side effects)
doesn't limit the optimisations, but the only optimisations
allowed are those which don't violate the as-if rule otherwise
(that is, not via copy constructor/destructor).

All...@my-dejanews.com

unread,

Jan 20, 1999, 3:00:00 AM1/20/99

to

In article <77og1m$c...@netlab.cs.rpi.edu>,

sbn...@uiuc.edu wrote:
>
> On 13 Jan 1999 17:32:23 GMT, All...@my-dejanews.com
> >In article <slrn79li0j....@localhost.localdomain>,

> >Copying a pointer never has extra side-effects. So this is
> >already permitted due to the "as-if" rule; the side effects
> >would match an abstract machine that didn't have this
> >"optimization."
>

> Yes, but only if the representation of "T*" is the same
> as that of "T*const" -- that is, they have the same
> sizeof, the same numerical interpretation, etc.

Well, yes.

I try not to make *any* assumptions about the target environment,
but stretching my imagination I can't think of a reason why a
compiler would represent a T* differently from a T*const. In each
case, it is used to locate a T, and although the permitted
operations need not be the same, you can't assume that a non-null
T*const points to a T that was declared const.

Further, there's a practical requirement that conversion from T*
to T*const is very fast; if this were not true, then const
functions and pretty much any other use of const would incur a
run-time penalty. The fastest conversion between two pointer types
occurs when they have identical representation; in that situation,
the compiler needs only to decide that the pointer is the new type,
and it is.

----

All...@my-dejanews.com is a "Spam Magnet" -- never read.
Please reply in USENET only, sorry.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Stanley Friesen [Contractor]

unread,

Jan 22, 1999, 3:00:00 AM1/22/99

to

In article <780hce$ko4$1...@nnrp1.dejanews.com>, <All...@my-dejanews.com> wrote:
>I try not to make *any* assumptions about the target environment,
>but stretching my imagination I can't think of a reason why a
>compiler would represent a T* differently from a T*const.

Not only is there no reason to do so, the standard explicitly
forbids them to be different.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

James Kuyper

unread,

Jan 24, 1999, 3:00:00 AM1/24/99

to

"Stanley Friesen [Contractor]" wrote:
>
> In article <780hce$ko4$1...@nnrp1.dejanews.com>, <All...@my-dejanews.com> wrote:
> >I try not to make *any* assumptions about the target environment,
> >but stretching my imagination I can't think of a reason why a
> >compiler would represent a T* differently from a T*const.
>
> Not only is there no reason to do so, the standard explicitly
> forbids them to be different.

Citation, please?
---

Bill Wade

unread,

Jan 26, 1999, 3:00:00 AM1/26/99

to

James Kuyper wrote in message <36A8C577...@wizard.net>...

>"Stanley Friesen [Contractor]" wrote:

>> [T* and T*const must have the same representation]

>Citation, please?

"The cvqualified or cvunqualified versions of a type ... shall have
the same representation and alignment requirements." (3.9.3p1)

For the related question: Must (T*) and (T const*) have the same
representation? The answer is also yes:

Any type is layout compatible with itself. (paraphrased 3.9p11)

"Pointers to cvqualified and cvunqualified versions (3.9.3) of
layoutcompatible types shall have the same value representation and align
ment requirements (3.9)." (3.9.2p3)

HTH

Stanley Friesen [Contractor]

unread,

Jan 26, 1999, 3:00:00 AM1/26/99

to

In article <36A8C577...@wizard.net>,
James Kuyper <kuy...@wizard.net> wrote:

>"Stanley Friesen [Contractor]" wrote:
>> Not only is there no reason to do so, the standard explicitly
>> forbids them to be different.
>
>Citation, please?

Unfortunately my copy of the standard is at home.

The issue is discussed in either chapter 2 or 3, either in the section about
object representations, or in the section about general pointer types. In
one of those places it specifically states that cv-qualifiers do not change
the object representation of pointers.

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]

Andrew Koenig

unread,

Jan 27, 1999, 3:00:00 AM1/27/99

to

In article <36af4c6f....@news.lspace.org>,
Alan Bellingham <al...@lspace.org> wrote:

> I can't give you chapter and verse, but any implementation that allowed
> it would break the equivalence between

> extern void foo(T * bar) ;

> and

> void foo(T * const bar) { ... }

Why do you think so? I don't see it.
--
Andrew Koenig
a...@research.att.com
http://www.research.att.com/info/ark

Ron Natalie

unread,

Jan 27, 1999, 3:00:00 AM1/27/99

to

Stanley Friesen [Contractor] wrote:
>
> >Citation, please?
>
> Unfortunately my copy of the standard is at home.
>

In 3.9.3 in the first paragraph describing const and volatile:

The cv-qualified or cv-unqualified versions of a type are
distinct types; however they shall have the same
representation and alignment requirements
---

Andrew Koenig

unread,

Jan 29, 1999, 3:00:00 AM1/29/99

to

In article <36b43d7f....@news.lspace.org>,
Alan Bellingham <al...@lspace.org> wrote:
> a...@research.att.com (Andrew Koenig) wrote:

> >In article <36af4c6f....@news.lspace.org>,
> >Alan Bellingham <al...@lspace.org> wrote:

> [May T* and T*const be represented differently?]

> >> I can't give you chapter and verse, but any implementation that
> allowed
> >> it would break the equivalence between

> >> extern void foo(T * bar) ;

> >> and

> >> void foo(T * const bar) { ... }

> >Why do you think so? I don't see it.

> (Any time ark disagrees with one is time for one to get worried.)

Flattery will get you somewhere, but I'm not sure where.

> Forgive me if I'm wrong, but I thought that a function definition is
> allowed to add a further level of constness to its parameters, so that
> such declarations as f(int const i) are allowed. Since the parameter is
> passed by value, the const is irrelevant to the caller, but it may have
> some utility to the function definition itself. Now I only received my
> copy of the standard yesterday, so I've not had a chance to read through
> it, and I may be mistaken.

Declarations such as f(int const i) are allowed, but are equivalent
to f(int i), as they are in C.

> _If_ I am correct, and if this applies to any type of parameter value,
> then a function may be defined as taking a T * const, but be declared as
> taking a T *. (The parameter value being the content of the pointer,
> i.e. an address.)

Yes.

> If the function definition is in one translation unit, and another
> translation unit only sees the declaration, then there needs to be some
> way of ensuring that the function itself receives the correct machine
> representation of the pointer.

Yes.

> I can see that, if the definition has the prototype in scope, a compiler
> could notice that callers may be calling using T* pointers, but what of
> the case that the function has been written without a prototype visible
> (yes, bad practice, but legal, surely)?

> In this case, I can see three paths out:

> 1) A T*const as a parameter is actually interpreted by the compiler as
> only a T*, with compile time checking as normal to ensure no T++ style
> modifications.

> 2) T*const and T* have the same representations as per 3.9.3 p1.

Which they do, but we're assuming for the purposes of the discussion
that they don't :-)

> 3) The differing prototype is required to be in scope - and the T*const
> parameter is taken to be a T* anyway.

> If I'm wrong in this, please explain my mistake - you do have far more
> knowledge than I do.

How about (4): The declaration is automagically converted to T*,
whether the const is there or not, and the code that actually copies
the parameter to the argument is generated as part of the called
function, not the caller? That code, because it knows the type
of the parameter, can take care of any necessary representation
change while copying the argument to the parameter.

For that matter, because we're talking about pointers here, it
would be possible for the pointer to be copied twice (once by the
caller, once by the callee with change of representation) and the
user program would never know.

As I said before, this issue is moot because T* and T*const are required
elsewhere to have the same representation -- but if they weren't,
I think it would be implementable anyway.

Steve Clamage

unread,

Jan 30, 1999, 3:00:00 AM1/30/99

to

James Kuyper <kuy...@wizard.net> writes:

>> >I try not to make *any* assumptions about the target environment,
>> >but stretching my imagination I can't think of a reason why a
>> >compiler would represent a T* differently from a T*const.
>>

>> Not only is there no reason to do so, the standard explicitly
>> forbids them to be different.

>Citation, please?

3.9.2 "Compound types" paragraph 3:

"Pointers to cv-qualified and cv-unqualified versions of
layout-compatible types shall have the same value representation
and alignment requirements."

--
Steve Clamage, stephen...@sun.com