C++ templates vs. .NET generics

Eugene Gershnik

unread,

Oct 4, 2003, 8:37:07 PM10/4/03

to

Take a look at the
http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx

Is the "run-time type expansion causes less code bloat" argument given there
a bogus one or there is something behind it?

BTW my favorite BS pearl from this article is:

"Of all the fringe benefits of run-time type expansion, my favorite is a
somewhat subtle one. Generic code is limited to operations that are certain
to work for any constructed instantiation of the type. The side effect of
this restriction is that CLR generics are more understandable and usable
than their C++ template counterparts."

Eugene

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

WW

unread,

Oct 5, 2003, 1:50:27 PM10/5/03

to

Eugene Gershnik wrote:
> Take a look at the
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
>
> Is the "run-time type expansion causes less code bloat" argument
> given there a bogus one or there is something behind it?
>
> BTW my favorite BS pearl from this article is:
>
> "Of all the fringe benefits of run-time type expansion, my favorite
> is a somewhat subtle one. Generic code is limited to operations that
> are certain to work for any constructed instantiation of the type.
> The side effect of this restriction is that CLR generics are more
> understandable and usable than their C++ template counterparts."

IMHO the question is: do you prefer runtime or compile time errors. I
prefer the latter. And IMHO the above question answered the wrong way is a
sort of mistake like the preference of buffer overruns vs. 0.2% slower
programs. I do not think if they do check if the generic instantioation
makes sense compile time - only they generate it runtime. If that unlikely
case would be the truth - then the wuestion is: do you prefer something to
be done once, or million times.

C++ templates and code bloat. C++ compiler vendors (especially Microsoft)
are just getting template *functionality* right. And we all know
optimization can only come after that. In theory C++ compilers/linkers
could do an awful lot of optimization there. For example generating the
very same code for int and long if they are the same, and refer to this with
two names. Split up the templates (compile time) into type independent and
type-dependent parts and only generate one instance of the type independent
parts etc.

I am anyway be very very suspicious getting absolute statements about the
quality of C++ templates from a shop, which did not manage to get them right
for many years, or even attempt to implement them (export aside) for very
recently. My point is: nothing whatsoever proves that those guys bashing
C++ templates in the article know/understood what C++ templates are.

I find it rather tiring that the center of the Solar system as well as the
inventor of the keynote keeps comparing VM based (not far from interpreted)
languages to C++. C++ is the language to write a portable VM in. Java and
C# are languages running on a portable VM.

In dynamic languages (such as Java and C#) it is perfectly possible to write
"self-modifying" code. Just like in CA-Clipper it was possible to create
completely new classes runtime. What we see above is pretty close to this.
"We have postponed all bindings generics and to runtime, so you have a huge
freedom to write your code". But this also means "we have postponed nearly
all error handling to runtime, so basically it is all up to you." Since it
is proven a software cannot be fully tested, this is a recipe for chaos(*).

Also, since the work is postponed to runtime (whether it is needed to be or
not), it has effect on speed. Furtermore some of those cannot be done only
once. If I have a component loaded/released dynamically then I have to
generate the generics each time I re-load it. And even if I keep it in
memory I have to check if I have generated it or not (this might be very
cheap, actually zero overhead above the usual dynamic name binding).

Also the attempt to hide the complexity from the programmer ends up as
nightmare when looking at real life products. CA-Clipper programmers using
LOCATE (sequential search) instead of SEEK (indexed search) because all the
complexity was hidden from them. Next thing was that people started to make
"query optimizing database drivers". Which is not bad, except that users
started to index all fields becase "that will make all my queries fast" and
then it took ages to insert a record...

I am not saying that C++ strikes a good balance on much of the complexity is
shown. But hiding has to have a limit. And automation has to have a limit.
Or a backdoor to good old manual (see professional cameras, see my "proposal
in adding resource handlers to gc").

But making people feel (via marketing and product documentation) that
playing ball with an egg or a softball or a piece of rock is the same is
simply irresponsible. Everything has a price and people who do not know
what they are doing or what things they trigger into action will just be
ignorant of the consequences, therefore unable to reliably create value.

I have one question I always wanted to ask of these guys who sell their
languages by bashing C++. I mean I wanted to ask the question so that they
can answer it for themselves. If C++ is so bad, why does the designers of
every new OO (or multiparadigm) language try so hard to prove that their
language is better? If C++ is so bad, why is it important to prove that
your product is better?

(*) Unless oyu have very experienced and disciplined programmers. But the
whole idea of such languages is that the programmer does not need to be
such, since the language is "safe".

--
WW aka Attila

Igor Ivanov

unread,

Oct 6, 2003, 7:36:39 AM10/6/03

to

"Eugene Gershnik" <gers...@nospam.hotmail.com> wrote in message news:<EPudndK2YbN...@speakeasy.net>...

> Take a look at the
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
>
> Is the "run-time type expansion causes less code bloat" argument given there
> a bogus one or there is something behind it?
>
> BTW my favorite BS pearl from this article is:
>
> "Of all the fringe benefits of run-time type expansion, my favorite is a
> somewhat subtle one. Generic code is limited to operations that are certain
> to work for any constructed instantiation of the type. The side effect of
> this restriction is that CLR generics are more understandable and usable
> than their C++ template counterparts."

Not BS but rather a mild statement of the fact that C++ templates are
used for the purpose for which they are poorly suited.

Igor

Glen Low

unread,

Oct 6, 2003, 8:43:58 AM10/6/03

to

> Is the "run-time type expansion causes less code bloat" argument given there
> a bogus one or there is something behind it?

Compiler and linkers could do a lot to reduce this. By the ODR,
vector<int> should be the same anywhere in the program, and thus is a
case for the compiler or linker not to repeat the code thereof (not
really a limitation of the language per se).

> "Of all the fringe benefits of run-time type expansion, my favorite is a
> somewhat subtle one. Generic code is limited to operations that are certain
> to work for any constructed instantiation of the type. The side effect of
> this restriction is that CLR generics are more understandable and usable
> than their C++ template counterparts."

They've adopted the same route as GJ of using interface constraints on
the allowable types for their template parameters. This should improve
compile-time checking since the compiler can now verify the
correctness of the template vs. the interface, but ties their generic
implementation back to rigid type hierarchies, which was one of the
freeing things about C++ templates.

Regards,
Glen Low, Pixelglow Software
www.pixelglow.com

Dietmar Kuehl

unread,

Oct 6, 2003, 9:42:31 PM10/6/03

to

Eugene Gershnik wrote:

> Take a look at the
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
>
> Is the "run-time type expansion causes less code bloat" argument given
> there a bogus one or there is something behind it?

Well, I think it depends on the generic code you write. If your class
depends essentially on generic contraints I doubt that you will be able
to be much better than C++ code: the code differs significantly between
each instantiation. On the other hand, if you have no constraints or
only non-generic ones, the generic code becomes essentially handling of
some pointers with fixed types plus a few casts. Of course, in both
cases C# and C++ can do essentially the same: nothing in the first and
delegate to a common implementation in the second case. It would also
be interesting to see the performance of complex generic algorithms
being compared to C++ implementations: With optimization turned on,
compilation of templates with heavy use of inline functions can take
ages but the resulting could be blindingly fast. With the
parameterization (ie. effectively algorithm configuration) done at
run time this seems to ask for slow performance: either the optimizer
eats the time or the execution of the algorithm. Of course, if the
algorithm is called often after being optimized just once, it could be
similar to C++ templates.

After answering your immediate question, a few thoughts in general:

>From a first glance (I have just read the two article and this is all
information I have about C# generics) the constraints approach seems
to have benefits over the C++ approach. In particular, it effectively
avoids the whole name look-up issue (in particular argument dependent
lookup). On the other hand, I'm not yet sure whether the constraints
approach is itself constraining: in generic code it is important that
eg. return types can be dependant on the function argument types in a
way specific to the arguments. I'm not yet sure that this can be
expressed in generic interfaces.

I'm not sure whether I simply missed it but apparently [partial]
specialization and non-type arguments are not present with generics.
However, both are quite important tools, eg. for recursive templates:
often, there is a generic version for the general case but it builts
upon a special case. For example, an n-dimensional array can be seen
as an array of (n-1)-dimensional arrays - except for 'n == 1' in
which case it is simply an array of objects. But don't get distracted
by this example: there may be solutions to this particular issue but
the problem is much more general.

In general, CLR generics seem to be less powerful than C++ templates.
On the other hand, this may be due to the audience of the articles I
have read: there is no point of confusing people unaware of generic
approaches with the finer but also essential points. I should look
for a more thorough description of CLR generics...
--
<mailto:dietma...@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

Graham Batty

unread,

Oct 7, 2003, 11:27:17 AM10/7/03

to

"Eugene Gershnik" <gers...@nospam.hotmail.com> wrote in message
news:EPudndK2YbN...@speakeasy.net...

> Take a look at the
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
>
> Is the "run-time type expansion causes less code bloat" argument given
there
> a bogus one or there is something behind it?

I suspect that this is actually largely a misunderstanding of the nature of
both types of generics. In particular, I think this person expects C#
generics to be entirely based on runtime polymorphism. As I understand it,
from the more technical descriptions presented in the article, this is not
the case. In fact, what C# is doing is something that a lot of people feel
C++ should do as well. That is, move template generation to the link phase
to avoid multiple compilation.

Of course, in .net, the link phase is in fact the JIT compiler phase. So
really, I don't know that there is really all that much more benefit
involved here in terms of code bloat (not where it counts, in terms of cache
hits and misses, anyways). What they've done is make generics an integral
part of the .net bytecode, and as such, the JIT compiler will do the work of
instantiating all uses of the template at run time, but not necessarily
DURING run time.

> BTW my favorite BS pearl from this article is:
>
> "Of all the fringe benefits of run-time type expansion, my favorite is a
> somewhat subtle one. Generic code is limited to operations that are
certain
> to work for any constructed instantiation of the type. The side effect of
> this restriction is that CLR generics are more understandable and usable
> than their C++ template counterparts."

I don't know that this is BS. In fact, if I'm right above, it essentially
means that constraints were pretty much necessary. After all, as it stands,
you get pretty crappy errors on templates in C++ (as the author of this
article is overly fond of pointing out), but if those templates were
generated at link time, you would end up with far far worse errors. Not only
that, but because C# generics are bytecode themselves, they can really be
external linkage, and so you don't even have the advantage of the code,
which is where the errors would show up, being available in the source file
where the instantiation failed.

So constraints are not so much a feature as a patching misfeature here imo.
Not only that, but it severely limits the application of templates in
general. Your template arguments must provide a set interface (in the
COM/Java/C# sense) or be part of a particular class hierarchy. So in
essense, there really is no generic programming here. Truely, I think C#
generics really are only useful for containers, not for any real sense of
generic programming (in the Stepanov & Lee sense). A step forward, and it
will certainly make the language stronger, but from C++'s perspective, I
think it is in fact a step backwards. I hope that if C++ ever has built-in
constraints, they will be stronger than this in general.

Graham.

Daveed Vandevoorde

unread,

Oct 7, 2003, 3:08:21 PM10/7/03

to

Dietmar Kuehl <dietma...@yahoo.com> wrote:
[...]

> I'm not sure whether I simply missed it but apparently [partial]
> specialization and non-type arguments are not present with generics.

[...]

That is correct. I was quite surprised when I was first shown
the C# generics syntax which reused the angle brackets that
C++ has had so many problems with. However, without the
possibility of explicit specialization, those angle brackets are
much less of a problem.

Like you say, coming from C++ this feels like a serious loss
of functionality, but when I brought it up, one of the C#
generics designers was convinced that explicit specialization
will never be needed (and as a consequence angle brackets
will never cause parsing trouble).

Daveed

Chris Perkins

unread,

Oct 7, 2003, 6:37:49 PM10/7/03

to

Dietmar Kuehl

> approaches with the finer but also essential points. I should look
> for a more thorough description of CLR generics...

I thought this one was quite good:

http://research.microsoft.com/projects/clrgen/generics.pdf

Chris Perkins

Jan Bares

unread,

Oct 8, 2003, 2:20:41 AM10/8/03

to

> Of course, in .net, the link phase is in fact the JIT compiler phase. So
> really, I don't know that there is really all that much more benefit
> involved here in terms of code bloat (not where it counts, in terms of
cache
> hits and misses, anyways). What they've done is make generics an integral
> part of the .net bytecode, and as such, the JIT compiler will do the work
of
> instantiating all uses of the template at run time, but not necessarily
> DURING run time.

Hi,

if I understand .NET well, the code bloat will be reduced by the fact, that
the instantiation will be shared across all managed modules in the same
address space. At least in Windows any application is set of many modules
(DLL's etc), without instantiating templates at run time you cannot share
them.

Best regards, Jan

Dietmar Kuehl

unread,

Oct 8, 2003, 4:53:26 AM10/8/03

to

Daveed Vandevoorde wrote:

> Dietmar Kuehl <dietma...@yahoo.com> wrote:
>> I'm not sure whether I simply missed it but apparently [partial]
>> specialization and non-type arguments are not present with generics.

> That is correct.

I investigated CLR generics more closely, essentially by reading the
paper by the designers and lots of information I found on the internet.
The result is rather disappointing! Things are *much* worse than I had
expected: Essentially, CLR generics are up to avoiding the excessive
use of casts when using containers and that's about it. There is
effectively no support for Generic Programming (which does not
necessarily mean that it is entirely impossible but it would be more
complex than necessary).

Here are the two most important missing features:

- As mentioned, specialization is completely missing - not to mention
partial specialization. This is effectively necessary to cope with
degenerate cases and/or to differentiate between different
capabilities of the parameter types.
- It is hard - if not impossible - to infer associated types, eg. the
return type of some function. However, this seems to be necessary
to spell out the constraints (well, I'm not entirely sure about this
because I haven't found any useful documentation of what is allowed
in the constraints).

STL is out there since eight years, there was loads of discussions on
these issues in the C++ forum, etc. But this experience is plainly
ignored.

> I was quite surprised when I was first shown
> the C# generics syntax which reused the angle brackets that
> C++ has had so many problems with. However, without the
> possibility of explicit specialization, those angle brackets are
> much less of a problem.

Well, the definitely avoid the shift vs. double angle bracket because
there are no non-type template arguments and thus there is no
ambiguity. The issue with angle brackets and and dependent names does
not really surface because it is necessary to provide constraints
which avoid the ambiguity here. I'm not sure I see the particular
problem with explicit specialization. ... but then I'm not sure what
you are refering to exactly: I can picture full specialization or
explicit instantiation but I'm not sure whether either of these terms
is exact.

> Like you say, coming from C++ this feels like a serious loss
> of functionality, but when I brought it up, one of the C#
> generics designers was convinced that explicit specialization
> will never be needed (and as a consequence angle brackets
> will never cause parsing trouble).

Did any of the C# generics designers ever pictured using generics for
more interesting uses than type safe containers? ... and did any of
the C# generics designers ever tried to do any form of Generic
Programming? Unfortunately, it is not really only a problem with C#
but it extends to the whole CLR: types are encoded in MSIL and hence
there is no support for specialization or type inference.

--
<mailto:dietma...@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dietmar Kuehl

unread,

Oct 8, 2003, 4:54:41 AM10/8/03

to

Chris Perkins wrote:
> Dietmar Kuehl
>> approaches with the finer but also essential points. I should look
>> for a more thorough description of CLR generics...
>
> I thought this one was quite good:
>
> http://research.microsoft.com/projects/clrgen/generics.pdf

Yes, it seems to give a fairly complete description of CLR generics -
and I was really disappointed. Unfortunately, the article does not really
cover the finer points like what can go into the constraints. But my
expectations are low: they did a good job in preventing use of CLR
generics for Generic Programming.

--
<mailto:dietma...@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dave Boyle

unread,

Oct 8, 2003, 2:10:18 PM10/8/03

to

> Yes, it seems to give a fairly complete description of CLR generics -
> and I was really disappointed. Unfortunately, the article does not really
> cover the finer points like what can go into the constraints. But my
> expectations are low: they did a good job in preventing use of CLR
> generics for Generic Programming.

At the risk of sounding provocative, did anyone expect generics to
support hard-core, Alexandrescu-style template usage? Such support
wouldn't seem consonant with the rest of the design philosophy of the
.NET Framework.

Cheers,

Dave

Dietmar Kuehl

unread,

Oct 9, 2003, 4:48:48 PM10/9/03

to

david...@ed.tadpole.com (Dave Boyle) wrote:
> At the risk of sounding provocative, did anyone expect generics to
> support hard-core, Alexandrescu-style template usage?

I suppose this is a rhetorical question but it still asks for a reaction:
did anyone *want* generics to support hard-core, Alexandrescu-style
template usage? This stuff is, no doubt, way cool and allows for things
we surely want to take advantage of in C++. C# and the CLR are, however,
a different platform with different techniques and often different goals.
Although I personally enjoy doing cool stuff (well, in some sense
everything I'm doing is "cool" work due to my name :-) the primary
objective is to get the task done. I don't think that advanced template
techniques are necessary for the context of C# and the CLR. However, some
stuff is indeed crucial - and apparently absent.

On the other hand, I have to admit that I'm personally not really that
much interested in creating configurable data structures which seems to
be the primary focus of Alexandrescu and Eisenecker/Czernecki. My focus
is on data structure independent algorithms where "algorithm" refers
effectively to any kind of operation: there is no point in investing in
highly configurable data structures since the crucial knowledge lays in
the algorithms. Data structures are just tossed together... (of course,
I know that there is considerable amount of effort in creating some data
structures but my focus is not at all on this stuff). My personal view
point is that object oriented techniques are well up to data structure
configuration (possibly at the cost of a little bit of performance) but
not at all for algorithms stuff.

> Such support wouldn't seem consonant with the rest of the design
> philosophy of the .NET Framework.

I think that an appropriate set of meta programming facilities could
create a simple to use environment where all the stuff Alexandrescu
address is also possible. How this looks exactly is, however, still
unclear and will probably take still some years of experimentation and
research. After all, meta programming facilities and generic
programming have become popular only relatively recently and are still
immature. Using a pragmatic approach for the .Net framework means to
select a set of easy to use facilities which have a reasonable benefit.
The question is where to draw the line: this is a trade-off between
complexity and benefit. My personal impression is that the current
scope is too narrow to reach beyond the trivial examples and a little
bit more complexity would yield huge benefit. To me it seems that the
team designing C# generics has a strong object oriented background but
no real experience with generic programming: what they are currently
discussion has similar goals and limitations as the first C++ templates
I have seen. This yields some benefit (primarily adding a little bit of
type safety and avoiding some casts) but it is not yet the huge step
forward. However, to make a huge step forward, I think only a tiny
little piece is missing: the capability to infer associated types (what
is clumsily done with nested typedefs in C++). This would not yet yield
the meta programming level we got in C++, it would still be more limited
but it would open the possibility to do quite powerful stuff within the
CLR.

--
<mailto:dietma...@yahoo.com> <http://www.dietmar-kuehl.de/>
Phaidros eaSE - Easy Software Engineering: <http://www.phaidros.com/>

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Glen Low

unread,

Oct 10, 2003, 12:22:47 PM10/10/03

to

> However, to make a huge step forward, I think only a tiny
> little piece is missing: the capability to infer associated types (what
> is clumsily done with nested typedefs in C++). This would not yet yield
> the meta programming level we got in C++, it would still be more limited
> but it would open the possibility to do quite powerful stuff within the
> CLR.

In practice you would probably use some combination of .NET reflection
and generics to get what you want. It's awkward because it would be
more runtime-based than C++ and thus more prone to errors, but it may
still be serviceable.

For example, in C++:

class outer
{
public:
typedef char inner;
};

outer::inner x;

but in C#:

public class outer
{
public Type inner { get { return typeof (char); } }
};

object x = outer.inner.Create (); // can't remember the exact
method...

With the advent of generics, you probably could hide some of the root
object / casting shenanigans behind a generic interface.

If the metadata support for generics is good (which is what they
claim) and the reflection support for generics is good (my inference),
you can work out something. Full specialization could be achieved
awkwardly by the code checking for the reflected Type and shunting
calls to different implementation object (a la handle/body idiom or
one of the GoF behavior patterns).

I wonder how .NET generics would handle the curiously recursive
inheritance idiom so beloved of their ATL:

public interface X <T> { ... }

public class Y: X <Y> { ... } // no such thing as a fwd declaration in
C#...

Cheers,

Glen Low, Pixelglow Software
www.pixelglow.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Andrei Alexandrescu

unread,

Oct 15, 2003, 8:43:45 AM10/15/03

to

"Jan Bares" <jan....@antek.cz.no.spam> wrote in message
news:bm099l$1dgo$1...@ns.felk.cvut.cz...

> if I understand .NET well, the code bloat will be reduced by the fact,
that
> the instantiation will be shared across all managed modules in the same
> address space. At least in Windows any application is set of many modules
> (DLL's etc), without instantiating templates at run time you cannot share
> them.

That is correct. But it's all a tradeoff. Separate instantiation for each
type does produce code bloat, but it's often more efficient.

Actually an impressive system is Pizza
(http://pizzacompiler.sourceforge.net/), which, through a compiler switch,
would select (over the same syntax!) between what they call "homogeneous"
and "heterogeneous" translations. It's sad Pizza didn't make it in Java and
GJ did.

Andrei

Aaron Bentley

unread,

Nov 1, 2003, 5:26:20 AM11/1/03

to

Eugene Gershnik wrote:

> Take a look at the
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
>
> Is the "run-time type expansion causes less code bloat" argument given there
> a bogus one or there is something behind it?

No, it's true.

Every time you instantiate a template, you cause the compiler to
generate code for that type. So square<float>(x) and square<int>(x)
might as well be separate handcoded square_float() and square_int()
functions. Templates allow you to generate a lot of code automatically,
so you can wind up wasting space if you're careless.

> BTW my favorite BS pearl from this article is:
>
> "Of all the fringe benefits of run-time type expansion, my favorite is a
> somewhat subtle one. Generic code is limited to operations that are certain
> to work for any constructed instantiation of the type. The side effect of
> this restriction is that CLR generics are more understandable and usable
> than their C++ template counterparts."

It's like saying a pen is more understandable and usable than a laser
printer.

Some of the more mind-bending stuff in template metaprogramming is
possible because only successful expansions can be used. Like having
multiple versions of a function where the version is selected based on
type traits.

C# generics seem to be in some strange twilight realm between generic
programming and polymorphism.

Aaron

--
Aaron Bentley
www.aaronbentley.com

Troll_King

unread,

Nov 2, 2003, 11:52:09 AM11/2/03

to

"Eugene Gershnik" <gers...@nospam.hotmail.com> wrote in message news:<EPudndK2YbN...@speakeasy.net>...

> Take a look at the
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
>
> Is the "run-time type expansion causes less code bloat" argument given there
> a bogus one or there is something behind it?
>
> BTW my favorite BS pearl from this article is:
>
> "Of all the fringe benefits of run-time type expansion, my favorite is a
> somewhat subtle one. Generic code is limited to operations that are certain
> to work for any constructed instantiation of the type. The side effect of
> this restriction is that CLR generics are more understandable and usable
> than their C++ template counterparts."

This question doesn't have a very stable context because Standard C++
is a programming language and .Net is a program. I think that the C++
standard is a guideline for compiler writers but it not say that you
have to write the implementation only one way, the issue is left open
to the implementor.

I believe that .Net and Java have thier purpose, and that purpose is
related to business software solutions that leverage the vendors
research and development through reuse. This is not suitable for the
needs of a system implementor who needs to work with a light weight
language, but it is the current implementation a solutions implementor
using .Net on the Microsoft product, will be forced to use. Hopefully
it is a good implementation but even if it wasn't, what choice would
you have.

Markus Werle

unread,

Nov 5, 2003, 5:12:01 PM11/5/03

to

Eugene Gershnik wrote:

> Take a look at the
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
>
> Is the "run-time type expansion causes less code bloat" argument given
> there a bogus one or there is something behind it?

Well, it's perhaps true for the microsoft compiler (though I doubt this),
but I cannot see a real code bloat with Intel's C++.
Actually, using Inter-Procedural Optimization (optimizes during linkage)
not only destroys _any_ intermediate symbol, but sometimes also
optimizes away some calls to library functions if linked statically
(which in one case cuts down the runtime from 45 minutes to 1.6
seconds for some part of my code)

The rant about code bloat is nothing but another historical
garbage from worser times which have gone by now.

Markus

--

Build your own Expression Template Library with Daixtrose!
Visit http://daixtrose.sourceforge.net/

elefant.alba.dp.ua

unread,

Nov 8, 2003, 8:30:49 AM11/8/03

to

This may not be On Topic for this newsgroup but for the sake of completeness
to this thread, there is a much more extensive summary of C# generics
available at:
http://msdn.microsoft.com/vcsharp/default.aspx?pull=/library/en-us/dv_vstech
art/html/csharp_generics.asp

--Dilip

"Markus Werle" <numerical....@web.de> wrote in message
news:bob8gt$1cc115$1...@ID-153032.news.uni-berlin.de...

> Eugene Gershnik wrote:
>
> > Take a look at the
> > http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx
> >
> > Is the "run-time type expansion causes less code bloat" argument given
> > there a bogus one or there is something behind it?
>
> Well, it's perhaps true for the microsoft compiler (though I doubt this),
> but I cannot see a real code bloat with Intel's C++.
> Actually, using Inter-Procedural Optimization (optimizes during linkage)
> not only destroys _any_ intermediate symbol, but sometimes also
> optimizes away some calls to library functions if linked statically
> (which in one case cuts down the runtime from 45 minutes to 1.6
> seconds for some part of my code)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Scott Meyers

unread,

Nov 9, 2003, 6:42:20 AM11/9/03

to

On 5 Nov 2003 17:12:01 -0500, Markus Werle wrote:

> Eugene Gershnik wrote:
> Well, it's perhaps true for the microsoft compiler (though I doubt this),
> but I cannot see a real code bloat with Intel's C++.
> Actually, using Inter-Procedural Optimization (optimizes during linkage)
> not only destroys _any_ intermediate symbol, but sometimes also
> optimizes away some calls to library functions if linked statically
> (which in one case cuts down the runtime from 45 minutes to 1.6
> seconds for some part of my code)
>
> The rant about code bloat is nothing but another historical
> garbage from worser times which have gone by now.

Not according to what I hear from developers and not according to what I
read from researchers. I'm working more and more with developers of
embedded systems, people who instinctively look into any increase in the
size of their programs (even when those sizes may be very large, e.g., many
megabytes), and they tell me that use of templates increases the size of
their programs. This is often their conclusion from experimenting with the
STL, which they'd like to use, but they find that it increases the size of
their programs too much. These people worry about template-induced code
bloat not because of experience they had five years ago with bad compilers
and linkers or because of random rumors they heard from people who may or
may not have actually used templates, but because they have tried them
themselves within the last year or two, and their empirical observation was
that using templates generally leads to bloated code.

On the research front, the phenomenon of "post-pass code optimization" has
yielded evidence that templates DO lead to code bloat (at least on the
tested platforms) in the form of literally identical binary code segments.
These identical code segments are not always complete functions, however.
For example, suppose we have this template:

template<typename T>
class MyClass {
...
private:
static int x;
};

And suppose that on a machine where int and long are the same size, we
instantiate MyClass for both int and long. At first glance, we'd expect
the instantiations of MyClass's member functions to be identical at the
binary level for the two types, but note that MyClass has a static data
member. That means that any reference to MyClass::x will differ between
MyClass<int> and MyClass<long>, and, assuming that each member function of
MyClass refers to x, that means that *none* of the implementations of
MyClass's member functions can be shared across the two instantiations.
(As an aside, my understanding of .NET generics is that this is a case
where they would achieve code sharing across instantiations, because the
generated code would be explicitly note that only references to the static
need to change across instantiations.)

Anyway, my point here is that assertions that "The rant about code bloat is

nothing but another historical garbage from worser times which have gone by

now" is not supported by the facts, at least not in my experience.

You can find more information on post-pass code optimization here; I found
this stuff really interesting:

Bruno De Bus et al., "Post-Pass Compaction Techniques," Communications of
the ACM, August 2003.

Papers at http://www.elis.ugent.be/~brdsutte/squeeze++/pubs. html,
including:

Bjorn De Sutter et al., "Sifting out the Mud: Low Level C++ Code
Reuse," OOPSLA 2002.

Bjorn De Sutter et al., "On the Side-Effects of Code Abstraction,"
LCTES 2003.

As an aside, the reference in the last paper to "code abstraction" is
really talking about un-inlining functions from the binary. One of the
experiments they run is to profile the code for hot spots, then feed that
information back into a tool that un-inlines only the infrequently called
functions. The end result in some of their experiments are programs that
run faster and are about half as big. One could thus argue that the
original optimized programs (yes, they enable full compiler optimizations)
were bloated by a factor of about 100%, though one can't necessary blame
that bloat on templates. (It depends on where the "excessive" inlining
came from.)

The experiments are run on gcc-based systems that seem to use a template
instantiation repository and that don't seem to eliminate duplicates
terribly well during linking, so I think that their results would be less
dramatic on other platforms, but the example of the template that refers to
a static was an enlightening example to me of how C++ templates can lead to
bloat and how .NET templates can avoid it (for that particular case).

There are still lots of people who would like to make more use of templates
in general and the STL in particular, but who, in spite of their using
contemporary compilers and linkers, find that the general use of templates
DOES bloat their code. These people would love to find a solution to this
problem. So would I. If you have ideas on how to go about doing it,
please post. Before doing so, you might want to review the thread at
http://tinyurl.com/u7sf, as that is a discussion of this issue from last
year that is, I suspect, still pretty relevant.

Scott

Glen Low

unread,

Nov 9, 2003, 9:15:41 AM11/9/03

to

> Actually, using Inter-Procedural Optimization (optimizes during linkage)
> not only destroys _any_ intermediate symbol, but sometimes also
> optimizes away some calls to library functions if linked statically
> (which in one case cuts down the runtime from 45 minutes to 1.6
> seconds for some part of my code)

If I understand it correctly, IPO is a object-code version of
inlining.

The question is: is inlining better than IPO? Is there anything that
is accomplished by source code inlining that cannot be done with
object code IPO? Are there any/many C++ compilers/linkers implementing
IPO?

If IPO is indeed that powerful, we might be able to ship object code
libraries with most or all of the power of source code libraries.

Cheers,
Glen Low, Pixelglow Software
www.pixelglow.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Glen Low

unread,

Nov 10, 2003, 4:42:03 AM11/10/03

to

To some extent, the code bloat could be ameliorated by adopting a
variation of Coplien's indirect template technique. IMHO, the typical
Standard C++ Library implementations don't do this enough.

1. Put all the non-inlineable, shareable code in a non-template base
class.
2. Put all the inlineable code in the actual template class.
3. The subclass interfaces to the base class through void*.

The end result is something closely approximating .NET generics or
Java type erasures. It's more manual but more flexible, since the
implementor can choose whether a function ought to be inlined (and
possibly duplicated between template instantiations) or not.

Cheers,
Glen Low, Pixelglow Software
www.pixelglow.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Markus Werle

unread,

Nov 10, 2003, 2:40:04 PM11/10/03

to

Scott Meyers wrote:

> On 5 Nov 2003 17:12:01 -0500, Markus Werle wrote:
> > Eugene Gershnik wrote:
> > Well, it's perhaps true for the microsoft compiler (though I doubt
> > this), but I cannot see a real code bloat with Intel's C++.
> > Actually, using Inter-Procedural Optimization (optimizes during
> > linkage) not only destroys _any_ intermediate symbol, but sometimes
> > also optimizes away some calls to library functions if linked
> > statically (which in one case cuts down the runtime from 45 minutes to
> > 1.6 seconds for some part of my code)
> >
> > The rant about code bloat is nothing but another historical
> > garbage from worser times which have gone by now.
>
> Not according to what I hear from developers and not according to what I

> read from researchers. [...]

Hmm. Hmm.

> On the research front, the phenomenon of "post-pass code optimization" has
> yielded evidence that templates DO lead to code bloat (at least on the
> tested platforms) in the form of literally identical binary code segments.

So the post-post-pass can find those and fold the code?
Do you know whether the best 3 compilers in the market
do already have such technology builtin or not?

> These identical code segments are not always complete functions, however.
> For example, suppose we have this template:
>
> template<typename T>
> class MyClass {
> ...
> private:
> static int x;
> };

I am a bit confused:
I always thought that the compiler is allowed to do _anything_
it likes as long as the executable yields the requested results.
So after a
icpc -static -o my.exe -ipo [all my object files]
I do not even _expect_ any entity MyClass<int>::x or others
to exist anymore.

Do I have a wrong view on this?

> And suppose that on a machine where int and long are the same size, we

> instantiate MyClass for both int and long. [...]

As long as MyClass<int>::x and MyClass<long>::x have different semantics
I'd prefer to blame the semantics and not the templates for the code bloat.
If You want those to be linked to the same memory snippet, I cannot see
any problem here. A little redesign will clearly ensure exactly this.
This is what traits were invented for, right?

(Also I fail to like _any_ code which uses statics unless it's compile
time calculation, because the management might have the idea to sell the
code in a multithreaded version and then the hell breaks loose. :-))

> Anyway, my point here is that assertions that "The rant about code bloat
> is nothing but another historical garbage from worser times which have
> gone by now" is not supported by the facts, at least not in my experience.

Maybe I was way too optimistic about the "gone by now".
Let me retry:

Templates are not responsible for code bloat, but rather
the fact that

a) ... most compilers have only rudimentary
support for dealing correctly/perfectly with templates.

b) ... templates are used unwisely.

>From a theoretical point of view I cannot believe
that anything that can be optimized by human hands
will stay out of reach of existing or future compilers,
be it on source code level or during linkage.

My own experience with "vanishing symbols" makes me
think that templates are no problem for my favourite
compiler and that code bloat can be avoided either
by intelligent use of templates or high performance compilers.

Maybe I should not have used "rant", but I find it
dangerous to make a design decision based on today's
compilers which hinders 80% of what C++ can offer.

I can imagine implementations of the STL which take care
of the tight memory situation.

I agree with you that testing existing STLs on embedded systems
might bring tears to the eyes.

> You can find more information on post-pass code optimization here; I
> found this stuff really interesting:
>
> Bruno De Bus et al., "Post-Pass Compaction Techniques," Communications
> of the ACM, August 2003.
>
> Papers at http://www.elis.ugent.be/~brdsutte/squeeze++/pubs. html,
> including:
>
> Bjorn De Sutter et al., "Sifting out the Mud: Low Level C++ Code
> Reuse," OOPSLA 2002.
>
> Bjorn De Sutter et al., "On the Side-Effects of Code Abstraction,"
> LCTES 2003.
>
> As an aside, the reference in the last paper to "code abstraction" is
> really talking about un-inlining functions from the binary. One of the
> experiments they run is to profile the code for hot spots, then feed that
> information back into a tool that un-inlines only the infrequently called
> functions.

Which sounds like what today's compilers offer ...

> The end result in some of their experiments are programs that
> run faster and are about half as big.

Which is, in the end, a very strong argument _for_ templates:
That compiler nearly made it. A factor of two might well be triggered
by a tiny little code change or by switching compiler _revisions_.

> One could thus argue that the
> original optimized programs (yes, they enable full compiler optimizations)
> were bloated by a factor of about 100%, though one can't necessary blame
> that bloat on templates. (It depends on where the "excessive" inlining
> came from.)

As I said above: one can't blame that bloat on templates but on the
compiler in use. I think most C++ compilers will include such a
post-processing step if this is not already true for some of them.

> The experiments are run on gcc-based systems that seem to use a template
> instantiation repository and that don't seem to eliminate duplicates
> terribly well during linking, so I think that their results would be less
> dramatic on other platforms, but the example of the template that refers
> to a static was an enlightening example to me of how C++ templates can
> lead to bloat and how .NET templates can avoid it (for that particular
> case).
>
> There are still lots of people who would like to make more use of
> templates in general and the STL in particular, but who, in spite of their
> using contemporary compilers and linkers, find that the general use of
> templates
> DOES bloat their code.

Maybe they have not read "Effective STL" yet ;-)

I mean, after reading the book from himself,
my code bloat immediately reduced significantly and the
speedup was incredible. (Thanks btw.)

> These people would love to find a solution to this
> problem. So would I.

I am still convinced: this problem - if it is one - is gone pretty soon.

1. Embedded systems will be shipped with more main memory, no end in sight.

2. If You are in need of an embedded STL, how about testing
http://www.syncdata.it/stlce/stldownload.html
or ask Andrei or the Boost people to write You one within a week or so
(grin)

> If you have ideas on how to go about doing it,
> please post. Before doing so, you might want to review the thread at
> http://tinyurl.com/u7sf, as that is a discussion of this issue from last
> year that is, I suspect, still pretty relevant.

To cite Francis Glassborow (2002-05-17)

"The point that needs to be made is that in almost all cases where
templates are accused of creating code bloat the fault is either a
compiler that fails in its responsibilities, or a programmer who is
working beyond their skills."

I'd like to add: ... or a template library which was written
without embedded systems in mind.

Premature optimization always is a tradeoff between memory consumption
and speed.

Maybe I should excuse myself: I use Expression Templates to generate code.
(see e.g. Daixtrose).
So most of the time when I write that nice keyword "template", this is a
strong hint to the compiler that I'd like _not_ to see the name that
follows this keyword in the set of visible symbol names lurking in the
executable.

This works pretty well and the performance issues _I_ have come
from allocation/reallocation or imperfect memory layout or
simply have the reason that I am a C++ beginner and stupid.
I never encountered templates as the problem,
so maybe my point of view is a little bit unrepresentative.

Markus

--

Build your own Expression Template Library with Daixtrose!
Visit http://daixtrose.sourceforge.net/

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

P.J. Plauger

unread,

Nov 11, 2003, 5:18:20 AM11/11/03

to

"Markus Werle" <numerical....@web.de> wrote in message

news:boo5m1$1g8m98$1...@ID-153032.news.uni-berlin.de...

> Templates are not responsible for code bloat, but rather
> the fact that
>
> a) ... most compilers have only rudimentary
> support for dealing correctly/perfectly with templates.

So we blame the implementors...

> b) ... templates are used unwisely.

and the users...

> >From a theoretical point of view I cannot believe
> that anything that can be optimized by human hands
> will stay out of reach of existing or future compilers,
> be it on source code level or during linkage.

and look to the future...

> My own experience with "vanishing symbols" makes me
> think that templates are no problem for my favourite
> compiler and that code bloat can be avoided either
> by intelligent use of templates or high performance compilers.
>
> Maybe I should not have used "rant", but I find it
> dangerous to make a design decision based on today's
> compilers which hinders 80% of what C++ can offer.

And I find it dangerous to promulgate a language standard
that ignores the limitations of existing technology.

> .....

> I am still convinced: this problem - if it is one - is gone pretty soon.
>
> 1. Embedded systems will be shipped with more main memory, no end in
sight.

But *small* embedded systems will continue to be important. Perhaps
the mean size of desktop applications has grown over the years, but for
embedded systems what we see instead is an increasing *range* of useful
sizes.

> 2. If You are in need of an embedded STL, how about testing
> http://www.syncdata.it/stlce/stldownload.html
> or ask Andrei or the Boost people to write You one within a week or so
> (grin)

Or you can get a commercially supported version prepackaged for CE,
if you can't afford free software.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

andrew queisser

unread,

Nov 12, 2003, 4:16:17 AM11/12/03

to

Markus Werle <numerical....@web.de> wrote in message news:<boo5m1$1g8m98$1...@ID-153032.news.uni-berlin.de>...
>

> 1. Embedded systems will be shipped with more main memory, no end in sight.
>

Yes, but as soon as the current generation of embedded systems has
enough memory to allow the programmer some leeway the next generation
of ever-lower cost embedded systems pops up. That generation might
only have 4K of memory.

Andrew

Markus Werle

unread,

Nov 12, 2003, 3:36:51 PM11/12/03

to

P.J. Plauger wrote:

> "Markus Werle" <numerical....@web.de> wrote in message
> news:boo5m1$1g8m98$1...@ID-153032.news.uni-berlin.de...

> [...]

>> Maybe I should not have used "rant", but I find it
>> dangerous to make a design decision based on today's
>> compilers which hinders 80% of what C++ can offer.
>
> And I find it dangerous to promulgate a language standard
> that ignores the limitations of existing technology.

I think this is exactly the point which is discussable.
What are indeed the limitations of existing technology and
even more important: how long will they last?

While looking at the development of C++ compilers during
the past 4 years I encounter not only rapid convergence towards
ISO standard (yes, they all tend to eat my stuff now :-),
but also an incredible acceleteration in performance
of both the compiler itself and the executable.

Today's compiler's already show a strong nonlinear
behaviour with regard to executable performance.
Small changes in the code yield (often unexpected)
large changes in runtime performance, even good
rules of thumb tend to become invalid in certain
circumstances.

If we think in terms of five years, a simple extrapolation
of what we have today lets me think that
the rule of thumb "Tenplates yield code bloat" is so
inexact, it should not be held high anymore.

Markus

--

Build your own Expression Template Library with Daixtrose!
Visit http://daixtrose.sourceforge.net/

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Mogens Hansen

unread,

Nov 13, 2003, 3:45:38 AM11/13/03

to

"Scott Meyers" <Use...@aristeia.com> wrote in message
news:MPG.1a16e95d9...@news.hevanet.com...

[8<8<8<]

> On the research front, the phenomenon of "post-pass code optimization" has
> yielded evidence that templates DO lead to code bloat (at least on the
> tested platforms) in the form of literally identical binary code segments.
> These identical code segments are not always complete functions, however.

That could be hard for the compiler/linker to merge.
How about manually splitting the functions into subfunctions, to help the
compiler/linker ?

> For example, suppose we have this template:
>
> template<typename T>
> class MyClass {
> ...
> private:
> static int x;
> };
>
> And suppose that on a machine where int and long are the same size, we
> instantiate MyClass for both int and long. At first glance, we'd expect
> the instantiations of MyClass's member functions to be identical at the
> binary level for the two types, but note that MyClass has a static data
> member. That means that any reference to MyClass::x will differ between
> MyClass<int> and MyClass<long>, and, assuming that each member function of
> MyClass refers to x, that means that *none* of the implementations of
> MyClass's member functions can be shared across the two instantiations.

That should be pretty easy to solve manually:

template <typename T>
class MyClass {

public:
void Func()
{ FuncImpl(x_); }

private:
void FuncImpl(int& x)
{
// ...
}

static int x_;
};

Now MyClass<int>::FuncImpl and MyClass<long>::FuncImpl might be fully binary
identical and thus can be folded into one function, since they don't
directly depend on the static member.

This code might be a _little_ slower, since x_ is accessed indirectly.
But that seems like a fair engineering trade-off to me: size for speed.

As an experiment I compiled the following code
<C++ code>
#include <typeinfo>
#include <iostream>

template <typename T>
class foo
{
public:
foo(int i_value);

static void print(std::ostream& os);

private:
static void init(const std::type_info& ti , int& i, int i_value);
static void print(const std::type_info& ti, std::ostream& os, int& i);

private:
static int i_;
};

template <typename T>
int foo<T>::i_;

template <typename T>
foo<T>::foo(int i_value)
{
init(typeid(T), i_, i_value);
}

template <typename T>
void foo<T>::init(const std::type_info& ti, int& i, int i_value)
{
i = i_value;

std::cout << "foo<" << ti.name() << ">::init(" << i_value << ");" <<
std::endl;
}

template <typename T>
inline void foo<T>::print(std::ostream& os)
{
print(typeid(T), os, i_);
}

template <typename T>
void foo<T>::print(const std::type_info& ti, std::ostream& os, int& i)
{
os << "foo<" << ti.name() << ">::print(" << i << ");" << std::endl;
}

int main()
{
foo<int > fi(1);
foo<long> fl(2);

foo<int >::print(std::cout);
foo<long>::print(std::cout);
}
<C++ code/>

with Microsoft Visual C++.NET 2003 and Intel C++ 7.1 for Window.

With both compilers, when compiling in release mode the functions
void foo<int>::init(const std::type_info& ti, int& i, int i_value)
void foo<long>::init(const std::type_info& ti, int& i, int i_value)
were folded into one function, and
void foo<int>::print(const std::type_info& ti, std::ostream& os, int& i)
void foo<long>::print(const std::type_info& ti, std::ostream& os, int& i)
were folded into one function.
Of course the output from the program was correct as expected.

Please note that the Microsoft linker has supported this option (/OPT:ICF)
for many years.
IIRC it was introduced in Microsoft Visual C++ V5.0, primarily to support
ATL for writing small components.
It's not brand new or only available in research environments.

Kind regards

Mogens Hansen

Scott Meyers

unread,

Nov 16, 2003, 6:38:26 AM11/16/03

to

On 10 Nov 2003 14:40:04 -0500, Markus Werle wrote:

> Scott Meyers wrote:
> > On the research front, the phenomenon of "post-pass code optimization" has
> > yielded evidence that templates DO lead to code bloat (at least on the
> > tested platforms) in the form of literally identical binary code segments.
>
> So the post-post-pass can find those and fold the code?
> Do you know whether the best 3 compilers in the market
> do already have such technology builtin or not?

None do, to the best of my knowledge. It's still in the research stages,
and most of the work I know about has been applied only to C. Consult the
papers I cited before for details.

> I am a bit confused:
> I always thought that the compiler is allowed to do _anything_
> it likes as long as the executable yields the requested results.

True. But in the case of code bloat, the question is not what
compilers/linkers are allowed to do, but what they do do.

> So after a
> icpc -static -o my.exe -ipo [all my object files]
> I do not even _expect_ any entity MyClass<int>::x or others
> to exist anymore.
>
> Do I have a wrong view on this?

I don't know, because I don't know what the "icpc" line is supposed to do.
However, I don't see how, in the general case, physically different statics
can be folded.

> As long as MyClass<int>::x and MyClass<long>::x have different semantics
> I'd prefer to blame the semantics and not the templates for the code bloat.

But if you compare the generated code for C++ template with that of .NET
generics (the topic of this thread), you'll see that C++ compilers -- at
least the ones I know about -- generate completely different functions even
if their bodies differ by only one or two instructions. In .NET, multiple
instantiations of a template will share code except in places where they
absolutely cannot (e.g., because they are referencing different statics).
That's because generics will be supported directly in the underlying IL.
At least that's my understanding.

> If You want those to be linked to the same memory snippet, I cannot see
> any problem here. A little redesign will clearly ensure exactly this.
> This is what traits were invented for, right?

It depends on how you choose to look at it. The .NET approach shows that
programmers don't have to concern themselves with this kind of thing. You
seem to believe that all we have to do is educate the hundreds of thousands
of C++ programmers to program differently, and the problem will go away.

> >From a theoretical point of view I cannot believe
> that anything that can be optimized by human hands
> will stay out of reach of existing or future compilers,
> be it on source code level or during linkage.

That may be true, but it doesn't help people who are wrestling with the
output of current compilers now.

> > As an aside, the reference in the last paper to "code abstraction" is
> > really talking about un-inlining functions from the binary. One of the
> > experiments they run is to profile the code for hot spots, then feed that
> > information back into a tool that un-inlines only the infrequently called
> > functions.
>
> Which sounds like what today's compilers offer ...

To the best of my knowledge, the only compilers that offer the ability to
use POGO are Intel's and Microsoft's, the latter for 64 bit code only. I
don't know whether either of them will "un-inline" code automatically. I
have never heard of them doing that.

> > The end result in some of their experiments are programs that
> > run faster and are about half as big.
>
> Which is, in the end, a very strong argument _for_ templates:
> That compiler nearly made it. A factor of two might well be triggered
> by a tiny little code change or by switching compiler _revisions_.

You should check the data. The data in their experiments show that the
greatest reductions from their tools come from code using templates, i.e.,
the most bloated programs are the ones making the heaviest use of
templates.

> As I said above: one can't blame that bloat on templates but on the
> compiler in use. I think most C++ compilers will include such a
> post-processing step if this is not already true for some of them.

As far as I know, none do. Which do, to your knowlege?

> 1. Embedded systems will be shipped with more main memory, no end in sight.

And that memory will be filled up with new functionality faster than the
memory gets added, at least for some types of systems. I'm working with
people now who have 32MB of memory and are still literally counting bytes.
Embedded systems with more memory also have to do more. If they don't do
more, they don't get more memory. Why would they? Memory costs money, and
if you're moving lots of units, every penny saved on hardware counts.

Scott

Scott Meyers

unread,

Nov 16, 2003, 7:30:34 PM11/16/03

to

On 13 Nov 2003 03:45:38 -0500, Mogens Hansen wrote:
> That should be pretty easy to solve manually:
>
> template <typename T>
> class MyClass {
> public:
> void Func()
> { FuncImpl(x_); }
>
> private:
> void FuncImpl(int& x)
> {
> // ...
> }
>
> static int x_;
> };
>
> Now MyClass<int>::FuncImpl and MyClass<long>::FuncImpl might be fully binary
> identical and thus can be folded into one function, since they don't
> directly depend on the static member.

So all we need to do is educate C++ programmers that they should never
refer to the static members of a class directly but should instead pass the
address of each static they want to refer to to an intermediate function?
And can that intermediate function really be inlined as you've done it
above? If it's inlined, doesn't the function go away, yielding the
original problem?

And what is your suggestion for how to write templates that will be
instantiated on different types, e.g., int and double? There will still
likely be many common instruction sequences in each member function, but
some conceptually identical instructions will actually differ due to the
need to refer to different data types. Note that in .NET (part of the
topic of this thread), these issues are handled automatically during code
generation -- programmers need not worry about them.

Regarding the assertion in this thread that I originally responded to (that
"the rant about code bloat is ... historical garbage from worser times
which have gone by now"), I think that the existence of the ability to
manually code around code bloat is not the same as evidence that most
programmers are aware of these workarounds or that they employ them on a
regular basis. It's been well known for 15 years that base classes should
generally declare virtual destructors, but failure to declare them
continues to be a problem. So much so that the standard for C++ may be
modified to generate them automatically for classes that declare other
virtual functions. Identification of a solution to a problem does not make
the problem go away. It goes away only if people embrace the solution, and
that doesn't always happen.

> As an experiment I compiled the following code

> with Microsoft Visual C++.NET 2003 and Intel C++ 7.1 for Window.
>
> With both compilers, when compiling in release mode the functions
> void foo<int>::init(const std::type_info& ti, int& i, int i_value)
> void foo<long>::init(const std::type_info& ti, int& i, int i_value)
> were folded into one function, and
> void foo<int>::print(const std::type_info& ti, std::ostream& os, int& i)
> void foo<long>::print(const std::type_info& ti, std::ostream& os, int& i)
> were folded into one function.

So let's look at those functions:

> template <typename T>
> void foo<T>::init(const std::type_info& ti, int& i, int i_value)
> {
> i = i_value;
>
> std::cout << "foo<" << ti.name() << ">::init(" << i_value << ");" <<
> std::endl;
> }

Notice that this function does not refer to the class's static member.
Hence this is not an example of the issue I originally referred to.

> template <typename T>
> inline void foo<T>::print(std::ostream& os)
> {
> print(typeid(T), os, i_);
> }

This does refer to the static member i_, but it's inline, so we'd expect
the different memory locations for i_ to be folded into the call site
inside main.

> template <typename T>
> void foo<T>::print(const std::type_info& ti, std::ostream& os, int& i)
> {
> os << "foo<" << ti.name() << ">::print(" << i << ");" << std::endl;
> }

Like init, this version of print fails to refer to the class's static
member, hence demonstrates nothing about the issue in question.

> Please note that the Microsoft linker has supported this option (/OPT:ICF)
> for many years.
> IIRC it was introduced in Microsoft Visual C++ V5.0, primarily to support
> ATL for writing small components.
> It's not brand new or only available in research environments.

And do most other compilers have a similar ability? What about gcc?
MetroWerks? Green Hills?

Scott

Mogens Hansen

unread,

Nov 20, 2003, 5:05:59 AM11/20/03

to

"Scott Meyers" <Use...@aristeia.com> wrote:
> On 13 Nov 2003 03:45:38 -0500, Mogens Hansen wrote:

[8<8<8]

> So all we need to do is educate C++ programmers that they should never
> refer to the static members of a class directly but should instead pass
the
> address of each static they want to refer to to an intermediate function?

No, since it is not a problem to everybody.
It is good that this issue is described, like you have done.
If the compilers creates more code than expected and necessary _and_ if that
is a problem for the concrete project, then the programmer needs to solve
the problem.
In that case the static member shouldn't be referenced directly.
The programmer should also be aware of a pragmatic workaround, by using an
intermediate function which uses a reference to the static.

> And can that intermediate function really be inlined as you've done it
> above?

No, you might be right.
In the sample code below, which I actually tested the implementation
function was specifically not inlined.

> If it's inlined, doesn't the function go away, yielding the
> original problem?

You may be right.

>
> And what is your suggestion for how to write templates that will be
> instantiated on different types, e.g., int and double?

If the static variable has a type which doesn't depend on the template
arguments then it's a non-issue whether the template arguments are binary
identical (like int and long on some platforms) or not (like int, float and
std::string).

If the static variable has a type which depend on the template arguments,
then the parts of the functions which depend on the template arugment could
be factored into seperate functions (using the template method pattern
(which doesn't have anything to do with C++ templates)).
The functions should then be accessed indirectly (through function pointers
or by making the functions virtual) for the sole purpose of merging the main
implementation functions across different instantiations.
That is basicly moving parts of the functionality from compile-time
polymorphism to run-time polymorphism to trade small size for slower
execution.
Solving the problem by adding another level of indirection which seperates
the commonalities from the variabilities.

Like:

<C++ code>
#include <typeinfo>
#include <iostream>

#include <string>

template <typename T>
class foo
{
public:

foo(T i_value);

static void print(std::ostream& os);

private:
static void print_i(std::ostream& os);

private:
static T i_;
};

template <typename T>
T foo<T>::i_;

template <typename T>
foo<T>::foo(T i_value)
{
i_ = i_value;
}

namespace {

void foo_print_impl(const std::type_info& ti, std::ostream& os, void
(*print_i_func)(std::ostream&))
{
os << "foo<" << ti.name() << ">::print(";
print_i_func(os);
os << ");" << std::endl;
}

} // end of unnamed namespace

template <typename T>
inline void foo<T>::print(std::ostream& os)
{

foo_print_impl(typeid(T), os, print_i);
}

template <typename T>
void foo<T>::print_i(std::ostream& os)
{
os << i_;
}

int main()
{
foo<int > fi(1);

foo<float> fl(2.2f);
foo<std::string> fs("Hello world!");

foo<int >::print(std::cout);
foo<float>::print(std::cout);
foo<std::string>::print(std::cout);
}
<C++ code/>

which any compiler should be able to translate without code-bloat.

> There will still
> likely be many common instruction sequences in each member function, but
> some conceptually identical instructions will actually differ due to the
> need to refer to different data types. Note that in .NET (part of the
> topic of this thread), these issues are handled automatically during code
> generation -- programmers need not worry about them.

My understanding of the current state of compilers (like the Intel and
Microsoft compilers) is that they can merge identical functions.
A finer granularity seem to require something like what is described in the
papers that you mentioned.

Does .NET ensure that identical code-block smaller than a function is
merged at assembly level during execution ?

>
> Regarding the assertion in this thread that I originally responded to
(that
> "the rant about code bloat is ... historical garbage from worser times
> which have gone by now"), I think that the existence of the ability to
> manually code around code bloat is not the same as evidence that most
> programmers are aware of these workarounds or that they employ them on a
> regular basis.

Right.
I would definitely like the tool chains in general to generate optimal code,
such that low level, manual performance tweaking can be avoided.
But until that happens it is good to know how to tweak the code when needed.

[8<8<8<]

> > template <typename T>
> > inline void foo<T>::print(std::ostream& os)
> > {
> > print(typeid(T), os, i_);
> > }
>
> This does refer to the static member i_, but it's inline, so we'd expect
> the different memory locations for i_ to be folded into the call site
> inside main.

Exactly.
That's the whole point of the function.

>
> > template <typename T>
> > void foo<T>::print(const std::type_info& ti, std::ostream& os, int& i)
> > {
> > os << "foo<" << ti.name() << ">::print(" << i << ");" << std::endl;
> > }
>
> Like init, this version of print fails to refer to the class's static
> member, hence demonstrates nothing about the issue in question.

Right, the whole point of this function is that it doesn't refer to the
class's static member or to the template arguments.
In fact in the example above foo<T>::print(const std::type_info& ti,
std::ostream& os, int& i) could have been a global non-template function (as
shown in the code above).

It sure does demonstrate that it is simple to create a pragmatic solution to
the issue, whithout changing the interface or semantics of the class..

Together the two functions isolates the part of the functionality which
depends on the template argument from the part which doesn't depend on the
template argument.
The technique of refactoring the code such that it isolates the template
argument dependent code from the argument template independent code is
general applicable to solve problems related to code-bloat fromC++
templates.

I fully acknowledge that it would be better if the tool chain (compiler /
linker / post-link optimizer - whatever) could solve the problem without
changing the source code.

In my experience (also from embedded, hard real-time systems with limited
resources using C++) templates offers unique possibilities to write compact,
flexible, high performance, relative high level code better than the
alternatives (like classic object oriented programming with class hierachies
and runtime polymorphy).
That's because template allow several decisions to be made at compile time,
and it often allows functions to be inlined and thus enabling the compilers
optimizer to optimize more.

Of course templates can be used to write sub-optimal software - just like
any other language feature.
Of course it is harder to write high performance, hard real time code for a
target with limited resources, than it is to write low performance
bloat-ware for a PC.

[8<8<8<]

> > It's not brand new or only available in research environments.
>
> And do most other compilers have a similar ability? What about gcc?
> MetroWerks? Green Hills?

I don't know.
Several C++ compilers which I use doesn't support that kind of optimization.

But that's a general issue.
What should we do if the tools are not as good as we would like or could be
?
What if the chosen compiler generates code which runs two times slower than
a competitors state-of-art compiler ?
We might have to write some of the critical parts of the code in assembler,
even though we know that it is sub-optimal in terms of programmer
productivity, portability and maintainability.
What if the chosen compiler doesn't support partial template specialization
and we need it ?

We can try to put presure on the vendors and pay them to improve the tools,
and point to techniques which solves the problems.

Kind regards

Mogens Hansen

Scott Meyers

unread,

Nov 21, 2003, 4:45:39 AM11/21/03

to

On 20 Nov 2003 05:05:59 -0500, Mogens Hansen wrote:
> In that case the static member shouldn't be referenced directly.
> The programmer should also be aware of a pragmatic workaround, by using an
> intermediate function which uses a reference to the static.

I'm confused about how this can work. Given a class template Tem and a
static variable s, we're assuming that the code for Tem's member functions
are identical for instantiation parameters T1 and T2 except for references
to s (because Tem<T1>::s and Tem<T2>::s have different addresses). That
means that any member function mf in Tem that refers to s will generate
different binary code for Tem<T1>::mf and Tem<T2>::mf, because each will
refer to the appropriate address for s.

You seem to suggest that instead of referring to s (which, for simplicity,
I'll assume is an int), we introduce a static member function getS that
will return a reference to s:

template<typename T>
int& getS() { return s; }

We'll assume that getS is not inlined, because we seem to agree that if
it's inlined, we're back to the original problem. But now Tem<T1>:getS and
Tem<T2>::getS are different functions, so if Tem::mf calls getS, we'll get
different function bodies for Tem<T1>::mf and Tem<T2>::mf. So I don't
understand how your general technique is supposed to work.

> > And what is your suggestion for how to write templates that will be
> > instantiated on different types, e.g., int and double?
>
> If the static variable has a type which doesn't depend on the template
> arguments then it's a non-issue whether the template arguments are binary
> identical (like int and long on some platforms) or not (like int, float and
> std::string).

Why?

> template <typename T>
> class foo
> {
> public:
> foo(T i_value);
>
> static void print(std::ostream& os);
>
> private:
> static void print_i(std::ostream& os);
>
> private:
> static T i_;
> };

> template <typename T>

> void foo<T>::print_i(std::ostream& os)
> {
> os << i_;
> }

I don't see how the functions generated from this template can be merged,
since they each refer to a different i_.

> template <typename T>
> inline void foo<T>::print(std::ostream& os)
> {
> foo_print_impl(typeid(T), os, print_i);
> }

I don't see how the functions generated from this template can be merged,
since they each refer to a different print_i.

> My understanding of the current state of compilers (like the Intel and
> Microsoft compilers) is that they can merge identical functions.
> A finer granularity seem to require something like what is described in the
> papers that you mentioned.
>
> Does .NET ensure that identical code-block smaller than a function is
> merged at assembly level during execution ?

My understanding is yes. I'm basing this on Jason Clark's October MSDN
article (and some subsequent email exchanges). The article is at
http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx.

> I would definitely like the tool chains in general to generate optimal code,
> such that low level, manual performance tweaking can be avoided.
> But until that happens it is good to know how to tweak the code when needed.

Here we agree.

> In my experience (also from embedded, hard real-time systems with limited
> resources using C++) templates offers unique possibilities to write compact,
> flexible, high performance, relative high level code better than the
> alternatives (like classic object oriented programming with class hierachies
> and runtime polymorphy).
> That's because template allow several decisions to be made at compile time,
> and it often allows functions to be inlined and thus enabling the compilers
> optimizer to optimize more.

We agree here, too. But my experience has been that getting developers of
such systems to be willing to try such techniques is made more difficult by
the fact that when they try very simple experiments with templates, they
often find that their code becomes MUCH larger than they anticipated. That
scares them away from using templates. Even when they can be convinced
that templates have some powerful properties, they must wrestle with the
fact that they will have to seriously reevaluate their approach to writing
code -- for their entire team. This is a nontrivial undertaking. Telling
people that there are sophisticated and unintuitive tricks that overcome
template code bloat is rarely as reassuring as you seem to think it should
be.

Scott

DILIP

unread,

Nov 21, 2003, 10:11:01 PM11/21/03

to

"Scott Meyers" <Use...@aristeia.com> wrote in message

news:MPG.1a26b9887...@news.hevanet.com...

> > Does .NET ensure that identical code-block smaller than a function is
> > merged at assembly level during execution ?
>
> My understanding is yes. I'm basing this on Jason Clark's October MSDN
> article (and some subsequent email exchanges). The article is at
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx.

I seem to be repeating this again and again in different threads but for the
sake of completeness let me point out that there is a much newer article by
Juval Lowy at MSDN over here:
(watch out for line breaks in links)

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechar
t/html/csharp_generics.asp

This piece has a much better and more exhaustive coverage of .NET generics.

To round out the whole thing here is another complete piece on the exact
difference between Generics and Templates:
http://blogs.gotdotnet.com/branbray/PermaLink.aspx/c14dce49-3254-4976-b38c-8
8bfd61b65cf

thanks
--Dilip

Peter Dimov

unread,

Nov 22, 2003, 5:30:36 AM11/22/03

to

Scott Meyers <Use...@aristeia.com> wrote in message news:<MPG.1a26b9887...@news.hevanet.com>...
[...]

> We agree here, too. But my experience has been that getting developers of
> such systems to be willing to try such techniques is made more difficult by
> the fact that when they try very simple experiments with templates, they
> often find that their code becomes MUCH larger than they anticipated. That
> scares them away from using templates.

I agree that templates are an easy way to increase the size of the
code. But I don't think that so far you have shown an example where
the equivalent non-template code would have been smaller. Note
"equivalent". You can merge non-template functions by hand by using
void* and casts (for instance), and you can also merge the common
parts of template functions by hand by using void* and casts. And you
can easily bloat the code by using copy and paste instead of
templates.

> Even when they can be convinced
> that templates have some powerful properties, they must wrestle with the
> fact that they will have to seriously reevaluate their approach to writing
> code -- for their entire team.

It seems to me that the opposite is true. If they apply their existing
approach to templated code, they can successfully eliminate the
redundancy. It is only the expectation that templates can somehow do
this for them that is not met.

Scott Meyers

unread,

Nov 22, 2003, 7:24:56 PM11/22/03

to

On 22 Nov 2003 05:30:36 -0500, Peter Dimov wrote:
> I agree that templates are an easy way to increase the size of the
> code. But I don't think that so far you have shown an example where
> the equivalent non-template code would have been smaller.

But I never made a claim to that effect. My argument was and is that
Markus Werle's original claim,

The rant about code bloat is nothing but another historical garbage from
worser times which have gone by now.

was not well founded.

> It seems to me that the opposite is true. If they apply their existing
> approach to templated code, they can successfully eliminate the
> redundancy. It is only the expectation that templates can somehow do
> this for them that is not met.

I continue to argue that it's not that simple. If you're working with a
linker that doesn't strip out duplicates and you use implicit
instantiation, non-inline functions, and separate compilation, you will get
multiple copies of template functions in your binary, thus bloating your
code. There is no real analog to this problem in non-template-based code.
There are ways to work around this (e.g., manual instantiation in a single
translation unit), but I wouldn't call that an application of "their
existing approach."

In case there is any confusion here, I'm a huge fan of templates. In my
work with the embedded community, I recommend them frequently, and I spend
a lot of time and energy explaining how code bloat can arise and the steps
they can take to combat it. My point is that breezy claims that "code
bloat is a problem that no longer exists" or "using templates eliminates
redundancy[*]" don't help make the case for using templates for people who are
concerned about bloat, because very simple tests can show (or at least seem
to show) that the claims are false. We don't strengthen the case for
templates by pretending that people don't have trouble with code bloat
arising from them.

Scott

[*] I agree that they eliminate redundancy in source code, but they can
lead to replication in object code, and for people working in
space-constrained environments, they care much more about the latter.

Mogens Hansen

unread,

Nov 22, 2003, 7:27:34 PM11/22/03

to

"Scott Meyers" <Use...@aristeia.com> wrote:
> On 20 Nov 2003 05:05:59 -0500, Mogens Hansen wrote:
> > In that case the static member shouldn't be referenced directly.
> > The programmer should also be aware of a pragmatic workaround, by using
an
> > intermediate function which uses a reference to the static.
>
> I'm confused about how this can work. Given a class template Tem and a
> static variable s, we're assuming that the code for Tem's member functions
> are identical for instantiation parameters T1 and T2 except for references
> to s (because Tem<T1>::s and Tem<T2>::s have different addresses).

Right.
But it's also important to note that Tem<T1>::s and Tem<T2>::s are the same
type.

> That
> means that any member function mf in Tem that refers to s will generate
> different binary code for Tem<T1>::mf and Tem<T2>::mf, because each will
> refer to the appropriate address for s.

Right.

>
> You seem to suggest that instead of referring to s (which, for simplicity,
> I'll assume is an int), we introduce a static member function getS that
> will return a reference to s:
>
> template<typename T>
> int& getS() { return s; }
>
> We'll assume that getS is not inlined, because we seem to agree that if
> it's inlined, we're back to the original problem. But now Tem<T1>:getS
and
> Tem<T2>::getS are different functions, so if Tem::mf calls getS, we'll get
> different function bodies for Tem<T1>::mf and Tem<T2>::mf. So I don't
> understand how your general technique is supposed to work.

No, that's not what I suggested, and I agree that it probably won't reduce
code bloat.
I suggested the introduction of a level of indirection, which can separate
the commonalities between Tem<T1>::mf and Tem<T2>::mf from the
variabilities.

The commonalities are:
* The algorithm (potentially large and thus prone to code bloat)
* The type of s (in this case int)

The variability is:
* The address of s

A reference to s as a function parameter captures these commonalities and
variabilities exactly:

template <typename T>
class Tem
{
public:
// This function isolated the variation due to the different
// addresses of s for different types of T
// This function should be inlined
void mf();
{ mf(s); }

private:

// This function isolated the commonalities across
// all types T
// It contains the actual implementation,
// which can be long and thus prone to code bloat
// if s was accessed directly
// This functions should not be inlined
// For any type T1 and T2 this Tem<T1>::mf and Tem<T2>::mf
// will (most likely) be binary identical
void mf(int& s_arg);

static int s;
};

>
> > > And what is your suggestion for how to write templates that will be
> > > instantiated on different types, e.g., int and double?
> >
> > If the static variable has a type which doesn't depend on the template
> > arguments then it's a non-issue whether the template arguments are
binary
> > identical (like int and long on some platforms) or not (like int, float
and
> > std::string).
>
> Why?

Because if the static variable has fixed type (independent of the template
argument - say int) then the only difference between Tem<T1>::mf and
Tem<T2>::mf will be the address of the static variable.
This difference is independent of which type T1 and T2 has.
Thus it's a non-issue for this discussion whether T1 and T2 might in fact be

binary identical (like int and long on some platforms) or not (like int,

double or std::string).
However the possible code bloat from accessing a static variable in a member
function is an issue - it's just independent of the actual type of T.

[8<8<8<]

> > template <typename T>
> > void foo<T>::print_i(std::ostream& os)
> > {
> > os << i_;
> > }
>
> I don't see how the functions generated from this template can be merged,
> since they each refer to a different i_.

Right.
It can't and it shouldn't.
But that's not due to i_ having different address for different types of T.
That's because i_ has different type for different types of T.
It is fundamental that printing (or adding, comparing, copying etc)
different types is handled differently at the binary level, eventhough it's
that same at the source code level.
Printing an int is fundamentally different from printing a double or a
std::string object.
This is exactly the variation that we want to express with templates.

Thus the inability to merge this function is not code bloat, using a
somewhat vague definition of code bloat like "redundant duplication of
identical functions, which could have been merged if the tool chain was more
sophisticated".

This function isolates part of the variation of printing for different types
of T.

>
> > template <typename T>
> > inline void foo<T>::print(std::ostream& os)
> > {
> > foo_print_impl(typeid(T), os, print_i);
> > }
>
> I don't see how the functions generated from this template can be merged,
> since they each refer to a different print_i.

Right.
It can't and it shouldn't.
But that doesn't lead to code bloat, because it captures the fundamental
differences of printing for different types of T.
These differences has to be present at the binary level.
No amount of optimization can make the differences go away.
Thus there is no redundancy which can be eliminated and thus no code bloat.

This function expresses the variation of how to print for different types of
T:
* Which type_info object to use
* How to print i_

The global, non-template function foo_print_impl captures the commonalities
of how to print for different types of T:
* Print some fixed text
* Print the name of the type - using a type_info object
* Print some more fixed text
* Print the the value of i_ - using a pointer to function
* Print some more fixed text
* Flush the stream

[8<8<8<]

> > Does .NET ensure that identical code-block smaller than a function is
> > merged at assembly level during execution ?
>
> My understanding is yes. I'm basing this on Jason Clark's October MSDN
> article (and some subsequent email exchanges). The article is at
> http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx.

Thank you for the reference. I'll read it soon.

I'm sorry if my question was ambiguos - english isn't my first language.
By "at assembly level during execution" I refered to machine code after the
MSIL has been JIT'ed - I was not refering to .NET Assemblies.
Does that affect your answer ?

[8<8<8<]

> Telling
> people that there are sophisticated and unintuitive tricks that overcome
> template code bloat is rarely as reassuring as you seem to think it should
> be.

While I of course acknowledge your experience from teaching and working with
a lot of people in a lot of different organisations, I don't consider the
techniques that I've shown in this thread sophisticated and unintuitive,
compared to what it takes to write high performance, compact code for
embedded systems in general.

I think that it is general applicable to eliminate or reduce code bloat from
C++ templates (if/when needed) by refactoring the code according to an
analysis of commonalities and variabilities.

Kind regards

Mogens Hansen

Gabriel Dos Reis

unread,

Nov 23, 2003, 10:01:47 AM11/23/03

to

Scott Meyers <Use...@aristeia.com> writes:

| > It seems to me that the opposite is true. If they apply their existing
| > approach to templated code, they can successfully eliminate the
| > redundancy. It is only the expectation that templates can somehow do
| > this for them that is not met.
|
| I continue to argue that it's not that simple. If you're working with a
| linker that doesn't strip out duplicates and you use implicit
| instantiation, non-inline functions, and separate compilation, you will get
| multiple copies of template functions in your binary, thus bloating your
| code.

So, it is proof by taking an implementation that does not support
language use. I think the same argument can be made about virtually
any language. Which makes the "argument" uninteresting.

| There is no real analog to this problem in non-template-based code.

Sure, there is. If you're going to imagine dumb implementations for
template codes, then why not also consider dumb implementations for
non-template codes? Otherwise it is comparison of apples vs. oranges.

| In case there is any confusion here, I'm a huge fan of templates. In my

I did not understand the issue as about being "fan of templates" vs. not being.

--
Gabriel Dos Reis
g...@integrable-solutions.net

David Abrahams

unread,

Nov 23, 2003, 10:05:13 AM11/23/03

to

Scott Meyers <Use...@aristeia.com> writes:

> On 22 Nov 2003 05:30:36 -0500, Peter Dimov wrote:
>> I agree that templates are an easy way to increase the size of the
>> code. But I don't think that so far you have shown an example where
>> the equivalent non-template code would have been smaller.
>
> But I never made a claim to that effect. My argument was and is that
> Markus Werle's original claim,
>
> The rant about code bloat is nothing but another historical garbage from
> worser times which have gone by now.
>
> was not well founded.
>
>> It seems to me that the opposite is true. If they apply their existing
>> approach to templated code, they can successfully eliminate the
>> redundancy. It is only the expectation that templates can somehow do
>> this for them that is not met.
>
> I continue to argue that it's not that simple. If you're working
> with a linker that doesn't strip out duplicates and you use implicit
> instantiation, non-inline functions, and separate compilation, you
> will get multiple copies of template functions in your binary, thus
> bloating your code.

Just to pick a nit or two: not quite. You might be working with a
compiler that does link-time instantiation, in which case there might
be no duplicates to strip out in the first place. Also, you won't get
duplicates if you don't instantiate the same template specialization
in two different translation units. Finally, there's two kinds of
"duplicate" worth considering: instantiations of the same function
template specialization, and function template specializations which
just happen to generate identical object code.

> There is no real analog to this problem in non-template-based code.

I guess it depends whether you count the preprocessor and other code
generation systems.

I think it's worth pointing out that "eliminating bloat" isn't
neccessarily the same as "eliminating duplication". The issue is much
more complicated than that. I don't know what .NET does, but Haskell
compiles every generic function down to object code before it is ever
used, so there is absolutely no duplication. Any time the function is
called, the compiler generates a package of function and object
pointers with a prescribed format, something like this:

template <class X>
X next(X x) { return ++x; }

compiles down to something like:

struct next_
{
void (*incr)(void*);
void (*copy)(void*, void*);
std::size_t x_size;
};

void next(next_& n, void* result_storage, void* x)
{
char storage[n.x_size]; // alloca, maybe.
n.copy(x,storage);
n.incr(storage);
n.copy(storage,result_storage);
}

And it's up to the caller to package up a next_ structure before
invoking the next function. This is essentially turning static
polymorphism into dynamic polymorphism (**), and it's great for
reducing the code contribution of really large functions which use
only a few operations that depend on the types of the function's
arguments, but it's really bad for little functions. If X is
std::list<int>::iterator above, you end up doing an enormous amount of
work and generating quite a bit of code for what should be not much
more than a single pointer dereference.

As far as I know, getting everything just right involves walking the
line between static and dynamic polymorphism, compiling some generic
things down to object code, and leaving others to be "instantiated"
inline. If you look through the Boost libraries (e.g. Boost.Function)
you can see that there are several which are designed to let you tune
how that happens manually, by sticking statically-polymorphic types
in runtime-polymorphic wrappers. The shared_ptr template has some
similar properties in its deleter.

Also as far as I know, doing that kind of transformation automatically
remains a research topic. Last I heard, Todd Veldhuizen was working
on it.

(**) but only at the implementation level; the templates are
completely typechecked when they're compiled. Someone seems to
be claiming the same for .NET.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Mogens Hansen

unread,

Nov 23, 2003, 10:35:51 AM11/23/03

to

"Scott Meyers" <Use...@aristeia.com> wrote:

[8<8<8<]

> I continue to argue that it's not that simple. If you're working with a
> linker that doesn't strip out duplicates and you use implicit
> instantiation, non-inline functions, and separate compilation, you will
get
> multiple copies of template functions in your binary, thus bloating your
> code.

We have to be carefull about what we are talking about.
One of the problems when discussing code bloat in relation to C++ templates
is that it is not well defined what is actually meant.

Are you saying that for two compilation units A and B, which both uses the
same template specialisation, say
std::vector<int>::push_back(const int&)
we risk having 2 instances of that code linked into to final application ?

If that's the case, there is no doubt that it is downright a bug in the tool
chain (most likely the linker in a conventional implementation).
It would be a violation of the "One Definition Rule" (§3.2-3 in the C++
Standard), which states that
<quote>
Every program shall contain exactly one definition of every non-inline
function or object that is used in that program
<quote/>

Thus this case is not worth consideration as a general problem, but as a
specific bug in a specific implementation.
I have never seen such a problem and definity wouldn't accept such state.

There is no doubt that for our embedded hard real-time application, which
runs on a moderate computer (appr. 20 MHz Motorola 68300 familiy processor
with appr. 2 MByte RAM and 2 MByte Flash PROM), and which uses template
heavily to achive good performance and high level code, the application
would never fit the memory if the tool chain had such a problem.
The tool chain is more than 5 years old, except for minor updates.

It is worth noting that the ability for the linker (assuming a conventional
compiler and linker implementation) to eliminate duplicate functions from
multiple compilation units is not only requiered if the compiler uses
"greedy instantiation" for templates, but also for eliminating functions
declared inlined but in fact not inlined (spilled inline functions).
See
C++ Templates, The Complete Guide
David Vandevoorde, Nicolai M. Josuttis
ISBN 0-201-73484-2
page 155-156 for further details.

The classic example of code bloat from C++ templates is a program which uses
say
std::vector<int*>::push_back
and
std::vector<std::string*>::push_back
Since both containers contains pointers it is very likely that the two
functions are binary identical (assuming that they are not inlined).
For a naive (but propably common) implementation both functions will be
linked into the application and that is fully compliant with the C++
Standard.
If both functions are present in the application this redundancy is waste of
space (code bloat), since they could be merged.

This situation is completely different from 2 compilation units using the
same specialization.

To avoid this kind of problem the tool chain (the linker) can use one
instantiation and discard all the others.
This is what the Microsoft linker has been capable of since Visual C++ V5.0.
A caveat is that if the address of the function is taken, they have to be
unique - that is
&std::vector<int*>::push_back != &std::vector<std::string*>::push_back
but that can easily be solved using a call-thunk to which the function
pointer points.

Another low tech solution, which doesn't require support from the tool chain
is to use a template specialisation as described in
The C++ Programming Language, Special Edition
Bjarne Stroustrup
ISBN 0-201-70073-5
page 342.
This is in fact what we do in our project.

> There is no real analog to this problem in non-template-based code.
> There are ways to work around this (e.g., manual instantiation in a single
> translation unit), but I wouldn't call that an application of "their
> existing approach."

I think the right solution is to make the vendor fix the bug or choose
another vendor if I have understood the scenario correct.

Kind regards

Mogens Hansen

Bjarne Stroustrup

unread,

Nov 23, 2003, 7:35:40 PM11/23/03

to

Scott Meyers <Use...@aristeia.com> wrote (about templates and code
bloat):

> [*] I agree that they eliminate redundancy in source code, but they can
> lead to replication in object code, and for people working in
> space-constrained environments, they care much more about the latter.

I recently talked with some major users of C++ in very constrained
embedded environments. They had a rule that said that no unused code
may be left in deployed code. That makes sense in that environment.
However, use of classes often lead to unused code (because a member
function isn't called) and use of class hierarchies can be even more
problematic, because how do you avoid having uncalled overriding
functions left in the code.

The solution was to use templates. An unused template function isn't
instantiated. We verified this through the complete too chain (from a
major supplier in the embedded systems world). Thus templates can be
used to avoid code bloat that is hard to avoid without them.

There are - IMO - too many myths about code bloat and templates
floating around. Some of those myths are just myths, some are fueled
by antique implementations, and some are fueled by designs that are
unsuitable for templatization (e.g. class hierarchies with excessive
numbers of member functions or templates where only part of a large
function dependes on a template parameter). I recommed a look at the
standard committee's technical report on performance issues (link on
my C++ page) for how to deal with the basic performance issues.

- Bjarne Stroustrup; http://www.research.att.com/~bs

Gabriel Dos Reis

unread,

Nov 23, 2003, 7:36:36 PM11/23/03

to

David Abrahams <da...@boost-consulting.com> writes:

[...]

| I think it's worth pointing out that "eliminating bloat" isn't
| neccessarily the same as "eliminating duplication". The issue is much
| more complicated than that. I don't know what .NET does, but Haskell
| compiles every generic function down to object code before it is ever
| used, so there is absolutely no duplication. Any time the function is
| called, the compiler generates a package of function and object
| pointers with a prescribed format, something like this:

What you describe is what OCaml (and I guess SML/NJ) uses to implement
separate compilation of modules and (parametric) polymorphic
functions. And you're right in pointing out the drawbacks in terms of
space and time. For simple functions, it is far from clear that it
reduces any code bloat.

Current unchecked C++ templates offer a near perfect wyas to achieve
performance in terms of space and time; but of course, it requires
skills. Just like for anything else. There is no free lunch.

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Balog Pal

unread,

Nov 24, 2003, 6:24:09 AM11/24/03

to

"Bjarne Stroustrup" <b...@research.att.com> wrote in message
news:188b3370.0311...@posting.google.com...

> There are - IMO - too many myths about code bloat and templates
> floating around. Some of those myths are just myths, some are fueled
> by antique implementations, and some are fueled by designs that are
> unsuitable for templatization (e.g. class hierarchies with excessive
> numbers of member functions or templates where only part of a large
> function dependes on a template parameter).

Yeah. Examining my self-written templates I found they go to 2 classes. One
has templates that are light, and all or most functionns inline, and about
2-3 instrctions in size.

The other class has big stuff. But there I have a nontemplate base class
that does the common things, that are independent on the template params.
Then the template "finalizes" the class by adding the specific stuff --
which generally turns out not really big.

Thinking about it -- didn't even do that to avoid code bloat (with which I
encountered no problem), but because it makes the whole thing much clearer.
[The other reason is that the base class may be part of a GUI framework --
that has poor support for templates, etc]

For Scott's original statement -- I still think to a fair discussion about a
code bloat we shall present alternatives that do same thing -- and one,
tepmlated implementation is bloated and the other is not. As everyone
knows 'delete' is the most efficient data compressor. ;-)

Wrt static class members in a template -- is it really that spread to worth
talking about? The only thing I could come up as a benign class static was
a mutex to guard some operations. But on systems with threads you more
likely can live with a few kilobytes of repeated object code.

Fighting bloat is better done the same wat as fighting for speed. Use
profile in one, and the mapfile in the other -- when it is obvois you have a
problem. Then discovering the source of wasted bytes you go and fix that
one or two classes really adding to the bloat, and leave the rest alone.

Paul

David Abrahams

unread,

Nov 24, 2003, 7:29:01 AM11/24/03

to

"Mogens Hansen" <moge...@dk-online.dk> writes:

> "Scott Meyers" <Use...@aristeia.com> wrote:
>
> [8<8<8<]
> > I continue to argue that it's not that simple. If you're working with a
> > linker that doesn't strip out duplicates and you use implicit
> > instantiation, non-inline functions, and separate compilation, you will
> get
> > multiple copies of template functions in your binary, thus bloating your
> > code.
>
> We have to be carefull about what we are talking about.
> One of the problems when discussing code bloat in relation to C++ templates
> is that it is not well defined what is actually meant.

That's a good point, but...

> Are you saying that for two compilation units A and B, which both uses the
> same template specialisation, say
> std::vector<int>::push_back(const int&)
> we risk having 2 instances of that code linked into to final application ?
>
> If that's the case, there is no doubt that it is downright a bug in the tool
> chain (most likely the linker in a conventional implementation).
> It would be a violation of the "One Definition Rule" (§3.2-3 in the C++
> Standard), which states that
> <quote>
> Every program shall contain exactly one definition of every non-inline
> function or object that is used in that program
> <quote/>

You've misinterpreted the ODR. The standard doesn't talk about (or
even have a concept of) "code linked into the final application".
The ODR is about source code.

> It is worth noting that the ability for the linker (assuming a conventional
> compiler and linker implementation) to eliminate duplicate functions from
> multiple compilation units is not only requiered if the compiler uses
> "greedy instantiation" for templates, but also for eliminating functions
> declared inlined but in fact not inlined (spilled inline functions).
> See
> C++ Templates, The Complete Guide
> David Vandevoorde, Nicolai M. Josuttis
> ISBN 0-201-73484-2
> page 155-156 for further details.

Not really. It just needs to guarantee that every time you take the
address of the inline function, the values compare equal. There's no
reason the executable can't contain (and use) as many copies of *any*
function (not just inline ones) as the compiler chooses to generate.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Nov 25, 2003, 1:49:38 PM11/25/03

to

Gabriel Dos Reis <g...@integrable-solutions.net> wrote in message
news:<m3ptfje...@uniton.integrable-solutions.net>...
> Scott Meyers <Use...@aristeia.com> writes:

> | > It seems to me that the opposite is true. If they apply their
> | > existing approach to templated code, they can successfully
> | > eliminate the redundancy. It is only the expectation that
> | > templates can somehow do this for them that is not met.

> | I continue to argue that it's not that simple. If you're working
> | with a linker that doesn't strip out duplicates and you use implicit
> | instantiation, non-inline functions, and separate compilation, you
> | will get multiple copies of template functions in your binary, thus
> | bloating your code.

> So, it is proof by taking an implementation that does not support
> language use. I think the same argument can be made about virtually
> any language. Which makes the "argument" uninteresting.

Unless you have to use such an implementation. In a perfect world, all
C++ compilers would be 100% conformant, bug free, and generate perfect
code, minimal in both size and execution time. In the real world, I've
never seen such a compiler, and I have to deal with what I've got.

Templates are, despite everything, still relatively new. All of the
compilers I know today implement them, in some way, but practially none
of the compilers (Comeau is the only exception I know of) are fully
conformant. So today, most compiler writers are more concerned about
attaining conformance than about tuning, to get optimal code (be it
optimal for size of for speed).

> | There is no real analog to this problem in non-template-based code.

> Sure, there is. If you're going to imagine dumb implementations for
> template codes, then why not also consider dumb implementations for
> non-template codes? Otherwise it is comparison of apples vs. oranges.

Because the actual implementations one has to deal with are dumb for
template code, but not for non-template code?

I know that implementations can do better, but from what little I've
seen or heard, implementations for embedded processors (precisely where
code bloat is most likely to be a problem) are not on the cutting edge
of technology.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

Val. Creux

unread,

Nov 25, 2003, 1:59:36 PM11/25/03

to

Code bloat outside the scope of C++ standard itself exist in real life
too and not only with antique implementations.

On one of the latest embedded C++ project I've worked one, with more
than 10 applications and more than 10 shared libraries we've been
beaten to death by code bloat.
Each applications and each shared libraires got a copy of the same
instanciation of each template we were using (either from C++ standard
libraires or from others libraries).
Even after merging shared libraries, we had to, unfortunatly due to
time/management pressure, to move to non template class dealing with
void *.

I aggree that is is outside the scope of C++ standard itself but it
happens.

I've just begin to look at some kind of post process tool that could
optmize/rewrite a set of application and shared libraires by
eliminating such duplicate insted of a magic compiler. Any other
link/idea welcomed.

Val.

b...@research.att.com (Bjarne Stroustrup) wrote in message news:<188b3370.0311...@posting.google.com>...
<...>

Gabriel Dos Reis

unread,

Nov 26, 2003, 4:07:24 AM11/26/03

to

Valery...@yahoo.com (Val. Creux) writes:

[...]

| I aggree that is is outside the scope of C++ standard itself but it
| happens.

Yes; but I believe one of the proper actions is for users to pressure
their implementation providers to deliver better products, with higher
quality.

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Scott Meyers

unread,

Nov 26, 2003, 2:58:46 PM11/26/03

to

On 22 Nov 2003 19:27:34 -0500, Mogens Hansen wrote:
> I suggested the introduction of a level of indirection, which can separate
> the commonalities between Tem<T1>::mf and Tem<T2>::mf from the
> variabilities.
>
> The commonalities are:
> * The algorithm (potentially large and thus prone to code bloat)
> * The type of s (in this case int)
>
> The variability is:
> * The address of s
>
> A reference to s as a function parameter captures these commonalities and
> variabilities exactly:

Ah, now I understand. Thank you for spelling it out so clearly. I
apologize for being a slow learner.

> > > Does .NET ensure that identical code-block smaller than a function is
> > > merged at assembly level during execution ?
> >
> > My understanding is yes. I'm basing this on Jason Clark's October MSDN
> > article (and some subsequent email exchanges). The article is at
> > http://msdn.microsoft.com/msdnmag/issues/03/10/NET/default.aspx.
>
> Thank you for the reference. I'll read it soon.
>
> I'm sorry if my question was ambiguos - english isn't my first language.
> By "at assembly level during execution" I refered to machine code after the
> MSIL has been JIT'ed - I was not refering to .NET Assemblies.
> Does that affect your answer ?

It doesn't, but my understanding of .NET is not very deep, so I could be
mistaken. At any rate, I also don't know whether the .NET VM saves the
jitted code it generates from IL upon execution or it throws it away and
regenerates it each time. If the latter, I don't see how bloat would be an
issue. I'm pretty sure that type-independent parts of the IL are merged,
because the IL represents the uninstantiated template directly, if my
understanding is correct.

> I think that it is general applicable to eliminate or reduce code bloat from
> C++ templates (if/when needed) by refactoring the code according to an
> analysis of commonalities and variabilities.

I agree, but I also think that this kind of thinking is very uncommon among
C++ programmers. Thank you for putting it front and center before me.

Scott

Ben Hutchings

unread,

Nov 26, 2003, 3:01:24 PM11/26/03

to

Mogens Hansen wrote:
<snip>

> The classic example of code bloat from C++ templates is a program which uses
> say
> std::vector<int*>::push_back
> and
> std::vector<std::string*>::push_back
> Since both containers contains pointers it is very likely that the two
> functions are binary identical (assuming that they are not inlined).

<snip>

> To avoid this kind of problem the tool chain (the linker) can use one
> instantiation and discard all the others.
> This is what the Microsoft linker has been capable of since Visual C++ V5.0.
> A caveat is that if the address of the function is taken, they have to be
> unique - that is
> &std::vector<int*>::push_back != &std::vector<std::string*>::push_back

Unfortunately the Microsoft implementation doesn't do this. This does
break at least one real-world C program, that being Mozilla's JSRef
library.

> but that can easily be solved using a call-thunk to which the function
> pointer points.

I think very simple prologues would do - either nops or short forward
branches if there are a large number of binary-identical functions.
There might be a small speed cost - but the branch should be predicted
and the reduction in code size may improve locality of reference and so
increase speed.

Alternately, or additionally, symbol references in object files could
have a flag that indicates, when set, that that the object file doesn't
use the address in a way that would allow it to be compared [1]. The
linker could merge functions completely only if all references to them
have this flag set.

This latter optimisation could also be used for constant data that
doesn't require dynamic initialisation. Visual C++ already does this
for string literals because there is no uniqueness requirement on their
addresses.

[1] Calling a function by name can never do this. Storing or passing
a reference or pointer to it might allow it to be compared later.
I'm not sure just how commonly this is done. It might be hard for
a compiler to rule out the possibility that any function-to-pointer
conversion could eventually be used to compare the pointer value.

Gabriel Dos Reis

unread,

Nov 27, 2003, 5:26:45 AM11/27/03

to

Ben Hutchings <do-not-s...@bwsint.com> writes:

| > but that can easily be solved using a call-thunk to which the function
| > pointer points.
|
| I think very simple prologues would do - either nops or short forward
| branches if there are a large number of binary-identical functions.

or multiple entry points. The common vendor ABI

http://www.codesourcery.com/cxx-abi/

does not explicitly require it, but it is sort of implicit there.
I hope that as implementations are maturing, they would provide for
such common sense code sharing.

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Nov 29, 2003, 11:42:24 AM11/29/03

to

Scott Meyers <sme...@aristeia.com> writes:

>> I think that it is general applicable to eliminate or reduce code bloat from
>> C++ templates (if/when needed) by refactoring the code according to an
>> analysis of commonalities and variabilities.
>
> I agree, but I also think that this kind of thinking is very uncommon among
> C++ programmers.

Really? That sort of refactoring was a part of my daily existence
even when I was just learning to write "good old OO" C++. It's just
very hard to manage *any* codebase unless you do it constantly. How
do most C++ programmers get by?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gabriel Dos Reis

unread,

Nov 29, 2003, 11:43:36 AM11/29/03

to

ka...@gabi-soft.fr writes:

yes, if you use dumb implementation, you get dump results. That is
true for nearly every functionality.

| In a perfect world, all C++ compilers would be 100% conformant,

note that this issue has nothing about being 100% conformant. It is
more about quality of implementation issue than about being
conforming.

--
Gabriel Dos Reis
g...@integrable-solutions.net

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Balog Pal

unread,

Nov 30, 2003, 5:29:38 AM11/30/03

to

"David Abrahams" <da...@boost-consulting.com> wrote in message
news:usmk9t...@boost-consulting.com...

> >> I think that it is general applicable to eliminate or reduce code bloat
from
> >> C++ templates (if/when needed) by refactoring the code according to an
> >> analysis of commonalities and variabilities.
> >
> > I agree, but I also think that this kind of thinking is very uncommon
among
> > C++ programmers.
>
> Really? That sort of refactoring was a part of my daily existence
> even when I was just learning to write "good old OO" C++. It's just
> very hard to manage *any* codebase unless you do it constantly. How
> do most C++ programmers get by?

Yeah. Bruce Eckel phrases "programming is a process of separating things
that change from things that stay the same" in every few chapters. And imho
he's damn right. :-)

What thinking is common then?

Paul

stelios xanthakis

unread,

Dec 1, 2003, 7:16:10 PM12/1/03

to

David Abrahams <da...@boost-consulting.com> wrote in message news:<usmk9t...@boost-consulting.com>...

> Scott Meyers <sme...@aristeia.com> writes:
>
> >> I think that it is general applicable to eliminate or reduce code bloat from
> >> C++ templates (if/when needed) by refactoring the code according to an
> >> analysis of commonalities and variabilities.
> >
> > I agree, but I also think that this kind of thinking is very uncommon among
> > C++ programmers.
>
> Really? That sort of refactoring was a part of my daily existence
> even when I was just learning to write "good old OO" C++. It's just
> very hard to manage *any* codebase unless you do it constantly. How
> do most C++ programmers get by?

My theory is: with "good old OO", inheritance, virtuals (and maybe even
virtual inheritance), the goal is maximum code reuse: The programmer
has to think hard to find the common denominators of code and put them
in base classes (not interface classes, OOP base classes).
With templates we achieve code expansion for maximum speed: The template
code is re-generated many times. It's just a speed/space tradeoff. By the
way, the programmer has to think less.

So while OOP encourages the process of "refactoring", generic programming
does not, and this is maybe why it's uncommon.

It'd be nice to put members of templates that do not depend on the actual
template arguments, out of the template. Hmmm. Unfortunatelly, templates+
inheritance is too complex for people who just want to get the job done.

C-ya

stelios

Hendrik Schober

unread,

Dec 1, 2003, 7:18:40 PM12/1/03

to

David Abrahams <da...@boost-consulting.com> wrote:
> Scott Meyers <sme...@aristeia.com> writes:
>
> > > I think that it is general applicable to eliminate or reduce code bloat from
> > > C++ templates (if/when needed) by refactoring the code according to an
> > > analysis of commonalities and variabilities.
> >
> > I agree, but I also think that this kind of thinking is very uncommon among
> > C++ programmers.
>
> Really?

Yes. Well, in my experience, anyway.

> That sort of refactoring was a part of my daily existence
> even when I was just learning to write "good old OO" C++. It's just
> very hard to manage *any* codebase unless you do it constantly. How
> do most C++ programmers get by?

By just hacking another feature into the
tangled mass of hacks making up their
company's flag ship product.

I don't know which programmer circles you
usually are in. (I know you from this ng,
boost, standardization.) But they do seem to
have a well filtered audience.
Most of the code I have seen certainly does
not indicate that refactoring is a common
activity among C++ programmers.
Deeply nested class hierarchies with a wild
mix of virtual and non-virtual functions all
intermingled, code copied to wherever it's
needed, adding includes without thinking
about it etc. can be found in much recent
code. FWIW, most C++ programmers I know just
start to accept that templatizing a function
on the iterator type it uses should be a
rather common activity to be done by the
average programmers when it is helpful --
which doesn't say at all that they would do
templatization by themselves.

Schobi

--
Spam...@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

Henrik Vallgren

unread,

Dec 2, 2003, 5:09:08 AM12/2/03

to

In the foreword of "Modern C++ Design", Scott Meyers writes of understanding
C++ templates beyond "container-of-T". At MSDN you can read "An Introduction
to C# Generics" by Juval Lowy. Here's a quote for you:

"In C++, templates are nothing more than glorified macros and they do not
persist to the compiled binary."

It's almost amusing ...

Henrik Vallgren

David Abrahams

unread,

Dec 2, 2003, 5:44:51 AM12/2/03

to

"Hendrik Schober" <Spam...@gmx.de> writes:

> David Abrahams <da...@boost-consulting.com> wrote:
>> Scott Meyers <sme...@aristeia.com> writes:
>>
>> > > I think that it is general applicable to eliminate or reduce code bloat from
>> > > C++ templates (if/when needed) by refactoring the code according to an
>> > > analysis of commonalities and variabilities.
>> >
>> > I agree, but I also think that this kind of thinking is very uncommon among
>> > C++ programmers.
>>
>> Really?
>
> Yes. Well, in my experience, anyway.
>
>> That sort of refactoring was a part of my daily existence
>> even when I was just learning to write "good old OO" C++. It's just
>> very hard to manage *any* codebase unless you do it constantly. How
>> do most C++ programmers get by?
>
>
> By just hacking another feature into the
> tangled mass of hacks making up their
> company's flag ship product.

I was afraid someone would give that answer, but in a way it proves my
point. The implication I read into Scott's post was: "we have to be
especially careful to teach people to refactor when they use
templates". I think my point is that if you don't constantly refactor
you'll generate a wretched mess no matter what programming technology
you use.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Scott Meyers

unread,

Dec 2, 2003, 8:02:34 AM12/2/03

to

On 24 Nov 2003 06:24:09 -0500, Balog Pal wrote:
> Wrt static class members in a template -- is it really that spread to worth
> talking about?

The example of the class static was simply to demonstrate that even with a
compiler/linker that merges duplicate function implementations, a seemingly
innocuous class feature (in this case reference to a static data member)
would prevent such compilers/linkers from being able to merge otherwise
identical function implementations generated from a common template.

Scott

Scott Meyers

unread,

Dec 2, 2003, 8:21:30 AM12/2/03

to

On 23 Nov 2003 19:35:40 -0500, Bjarne Stroustrup wrote:
> However, use of classes often lead to unused code (because a member
> function isn't called) and use of class hierarchies can be even more
> problematic, because how do you avoid having uncalled overriding
> functions left in the code.
>
> The solution was to use templates. An unused template function isn't
> instantiated. We verified this through the complete too chain (from a
> major supplier in the embedded systems world). Thus templates can be
> used to avoid code bloat that is hard to avoid without them.

For non-virtual functions, I'd expect that the solution would be the same
as in C: create a library instead of a simple object file, then link
against the library. It's my understanding that this should link in only
referenced functions. Is there a reason why this strategy isn't viable for
member functions?

For virtual functions (which is what I assume you're referring to by
"uncalled overriding functions"), you seem to be suggesting that one can
replace class hierarchies and dynamic binding with templates and static
binding. We both know that, in general, that's not the case. So can you
please elaborate on what you were referring to?

> There are - IMO - too many myths about code bloat and templates
> floating around. Some of those myths are just myths, some are fueled
> by antique implementations, and some are fueled by designs that are
> unsuitable for templatization (e.g. class hierarchies with excessive
> numbers of member functions or templates where only part of a large
> function dependes on a template parameter). I recommed a look at the
> standard committee's technical report on performance issues (link on
> my C++ page) for how to deal with the basic performance issues.

I am familiar with that report, though I confess to not having had time to
examine its discussion of avoiding code bloat. I'd expect it to discuss
(1) explicit instantiation as a way of avoiding linkers that don't merge
duplicate instantiations and (2) migrating type-independent code (really
template-parameter-independent code) into non-template classes or
functions, but does it discuss the kinds of (to my mind) sophisticated
commonality/variability analysis that Mogens posted regarding avoiding
bloat in templates containing references to static data?

Scott

Scott Meyers

unread,

Dec 2, 2003, 11:56:29 AM12/2/03

to

On 29 Nov 2003 11:42:24 -0500, David Abrahams wrote:
> Scott Meyers <sme...@aristeia.com> writes:
>
> >> I think that it is general applicable to eliminate or reduce code bloat from
> >> C++ templates (if/when needed) by refactoring the code according to an
> >> analysis of commonalities and variabilities.
> >
> > I agree, but I also think that this kind of thinking is very uncommon among
> > C++ programmers.
>
> Really? That sort of refactoring was a part of my daily existence
> even when I was just learning to write "good old OO" C++. It's just
> very hard to manage *any* codebase unless you do it constantly. How
> do most C++ programmers get by?

I must begin with my favorite quote from Bjarne: "Nobody knows what *most*
C++ programmers are doing."

With that caveat, let me clarify that my reference to "this kind of
thinking" was referring to the commonality/variability analysis that Mogens
posted regarding references to class statics in template code. I think
that a lot more programmers are at least familiar with the idea of moving
template-parameter-independent code out of templates, though how many take
the time to do this is not clear. My guesses are that (1) it's not
terribly common among the people who know about it, (2) not that many
people know about it, and (3) the percentage of people who know about it
who work in embedded systems is smaller than the percentage who work on
hosted implementations. But that's all speculation, though it's not blind
speculation; I do get out a little.

Here's more speculation. The reason why it's harder for templates is that
you can't physically see the replication in the source code. If I have a
class C1 and a class C2 and they both have functions that do more or less
the same thing (call these functions f and g), people can look at the
source code and see that C1::f and C2::g look pretty darn similar. That
becomes the basis of their commonality/variability analysis and
refactoring. But if they see this instead,

template<typename T>
void C<T>::f(const T& p)
{
...
}

they think they see only one function. It takes a tremendous amount of
experience before they being to viscerally understand that they aren't
looking at one function there, they're looking at many. That's a huge
hurdle to get over, and it's possible that part of the problem is that many
people (including me) get sloppy and refer to "the function f." But f
isn't a function, it's *many* functions, and it's hard to perform
commonality/variability analysis across multiple functions if you don't
really understand that you have multiple functions in front of you.

Even after one has acquired that understanding, it's typically nontrivial
to vivisect f into the parts that are dependent on T and those that are
not. And until you've done that, you can't begin to look for opportunities
to refactor.

In my opinion, templates are complicated on two levels. First, they have
odd language rules that, to the uninitiated, border on the magical. Try to
remember the first time you found out that names from templated base
classes are not automatically inherited (i.e., visible in derived classes).
Recall the difficulty you had in understanding the difference between type
parameter deduction and overloading resolution. Second, they have code
generation behavior that is anything but transparent. For example, some
compilers generate multiple copies across translation units, some don't.
Some linkers merge identical instantiations regardless of type, others
don't. People like you, Dave, are so comfortable with templates that,
well, you write books about template metaprogramming. I'm supposed to know
a fair amount about C++, and TMP makes my head explode. So on a scale of 1
to Dave wrt templates, I'm about a 2. The people I consult with are
generally about a 1. The people I work with on embedded systems are about
a 0.1, not because they aren't smart, but because they don't have years of
experience with templates. In the code fragment above, they don't see
multiple functions. They see one function that has a really funny looking
declaration and that may or may not generate way more code than they
expect.

How do most C++ programmers get by, you ask? In many cases, I think they
just try (template) things out, and if they don't work as expected, they
back away from them and go on doing things the way they understand.
They're not dumb. They're not scared. They just don't have time to learn
a brand new way of programming that has platform-dependent (i.e.,
unpredictable) code generation implications, and fundamentally, that's what
templates represent.

They want to use them. They just find that when they try, often as not,
they get burned. So they're cautious.

Scott

Balog Pal

unread,

Dec 2, 2003, 4:41:07 PM12/2/03

to

"stelios xanthakis" <may...@freemail.gr> wrote in message
news:8a018872.03120...@posting.google.com...

> So while OOP encourages the process of "refactoring", generic programming
> does not, and this is maybe why it's uncommon.

As Scott mentioned earlier, redundancy is fended off with both at the source
level. The level that really counts looking at the conceptual and logical
layers.

Then he obseves much redundancy left in the object code -- a thing on
physical layer that programmer should not be conserned about. In an ideal
world. ;-) In the real one he reopened the thing as a problem that DIDN'T
pass away as everyone tends to think about it.

In general I don't think a programmer has to _early_ refactor or shape code
to avoid problems on the physical layer, or to cover problems of the
implementation. That is hackery that not iproves but cripples code. It
may be necessary like any by-hand optimisation but It would be really bad to
put it forward as 'do it this way'. IMHO the same dont do it/dont do it yet
guidelines shall apply.

Even if it turns out a problem with a certain compiler I'd consider replace
the compiler as a first step, and only go reshaping code.

> It'd be nice to put members of templates that do not depend on the actual
> template arguments, out of the template.

I observe it happens naturally --where it is indeed natural, and where it
isn't force is just a way to obfuscation. And compilers I see in fact
collapse those functions. The problem with static member really prevents
that, but I tried to think hard on what statics templates generally use, and
found almost nothing. The only frequent static I found was a mutex to make
some internal operations thread-safe. But IMHO on system with threads a few
wasted kbytes in the image is not likely a RL problem.

> Hmmm. Unfortunatelly, templates+
> inheritance is too complex for people who just want to get the job done.

Interesting, I found template is stuff that people better not touch for some
reason. (meaning write own templates.) Then who decides to use that tool
does it correctly, and without problem.

OTOH inheritance seems natural and everyone tends to jump on it. Then
producing a complete mess. Despite the fact we have tons of literature on
dos and donts.

Paul

Mogens Hansen

unread,

Dec 2, 2003, 5:04:38 PM12/2/03

to

"stelios xanthakis" <may...@freemail.gr> wrote in message
news:8a018872.03120...@posting.google.com...

> David Abrahams <da...@boost-consulting.com> wrote in message
news:<usmk9t...@boost-consulting.com>...
> > Scott Meyers <sme...@aristeia.com> writes:
> >
> > >> I think that it is general applicable to eliminate or reduce code
bloat from
> > >> C++ templates (if/when needed) by refactoring the code according to
an
> > >> analysis of commonalities and variabilities.
> > >
> > > I agree, but I also think that this kind of thinking is very uncommon
among
> > > C++ programmers.
> >
> > Really? That sort of refactoring was a part of my daily existence
> > even when I was just learning to write "good old OO" C++. It's just
> > very hard to manage *any* codebase unless you do it constantly. How
> > do most C++ programmers get by?
>
> My theory is: with "good old OO", inheritance, virtuals (and maybe even
> virtual inheritance), the goal is maximum code reuse: The programmer
> has to think hard to find the common denominators of code and put them
> in base classes (not interface classes, OOP base classes).
> With templates we achieve code expansion for maximum speed: The template
> code is re-generated many times. It's just a speed/space tradeoff. By the
> way, the programmer has to think less.

It's not a reuse vs. speed issue.

Take a look at std::vector.
It written using generic programming, and it provides good performance, but
it's definitely about reuse (or simply _use_).
It more usable than it's OO predecesors (ie. Smalltalk inspired libraries
like NIH - with all due respect).

>
> So while OOP encourages the process of "refactoring", generic programming
> does not, and this is maybe why it's uncommon.

Refactoring, although a buzz-word, is simply a matter of having a feedback
loop in the development process to guide the development in the right
direction based on experience.

The analysis of commonalities and variabilities (and refactoring) are
applicable and necessary irrespectable of the programming paradigm, like
procedural, object oriented, generic programming or template meta
programming. All of which can easily and rightfully exist in one
application.

Please take a look at the excelent book
Multi-Paradigm DESIGN in C++
James O. Coplien
ISBN 0-201-82467-1
which gives these issue a carefull treatment.

It's not a coincidence that Scott Meyers (IIRC) called James O. Coplien's
previous book (Advanced C++) "the LSD book" because it's purple and it
expands you mind every time you use it :-)
"Multi-Paradigm DESIGN in C++" has some of the same mind expanding
features - although at a different level of granularity.
It demanding to read, but good things doesn't come for free.

Kind regards

Mogens Hansen

Hendrik Schober

unread,

Dec 2, 2003, 5:05:20 PM12/2/03

to

David Abrahams <da...@boost-consulting.com> wrote:
> [...]

> >> > I agree, but I also think that this kind of thinking is very uncommon among
> >> > C++ programmers.
> >>
> >> Really?
> >
> > Yes. Well, in my experience, anyway.
>

> [...]
> >> [...] How

> >> do most C++ programmers get by?
> >
> >
> > By just hacking another feature into the
> > tangled mass of hacks making up their
> > company's flag ship product.
>
> I was afraid someone would give that answer, but in a way it proves my
> point. The implication I read into Scott's post was: "we have to be
> especially careful to teach people to refactor when they use
> templates". I think my point is that if you don't constantly refactor
> you'll generate a wretched mess no matter what programming technology
> you use.

Actually I'm on your side here.
But you asked whether it's true that
this kind of thinking is uncommon and
how programmers get by. And I think if
one deals with the people that show up
in this ng, one will get a wrong
impression of Joe Programmer.

Schobi

--
Spam...@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Dec 2, 2003, 5:17:57 PM12/2/03

to

Scott Meyers <Use...@aristeia.com> writes:

> On 23 Nov 2003 19:35:40 -0500, Bjarne Stroustrup wrote:
>> However, use of classes often lead to unused code (because a member
>> function isn't called) and use of class hierarchies can be even more
>> problematic, because how do you avoid having uncalled overriding
>> functions left in the code.
>>
>> The solution was to use templates. An unused template function isn't
>> instantiated. We verified this through the complete too chain (from a
>> major supplier in the embedded systems world). Thus templates can be
>> used to avoid code bloat that is hard to avoid without them.
>
> For non-virtual functions, I'd expect that the solution would be the same
> as in C: create a library instead of a simple object file, then link
> against the library. It's my understanding that this should link in only
> referenced functions.

Traditionally, static libraries are just archives of object files, and
some linkers only discard unreferenced code on translation unit
boundaries. If you use any function in the translation unit, you get
the whole wad. AFAIK, that is the tradition inherited from K&R 'C',
and even with modern GNU ld you still have to choose special linker
flags to get individual unreferenced symbols to be discarded [no fair
discounting the need for special flags if you're going to argue that
people who turn off inlining get bloat when they use templates ;->].

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Balog Pal

unread,

Dec 2, 2003, 5:19:57 PM12/2/03

to

"Scott Meyers" <Use...@aristeia.com> wrote in message
news:MPG.1a35d3ea9...@news.hevanet.com...

> The example of the class static was simply to demonstrate that even with a
> compiler/linker that merges duplicate function implementations, a
seemingly
> innocuous class feature (in this case reference to a static data member)
> would prevent such compilers/linkers from being able to merge otherwise
> identical function implementations generated from a common template.

Well, but are there many such examples?

I observe the whole stuff with great interest. I remember the days most
linkers work only at an object file level -- so if you wanted a single
function from a module you got everything there, and also anything
referenced by those not really needed functions. Function elvel link
option is not really new imho.

While templates just avoid that problem by instatniating only the functions
you really use -- not generating any extra. So templates should not
increase but decrease the code bloat compared to a solution doing the same.

And the linker's ability to even merge absolutely unrelated function is like
the cherry on the pie. Using some features leads to different functions.
Well. But why would someone think it's not the case in the first place?

I still wonder what would be the alternative "good" solution avoiding
bloat -- to compare the emplate use with.

Paul

Mogens Hansen

unread,

Dec 3, 2003, 7:46:57 PM12/3/03

to

"Scott Meyers" <Use...@aristeia.com> wrote:
> On 23 Nov 2003 19:35:40 -0500, Bjarne Stroustrup wrote:
> > However, use of classes often lead to unused code (because a member
> > function isn't called) and use of class hierarchies can be even more
> > problematic, because how do you avoid having uncalled overriding
> > functions left in the code.
> >
> > The solution was to use templates. An unused template function isn't
> > instantiated. We verified this through the complete too chain (from a
> > major supplier in the embedded systems world). Thus templates can be
> > used to avoid code bloat that is hard to avoid without them.
>
> For non-virtual functions, I'd expect that the solution would be the same
> as in C: create a library instead of a simple object file, then link
> against the library. It's my understanding that this should link in only
> referenced functions. Is there a reason why this strategy isn't viable
for
> member functions?
>
> For virtual functions (which is what I assume you're referring to by
> "uncalled overriding functions"), you seem to be suggesting that one can
> replace class hierarchies and dynamic binding with templates and static
> binding. We both know that, in general, that's not the case. So can you
> please elaborate on what you were referring to?

Let me contribute with some experience based on experiments, measurements
and comparison between various design techniques and compilers.

A couple of years ago I wrote a (proof of concept) GUI library for Win32,
with an abstraction level which was at least what MFC provides.
It provided the basic features that (almost) all GUI libraries does,
compared to the Win32 C level programming:
* Typesafety (An edit control is a known type - not just a HWND)
* Easy association of instance specific data (datamembers in a window
object)
* Message-cracking

An important (although trivial) observation, which started the project was
that a given application defined window type is staticly defined.
Properties like:
* which window messages does the window intercept and handle
* which data-members are needed
is well known and fixed at compile-time.
Thus we don't need any dynamic binding for things like message-handling -
except the single dynamic dispatch which is inherent in MS-Windows (the
winproc).
This observation, of course, matches exactly with the fact that a typical
Win32 API based application written in C doesn't have any dynamic dispatch
except for the winproc.

The need for dynamic binding in libraries like MFC, Borland OWL is due to
the layering.
There is the library layer (MFC, OWL etc) and there is the application
layer. The application depends on the library and the library is independent
of the application.
The library provides a fixed set of functionality which can be extended
through inheritance and virtual functions. The library vendor doesn't know
my derived special window types, so they need to supply hook for them.
I don't mean to critize those libraries, I just wanted to compare the
design.
I chose GUI for Win32 because several libraries exist, so comparison was
doable.

My library was written with extensive use of templates and inlining, but
without a single virtual method and without any preprocessor macros. The
hooks for extension was made through static polymophism using the "Curriosly
recuring template pattern", such that the library at compile-time could
learn about the specific derived window classes, although they where of
course unknown a the time the library was written.

My point is that sometimes the dynamic binding is added to a library due to
some design artifact and not because the need is inherent in the problem
domain.
When that is the case, there is room for optimization.

I made a couple of test applications using different compilers (Microsoft,
Intel and Borland - multiple versions from each vendor) and different
libraries (Win32 API, MFC, OWL, WTL, VCL and my library).

These applications were written to compare programming style, and
performance in terms of lines of code, program size and memory usage.
I did my best to write the test applications in a style which was true to
the library and resonable effecient (like no use on Wizard code generation).

One set of test applications was the functionality of the COLOR1 application
from the book
Programming Windows, Fifth Edition
Charles Petzold
ISBN 1-57231-995-X

The program size was (all libraries linked staticly):
Win32 C API: 100 % (the reference - considered optimal)
MFC: 265%-590% (dependent on compiler)
OWL: 345%
WTL: 130%
VCL: 775%-900%
My library: 115-190%

Note that my library was able to achieve very low overhead, but also
that it was quite dependent on the quality of implementation of the
compiler.
Also note that WTL, which uses somewhat similar techniques but doesn't
provide the same level of abstraction (like no message cracking) also gave
low overhead.
Also note that is it resonable to say that VCL provides the highest level of
abstraction and the highest level of overhead.

The two template based libraries (WTL and my library) produced significantly
lower overhead the the more traditional OO libraries (MFC, OWL and VCL).

The memory usage was
Win32 C API: 100 % (the reference - considered optimal)
MFC: 125%-225% (dependent on compiler)
OWL: 125%
WTL: 125%-130% (dependent on compiler)
VCL: 145%-265%
My library: 101%-283% (dependent on compiler)

Note that my library was able to achieve near zero overhead, but also that
it was quite dependent on the quality of the compiler.

Another set of test applications measured how many Windows messages could be
processed in a given period of time:
Win32 C API: 100 % (the reference - considered optimal)
MFC: 42%-23% (dependent on compiler)
OWL: 49%
WTL: 50%-60% (dependent on compiler)
VCL: 3%
My library: 96%-98% (dependent on compiler)

Note that my library was able to achieve near zero overhead.

It was my clear impression that the message loop inside Windows is highly
optimized, which means that adding a few additional assembler instructions
is clearly measureable.
It also means that this number is not very significant for real world
application, since by far most of time spend in a message-handler is spend
on doing the actual work and not on whatever overhead the library imposes.

Back then I started to write a paper about my observations, but then .NET
came along and the focus shifted from Win32, and I got busy doing other
things, so the paper was never finished.
Maybe I should resurrect the project and the paper and make measurements
with newer compilers, since it's not realy Win32 specific but applicable for
high performance application in general.

Kind regards

Mogens Hansen

ka...@gabi-soft.fr

unread,

Dec 3, 2003, 7:47:54 PM12/3/03

to

"Henrik Vallgren" <henrik....@stream-space.com> wrote in message
news:<nZJyb.40219$dP1.1...@newsc.telia.net>...

> In the foreword of "Modern C++ Design", Scott Meyers writes of
> understanding C++ templates beyond "container-of-T". At MSDN you can
> read "An Introduction to C# Generics" by Juval Lowy. Here's a quote
> for you:

> "In C++, templates are nothing more than glorified macros and they do
> not persist to the compiled binary."

> It's almost amusing ...

In which way.

Templates don't persist to the compiled binary -- they are purely a
compile time feature. Generally speaking, I would consider this a
feature, not a defect.

The most important difference between templates and macros is the way
templates interact with scoping and name lookup. Generally speaking,
this is only important in generic programming because you need to
leverage off function overload resolution in order to provide
conditionals -- in short, the interaction is important in order to
overcome a weakness in templates as a generic language. Compared to
some macro languages I've used (the one in Intel's ASM 86, for example),
I'd consider templates closer to watered down macros than to glorified
macros.

But of course, any analogy will be false some of the time, and the best
way to categorize C++ templates is to say that they are C++ templates --
and to refer the person to chapter 14 of the standard (which should keep
him occupied for a bit).

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Scott Meyers

unread,

Dec 4, 2003, 10:46:00 AM12/4/03

to

On 2 Dec 2003 17:17:57 -0500, David Abrahams wrote:

> Scott Meyers <Use...@aristeia.com> writes:
> > For non-virtual functions, I'd expect that the solution would be the same
> > as in C: create a library instead of a simple object file, then link
> > against the library. It's my understanding that this should link in only
> > referenced functions.
>
> Traditionally, static libraries are just archives of object files, and
> some linkers only discard unreferenced code on translation unit
> boundaries. If you use any function in the translation unit, you get
> the whole wad. AFAIK, that is the tradition inherited from K&R 'C',
> and even with modern GNU ld you still have to choose special linker
> flags to get individual unreferenced symbols to be discarded [no fair
> discounting the need for special flags if you're going to argue that
> people who turn off inlining get bloat when they use templates ;->].

Okay, but if you programmed in C and had a bunch of free functions that all
worked with a particular type of struct (the moral equivalent of member
functions), you'd have the same problem: if they were all defined in a single
translation unit and you were using an all-or-nothing-per-translation-unit
linker, you might end up linking in unused functions. You'd avoid the problem
by not putting all the function definitions in a single translation unit. But
you can split up C++ member function definitions in an equivalent manner. So
again it seems to me that you can apply your favorite C technique to C++ to
eliminate functions that are never called.

You can use templates, too, but it occurs to me that to make that work, you
have to introduce a template parameter of some kind. If such a parameter makes
conceptual sense, fine, but if it's just there to take advantage of the fact
that function templates are instantiated only if used, that strikes me as a
hack uglier than the hack of possibly having to put each member function
definition in its own file.

I've never thought of using templates purely as a way to avoid generating dead
code, but the more I think about it, the less I like it. To me, templates
model something fairly high-level in your design: whether you need a single
class (or function) or a collection of classes (or functions) with certain
things in common. To turn a class into a template is to revise a design
statement from "I need a particular type of thing" to "I need a whole bunch of
particular types of things." To change from a presumably true design statement
(I need only one type) to a presumably false one (I need many types) just to
avoid generating dead code strikes me as, well, misleading at best.

Scott

Henrik Vallgren

unread,

Dec 4, 2003, 10:48:48 AM12/4/03

to

<ka...@gabi-soft.fr> skrev i meddelandet
news:d6652001.03120...@posting.google.com...

> "Henrik Vallgren" <henrik....@stream-space.com> wrote in message
> news:<nZJyb.40219$dP1.1...@newsc.telia.net>...
> > In the foreword of "Modern C++ Design", Scott Meyers writes of
> > understanding C++ templates beyond "container-of-T". At MSDN you can
> > read "An Introduction to C# Generics" by Juval Lowy. Here's a quote
> > for you:
>
> > "In C++, templates are nothing more than glorified macros and they do
> > not persist to the compiled binary."
>
> > It's almost amusing ...
>
> In which way.

Microsoft starts out from C++ templates, removes metaprogramming
capabilities and compile time type safety, and releases "drastically
different"
(improved) functionality.

> Templates don't persist to the compiled binary -- they are purely a
> compile time feature. Generally speaking, I would consider this a
> feature, not a defect.

Still, I can export them from my binaries, allowing cross dll interfaces?

> The most important difference between templates and macros is the way
> templates interact with scoping and name lookup. Generally speaking,
> this is only important in generic programming because you need to
> leverage off function overload resolution in order to provide
> conditionals -- in short, the interaction is important in order to
> overcome a weakness in templates as a generic language. Compared to
> some macro languages I've used (the one in Intel's ASM 86, for example),
> I'd consider templates closer to watered down macros than to glorified
> macros.

How about template metaprogramming: would you describe functional
programming as "glorified macros"? How do you define the distinction
between macros and programming languages?

Henrik Vallgren

Hendrik Schober

unread,

Dec 4, 2003, 10:53:49 AM12/4/03

to

Mogens Hansen <moge...@dk-online.dk> wrote:
> [...]

> Back then I started to write a paper about my observations, but then .NET
> came along and the focus shifted from Win32, and I got busy doing other
> things, so the paper was never finished.
> Maybe I should resurrect the project and the paper and make measurements
> with newer compilers, since it's not realy Win32 specific but applicable for
> high performance application in general.

I suppose you know this:
http://sourceforge.net/projects/notus

> Kind regards
>
> Mogens Hansen

Schobi

--
Spam...@gmx.de is never read
I'm Schobi at suespammers dot org

"Sometimes compilers are so much more reasonable than people."
Scott Meyers

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Joe Greer

unread,

Dec 4, 2003, 11:10:20 AM12/4/03

to

> But of course, any analogy will be false some of the time, and the best
> way to categorize C++ templates is to say that they are C++ templates --
> and to refer the person to chapter 14 of the standard (which should keep
> him occupied for a bit).

For the most part, I fail the see the point of comparing an OS feature
to a language feature. C++ templates are more powerful in some ways.
They are more expressive and you can write code which uses the
contained object in various ways if it supports the features the
template needs. On the otherhand, I defy you to write a typesafe
container using C++ templates that can be invoked in a typesafe way by
Eiffel or Visual Basic using objects declared in those languages.
.NET generics can do this. In fact that is their whole point in
existing.

Just my thoughts, for what they are worth.

joe

Francis Glassborow

unread,

Dec 4, 2003, 6:15:07 PM12/4/03

to

In article <MPG.1a384eb49...@news.hevanet.com>, Scott Meyers
<Use...@aristeia.com> writes

>I've never thought of using templates purely as a way to avoid generating dead
>code, but the more I think about it, the less I like it. To me, templates
>model something fairly high-level in your design: whether you need a single
>class (or function) or a collection of classes (or functions) with certain
>things in common. To turn a class into a template is to revise a design
>statement from "I need a particular type of thing" to "I need a whole bunch of
>particular types of things." To change from a presumably true design statement
>(I need only one type) to a presumably false one (I need many types) just to
>avoid generating dead code strikes me as, well, misleading at best.

More than a decade ago TopSpeed's smart linker not only linked modules
across the four languages they supplied but also only linked what was
used. Sad that more than ten years latter that linker is defunct and no
one that I know off gets close to its functionality.

--
Francis Glassborow ACCU
If you are not using up-to-date virus protection you should not be reading
this. Viruses do not just hurt the infected but the whole community.

ka...@gabi-soft.fr

unread,

Dec 4, 2003, 7:56:11 PM12/4/03

to

David Abrahams <da...@boost-consulting.com> wrote in message

news:<u1xrng...@boost-consulting.com>...

> Scott Meyers <Use...@aristeia.com> writes:
> > On 23 Nov 2003 19:35:40 -0500, Bjarne Stroustrup wrote:
> >> However, use of classes often lead to unused code (because a member
> >> function isn't called) and use of class hierarchies can be even
> >> more problematic, because how do you avoid having uncalled
> >> overriding functions left in the code.

> >> The solution was to use templates. An unused template function
> >> isn't instantiated. We verified this through the complete too chain
> >> (from a major supplier in the embedded systems world). Thus
> >> templates can be used to avoid code bloat that is hard to avoid
> >> without them.

> > For non-virtual functions, I'd expect that the solution would be the
> > same as in C: create a library instead of a simple object file, then
> > link against the library. It's my understanding that this should
> > link in only referenced functions.

> Traditionally, static libraries are just archives of object files, and
> some linkers only discard unreferenced code on translation unit
> boundaries. If you use any function in the translation unit, you get
> the whole wad. AFAIK, that is the tradition inherited from K&R 'C',
> and even with modern GNU ld you still have to choose special linker
> flags to get individual unreferenced symbols to be discarded [no fair
> discounting the need for special flags if you're going to argue that
> people who turn off inlining get bloat when they use templates ;->].

Note that the traditional point of view is at least partially supported
by the C++ standard. The standard leaves it completely up to the
implementation as to how to specify whether a translation unit is part
of the program or not, but it pretty much considers that a translation
unit is either in the program, or it isn't -- it cannot be partially in
the program.

This is significant when the translation unit contains static variables
with dynamic initialization; if I call a function from module X, any
static variables in that translation unit must exist in the program, and
be initialized before the call, even if they are not referenced
otherwise. (As you well know, the constructor could register the object
with some globally known registry.)

This makes pulling in just part of an object file rather difficult.

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Balog Pal

unread,

Dec 4, 2003, 7:59:17 PM12/4/03

to

"Scott Meyers" <Use...@aristeia.com> wrote in message

news:MPG.1a384eb49...@news.hevanet.com...

> Okay, but if you programmed in C and had a bunch of free functions that
all
> worked with a particular type of struct (the moral equivalent of member
> functions), you'd have the same problem: if they were all defined in a
single
> translation unit and you were using an all-or-nothing-per-translation-unit
> linker, you might end up linking in unused functions.

Exactly that was the common thing happening.

> You'd avoid the problem
> by not putting all the function definitions in a single translation unit.
But
> you can split up C++ member function definitions in an equivalent manner.
So
> again it seems to me that you can apply your favorite C technique to C++
to
> eliminate functions that are never called.

Yeah, in practice that solution was followed to some extent. Placing only a
few functions in a single unit. Those likely needed together. Still leaving
some stuff, but supposedly not that much.

It is pretty boring to do even if it comes easy. Then you reach a point
where implementation of functions use some elements local to the TU. static
functions, maybe data. Then you can't split the unit anymore. Or if you do
anyway, you must make some of the stuff public taking all the problems
coming with it.

It is definitely a workaround that costs pretty much, and programmers shall
better cry for more intelligent tools, that can do it on the object file
level.

> You can use templates, too, but it occurs to me that to make that work,
you
> have to introduce a template parameter of some kind.

IMHO using templates in this context were not presented as an alternative to
the traditional functions. Just merely put the 'bloat-creating' capability
side by side. Where following "the natural good way" predict less bloat in
units using more templates then where the same thing is created via
templateless code.

> If such a parameter makes
> conceptual sense, fine, but if it's just there to take advantage of the
fact
> that function templates are instantiated only if used, that strikes me as
a
> hack uglier than the hack of possibly having to put each member function
> definition in its own file.

Sure. :)

Paul

Scott Meyers

unread,

Dec 5, 2003, 7:49:39 AM12/5/03

to

I wrote:

> I've never thought of using templates purely as a way to avoid generating
> dead code, but the more I think about it, the less I like it. To me,
> templates model something fairly high-level in your design: whether you
> need a single class (or function) or a collection of classes (or
> functions) with certain things in common. To turn a class into a
> template is to revise a design statement from "I need a particular type
> of thing" to "I need a whole bunch of particular types of things." To
> change from a presumably true design statement (I need only one type) to
> a presumably false one (I need many types) just to avoid generating dead
> code strikes me as, well, misleading at best.

Except when it's not. Maybe I'm late to the game here, but it occured to
me that if we can design a class to be instantiated only once and call it a
singleton class, there's no reason why we can't design a template to be
instantiated only once and call it a singleton template. Something like
this:

template<int id = 0>
class Foo {
...
};

We can then use a static assert inside Foo<id> (possibly as an empty base
class of Foo<id>) to allow only Foo<0> to compile. Users can then use
Foo<> instead of Foo and more or less deal with Foo as if it's not a
template.

I've been doing other things for a while. Is this well-tilled soil?

Francis Glassborow

unread,

Dec 5, 2003, 5:31:18 PM12/5/03

to

In article <MPG.1a39ad0a7...@news.hevanet.com>, Scott Meyers
<Use...@aristeia.com> writes

>Except when it's not. Maybe I'm late to the game here, but it occured to
>me that if we can design a class to be instantiated only once and call it a
>singleton class, there's no reason why we can't design a template to be
>instantiated only once and call it a singleton template. Something like
>this:
>
> template<int id = 0>
> class Foo {
> ...
> };
>
>We can then use a static assert inside Foo<id> (possibly as an empty base
>class of Foo<id>) to allow only Foo<0> to compile. Users can then use
>Foo<> instead of Foo and more or less deal with Foo as if it's not a
>template.
>
>I've been doing other things for a while. Is this well-tilled soil?

I think the way to go is:

template<int id = 0> class Foo; // declaration not definition

template<>
class Foo<0>{
...
}; // definition for complete specialisation

Then you can use Foo<> as a 'singular class'

But this is rather out of my league.

--
Francis Glassborow ACCU
If you are not using up-to-date virus protection you should not be reading
this. Viruses do not just hurt the infected but the whole community.

stelios xanthakis

unread,

Dec 5, 2003, 6:09:48 PM12/5/03

to

"Mogens Hansen" <moge...@dk-online.dk> wrote in message news:<bqhu70$2elu$1...@news.cybercity.dk>...

There are two kinds of reuse:
1. The programmer writes the code once and calls it many times
2. In the generated assembly the code exists once.

The difference between a function and a macro. In both cases we write
code once. In the case of macro the code is expanded many times
wherever the code is invoked. Can we say that a macro is code reuse?
As far as typing is concerned, yes. As far as the executable size is, no.

I don't have access to the std::vector code but what I know is that
templates are "smart macros": Code is expanded for each specialization
with different types.

But I should really see before jumping into conclusions.

> The analysis of commonalities and variabilities (and refactoring) are
> applicable and necessary irrespectable of the programming paradigm, like
> procedural, object oriented, generic programming or template meta
> programming. All of which can easily and rightfully exist in one
> application.

Agree.

>
> Please take a look at the excelent book
> Multi-Paradigm DESIGN in C++
> James O. Coplien
> ISBN 0-201-82467-1
> which gives these issue a carefull treatment.

Thanks. I'll check it out (stopped buying programming books after
TCPL which I read every 6 months).

Stelios

David Abrahams

unread,

Dec 5, 2003, 6:11:07 PM12/5/03

to

Scott Meyers <Use...@aristeia.com> writes:

> I wrote:
>
> > I've never thought of using templates purely as a way to avoid generating
> > dead code

Me neither; that wasn't the point. My point was that if you choose a
technology (templates, runtime polymorphism, .NET, whatever) and
aren't thinking too much about doing stuff to keep your code size
down, templates stand as good a chance as anything of doing a good job
with code size as any of the other technologies. If you switch
technologies on a project where you are thinking about code size
without adjusting your approach to reducing it accordingly, you
shouldn't be surprised if the code size jumps.

> > , but the more I think about it, the less I like it. To me,
> > templates model something fairly high-level in your design: whether you
> > need a single class (or function) or a collection of classes (or
> > functions) with certain things in common. To turn a class into a
> > template is to revise a design statement from "I need a particular type
> > of thing" to "I need a whole bunch of particular types of things." To
> > change from a presumably true design statement (I need only one type) to
> > a presumably false one (I need many types) just to avoid generating dead
> > code strikes me as, well, misleading at best.
>
> Except when it's not. Maybe I'm late to the game here, but it occured to
> me that if we can design a class to be instantiated only once and call it a
> singleton class, there's no reason why we can't design a template to be
> instantiated only once and call it a singleton template. Something like
> this:
>
> template<int id = 0>
> class Foo {
> ...
> };
>
> We can then use a static assert inside Foo<id> (possibly as an empty base
> class of Foo<id>) to allow only Foo<0> to compile. Users can then use
> Foo<> instead of Foo and more or less deal with Foo as if it's not a
> template.
>
> I've been doing other things for a while. Is this well-tilled soil?

STLPort has been doing something similar for some time to generate
global variables even in its header-only configurations.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Howard Hinnant

unread,

Dec 5, 2003, 6:14:24 PM12/5/03

to

In article <MPG.1a39ad0a7...@news.hevanet.com>,
Scott Meyers <Use...@aristeia.com> wrote:

Fwiw, I've been using this technique in the Metrowerks std::lib for
about 5 years now. Except that I templated my class on a bool instead
of an int. The class in question is an implementation detail (of
set/map), and thus not part of the public interface. <shrug> It's been
working for me (and my customers) pretty well, and I really don't have
any motivation to change it at the moment. Though I'm always looking
for ways to improve our lib.

-Howard

Mogens Hansen

unread,

Dec 5, 2003, 6:37:36 PM12/5/03

to

"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote:

> More than a decade ago TopSpeed's smart linker not only linked modules
> across the four languages they supplied but also only linked what was
> used. Sad that more than ten years latter that linker is defunct and no
> one that I know off gets close to its functionality.

I tried a simple example with 3 different current compilers on MS-Windows:
* Microsoft Visual C++ 7.1 (.NET 2003)
* Intel C++ 7.1 for Windows
* Borland C++Builder V6.0

All of them where able to eliminate unused functions even if some function
in the same translation unit was referenced.

For the sample code below only
void foo();
was part of the final application
The functions
void bar();
void foobar();
were eliminated by the linkers.
I verified his by looking at the linker generated MAP files and by looking
at the generated programs with a debugger.

Borland C++Builder is capable of compiling and linking Pascal (Delphi
Language) source code seamless into a C++ based application.
While it compiles the Pascal code, it generates C++ header files.

IIRC this is the way it has been for a long time for all those products.

Kind regards

Mogens Hansen

<C++ code - fnyt.cpp>
void foo();
void bar();
void foobar();

int main()
{
foo();
}
<C++ code - fnyt.cpp/>

<C++ code - foobar.cpp>
#include <iostream>

using namespace std;

void foo()
{
cout << "foo called" << endl;
}

void bar()
{
cout << "bar called" << endl;
}

void foobar()
{
foo();
bar();
}
<C++ code - foobar.cpp/>

David Abrahams

unread,

Dec 6, 2003, 5:23:31 AM12/6/03

to

Francis Glassborow <fra...@robinton.demon.co.uk> writes:

> In article <MPG.1a39ad0a7...@news.hevanet.com>, Scott Meyers
> <Use...@aristeia.com> writes
>>Except when it's not. Maybe I'm late to the game here, but it occured to
>>me that if we can design a class to be instantiated only once and call it a
>>singleton class, there's no reason why we can't design a template to be
>>instantiated only once and call it a singleton template. Something like
>>this:
>>
>> template<int id = 0>
>> class Foo {
>> ...
>> };
>>
>>We can then use a static assert inside Foo<id> (possibly as an empty base
>>class of Foo<id>) to allow only Foo<0> to compile. Users can then use
>>Foo<> instead of Foo and more or less deal with Foo as if it's not a
>>template.
>>
>>I've been doing other things for a while. Is this well-tilled soil?
> I think the way to go is:
>
> template<int id = 0> class Foo; // declaration not definition
>
> template<>
> class Foo<0>{
> ...
> }; // definition for complete specialisation
>
> Then you can use Foo<> as a 'singular class'

The problem if you do that is that Foo gets compiled immediately
whether or not it's ever used. You might as well drop the template
part.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Scott Meyers

unread,

Dec 6, 2003, 5:26:22 AM12/6/03

to

On 5 Dec 2003 18:11:07 -0500, David Abrahams wrote:
> Scott Meyers <Use...@aristeia.com> writes:

> STLPort has been doing something similar for some time to generate
> global variables even in its header-only configurations.

Do you happen to know what their motivation was for using a template instead
of a class if they didn't want to instantiate on multiple parameter values?

On 5 Dec 2003 18:14:24 -0500, Howard Hinnant wrote:
> Fwiw, I've been using this technique in the Metrowerks std::lib for
> about 5 years now. Except that I templated my class on a bool instead
> of an int. The class in question is an implementation detail (of
> set/map), and thus not part of the public interface. <shrug> It's been
> working for me (and my customers) pretty well, and I really don't have
> any motivation to change it at the moment. Though I'm always looking
> for ways to improve our lib.

But what was the motivation for using a template instead of a class,
assuming you were never going to instantiate on more than one
parameter value?

Scott

Thorsten Ottosen

unread,

Dec 6, 2003, 5:37:33 AM12/6/03

to

"stelios xanthakis" <may...@freemail.gr> wrote in message
news:8a018872.03120...@posting.google.com...
> "Mogens Hansen" <moge...@dk-online.dk> wrote in message
news:<bqhu70$2elu$1...@news.cybercity.dk>...
> > "stelios xanthakis" <may...@freemail.gr> wrote in message

> The difference between a function and a macro. In both cases we write

> code once. In the case of macro the code is expanded many times
> wherever the code is invoked. Can we say that a macro is code reuse?
> As far as typing is concerned, yes. As far as the executable size is, no.

it depends. If the function is small, then inlining will actually make the
code smaller because
there is no need for a new stack-frame.

> I don't have access to the std::vector code but what I know is that
> templates are "smart macros": Code is expanded for each specialization
> with different types.

not necessarily, it is widely known that eg. pointers can share a single
void pointer implementation. Even
built in types int / long can share the same code in some cases (If they
have the same size).

> But I should really see before jumping into conclusions.

yes.

best regards

Thorsten

Francis Glassborow

unread,

Dec 6, 2003, 9:36:00 AM12/6/03

to

In article <bqqgst$18rf$1...@news.cybercity.dk>, Mogens Hansen
<moge...@dk-online.dk> writes

>I tried a simple example with 3 different current compilers on MS-Windows:
> * Microsoft Visual C++ 7.1 (.NET 2003)
> * Intel C++ 7.1 for Windows
> * Borland C++Builder V6.0
>
>All of them where able to eliminate unused functions even if some function
>in the same translation unit was referenced.
>
>For the sample code below only
> void foo();
>was part of the final application
>The functions
> void bar();
> void foobar();
>were eliminated by the linkers.
>I verified his by looking at the linker generated MAP files and by looking
>at the generated programs with a debugger.
>
>
>Borland C++Builder is capable of compiling and linking Pascal (Delphi
>Language) source code seamless into a C++ based application.
>While it compiles the Pascal code, it generates C++ header files.
>
>
>IIRC this is the way it has been for a long time for all those products.

Did they also eliminate statics and string literals that were part of
those functions' definitions. The last time I looked (which admittedly
was some time ago) they did not. OTOH the last time I looked,
optimisation flags set for aggressive optimisation was removing things
that should not be removed. The problem for many compiler/linkers is
getting it exactly right 'as much as possible but no more.'

--
Francis Glassborow ACCU
If you are not using up-to-date virus protection you should not be reading
this. Viruses do not just hurt the infected but the whole community.

Balog Pal

unread,

Dec 6, 2003, 9:38:35 AM12/6/03

to

"Scott Meyers" <Use...@aristeia.com> wrote in message

news:MPG.1a39ad0a7...@news.hevanet.com...

> > code strikes me as, well, misleading at best.
>
> Except when it's not. Maybe I'm late to the game here, but it occured to
> me that if we can design a class to be instantiated only once and call it
a
> singleton class, there's no reason why we can't design a template to be
> instantiated only once and call it a singleton template. Something like
> this:
>
> template<int id = 0>
> class Foo {
> ...
> };

That would be a cool idea for a different C++ language. Where you had a
class definition, insert that template<int id = 0> and then get everything
working as without it. (let's assume types got magically adjusted on use
site).

But in fact you must rewrite the implementation of the class too. As you get
hit by 2phase lookup, dependent names, not using names from base class, and
so on.

Guess it only worth it in extreme cases.

[going out of 2003 I see very few excuses for compilers/linkers to not
provide "function-level linking". ]

Paul

P.J. Plauger

unread,

Dec 6, 2003, 9:41:16 AM12/6/03

to

"Scott Meyers" <Use...@aristeia.com> wrote in message

news:MPG.1a3ab8b63...@news.hevanet.com...

> On 5 Dec 2003 18:11:07 -0500, David Abrahams wrote:
> > Scott Meyers <Use...@aristeia.com> writes:
> > STLPort has been doing something similar for some time to generate
> > global variables even in its header-only configurations.
>
> Do you happen to know what their motivation was for using a template
instead
> of a class if they didn't want to instantiate on multiple parameter
values?
>
> On 5 Dec 2003 18:14:24 -0500, Howard Hinnant wrote:
> > Fwiw, I've been using this technique in the Metrowerks std::lib for
> > about 5 years now. Except that I templated my class on a bool instead
> > of an int. The class in question is an implementation detail (of
> > set/map), and thus not part of the public interface. <shrug> It's been
> > working for me (and my customers) pretty well, and I really don't have
> > any motivation to change it at the moment. Though I'm always looking
> > for ways to improve our lib.
>
> But what was the motivation for using a template instead of a class,
> assuming you were never going to instantiate on more than one
> parameter value?

You can include a template in multiple translation units and not
get multiple definitions of the static object initializers. Thus,
you can keep all the code in a header.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

David Abrahams

unread,

Dec 6, 2003, 7:58:00 PM12/6/03

to

Scott Meyers <Use...@aristeia.com> writes:

> On 5 Dec 2003 18:11:07 -0500, David Abrahams wrote:
> > Scott Meyers <Use...@aristeia.com> writes:
> > STLPort has been doing something similar for some time to generate
> > global variables even in its header-only configurations.
>
> Do you happen to know what their motivation was for using a template instead
> of a class if they didn't want to instantiate on multiple parameter values?
>
> On 5 Dec 2003 18:14:24 -0500, Howard Hinnant wrote:
> > Fwiw, I've been using this technique in the Metrowerks std::lib for
> > about 5 years now. Except that I templated my class on a bool instead
> > of an int. The class in question is an implementation detail (of
> > set/map), and thus not part of the public interface. <shrug> It's been
> > working for me (and my customers) pretty well, and I really don't have
> > any motivation to change it at the moment. Though I'm always looking
> > for ways to improve our lib.
>
> But what was the motivation for using a template instead of a class,
> assuming you were never going to instantiate on more than one
> parameter value?

I'm not sure, but a static member that never gets used still has to
be initialized... unless it's a member of an uninstantiated template.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

James Kanze

unread,

Dec 6, 2003, 8:05:17 PM12/6/03

to

"Henrik Vallgren" <henrik....@stream-space.com> writes:

|> > Templates don't persist to the compiled binary -- they are purely
|> > a compile time feature. Generally speaking, I would consider this
|> > a feature, not a defect.

|> Still, I can export them from my binaries, allowing cross dll
|> interfaces?

Not as templates, at least not with the compilers I know. As far as the
standard is concerned, all of the phases of compilation (including
linking and template instantiation) are finished before execution
starts. Most implementations today permit some form of dynamic linking
as an extention, but I know of none which permit new instantiation of
templates during the dynamic link phase.

|> > The most important difference between templates and macros is the
|> > way templates interact with scoping and name lookup. Generally
|> > speaking, this is only important in generic programming because
|> > you need to leverage off function overload resolution in order to
|> > provide conditionals -- in short, the interaction is important in
|> > order to overcome a weakness in templates as a generic language.
|> > Compared to some macro languages I've used (the one in Intel's
|> > ASM 86, for example), I'd consider templates closer to watered
|> > down macros than to glorified macros.

|> How about template metaprogramming: would you describe functional
|> programming as "glorified macros"? How do you define the distinction
|> between macros and programming languages?

Macros are a programming language. Historically, the main difference is
what they output: not executable code, but text. With this distinction,
however, not even the C/C++ preprocessor is a macro language, because
they output (according to the standard) a token stream, and not text.

In practice, I would say that what distinguishes macros from other
programming languages is the fact that they are embedded into some other
language; their output is input for the containing language, in some
form or another. But that they operate at a level above the containing
language. By this definition, templates aren't macros, but it is only
the second condition which prevents them from being macros: while they
provide a meta-language which serves to generate input for the basic
language, they don't operate above the containing language; they respect
scope, they have access to type information and internal constants, etc.

What the language is used for (meta-programming, etc.) is irrelevant to
the question.

--
James Kanze mailto:ka...@gabi-soft.fr

Conseils en informatique orientée objet/

Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France +33 1 41 89 80 93

Matt Austern

unread,

Dec 6, 2003, 8:59:22 PM12/6/03

to

Scott Meyers <Use...@aristeia.com> writes:

> On 5 Dec 2003 18:11:07 -0500, David Abrahams wrote:
> > Scott Meyers <Use...@aristeia.com> writes:
> > STLPort has been doing something similar for some time to generate
> > global variables even in its header-only configurations.
>
> Do you happen to know what their motivation was for using a template instead
> of a class if they didn't want to instantiate on multiple parameter values?

Yes, I'm the one that did this. I wanted the STL implementation
(which at the time was just an STL implementation, not a full
standard library implementation) to be in the form of headers only
instead of headers and source files. So there are a few things that
are templates or inline functions for purely artificial reasons, just
because I need to be able to include them in multiple translation
units. That's the primary reason that the old SGI-style allocators
were templatized, for example.

It was an important design goal at the time I did this work, for
reasons that have long been irrelevant.

Mogens Hansen

unread,

Dec 6, 2003, 9:01:14 PM12/6/03

to

"Francis Glassborow" <fra...@robinton.demon.co.uk> wrote:

> Did they also eliminate statics and string literals that were part of
> those functions' definitions.

The Borland compiler doesn't removed the unused string literal
"bar called"
even though the function
void bar();
isn't linked into the application.
I could find it in the exe file, using a hex-editor.

The Microsoft and Intel compiler tool chains on MS-Windows does remove the
unsused string literal.
Perhaps it's unfair to count the Microsoft and the Intel compilers as two
different implementation, since AFAIK the Intel compiler uses the Microsoft
linker.

Kind regards

Mogens Hansen

Glen Low

unread,

Dec 6, 2003, 9:01:38 PM12/6/03

to

> You can include a template in multiple translation units and not
> get multiple definitions of the static object initializers. Thus,
> you can keep all the code in a header.

Interesting, I've been trying to figure out the same problem, could
you elaborate?

For example, suppose a.h is

template <typename T> struct U
{
static const int arr [2];
};

Now if I put in the header a static object initializer

template <> const int U<int>::arr [2] = {1, 2};

That is fine until I have multiple translation units #including a.h,
then I get multiple definitions at the link stage (gcc 3.3), so how
did you fix that?

Cheers,
Glen Low, Pixelglow Software
www.pixelglow.com

John Potter

unread,

Dec 7, 2003, 9:47:23 AM12/7/03

to

On 6 Dec 2003 21:01:38 -0500, gle...@pixelglow.com (Glen Low) wrote:

> > You can include a template in multiple translation units and not
> > get multiple definitions of the static object initializers. Thus,
> > you can keep all the code in a header.

> Interesting, I've been trying to figure out the same problem, could
> you elaborate?

> For example, suppose a.h is

> template <typename T> struct U
> {
> static const int arr [2];
> };

> Now if I put in the header a static object initializer

> template <> const int U<int>::arr [2] = {1, 2};

That specializtion is not a template any more.

> That is fine until I have multiple translation units #including a.h,
> then I get multiple definitions at the link stage (gcc 3.3), so how
> did you fix that?

template <class T>
int const U<T>::arr[2] = { 1, 2 };

Now you can include it in more than one translation unit and instantiate
it in more than one and gcc will not have a problem. You might even
output the address of the array in several units to see that there is
only one.

John

Pete Becker

unread,

Dec 7, 2003, 10:02:07 AM12/7/03

to

Glen Low wrote:
>
> template <> const int U<int>::arr [2] = {1, 2};
>
> That is fine until I have multiple translation units #including a.h,
> then I get multiple definitions at the link stage (gcc 3.3), so how
> did you fix that?
>

That's an explicit specialization, not a template. Here's what works:

template<class T> struct demo
{
static int data;
};

template<class T>
int demo<T>::data = 17;

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

P.J. Plauger

unread,

Dec 7, 2003, 10:03:11 AM12/7/03

to

"Glen Low" <gle...@pixelglow.com> wrote in message
news:9215d7ac.03120...@posting.google.com...

> > You can include a template in multiple translation units and not
> > get multiple definitions of the static object initializers. Thus,
> > you can keep all the code in a header.
>
> Interesting, I've been trying to figure out the same problem, could
> you elaborate?
>
> For example, suppose a.h is
>
> template <typename T> struct U
> {
> static const int arr [2];
> };
>
> Now if I put in the header a static object initializer
>
> template <> const int U<int>::arr [2] = {1, 2};
>
> That is fine until I have multiple translation units #including a.h,
> then I get multiple definitions at the link stage (gcc 3.3), so how
> did you fix that?

You've written an explicit instantiation, which is just another
spelling for a definition. Leave it in template form and you
can tolerate multiple inclusions.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Glen Low

unread,

Dec 8, 2003, 6:06:32 AM12/8/03

to

> You've written an explicit instantiation, which is just another
> spelling for a definition. Leave it in template form and you
> can tolerate multiple inclusions.

Thanks for all your suggestions, but it doesn't address the big
picture of the problem. What I need is a way of mapping a type to a
constant at compile time. In the case of a simple constant, there is
no problem, but as soon as I need a type to a complex constant like
the array above, I get the multiple inclusion problem.

I suppose, stretching the gray cells a bit, I could use a partial
specialization instead of an explicit specialization.

template <typename T, int dummy = 0> struct U

{
static const int arr [2];
};

template <int dummy> const int U<int,dummy>::arr [2] = {1, 2};

I haven't yet tested this out yet, though.

Cheers,
Glen Low, Pixelglow Software
www.pixelglow.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Allan W

unread,

Dec 10, 2003, 6:14:19 AM12/10/03

to

James Kanze <ka...@alex.gabi-soft.fr> wrote

> Macros are a programming language. Historically, the main difference is
> what they output: not executable code, but text. With this distinction,
> however, not even the C/C++ preprocessor is a macro language, because
> they output (according to the standard) a token stream, and not text.

Conceptually, the C++ preprocessor does output text, and then the next
phase of compiling reads it back in again. The fact that the intermediate
I/O is skipped is merely an optimization.

The Microsoft Visual C++ compiler (6.0, anyway) includes command-line
switch /P causes the compiler to write preprocessed output to a
file; by default, the filetype is .i. Other switches control if the
preprocessed output goes to stdout instead, and if it includes #line
directives. I would assume that most (or all) of the other C++ compilers
have a similar facility.

> In practice, I would say that what distinguishes macros from other
> programming languages is the fact that they are embedded into some other
> language; their output is input for the containing language, in some
> form or another. But that they operate at a level above the containing
> language. By this definition, templates aren't macros, but it is only
> the second condition which prevents them from being macros: while they
> provide a meta-language which serves to generate input for the basic
> language, they don't operate above the containing language; they respect
> scope, they have access to type information and internal constants, etc.

I agree.

ka...@gabi-soft.fr

unread,

Dec 10, 2003, 3:12:42 PM12/10/03

to

all...@my-dejanews.com (Allan W) wrote in message
news:<7f2735a5.03120...@posting.google.com>...

> James Kanze <ka...@alex.gabi-soft.fr> wrote
> > Macros are a programming language. Historically, the main
> > difference is what they output: not executable code, but text. With
> > this distinction, however, not even the C/C++ preprocessor is a
> > macro language, because they output (according to the standard) a
> > token stream, and not text.

> Conceptually, the C++ preprocessor does output text, and then the next
> phase of compiling reads it back in again. The fact that the
> intermediate I/O is skipped is merely an optimization.

I think you've got it backwards. Conceptually, the C++ preprocessor
does not output text, at least according to the standard. Many
implementation do in fact use text as the intermediate format, but this
is an implementation technique, not something mandated by the standard.

(Note that this is a change with regards to K&R C. I guess you could
consider it an innovation of the first C standard.)

> The Microsoft Visual C++ compiler (6.0, anyway) includes command-line
> switch /P causes the compiler to write preprocessed output to a file;
> by default, the filetype is .i.

Every C or C++ compiler I know of has an option to output the
"preprocessor output". The standard doesn't require it, however, and it
is also quite possible for the compiler to re-textize the tokens -- it
has to be able to do so in order to implement the ## operator anyway.

Relevant sections of the standard are §2.1 (Phases of translation),
particularly paragraph 3, 4 and 7:

3 The source file is decomposed into preprocessing tokens and
sequences of white-space characters (including comments). [...]

4 Preprocessing directives are executed and macro invocations are
expanded. [...]

[...]

7 White-space characters separating tokens are no longer significant.
Each preprocessing token is converted into a token.

Note that from phase three on, the compiler is dealing with
"preprocessor tokens", and not text.

This does create problems later on. Consider §16.3.3/2-3, for example:

If, in the replacement list, a parameter is immediately preceeded or
followed by a ## preprocessing token, the parameter is replace by
the corresponding argument's preprocessing token sequence [without
expansion].

For both object-like and function-like macro invocations, before the
replacement list is reexamined for more macro names to replace, each
instance of a ## preprocessing toke in the replacement list (not
from an argument) is deleted and the preceding preprocessing token
is concatenated with the following preprocessing token. If the
resul is not a valid preprocessing token, the behavior is undefined.
The resulting token is available for further macro replacement. The
order of evaluation of ## operators is unspecified.

On one hand, you will know how it speaks not of text, but of
concatenating tokens. On the other hand, you will note how it very
sneakily omits defining what concatenating a token means:-). (There's
also the problem that what constitutes a token is context dependant. So
if the results of concatenation give <xyz.h> after an #include, it's
legal, but if they give it elsewhere, it's undefined behavior.) All of
the compilers I know interpret it to mean: convert the two operand
tokens to text, concatenate the text, then reconvert the results to a
token. If the result isn't a legal token, reformat your hard disk:-).

Seriously, most compilers do simply work with text, and all that is
really necessary is that the text string that results after all of the
## operators have been evaluated can be correctly tokenized, in the
context in which it appears. Most, but not all: g++ 3.3.1, for example,
will complain about things like:

#define build( x, y ) <##x##/##y##>
#include build( threading, mutex.h )

for example. Which works with every other compiler I've tried.

It's arguably legal, because the total results of all of the ## are a
legal preprocessing token in this context. But the argument is very
tenuous; the intermediate results aren't. And even if you accept that
argument, you then have the problem that in this particular context, <,
/, and > are not legal preprocessor tokens, so you had a tokenization
error before the ## was invoked. But frankly, I'd like someone to
explain §16.2/4 to me in any plausible way without introducing text:-).
(I'd have posted a bug report to the g++ developers if I could figure
out what the standard really says here. Somehow saying that I don't
know what the standard says you're to do, so you have to do what I want,
doesn't seem to serious, however.)

--
James Kanze GABI Software mailto:ka...@gabi-soft.fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung

11 rue de Rambouillet, 78460 Chevreuse, France, +33 (0)1 30 23 45 16

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Balog Pal

unread,

Dec 10, 2003, 3:27:37 PM12/10/03

to

"Allan W" <all...@my-dejanews.com> wrote in message
news:7f2735a5.03120...@posting.google.com...

> Conceptually, the C++ preprocessor does output text, and then the next
> phase of compiling reads it back in again. The fact that the intermediate
> I/O is skipped is merely an optimization.

There's also preprocessing-tokenisation.

> The Microsoft Visual C++ compiler (6.0, anyway) includes command-line
> switch /P causes the compiler to write preprocessed output to a
> file; by default, the filetype is .i. Other switches control if the
> preprocessed output goes to stdout instead, and if it includes #line
> directives. I would assume that most (or all) of the other C++ compilers
> have a similar facility.

Yes, but actually reading and compiling that file as the primary input mey
not be the same as the original compilation. If you use token-paste it
will produce a single token. But written to text, and re-read it may
disperse to multiple tokens with different meaning.

Paul

Ben Hutchings

unread,

Dec 10, 2003, 3:29:03 PM12/10/03

to

Allan W wrote:
> James Kanze <ka...@alex.gabi-soft.fr> wrote
> > Macros are a programming language. Historically, the main difference is
> > what they output: not executable code, but text. With this distinction,
> > however, not even the C/C++ preprocessor is a macro language, because
> > they output (according to the standard) a token stream, and not text.
>
> Conceptually, the C++ preprocessor does output text, and then the next
> phase of compiling reads it back in again.

Read what the standard has to say (section 2.1). Phases 1 to 4 of
translation cover what is generally thought of as preprocessing.
The result of those phases as described is a sequence of
preprocessing tokens, not a mere sequence of characters.

> The fact that the intermediate I/O is skipped is merely an optimization.

The intermediate I/O is an implementation detail.

> The Microsoft Visual C++ compiler (6.0, anyway) includes command-line
> switch /P causes the compiler to write preprocessed output to a
> file; by default, the filetype is .i. Other switches control if the
> preprocessed output goes to stdout instead, and if it includes #line
> directives.

It also does this wrongly. I forget exactly what the problem is
but ISTR that it can in some cases delete necessary whitespace
between tokens.

> I would assume that most (or all) of the other C++ compilers
> have a similar facility.

<snip>

"Traditional" C and C++ compilers use a separate program for
preprocessing, so of course they can do this.

Allan W

unread,

Dec 12, 2003, 9:19:23 PM12/12/03

to

> > James Kanze <ka...@alex.gabi-soft.fr> wrote
> > > Macros are a programming language. Historically, the main
> > > difference is what they output: not executable code, but text. With
> > > this distinction, however, not even the C/C++ preprocessor is a
> > > macro language, because they output (according to the standard) a
> > > token stream, and not text.

> all...@my-dejanews.com (Allan W) wrote

> > Conceptually, the C++ preprocessor does output text, and then the next
> > phase of compiling reads it back in again. The fact that the
> > intermediate I/O is skipped is merely an optimization.

ka...@gabi-soft.fr wrote

> I think you've got it backwards. Conceptually, the C++ preprocessor
> does not output text, at least according to the standard. Many
> implementation do in fact use text as the intermediate format, but this
> is an implementation technique, not something mandated by the standard.
>
> (Note that this is a change with regards to K&R C. I guess you could
> consider it an innovation of the first C standard.)

Thanks! I stand corrected.

Sean Kelly

unread,

Dec 17, 2003, 4:58:35 AM12/17/03

to

Scott Meyers <Use...@aristeia.com> wrote in message news:<MPG.1a35ddb7c...@news.hevanet.com>...
>
> I must begin with my favorite quote from Bjarne: "Nobody knows what *most*
> C++ programmers are doing."
>
> With that caveat, let me clarify that my reference to "this kind of
> thinking" was referring to the commonality/variability analysis that Mogens
> posted regarding references to class statics in template code. I think
> that a lot more programmers are at least familiar with the idea of moving
> template-parameter-independent code out of templates, though how many take
> the time to do this is not clear. My guesses are that (1) it's not
> terribly common among the people who know about it, (2) not that many
> people know about it, and (3) the percentage of people who know about it
> who work in embedded systems is smaller than the percentage who work on
> hosted implementations. But that's all speculation, though it's not blind
> speculation; I do get out a little.

A related question might be: is such code separation for space
efficiency an important and missing factor in programs that do not use
it? I would think that programmers who must think about such things
(embedded programmers) should have a greater understanding of the
issues and workarounds than a typical programmer. I would assume that
there's quite a camp of C-oriented embedded programmers out there who
may not know the ins and outs of templates, but I would be surprised
if there were C++-oriented embedded programmers who didn't know about
this technique and generated bloated code as a result.

Sean

C++ templates vs. .NET generics

Eugene Gershnik

WW

Igor Ivanov

Glen Low

Dietmar Kuehl

Graham Batty

Daveed Vandevoorde

Chris Perkins

Jan Bares

Dietmar Kuehl

Dietmar Kuehl

Dave Boyle

Dietmar Kuehl

Glen Low

Andrei Alexandrescu

Aaron Bentley

Troll_King

Markus Werle

elefant.alba.dp.ua

Scott Meyers

Glen Low

Glen Low

Markus Werle

P.J. Plauger

andrew queisser

Markus Werle

Mogens Hansen

Scott Meyers

Scott Meyers

Mogens Hansen

Scott Meyers

__DILIP__

Peter Dimov

Scott Meyers

Mogens Hansen

Gabriel Dos Reis

David Abrahams

Mogens Hansen

Bjarne Stroustrup

Gabriel Dos Reis

Balog Pal

David Abrahams

ka...@gabi-soft.fr

Val. Creux

Gabriel Dos Reis

Scott Meyers

Ben Hutchings

Gabriel Dos Reis

David Abrahams

Gabriel Dos Reis

Balog Pal

stelios xanthakis

Hendrik Schober

Henrik Vallgren

David Abrahams

Scott Meyers

Scott Meyers

Scott Meyers

Balog Pal

Mogens Hansen

Hendrik Schober

David Abrahams

Balog Pal

Mogens Hansen

ka...@gabi-soft.fr

Scott Meyers

Henrik Vallgren

Hendrik Schober

Joe Greer

Francis Glassborow

ka...@gabi-soft.fr

Balog Pal

Scott Meyers

Francis Glassborow

stelios xanthakis

David Abrahams

Howard Hinnant

Mogens Hansen

David Abrahams

DILIP