Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Techniques to reduce executable size

8 views
Skip to first unread message

Qu0ll

unread,
Mar 24, 2009, 12:25:29 PM3/24/09
to
I come from a Java world where we have tools like ProGuard which analyze all
the components of an application and strip out classes and members that are
not being used. Is there an equivalent in C++ or does this happen
automatically? For example, if I use one or two classes in the Boost
library, do I get the entire library when I link my program? Similarly if I
use just part of the STL, do I get the whole thing or just those parts I
use?

--
And loving it,

-Qu0ll (Rare, not extinct)
_________________________________________________
Qu0llS...@gmail.com
[Replace the "SixFour" with numbers to email me]

Victor Bazarov

unread,
Mar 24, 2009, 12:42:04 PM3/24/09
to
Qu0ll wrote:
> I come from a Java world where we have tools like ProGuard which analyze
> all the components of an application and strip out classes and members
> that are not being used. Is there an equivalent in C++ or does this
> happen automatically?

C++ has a concept, related mostly to templates, that you don't pay for
what you don't use. Compiler implementors develop their tools with
reduced program size in mind, of course, it's always one of the goals.
For example, linkers (the programs that tie different object modules
together) can perform some reduction by not linking modules from which
no function is used.

> For example, if I use one or two classes in the
> Boost library, do I get the entire library when I link my program?

Most likely not.

> Similarly if I use just part of the STL, do I get the whole thing or
> just those parts I use?

Only the parts that you use. That's one of the selling points of C++
templates.

The overall reduction in machine code is not always possible by the
compiler/linker, since the use of functions/modules can be dependent on
the data the program has to process. Theoretically, if you have an
exhaustive set of tests, you can run your program under a *coverage*
tool to collect coverage data and then manually remove the code that is
never executed.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Qu0ll

unread,
Mar 24, 2009, 12:55:58 PM3/24/09
to
"Victor Bazarov" <v.Aba...@comAcast.net> wrote in message
news:gqb2gt$73r$1...@news.datemas.de...

Thanks Victor for the prompt, comprehensive reply. So therefore there is no
need for a tool like ProGuard as the linker and the template mechanism do
this automatically. Does this apply to individual classes defined in the
same physical file? That is, will the linker only link in those that are
actually used even when they are defined in the same compilation unit as
some which are not used? In Java we have each class in a separate file but
this doesn't appear to be the way in C++.

Victor Bazarov

unread,
Mar 24, 2009, 1:31:59 PM3/24/09
to

The trick with the templates is that they aren't really compiled
separately from your code. When used in your code, each template is
*instantiated* by the compiler, and the code is added to the program and
is actually shared between the modules (the linker should take care of
unifying the code). Templates that aren't instantiated, are only
compiled for the sake of syntax check, but the machine code is not
generated for those.

James Kanze

unread,
Mar 24, 2009, 5:19:52 PM3/24/09
to
On Mar 24, 5:25 pm, "Qu0ll" <Qu0llSixF...@gmail.com> wrote:
> I come from a Java world where we have tools like ProGuard
> which analyze all the components of an application and strip
> out classes and members that are not being used. Is there an
> equivalent in C++ or does this happen automatically? For
> example, if I use one or two classes in the Boost library, do
> I get the entire library when I link my program? Similarly if
> I use just part of the STL, do I get the whole thing or just
> those parts I use?

I'm not quite sure I understand what the Java tool does; Java
only loads classes on an as needed basis, so you never have
something you don't use. In statically compiled languages
(thus, C++), the linker only pulls in the object files it needs
from a library. Beyond that, it's a question of how the library
files were made---for widely used general purpose libraries,
each function should generally be in a separate object file; for
application specific classes, on the other hand, it's more usual
to use one object file for the entire class, which means that
you get all of the functions for the class as soon as you use
any one of them.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Qu0ll

unread,
Mar 24, 2009, 6:01:58 PM3/24/09
to
"James Kanze" <james...@gmail.com> wrote in message
news:483ed2b8-5958-4ea9...@z9g2000yqi.googlegroups.com...

[...]

> I'm not quite sure I understand what the Java tool does; Java
> only loads classes on an as needed basis, so you never have
> something you don't use.

ProGuard reduces the size of a JAR by removing classes and members not
required by the application or applet. This is particularly useful for
applets where the emphasis is on trying to keep the JARs as small as
possible to permit faster downloads.

> In statically compiled languages
> (thus, C++), the linker only pulls in the object files it needs
> from a library. Beyond that, it's a question of how the library
> files were made---for widely used general purpose libraries,
> each function should generally be in a separate object file; for
> application specific classes, on the other hand, it's more usual
> to use one object file for the entire class, which means that
> you get all of the functions for the class as soon as you use
> any one of them.

Do you really mean that each function may be in a separate file? So a
single class spans several files?

blargg

unread,
Mar 24, 2009, 6:30:34 PM3/24/09
to
Qu0ll wrote:
> James Kanze wrote:
[...]

> ProGuard reduces the size of a JAR by removing classes and members not
> required by the application or applet. This is particularly useful for
> applets where the emphasis is on trying to keep the JARs as small as
> possible to permit faster downloads.
>
> > In statically compiled languages
> > (thus, C++), the linker only pulls in the object files it needs
> > from a library. Beyond that, it's a question of how the library
> > files were made---for widely used general purpose libraries,
> > each function should generally be in a separate object file; for
> > application specific classes, on the other hand, it's more usual
> > to use one object file for the entire class, which means that
> > you get all of the functions for the class as soon as you use
> > any one of them.
>
> Do you really mean that each function may be in a separate file? So a
> single class spans several files?

Yes, some linkers can't "dead-strip" unused objects within an object file,
only the entire object file's contents. Others are smart and don't care
whether everything is in a single file or multiple files. Obviously one
should use the latter if executable size is critical. With dumb linkers,
one can still often identify common subsets of functionality that are
either mostly used, or not used at all, and keep their implementations in
separate files, achieving close to what a smart linker does.

James Kanze

unread,
Mar 25, 2009, 4:31:09 AM3/25/09
to
On Mar 24, 11:01 pm, "Qu0ll" <Qu0llSixF...@gmail.com> wrote:
> "James Kanze" <james.ka...@gmail.com> wrote in message

> news:483ed2b8-5958-4ea9...@z9g2000yqi.googlegroups.com...

> [...]


> > In statically compiled languages
> > (thus, C++), the linker only pulls in the object files it needs
> > from a library. Beyond that, it's a question of how the library
> > files were made---for widely used general purpose libraries,
> > each function should generally be in a separate object file; for
> > application specific classes, on the other hand, it's more usual
> > to use one object file for the entire class, which means that
> > you get all of the functions for the class as soon as you use
> > any one of them.

> Do you really mean that each function may be in a separate file? So a
> single class spans several files?

Yes. From a QoI point of view, I would expect this in any
general purpose library.

There are exceptions, of course. There's no point in doing it
if the application is going to pick up all the functions anyway,
e.g. because they're virtual. And of course, templates are a
completely different problem---functions which aren't used won't
even be compiled, much less have an object file to be linked in
(although this varies somewhat, depending on the instantiation
strategy). But for general purpose libraries, the rule is one
function per source file for non-virtual non-template functions.

James Kanze

unread,
Mar 25, 2009, 4:34:31 AM3/25/09
to
On Mar 24, 11:30 pm, blargg....@gishpuppy.com (blargg) wrote:
> Qu0ll wrote:
> > James Kanze wrote:
> [...]
> > Do you really mean that each function may be in a separate file? So a
> > single class spans several files?

> Yes, some linkers can't "dead-strip" unused objects within an
> object file, only the entire object file's contents.

A linker can't strip unused objects from an object file and
still be conformant. Some can strip unused functions, but this
functionality isn't very wide spread, and isn't available on
most machines. (It has more to do with the object file format
than the linker, I think. Not including a function which isn't
used, even when other things in the object file are used, is
fairly trivial to implement, IF the information concerning the
extent of the function is present in the object file.)

blargg

unread,
Mar 25, 2009, 4:18:47 PM3/25/09
to
James Kanze wrote:
> On Mar 24, 11:30 pm, blargg....@gishpuppy.com (blargg) wrote:
> > Qu0ll wrote:
> > > James Kanze wrote:
> > [...]
> > > Do you really mean that each function may be in a separate file? So a
> > > single class spans several files?
>
> > Yes, some linkers can't "dead-strip" unused objects within an
> > object file, only the entire object file's contents.
>
> A linker can't strip unused objects from an object file and
> still be conformant.

OK, but it can strip objects which are never ultimately referenced, and
whose constructor has no side-effects which modify any of the referenced
objects. Which is a good reason for avoiding static-duration objects with
side-effects which might not be used by a particular program.

James Kanze

unread,
Mar 26, 2009, 5:26:14 AM3/26/09
to
On Mar 25, 9:18 pm, blargg....@gishpuppy.com (blargg) wrote:
> James Kanze wrote:
> > On Mar 24, 11:30 pm, blargg....@gishpuppy.com (blargg) wrote:
> > > Qu0ll wrote:
> > > > James Kanze wrote:
> > > [...]
> > > > Do you really mean that each function may be in a separate file? So a
> > > > single class spans several files?

> > > Yes, some linkers can't "dead-strip" unused objects within an
> > > object file, only the entire object file's contents.

> > A linker can't strip unused objects from an object file and
> > still be conformant.

> OK, but it can strip objects which are never ultimately
> referenced, and whose constructor has no side-effects which
> modify any of the referenced objects.

Whose constructor or destructor has no side-effects which affect
observable behavior. Provided it can distinguish those objects
from ones whose constructor or destructor does have visible
side-effects.

> Which is a good reason for avoiding static-duration objects
> with side-effects which might not be used by a particular
> program.

In general, why link in something that isn't used? (But to tell
the truth, I'm not too sure what you're saying we should avoid.)

Paavo Helde

unread,
Mar 26, 2009, 5:39:58 PM3/26/09
to
"Qu0ll" <Qu0llS...@gmail.com> kirjutas:

> I come from a Java world where we have tools like ProGuard which
> analyze all the components of an application and strip out classes and
> members that are not being used. Is there an equivalent in C++ or
> does this happen automatically? For example, if I use one or two
> classes in the Boost library, do I get the entire library when I link
> my program? Similarly if I use just part of the STL, do I get the
> whole thing or just those parts I use?

This very much depends on whether you are using static or dynamic
libraries. In case of dynamic libraries (.dll, .dylib, .so) the whole
library file needs to be present at run-time. However, in case of modern
OS-es only touched parts/pages are actually loaded in memory, so the size
of the dynamic library does not directly affect the program speed. A
potential benefit of dynamic libraries is that if there are multiple
applications using the same dynamic library, they might be able to share
the same copy, loaded into memory only once and thus improving the overall
system performance.

In case of static libraries, the linker is able to strip out unused code,
to more or less extent, as told by other responders. For a java applet like
single web download the executable should be fully static-linked and
stripped as much as possible, depending on the capabilities of the linker.

In case of Boost, for example, you can force either static or dynamic
linking as you see fit. The default rules have been changed over the time,
in my impression to prefer more static linking.

The Java jar files are probably more similar to dynamic-link libraries, so
what you asking for would be some kind of tool to shrink and repackage
dynamic libraries, depending on which parts of them are used by a certain
application. I am not aware of any such tool, but they might exist. There
is also a logical complication that the application may want to resolve and
call any function from a dynamic library at run-time, using the function
name in string form, probably coming from an external source. Too bad if
the function has been stripped off at this point... Another problem is that
such stripping would depend on the application, so different applications
would not be able to share the same dynamic library any more.

hth
Paavo


Paavo Helde

unread,
Mar 26, 2009, 6:05:10 PM3/26/09
to
Paavo Helde <pa...@nospam.please.ee> kirjutas:

> "Qu0ll" <Qu0llS...@gmail.com> kirjutas:


>> or does this happen automatically? For example, if I use one or two
>> classes in the Boost library, do I get the entire library when I link
>

> This very much depends on whether you are using static or dynamic
> libraries. In case of dynamic libraries (.dll, .dylib, .so) the whole

Just a clarification for OP: large parts of Boost are header-only templates
code, the whole static vs. dynamic library thing does not affect them, as
there is no library. Yes, the Boost library appears to consist of many
libraries and many non-libraries ;-)

Paavo

Bart van Ingen Schenau

unread,
Mar 27, 2009, 4:41:09 AM3/27/09
to
On Mar 26, 10:39 pm, Paavo Helde <pa...@nospam.please.ee> wrote:
> "Qu0ll" <Qu0llSixF...@gmail.com> kirjutas:

>
> > I come from a Java world where we have tools like ProGuard which
> > analyze all the components of an application and strip out classes and
> > members that are not being used.  Is there an equivalent in C++ or
> > does this happen automatically?
>
<snip>

> The Java jar files are probably more similar to dynamic-link libraries, so

I would say that a jar file is comparable to the result of passing all
object files and libraries, that make up an application, to the
librarian instead of the linker.

For Java, that is an acceptable method of packaging, because every
machine that supports Java must be able to execute, interprer or
compile the byte-code.
For C++, there is no such assumption that the machine running the
application has the capability to link object files together. At best,
the machine is able to dynamically load parts of the application (when
it supports DLL/so's). For that reason, the packaging method of jar
files is not suitable for C++ code and there are no tools for C++ that
do something similar to ProGuard.

<sip>


> Another problem is that
> such stripping would depend on the application, so different applications
> would not be able to share the same dynamic library any more.

And one of the prime reasons for using DLL's is that different
applications are able to share the same library, so you can save disk
(only one copy of the library is needed) and memory space (multiple
running applications sharing the same library instance).
This would be completely negated if you have application-specific
stripped-down DLL versions.

For jar-files, this is not a concern, because their contents are not
shared between applications.

>
> hth
> Paavo

Bart v Ingen Schenau

0 new messages