Crypto++ "lite" for use with STLport?

39 views
Skip to first unread message

Robert Roessler

unread,
Sep 17, 2007, 12:44:27 AM9/17/07
to Crypto++
So... Crypto++ and STLport (recent versions, anyway) don't mix all that
well. What I mean is that Crypto++ can NOT be used with STLport in the
"no iostreams" (_STLP_NO_IOSTREAMS) mode, which is its simplest case -
requiring only the STLport source and headers (no library needs to be
built).

If you try to use STLport in this mode with Crypto++, you get immediate
compile errors because STLport has detected Crypto++'s use of iostreams.

Since I like using STLport but am very pleased with the speed of
Crypto++ hashing functions, I looked for a solution... I decided to try
and build a Crypto++ "lite" version as a proof-of-concept. I will
present my results, and hope that either a) Wei agrees that this extra
packaging of Crypto++ has merit, or b) Wei (or anyone) explains an even
easier way to achieve my goals. ;)

Paraphrasing Wei's minimum hashing test code from a previous thread:

#define CRYPTOPP_ENABLE_NAMESPACE_WEAK 1

#include "md5.h"
#include "sha.h"
#include "whrlpool.h"

USING_NAMESPACE(CryptoPP)
USING_NAMESPACE(std)

int _tmain(int argc, _TCHAR* argv[])
{
Weak::MD5 md5;
SHA sha;
Whirlpool w;
byte a[100];
md5.CalculateDigest(a, a, 0);
sha.CalculateDigest(a, a, 0);
w.CalculateDigest(a, a, 0);
return 0;
}

I can build this with VS 2005 SP1 with settings similar to those used in
Crypto++ (with the addition of the linker "optimization" switch
"opt:nowin98") and get an exe size of 138 KB. When I substitute
"cryptlite.lib" for the standard "cryptlib.lib", this comes down to 84.5
KB. :)

I created "cryptlite.lib" by adding a "cryptlite" project to the
"cryptest" solution. It is [for now] just a static lib containing only
the files needed for the above hashing test code:

algparam.cpp, cpu.cpp, cryptlib.cpp, filters.cpp, md5.cpp, misc.cpp,
mqueue.cpp, queue.cpp, sha.cpp, and whrlpool.cpp. The ONLY source mod
required is blocking the "include <locale>" in stdcpp.h, which I
accomplished by wrapping said include with "#ifndef CRYPTOPP_LITE".

So for BUILDING this subset lib, just define this symbol in the
"cryptlite" project. When USING this lib, CRYPTOPP_LITE must be
defined, as well as CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES - both
before the includes of the Crypto++ headers.

Why do we care about any of this? Since we can now use the desired
STLport "headers-only" mode with Crypto++, we can rebuild the above test
case, and we see... 84 KB. Big deal. BUT - let's look at a real app
which uses the above hashes plus a lot of STL:

RFtp PRO 3.2
============
727 KB baseline code using STLport and LibTomCrypt hashes

803 KB MS STL + cryptlib
760 KB MS STL + cryptlite
737 KB STLport + cryptlite

Things start looking a bit more appealing with more realistic tests. :)

Notice that, even aside from STLport considerations, the "cryptlite"
subset of Crypto++ may have a niche where size matters (e.g.,
embedded?). ;) Clearly, more of the standard Crypto++ modules could be
added - pretty much everything that doesn't require iostreams or
whatever the locale header is providing?

Finally, what's up with CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES? I ask
because I notice that a lot of the space savings of the "cryptolite" lib
can be realized with the ordinary "cryptlib" plus this define in the
client code... note that this is really just an educational question,
since it does not solve the STLport conflict.

Comments?

Robert Roessler
roes...@rftp.com
http://www.rftp.com

Parch

unread,
Sep 20, 2007, 10:00:50 PM9/20/07
to Crypto++ Users
Your comment about this -> CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES
is interesting. I would not have guessed that this define is necessary
when building as a Windows lib -- only as a DLL. Would using it when
building it as a lib help build-time performance in any way?

> roess...@rftp.comhttp://www.rftp.com

Robert Roessler

unread,
Sep 24, 2007, 3:53:10 PM9/24/07
to Crypto++
Parch wrote:
> Your comment about this -> CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES
> is interesting. I would not have guessed that this define is necessary
> when building as a Windows lib -- only as a DLL. Would using it when
> building it as a lib help build-time performance in any way?

Well, since no one else is chiming in yet, *I* used this def like I did
everything else in constructing this Crypto++ subset lib: stuff in the
modules I knew I wanted for hashing, and then keep adding (or in this
case subtracting) until the compile/link errors go away... :)

Clearly, cruelly ripping these modules away from their assumed
infrastructure and dependencies can result in having to force things
that would have happened "naturally" - e.g., dll.cpp would define this
symbol in more "normal" builds.

In this particular case, adding iterhash.cpp directly into the library
resulted in lots of errors... it appears that for this file and a number
of others in the same "category", you want to let the *header* file
iterhash.h #include the .cpp itself, and
CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES seemed to be what would drive this.

As to why I observed the results I did with the standard cryptlib.lib
and defining the symbol, I can conjecture that it causes some symbols to
be defined "early", causing the linker to not have to bring in some
other larger modules to resolve undefines(?).

Robert Roessler

unread,
Sep 30, 2007, 8:13:04 PM9/30/07
to Crypto++
Robert Roessler wrote:
> So... Crypto++ and STLport (recent versions, anyway) don't mix all that
> well. What I mean is that Crypto++ can NOT be used with STLport in the
> "no iostreams" (_STLP_NO_IOSTREAMS) mode, which is its simplest case -
> requiring only the STLport source and headers (no library needs to be
> built).
>
> If you try to use STLport in this mode with Crypto++, you get immediate
> compile errors because STLport has detected Crypto++'s use of iostreams.
>
> Since I like using STLport but am very pleased with the speed of
> Crypto++ hashing functions, I looked for a solution... I decided to try
> and build a Crypto++ "lite" version as a proof-of-concept. I will
> present my results, and hope that either a) Wei agrees that this extra
> packaging of Crypto++ has merit, or b) Wei (or anyone) explains an even
> easier way to achieve my goals. ;)

> ...

> I created "cryptlite.lib" by adding a "cryptlite" project to the
> "cryptest" solution. It is [for now] just a static lib containing only
> the files needed for the above hashing test code:
>
> algparam.cpp, cpu.cpp, cryptlib.cpp, filters.cpp, md5.cpp, misc.cpp,
> mqueue.cpp, queue.cpp, sha.cpp, and whrlpool.cpp. The ONLY source mod
> required is blocking the "include <locale>" in stdcpp.h, which I
> accomplished by wrapping said include with "#ifndef CRYPTOPP_LITE".
>
> So for BUILDING this subset lib, just define this symbol in the
> "cryptlite" project. When USING this lib, CRYPTOPP_LITE must be
> defined, as well as CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES - both
> before the includes of the Crypto++ headers.

> ...

> Notice that, even aside from STLport considerations, the "cryptlite"
> subset of Crypto++ may have a niche where size matters (e.g.,
> embedded?). ;) Clearly, more of the standard Crypto++ modules could be
> added - pretty much everything that doesn't require iostreams or
> whatever the locale header is providing?

I note that stdcpp.h has been updated on the SVN repo to no longer
include the locale header file... assuming this wasn't a coincidence,
then this message was probably read by Wei Dai. ;)

This means that my sole required change in the Crypto++ sources is no
longer necessary, and so the CRYPTOPP_LITE stuff isn't either... but
that still doesn't alter the fundamental issue of needing iostreams to
do the full cryptlib.lib build, and therefore doesn't obviate the
[potential] need/usefulness of a "lite" subset version of Crypto++.

Beyond Parch's curiosity (echoing mine) about
CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES, an additional comment or two on
why this "lite" idea does or doesn't make sense would be welcome. I can
also contribute my [trivial] "cryptlite.lib" project file, but it really
only currently deals with md5, sha*, and whirlpool hashes.

Parch

unread,
Oct 1, 2007, 4:57:16 AM10/1/07
to Crypto++ Users
I still find this confusing. But here are some ideas after playing
around with a minimal build with those CPP files you had in your
project. Warning: I wasn't using STLPORT, just VS2005 SP1 compiler
with its usual headers.

Without defining that symbol, I got rid of all the iterhash related
compile errors by:
--cutting all the template function definitions out of iterhash.cpp
--pasting them in, near the end of iterhash.h, inside the crypto++
namespace, just before this bit (which I then removed)

#ifdef CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES
#include "iterhash.cpp"
#endif

NAMESPACE_BEGIN(CryptoPP)

#ifdef WORD64_AVAILABLE
CRYPTOPP_DLL_TEMPLATE_CLASS IteratedHashBase<word64,
HashTransformation>;
...
etc.
....
NAMESPACE_END(CryptoPP)

Now this left a few other link errors, but adding fips140.cpp got rid
of those.Finally, using the trivial minimum hashing code you posted,
and enabling function level linking and link-time-code generation
optimization options for release mode build, and the nowin98 option, I
reproduced your exe file size result.

So, what have I learned? Not sure, I still don't fully understand
MANUALLY_INSTANTIATE_TEMPLATES, but I feel a little closer.

I guess another solution you could look at, rather than defining
MANUALLY_INSTANTIATE_TEMPLATES throughout your whole project, would be
to have e.g. 'lite.cpp' - which would be substitute for dll.cpp, and
define MANUALLY_INSTANTIATE_TEMPLATES locally - but just define
include the header files you want, so all the iostreams doesn't get
dragged in.

> roess...@rftp.comhttp://www.rftp.com

Wei Dai

unread,
Oct 1, 2007, 8:23:47 PM10/1/07
to Robert Roessler, Crypto++
Robert Roessler wrote:
> Beyond Parch's curiosity (echoing mine) about
> CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES, an additional comment or two on
> why this "lite" idea does or doesn't make sense would be welcome. I can
> also contribute my [trivial] "cryptlite.lib" project file, but it really
> only currently deals with md5, sha*, and whirlpool hashes.

Thanks for reminding me about this. It looks like what happened was that the
template classes in iterhash.h were being instantiated in dll.cpp via the
use of CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES. I thought the Microsoft
linker would be smart enough to discard the parts of dll.cpp that are not
needed, but apparently it isn't, so it ended up bring in a lot more code
into the final executable than is necessary.

I've checked into SVN a change to have the iterhash template classes be
instantiated separately in iterhash.cpp, which should fix this particular
problem. I'm now getting an executable size of 98K when using the 3 hash
functions with the resulting cryptlib.lib.

As for having a separate "cryptlite.lib" project, I'd prefer not creating
another project if not necessary, to avoid the additional maintenance
overhead. In theory, the linker *should* be able to discard any code that
isn't necessary from the final executable.


Robert Roessler

unread,
Oct 1, 2007, 10:44:48 PM10/1/07
to Crypto++

I must be doing a really poor job at explaining my problem / use case... ;)

Your adjustments to how iterhash.cpp and iterhash.h relate are *not* an
improvement for me (the code size went up 2k), and the way they worked
previously was not an issue for me to start with (reverting these
changes would be totally fine with me, and appreciated).

I used CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES to emulate the usage I
saw in Crypto++ itself (the dll.cpp case specifically), and that was
never an issue (for me).

MY deal all along has been using Crypto++ with STLport, which in the
general case does seem to have some "bring in too much code" issues...
but in particular, I use STLport with _STLP_NO_IOSTREAMS defined, which
means 2 things:

1) you don't NEED to build the STLport library at all, as you are using
it in "headers only" mode; this is a wonderful and simple way to use a
non-Microsoft STL *as long as you don't use iostreams*

2) you CAN'T use STLport this way if the code you are building uses any
of the iostreams stuff AT ALL

The consequence of #2 is that I CAN'T build the full Crypto++ library,
at least without [possibly major?] restructuring / ifdef-controlled
subsetting to optionally not use iostreams... which presumably doesn't
make any sense for the mainstream Crypto++ product.

So the idea of Crypto++ "lite" arises out of an "iostreams-less" version
of the library, so that it can be built with this "style" of STLport
usage... with a side-effect of a smaller and simpler library that *may*
be useful in some small-memory niches, all for the low, low cost of
defining (and as you point out, maintaining) a vcproj file. :)

Not to be long-winded, but I needed to make it clear that this is not
[primarily, and IME] a "how much code is linked in" issue. ;)

Tim Lovell-Smith

unread,
Oct 2, 2007, 1:37:14 AM10/2/07
to Crypto++ Users
Sorry, Robert, you're right. I don't think I have really been helping
with your main problem. But I did think I saw you having a secondary
problem, which was that during customization of the library having to
figure out the special #defines needed to get the iterhash stuff to
compile. I kind of hoped that by talking about that problem, a better
design avoiding those problems would appear.

However, back to your main issue. Basically being able to e.g. #define
CRYPTOPP_NO_IOSTREAMS in your project settings and crypto++ would then
be buildable only as a library (no test suite) not including anything
obviously depending iostreams? I haven't made a project to test this
but I think the restructuring needed to do this would not be major.
All the dependencies on iostreams I have seen are through 1) test
code, 2) "files.h" & "files.cpp", and 3) indirectly, the all-consuming
"dll.cpp". Perhaps you could give it a go. I think you could probably
build most of the crypto++ modules (e.g. most of the ciphers) without
iostreams.

> roess...@rftp.comhttp://www.rftp.com

Robert Roessler

unread,
Oct 2, 2007, 3:30:20 AM10/2/07
to Crypto++
Tim Lovell-Smith wrote:
> Sorry, Robert, you're right. I don't think I have really been helping
> with your main problem. But I did think I saw you having a secondary
> problem, which was that during customization of the library having to
> figure out the special #defines needed to get the iterhash stuff to
> compile. I kind of hoped that by talking about that problem, a better
> design avoiding those problems would appear.

Maybe I just got lucky, but I could tell that iterhash.{cpp,h} is (was)
one instance of a structuring pattern that represented a trick I could
use as I added modules and built up "cryptlite.lib". ;)

And that is mostly why I would rather that the band-aid-y change to
these files was undone - otherwise, I may have to figure out new tricks
as I go. :|

> However, back to your main issue. Basically being able to e.g. #define
> CRYPTOPP_NO_IOSTREAMS in your project settings and crypto++ would then
> be buildable only as a library (no test suite) not including anything
> obviously depending iostreams? I haven't made a project to test this
> but I think the restructuring needed to do this would not be major.
> All the dependencies on iostreams I have seen are through 1) test
> code, 2) "files.h" & "files.cpp", and 3) indirectly, the all-consuming
> "dll.cpp". Perhaps you could give it a go. I think you could probably
> build most of the crypto++ modules (e.g. most of the ciphers) without
> iostreams.

When building cryptlib.h (using the "headers-only" STLport), the first 2
dozen or so files compile great, but starting with rw.cpp, a good number
of the compiles fail with integer.h (which uses iostreams) and
modarith.h (which uses Integer)... so it might indeed be fairly do-able
to just ifdef iostreams stuff out, IFF everything is this localized /
concentrated.

Wei Dai

unread,
Oct 2, 2007, 5:21:25 AM10/2/07
to cryptop...@googlegroups.com
> So the idea of Crypto++ "lite" arises out of an "iostreams-less" version
> of the library, so that it can be built with this "style" of STLport
> usage... with a side-effect of a smaller and simpler library that *may*
> be useful in some small-memory niches, all for the low, low cost of
> defining (and as you point out, maintaining) a vcproj file. :)

I see. The problem is that the cost isn't that low, because I have to
support multiple compilers, and don't want to create, test and maintain new
project files for each. Plus I'm not sure anyone else besides you is
interested in the "lite" project. Of course if you want to share it with
others, you can post it to the Crypto++ wiki, or the Google group, or your
own web site if you have one, etc.

The problem of hash-function-only usage causing the linker to pull in
unnecessary iostream code is a more general one that might affect many more
people, which is why I wanted to get that fixed.

Are there any other questions I haven't answered?


Mouse

unread,
Oct 2, 2007, 8:42:50 AM10/2/07
to cryptop...@googlegroups.com
> > So the idea of Crypto++ "lite" arises out of an "iostreams-less"
> > version of the library....... with a side-effect of a smaller and
> > simpler library that *may* be useful in some small-memory niches,
> > all for the low, low cost of defining (and as you point out,
> > maintaining) a vcproj file. :)
>
> I see. The problem is that the cost isn't that low, because I
> have to support multiple compilers, and don't want to create,
> test and maintain new project files for each. Plus I'm not
> sure anyone else besides you is interested in the "lite"
> project.

I think this "low-memory" version would be very useful, particularly in the
embedded realm (restrictions on both flash where the executables are stored
and RAM where they compete for run-time space).

Robert Roessler

unread,
Oct 2, 2007, 11:44:11 PM10/2/07
to Crypto++
Wei Dai wrote:
>> So the idea of Crypto++ "lite" arises out of an "iostreams-less" version
>> of the library, so that it can be built with this "style" of STLport
>> usage... with a side-effect of a smaller and simpler library that *may*
>> be useful in some small-memory niches, all for the low, low cost of
>> defining (and as you point out, maintaining) a vcproj file. :)
>
> I see. The problem is that the cost isn't that low, because I have to
> support multiple compilers, and don't want to create, test and maintain new
> project files for each. Plus I'm not sure anyone else besides you is
> interested in the "lite" project. Of course if you want to share it with
> others, you can post it to the Crypto++ wiki, or the Google group, or your
> own web site if you have one, etc.

I didn't think you were going to go for this. :)

> The problem of hash-function-only usage causing the linker to pull in
> unnecessary iostream code is a more general one that might affect many more
> people, which is why I wanted to get that fixed.

Hmmm. So how did the changes to iterhash.{cpp,h} which I rather you
hadn't made (since they made my exe larger and are now less clear than
previously) address this? And *why* did code size get worse?

> Are there any other questions I haven't answered?

Something about CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES would be cool
and a) what it does in general, and b) why does code get smaller if I
define it on the CLIENT side? It even reduces some of the extra size
introduced by the iterhash.{cpp,h} changes - but doesn't get rid of all
of it.

Tim Lovell-Smith

unread,
Oct 3, 2007, 8:00:54 PM10/3/07
to Crypto++ Users
> Hmmm. So how did the changes to iterhash.{cpp,h} which I rather you
hadn't made (since they made my exe larger and are now less clear than
previously) address this? And *why* did code size get worse?

I would guess that whether the code gets bigger or smaller is up to
your compiler/linker settings, since nothing changes about whether you
actually use those functions or not - only whether the linker
optimizes them out. Do you have the same optimizations enabled in your
exe project and in cryptlib? Especially function-level linking, which
effects whether individual functions can get optimized out, and /
OPT:REF? I'm not sure, but maybe Link-time code generation might be
helpful too.

Wei Dai

unread,
Oct 3, 2007, 8:29:16 PM10/3/07
to Robert Roessler, Crypto++
Robert Roessler wrote:
> Hmmm. So how did the changes to iterhash.{cpp,h} which I rather you
> hadn't made (since they made my exe larger and are now less clear than
> previously) address this? And *why* did code size get worse?

Before the changes, a number of template classes in iterhash.h were being
instantiated into dll.obj. So when you use a hash function, the linker
brings dll.obj into your .exe. dll.obj references iostream, so the linker
also brings in iostream (even though it really shouldn't, because it only
needs the iterhash template functions).

Now, those template classes are instantiated into iterhash.obj, which
doesn't reference iostream.

I don't understand why it would cause code size to get worse for you. I wish
the linker had an option that would produce some output explaining which
pieces of code it's linking in, how big they are, and why (i.e. some kind of
dependency graph), but it doesn't, so it's not obvious how to debug these
kinds of issues.

> Something about CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES would be cool
> and a) what it does in general, and b) why does code get smaller if I
> define it on the CLIENT side? It even reduces some of the extra size
> introduced by the iterhash.{cpp,h} changes - but doesn't get rid of all
> of it.

CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES changes the definitions of a number
of macros in config.h. When it is not defined, certain templates are not
instantiated. When it is defined, those same templates are explicitly
instantiated. If you define it in a .cpp file in your own application, it
causes those templates to be instantiated into the .obj file compile from
that .cpp file, so the linker no longer has to look for it in cryptlib.lib.
I guess that reduces code size because when the linker was linking in the
templates from cryptlib.lib, it was also linking in other unneeded code.


Robert Roessler

unread,
Oct 3, 2007, 10:27:16 PM10/3/07
to Crypto++
Wei Dai wrote:
> Robert Roessler wrote:
>> Hmmm. So how did the changes to iterhash.{cpp,h} which I rather you
>> hadn't made (since they made my exe larger and are now less clear than
>> previously) address this? And *why* did code size get worse?
>
> Before the changes, a number of template classes in iterhash.h were
> being instantiated into dll.obj. So when you use a hash function, the
> linker brings dll.obj into your .exe. dll.obj references iostream, so
> the linker also brings in iostream (even though it really shouldn't,
> because it only needs the iterhash template functions).
>
> Now, those template classes are instantiated into iterhash.obj, which
> doesn't reference iostream.
>
> I don't understand why it would cause code size to get worse for you. I
> wish the linker had an option that would produce some output explaining
> which pieces of code it's linking in, how big they are, and why (i.e.
> some kind of dependency graph), but it doesn't, so it's not obvious how
> to debug these kinds of issues.

"What we have here is a failure to communicate..." (immortal words if
there ever were any). :)

Everyone but me keeps wanting to talk about linkers and "optimizing
away" code... but remember, this thread started with and is still about
the slimmed-down SUBSET of Crypto++ which ONLY includes the .cpp files
listed in the original posting:

algparam.cpp, cpu.cpp, cryptlib.cpp, filters.cpp, md5.cpp, misc.cpp,
mqueue.cpp, queue.cpp, sha.cpp, and whrlpool.cpp.

The file dll.cpp never enters the picture!

So the situation I describe should be easier to understand, given that
we are not talking about the whole library body of code... so with the
exact same compiler and linker options, I have the following 2 cases:

1) using the current HEAD from the SVN repo, and ADDING iterhash.cpp to
the "cryptlite" project, the library ultimately yields a 739K exe.

2) forcing iterhash.{cpp,h} back to rev 375, REMOVING iterhash.cpp from
the project gives me a library which produces a 737K exe.

Since both the library and my client code are using the link-time code
generation option, you would [like to] think that the opportunities for
"optimizing away" code (to use your model) would be equivalent... but it
almost acts like with your "old" code (iterhash.{cpp,h} @ rev 375), the
source being included when the CLIENT is compiled rather than in the
library is giving better exploitation of code-removal possibilities (or
other source-sensitive optimization).

Actually, I hope it is something else...

>> Something about CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES would be cool
>> and a) what it does in general, and b) why does code get smaller if I
>> define it on the CLIENT side? It even reduces some of the extra size
>> introduced by the iterhash.{cpp,h} changes - but doesn't get rid of all
>> of it.
>
> CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES changes the definitions of a
> number of macros in config.h. When it is not defined, certain templates
> are not instantiated. When it is defined, those same templates are
> explicitly instantiated. If you define it in a .cpp file in your own
> application, it causes those templates to be instantiated into the .obj
> file compile from that .cpp file, so the linker no longer has to look
> for it in cryptlib.lib. I guess that reduces code size because when the
> linker was linking in the templates from cryptlib.lib, it was also
> linking in other unneeded code.

This latter is what I opined might be happening at the end of my
September 24 response to the poster formerly known as "Parch". ;)

Tim Lovell-Smith

unread,
Oct 4, 2007, 11:45:16 PM10/4/07
to Crypto++ Users
I think I was wrong before. It isn't whether the linker is discarding
functions or not - it's the code generation.

If I apply Wei's changes, AND am careful to have exactly the same C/C+
+ options enabled in my .exe project as my .lib project, I get exactly
the same output .exe size.

Also, telling the optimizer to optimize for speed, but favour size
over speed (weird how you can do that) shaves off a few more KB.

Do these suggestions work for you?

Tim

> roess...@rftp.comhttp://www.rftp.com

Robert Roessler

unread,
Oct 5, 2007, 5:53:06 AM10/5/07
to Crypto++
Tim Lovell-Smith wrote:
> I think I was wrong before. It isn't whether the linker is discarding
> functions or not - it's the code generation.
>
> If I apply Wei's changes, AND am careful to have exactly the same C/C+
> + options enabled in my .exe project as my .lib project, I get exactly
> the same output .exe size.

I have been careful about holding everything constant *except* the most
recent changes to iterhash.{cpp,h} - and observe the 2K change as reported.

I do NOT have identical options in my client project as the "cryptlite"
static lib project - but they are quite close. My client settings are
what they are for historical reasons, and the "cryptlite" settings are
duplicated so as to match the main Crypto++ settings.

They match in all the key particulars: multi-threaded NON-dll, linker
code generation, omit frame pointers, enable intrinsics... the library
uses -O2 and "any suitable" for inline expansion, favor size or speed is
"neither". My client uses -Ox and "only __inline" along with "favor
small code". And everyone is using -Gy for function-level linking.

> Also, telling the optimizer to optimize for speed, but favour size
> over speed (weird how you can do that) shaves off a few more KB.

I have been using this for quite some time... I guess I have assumed it
means if multiple sequences could be emitted that are *close* in speed,
use the smallest one (whereas if there is a larger expected difference
in speed, use the faster version even if it is larger).

> Do these suggestions work for you?

See above comments... ;) As a reality-check, what are actually using as
a test bed for comparison? Have you created your own version of the
"cryptlite" static lib project in your Crypto++ working copy tree?

Tim Lovell-Smith

unread,
Oct 5, 2007, 4:05:16 PM10/5/07
to Crypto++ Users
Yes, I have made my own cryptlite project, based on the cryptlib
project, and separate project for the exe. I will post a few numbers,
which I should have done last time.

First, baseline cryptlite command line & exe command lines before
Wei's changes (with the best parameters I tried):

/O2 /Ob2 /Oi /Os /Oy /GL /D "NDEBUG" /D "_WINDOWS" /D "WIN32" /D
"_VC80_UPGRADE=0x0600" /GF /FD /EHsc /MT /Gy /Fp".\Release/
cryptlib_lite.pch" /Fo".\Release/" /Fd".\Release/" /W3 /nologo /c /Zi /
TP /errorReport:prompt

/O2 /Ob2 /Oi /Os /Oy /GL /I "../CryptoCPP" /D "WIN32" /D "NDEBUG" /D
"_CONSOLE" /D "_UNICODE" /D "UNICODE" /GF /FD /EHsc /MT /Gy /Fo"Release
\\" /Fd"Release\vc80.pdb" /W3 /nologo /c /Wp64 /Zi /TP /
errorReport:prompt

You can see a few differences, but importantly optimization parameters
are the same between projects. The exe code is the minimal hashing
example at the top of the thread. Result: an 80 kb exe (81,920 bytes
according to explorer).

Now I (manually) add iterhash.cpp to the lib project and apply Wei
Dai's patches to iterhash.{h,cpp}, and rebuild the entire solution
(lib and exe). Result: an 80kb exe

This is a negative result if you will - the patch is not yet proven to
affect code size at all, with the same settings between projects.

Now, keeping Wei Dai's patch, I one by one change my exe optimization
settings closer to what you describe:
set /Ob1 for the exe -> 80.0 KB (81,920 bytes, a suspiciously round
number)
set /Ob1 and /Ox for the exe -> the same, 80.0 KB (81,920 bytes)

Next step, I change the favour fast/small code option in the cryptlib
project settings to 'Neither' -> 84.0 KB (86,016 bytes).

OK, so far, if you're with me, I have increased code size, but I
haven't proven a positive result: that the patch affects code size
when you have different settings, so I need one last experiment: keep
the optimizations constant with the last experiment, and undo the
patch. Note that with the patch, the 'lib' project's optimizations
will apply to iterhash.cpp code, not the 'exe' project's. Result ->
83.5 KB (85,504 bytes).

A positive result if you will. The patch is shown to affect code size
-- WITH different settings between projects. Because which project it
gets built in changes.

A final experiment: override the per .cpp file optimizations on
iterhash.cpp, and put it back to 'favor small code', which should have
the same effect as reversing wei's patch.. result: -> 84.0 KB (86,016
bytes), as expected.

> roess...@rftp.comhttp://www.rftp.com

Eric Hughes

unread,
Oct 5, 2007, 6:46:00 PM10/5/07
to Crypto++
At 06:23 PM 10/1/2007, Wei Dai wrote:
>In theory, the linker *should* be able to discard any code that
>isn't necessary from the final executable.

The rule of linkers from way back is that all functions in a compilation
unit are linked. The linker, as a rule, doesn't know what is "used" or
not, merely what is referenced, that is, reference implies use. This made
perfect sense in an era where assembly language hacks were common (think
absolute addresses), which destroyed any hope of getting it right in
general. It's less sensible when object code is almost universally
generated within a high-level language context.

While it's obvious these days, in the past there was no notion of an "entry
point" that was universal enough to be able to root a tree (or forest, as
with a DLL) in order to be able to perform dependency tracking. Long ago,
I worked with code that used JMP tables for indexed system calls; there was
no way to specify the address in that JMP instruction (an entry point) from
any other JMP instruction.

I can't say I'm current with what people have done to remedy
this. Function-level linking is one way of addressing this, but it's not
universal. There would still remain issues with cross-language linking,
etc., that make this risky to turn on by default. The real point is that,
historically, the "reference implies use" principle isn't the worst.

The work-around is to put each function into its own compilation
unit. This is a pain, but it pretty much always works because then unit
dependency maps to function dependency.

As for debugging link-size issues, there's DUMPBIN for MSVC.

Eric

Robert Roessler

unread,
Oct 12, 2007, 11:29:40 PM10/12/07
to Crypto++

So (after an absence), it would appear from my results and your
exhaustive analysis that the Crypto++ library would be better left the
way it was (WRT iterhash.{cpp,h}, anyway), in that it gives the library
user more control of how the code is actually generated... and if the
user has said that they would [e.g.] rather favor size over speed, then
the more code of the library that is left "exposed" this way (getting
compiled with client apps), the more this can happen.

So I will renew my request of Wei Dai to please revert the
iterhash.{cpp,h} changes, given that

a) it's not clear that the problem they were trying to solve needed to
be solved or was requested to be solved,

b) it's not clear that they had the intended effect in any case, and

c) it does seem clear that they *can* have a deleterious effect on
Crypto++ client apps

Thanks to Wei Dai and Tim Lovell-Smith for the time and thought that has
gone into this issue/puzzle (and its explanation/solution). :)

Tim Lovell-Smith

unread,
Oct 14, 2007, 5:12:48 AM10/14/07
to Crypto++ Users
Sigh, I was hoping you would change your mind, Robert, from what I
found. I will try to explain why. Firstly, I still think this change
is a positive one because

1) it increases the transparency of compilation behaviour (functions
get compiled in the .cpp file they are defined in)

2) the transparent compilation behaviour and /also/ the fact the
define doesn't have to be fudged any more by the client code makes it
easier to modify Crypto++, which would be useful for making 'lite'
crypto++ projects, or just hacking at and improving the crypto++ code
in general.

There are a few other places where
CRYPTOPP_MANUALLY_INSTANTIATE_TEMPLATES is used, e.g. ecccrypto.cpp.
If all these were made the same as the iterhash.{h,cpp} both these
benefits would apply even more widely. This would e.g. make it easier
for me to compile an elliptic curve cryptography specific crypto++
(something I am considering doing for my own project, as it happens).

On the other hand, you claim there is a deleterious effect, which
could cancel out the positives. I have been trying to figure out what
this effect is, and my conclusion is that there isn't really any. If
you want the smallest compile possible, you can modify the per-file
compilation settings for iterhash.cpp (right click, properties in
Visual Studio) and adjust them to whatever does the best job of
minimizing the final code size. You can even take this to the extreme
and tweak with the settings for every file in crypto++ until you have
the exe at its minimum size.

Finally, I think its important to consider that whether other users
would benefit from the changes you are asking for to crypto++ is
overall unproven. You have a special interest in small file sizes. But
other people might have a special interest in e.g. long-run throughput
performance. Not to mention, the size effect with different compilers
might be entirely different. For crypto++ to place special weight on a
2kb file size change in the face of unknown effects in the other
dimensions of performance doesn't seem sensible to me.

In summary, yup, Crypto++ aims at optimizing performance a lot of the
time. But I think it is most important for Wei to provide a generally
useful library. The degree of tweaking which you are doing is best
left up to the end-user.

> roess...@rftp.comhttp://www.rftp.com

Reply all
Reply to author
Forward
0 new messages