I am considering using LLVM in a project for a Windows CE where space is at a premium. My jaw dropped when I checked the size of HowToUseJIT.exe (VC++ Win32 debug): 15.4 MB! The release build of HowToUseJIT is “only” 3.39 MB, but this is still 85% larger than the binary to which I was thinking of adding LLVM.
The top ten LLVM libraries (Win32 *.lib) are pretty huge:
Release Bld Debug Bld Name
24,510,490 71,038,240 LLVMCodeGen.lib
21,084,666 56,724,338 LLVMCore.lib
14,624,218 37,070,488 LLVMAnalysis.lib
11,987,202 30,711,450 LLVMScalarOpts.lib
8,600,668 23,837,478 LLVMSelectionDAG.lib
8,634,324 23,802,952 LLVMTransformUtils.lib
8,347,134 20,840,744 LLVMipo.lib
5,061,702 11,028,744 LLVMX86CodeGen.lib
3,857,612 9,270,012 LLVMInstCombine.lib
3,330,608 7,820,760 LLVMSupport.lib
The binaries are vastly larger than the source code; for example, everything in lib/CodeGen is 3.63 MB and everything in lib/VMCore is 907 KB. This is quite different than my typical experience; my own C++ source code is larger than the Release DLL it compiles into.
Does anyone know why this stuff is so big, and whether there is a way to get a bare subset of LLVM that fits in under 1 MB?
> The top ten LLVM libraries (Win32 *.lib) are pretty huge:
>
> Release Bld Debug Bld Name
> 24,510,490 71,038,240 LLVMCodeGen.lib
> 21,084,666 56,724,338 LLVMCore.lib
> 14,624,218 37,070,488 LLVMAnalysis.lib
> 11,987,202 30,711,450 LLVMScalarOpts.lib
> 8,600,668 23,837,478 LLVMSelectionDAG.lib
> 8,634,324 23,802,952 LLVMTransformUtils.lib
> 8,347,134 20,840,744 LLVMipo.lib
> 5,061,702 11,028,744 LLVMX86CodeGen.lib
> 3,857,612 9,270,012 LLVMInstCombine.lib
> 3,330,608 7,820,760 LLVMSupport.lib
Not sure about Win32, but here are some numbers on OS X for comparison:
5,282,356 libLLVMCodeGen.a
3,087,436 libLLVMAnalysis.a
1,682,476 libLLVMInstCombine.a
I believe these are all release builds.
Trevor
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> On Jul 23, 2010, at 10:24 AM, David Piepgrass wrote:
>
>> The top ten LLVM libraries (Win32 *.lib) are pretty huge:
>>
>> Release Bld Debug Bld Name
>> 24,510,490 71,038,240 LLVMCodeGen.lib
[snip]
> Not sure about Win32, but here are some numbers on OS X for comparison:
>
> 5,282,356 libLLVMCodeGen.a
Comparing the size of the static libraries makes little sense, and even
less when they are compiled by different tools. What really matters is
the size of the executables.
I agree that LLVM can be considered a heavyweight dependency on this
aspect.
[snip]
Why is the size of static libraries a "nonsensical" topic of discussion? Anyway, in the same example I mentioned that the size of HowToUseJIT (an executable) is very large (15.4 MB debug, 3.4 MB release); as is a small example, I'd expect any real-world executable to be larger. I think it's fair to wonder what makes it is so large, why MacOS seems to get different results, and whether it is possible to construct an example less than 1 MB.
>> Comparing the size of the static libraries makes little sense, and even
>> less when they are compiled by different tools. What really matters is
>> the size of the executables.
>>
>> I agree that LLVM can be considered a heavyweight dependency on this
>> aspect.
>
> Why is the size of static libraries a "nonsensical" topic of
> discussion?
Why do you care about the size of library files?
> Anyway, in the same example I mentioned that the size of HowToUseJIT
> (an executable) is very large (15.4 MB debug, 3.4 MB release); as is a
> small example, I'd expect any real-world executable to be larger.
Of course a real-world project would be larger, but not on a "linear"
proportion compared to HowToUseJIT. That example application pulls a big
chunk from the LLVM libraries. That is what makes it large, not the code
in howtousejit.cpp. My compiler, for instance, is anything but a toy
application and is 5.7 MB.
> I think it's fair to wonder what makes it is so large, why MacOS seems
> to get different results,
If you want to compare sizes, you must limit your comparisons to
executable files. Why would be relevant that XCode produces library
files smaller than Visual Studio? Its comparing apples to oranges.
> and whether it is possible to construct an example less than 1 MB.
A LLVM JIT compiler for x86 under 1 MB? I doubt it is possible without a
major rewriting of LLVM.
I think it's fair to wonder what makes it is so large, why MacOS seems to get different results, and whether it is possible to construct an example less than 1 MB.
I assumed dynamic libraries and static libraries were similar in size, but I just checked some of my own static libraries and they are indeed much larger than the executables they compile to. Sorry, it just never occurred to me that they would be much different.
> > Anyway, in the same example I mentioned that the size of HowToUseJIT
> > (an executable) is very large (15.4 MB debug, 3.4 MB release); as is
> a
> > small example, I'd expect any real-world executable to be larger.
>
> Of course a real-world project would be larger, but not on a "linear"
> proportion compared to HowToUseJIT. That example application pulls a
> big
> chunk from the LLVM libraries. That is what makes it large, not the
> code
> in howtousejit.cpp. My compiler, for instance, is anything but a toy
> application and is 5.7 MB.
>
> > and whether it is possible to construct an example less than 1 MB.
>
> A LLVM JIT compiler for x86 under 1 MB? I doubt it is possible without
> a major rewriting of LLVM.
Even with no optimizations? Drat. That means I can't use it.
It's too bad nobody's written a utility to profile the sizes of C++ classes/functions... that would sure help an investigation like this. A question at StackOverflow didn't turn up any such utility:
http://stackoverflow.com/questions/1051597/is-there-a-function-size-profiler-out-there
Why? I'd never checked, but I always assumed the LLVM JIT was much
larger than 3.4 MB.
For comparison:
[rnk@tamalpais google3]$ du -h /usr/lib/gcc/x86_64-linux-gnu/4.4/cc1plus
10M /usr/lib/gcc/x86_64-linux-gnu/4.4/cc1plus
[rnk@tamalpais google3]$ du -h `which python2.6`
2.5M /usr/bin/python2.6
It seems reasonable that a JIT compiler with optimizers would weigh in
somewhere between an interpreter and a full C++ compiler.
Reid
> Why would be relevant that XCode produces library
> files smaller than Visual Studio? Its comparing apples to oranges.
The size of static libraries is relevant because it places an upper
bound on the size of the executable. Otherwise we can only speak
anecdotally about "typical" executables that use "some" of the LLVM
features.
As for the apples-to-oranges comparison between GCC output and Visual
Studio output, having additional data points from other environments
may be helpful in understanding whether a size issue affects all
platforms or is specific to Visual Studio.
Trevor
You are forgetting that python2.6 includes a lot of libraries.
The object file for the interpreter is about 130k.
>
> Reid
> _______________________________________________
> LLVM Developers mailing list
> LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
Mark.
> On Jul 27, 2010, at 4:11 PM, Óscar Fuentes wrote:
>
>> Why would be relevant that XCode produces library
>> files smaller than Visual Studio? Its comparing apples to oranges.
>
> The size of static libraries is relevant because it places an upper
> bound on the size of the executable.
This is not strictly true, as some compilers can generate code from the
library contents (think on a library containing LLVM bytecode, or a C++
library that relies on the linker for template instantiation.) But for
our purposes, let's accept that upper bound.
> Otherwise we can only speak anecdotally about "typical" executables
> that use "some" of the LLVM features.
That upper bound is useful just in the case where the combined size of
*all* the libraries looks small enough to you. OTOH, labeling a library
as fatware just looking at the size of the static libraries is wrong.
I'll say that the right method for estimating the size of a project that
uses LLVM is to determine the features you need (JIT? static code
generation? optimizations? backend(s)? etc) and to create an executable
that links then in. That is far more accurate than adding the file size
of static libraries.
> As for the apples-to-oranges comparison between GCC output and Visual
> Studio output, having additional data points from other environments
> may be helpful in understanding whether a size issue affects all
> platforms or is specific to Visual Studio.
As mentioned before, this only makes sense for executable files. Of
course with debug info stripped, optimizations enabled and with the
runtime C/C++ libraries dynamically linked.
> On Wed, Jul 28, 2010 at 9:01 AM, David Piepgrass
> <dpiep...@mentoreng.com> wrote:
> >> A LLVM JIT compiler for x86 under 1 MB? I doubt it is possible
> >> without a major rewriting of LLVM.
> >
> > Even with no optimizations? Drat. That means I can't use it.
>
> Why? I'd never checked, but I always assumed the LLVM JIT was much
> larger than 3.4 MB.
It is ~4.8M here.
Here are some size comparisons from ClamAV on Linux:
without JIT, -m32, -Os, stripped: 835K
with JIT, -m32, -Os, stripped: 5.6M
with JIT, -m64, -O2, stripped: 8.8M
If LLVM is compiled with debug info, and not stripped then it can be as
big as 70MB.
The JIT of course includes the code to generate LLVM IR for x86, and do
some minimal optimizations on it (mem2reg, dce).
On Tue, 27 Jul 2010 16:49:14 -0600
David Piepgrass <dpiep...@mentoreng.com> wrote:
> Why is the size of static libraries a "nonsensical" topic of
> discussion?
Because they include copies of the same code multiple times.
When you link an executable you only get 1 copy.
Think of templates being instantiated in different files with the same
type.
They also include symbol (and perhaps debug) information on Linux.
I think VS keeps symbols separate.
A more useful upper bound for size would be to create a shared library
from all of LLVM.
On Wed, 28 Jul 2010 21:56:58 +0200
Óscar Fuentes <o...@wanadoo.es> wrote:
> I'll say that the right method for estimating the size of a project
> that uses LLVM is to determine the features you need (JIT? static code
> generation? optimizations? backend(s)? etc) and to create an
> executable that links then in. That is far more accurate than adding
> the file size of static libraries.
Agreed, once you've chosen what features you need, the LLVM code you
link in will be about the same size.
Best regards,
--Edwin
I've had some success analyzing the binary size by making Visual C++
generate a .map file. This basically tells you at what binary offset each
function is located in the exe or dll. By subtracting the offset of the next
function you get the actual binary size of each function (including
alignment). Then you can aggregate these by class or by object file and sort
by size to get a real idea of where the big code is.
If I recall correctly, I got the JIT down to about 2 MB using information
collected from the .map file. This still included some optimization passes.
Unfortunately a lot of features are fairly tightly interwoven. If you don't
need support for debugging, exceptions, garbage collection, intrinsics,
arbitrary precision integers and/or vectors, I bet you could get it way
smaller. But it would take considerable effort to pry loose. Also, some
passes can do a lot of things which you might not be interested in. For
example instcomb is huge but you probably only need a handful of the
possible combinations to make your JIT code a lot faster. Most of the
optimizations can be performed statically in the high-level language anyway
(e.g. replacing a division by 2 by a shift right).
So I'm quite convinced that LLVM can be made smaller than 1 MB, but it will
take some custom work. With a good test suite you can systematically cut
things out and ensure everything keeps working. Unfortunately it will become
infeasible to merge patches from main development into your local tree, so
you won't benefit from any advances or bug fixes there. But if you're happy
with LLVM 2.7's functionality, now and later, then this is a feasible option
if you have a couple months to do the custom work.
Cheers,
Nicolas