Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Haskell-beginners] GHC-generated executables size

56 views
Skip to first unread message

.

unread,
Oct 16, 2010, 6:26:12 AM10/16/10
to begi...@haskell.org
Hi,
I was playing around with ghc again, and I was wondering what makes
the executables so large and how I could make them smaller (strip works,
but is there anything more I can do?)
More specifically, I am compiling a program that uses the GTK+ bindings,
HDBC, and some things from Prelude.
The program simply displays a window, and reads and writes values
from/into a data base file. Not much, really.
Anyway, the program size is 20MB without stripping, and 10MB after
stripping ...

Any hints?

Thanks and good night for now ..
Christian


_______________________________________________
Beginners mailing list
Begi...@haskell.org
http://www.haskell.org/mailman/listinfo/beginners

Aleksandar Dimitrov

unread,
Oct 16, 2010, 6:44:19 AM10/16/10
to begi...@haskell.org, .
Hi Christian,

> I was playing around with ghc again, and I was wondering what makes
> the executables so large and how I could make them smaller (strip works,
> but is there anything more I can do?)
> More specifically, I am compiling a program that uses the GTK+ bindings,
> HDBC, and some things from Prelude.
> The program simply displays a window, and reads and writes values
> from/into a data base file. Not much, really.
> Anyway, the program size is 20MB without stripping, and 10MB after
> stripping ...

GHC links statically. There is some effort to bring dynamic linking to GHC:
http://hackage.haskell.org/trac/ghc/wiki/SharedLibraries/PlatformSupport?redirectedfrom=DynamicLinking

You'll have to ask more knowledgeable people about its status and if it's
recommended for current projects. Since the GTK libraries are so vast,
even including one call (which forces the libs to be linked into the
executable) will dramatically increase the statically linked executable's
size.

Regards,
Aleks

Daniel Fischer

unread,
Oct 16, 2010, 7:24:02 AM10/16/10
to begi...@haskell.org, .
On Saturday 16 October 2010 12:26:00, . wrote:
> Hi,
> I was playing around with ghc again, and I was wondering what makes
> the executables so large and how I could make them smaller (strip works,
> but is there anything more I can do?)
> More specifically, I am compiling a program that uses the GTK+ bindings,
> HDBC, and some things from Prelude.
> The program simply displays a window, and reads and writes values
> from/into a data base file. Not much, really.
> Anyway, the program size is 20MB without stripping, and 10MB after
> stripping ...
>
> Any hints?

Two things spring to mind (in addition to the static linking mentioned by
Aleksandar).

1) If you didn't compile the packages with -split-objs, when you use one
function from a module, the entire object file for the module is linked in.
For packages with many modules or many dependencies, that adds up pretty
fast.

If you set

split-objs: True

in your ~/.cabal/config, packages installed via cabal-install (the cabal
executable) will be built with -split-objs and only the needed functions
will be linked in (at least if you compile your programmes with
optimisations, I don't know whether -O0 uses split object files or the
monolithic ones).
(Downside: building the packages takes longer, duh; and you need more disk
space for monolithic+split object files, duh again).

2) If it's not (only) that,
it's probably the same effect as discussed in
http://hackage.haskell.org/trac/ghc/ticket/4387

Simon (PJ) says:
"Every module has a module-initialisation routine. Apart from initialising
the module, it calls the module-initialisation routine for each imported
module. So if M imports module SpecConstr from package ghc, then the
module-initialisatin routine for M will call the initialisation routine for
SpecConstr. Even though nothing from SpecConstr is ultimately used."

So if you import a module (you don't even need to use anything from it)
which transitively imports a lot of modules, you get a ton of module-
initialisation routines.
People are thinking about how to handle this best (since it affects the
vector package, on which a lot of other packages depend, it's not
unimportant).

.

unread,
Oct 16, 2010, 9:04:01 AM10/16/10
to Daniel Fischer, begi...@haskell.org
Hi Daniel,
thanks for the explanations.
I have tried reinstalling with
cabal --reinstall gtk,
having set
split-objs: True
in ~/.cabal/config before.

However, the compile yielded a single .o file again, and recompiling and
re-linking my little program does not change its size ...
Any other idea what I might be doing wrong?
Also, I saw in the ghc documentatio about -split-objs that one should
only use it "if you know exactly what you're doing".
Do you know what the caveats are?

Thanks again,
Christian

On Sat, 2010-10-16 at 13:23 +0200, Daniel Fischer wrote:
> On Saturday 16 October 2010 12:26:00, . wrote:
> > Hi,
> > I was playing around with ghc again, and I was wondering what makes
> > the executables so large and how I could make them smaller (strip works,
> > but is there anything more I can do?)
> > More specifically, I am compiling a program that uses the GTK+ bindings,
> > HDBC, and some things from Prelude.
> > The program simply displays a window, and reads and writes values
> > from/into a data base file. Not much, really.
> > Anyway, the program size is 20MB without stripping, and 10MB after
> > stripping ...
> >
> > Any hints?
>
> Two things spring to mind (in addition to the static linking mentioned by
> Aleksandar).
>
> 1) If you didn't compile the packages with -split-objs, when you use one
> function from a module, the entire object file for the module is linked in.
> For packages with many modules or many dependencies, that adds up pretty
> fast.
>
> If you set
>

>

Henk-Jan van Tuyl

unread,
Oct 16, 2010, 2:44:40 PM10/16/10
to begi...@haskell.org, .
On Sat, 16 Oct 2010 12:26:00 +0200, . <ch.g...@googlemail.com> wrote:

> Hi,
> I was playing around with ghc again, and I was wondering what makes
> the executables so large and how I could make them smaller (strip works,
> but is there anything more I can do?)

UPX [0] compresses executables and DLLs quite well (use it after "strip"
for the best results).

Regards,
Henk-Jan van Tuyl


[0] http://upx.sourceforge.net/

--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
--

Daniel Fischer

unread,
Oct 16, 2010, 3:10:38 PM10/16/10
to ., begi...@haskell.org
On Saturday 16 October 2010 15:03:52, . wrote:
> Hi Daniel,
> thanks for the explanations.
> I have tried reinstalling with
> cabal --reinstall gtk,
> having set
> split-objs: True
> in ~/.cabal/config before.
>
> However, the compile yielded a single .o file again,

Yes, there's the single .o file for the package, the question is, what's in
the archive (libHSxxx.a). Rule of thumb, if it's more than twice as large
as the HSxxx.o, it contains the split object files (you can verify by
looking at it with nm).

> and recompiling and re-linking my little program does not change
> its size ...

Possibly the packages have been built with -spilt-objs originally, or your
GHC doesn't support split-objs, in either case, sorry for the inconvenience
:-/

gtk contains apparently hundreds of thousands of modules, so it may be the
module-initialisation functions. Quick test:

$ nm yourexecutable | grep stginit | wc -l

> Any other idea what I might be doing wrong?

Nothing, probably.

What ghc version are you using, on which platform (OS, arch), which
packages does your executable need?

> Also, I saw in the ghc documentatio about -split-objs that one should
> only use it "if you know exactly what you're doing".
> Do you know what the caveats are?

Doesn't work on all platforms (Sparc seems to have some problems iirc),
slower compile times. That's what I'm aware of.

>
> Thanks again,

.

unread,
Oct 16, 2010, 3:32:36 PM10/16/10
to Daniel Fischer, begi...@haskell.org
On Sat, 2010-10-16 at 21:09 +0200, Daniel Fischer wrote:
> On Saturday 16 October 2010 15:03:52, . wrote:
> > Hi Daniel,
> > thanks for the explanations.
> > I have tried reinstalling with
> > cabal --reinstall gtk,
> > having set
> > split-objs: True
> > in ~/.cabal/config before.
> >
> > However, the compile yielded a single .o file again,
>
> Yes, there's the single .o file for the package, the question is, what's in
> the archive (libHSxxx.a). Rule of thumb, if it's more than twice as large
> as the HSxxx.o, it contains the split object files (you can verify by
> looking at it with nm).

Ah .. ok, stupid me ;) the archive is more than twice the size, and ar
tells me about a lot of object files inside it. So compiling with
-split-objs appears to have worked out.
I have seen, however, that I have 2 versions of gtk (0.11.2 and 0.11.0)
on the system. I will try to find out which one was actually used when I
called "ghc --make guiMain"; how would I go about that?

> > and recompiling and re-linking my little program does not change
> > its size ...
>
> Possibly the packages have been built with -spilt-objs originally, or your
> GHC doesn't support split-objs, in either case, sorry for the inconvenience
> :-/
>
> gtk contains apparently hundreds of thousands of modules, so it may be the
> module-initialisation functions. Quick test:
>
> $ nm yourexecutable | grep stginit | wc -l

I tried that, there are almost 900 matches.

> > Any other idea what I might be doing wrong?
>
> Nothing, probably.
>
> What ghc version are you using, on which platform (OS, arch), which
> packages does your executable need?

I am using Ubuntu 10.10, ghc 6.12.1, on a 4-core AMD Phenom.
The program I am compiling needs these packages:
gtk, Text.Parsec, Time, Database.HDBC.Sqlite3, Locale,
Data.ByteString.Lazy.


--Christian

Daniel Fischer

unread,
Oct 16, 2010, 4:18:46 PM10/16/10
to ., begi...@haskell.org
On Saturday 16 October 2010 21:32:27, . wrote:
> I have seen, however, that I have 2 versions of gtk (0.11.2 and 0.11.0)
> on the system. I will try to find out which one was actually used when I
> called "ghc --make guiMain"; how would I go about that?
>

Normally, ghc picks the later version. When compiling, you can see which
version is used by giving the appropriate verbosity flag (-v2 or greater).
After the fact, the only way I know is

$ nm executable | grep stginit_gtk | more (or less)

and then you'll see the z-encoded package version, lines like

080ca5dc T __stginit_gtkzm0zi11zi2_Stuff

for gtk-0.11.2.

> > > and recompiling and re-linking my little program does not change
> > > its size ...
> >
> > Possibly the packages have been built with -spilt-objs originally, or
> > your GHC doesn't support split-objs, in either case, sorry for the
> > inconvenience
> >
> > :-/
> >
> > gtk contains apparently hundreds of thousands of modules, so it may be
> > the module-initialisation functions. Quick test:
> >
> > $ nm yourexecutable | grep stginit | wc -l
>
> I tried that, there are almost 900 matches.
>

Hm, shouldn't nearly take you to 20M.

> > > Any other idea what I might be doing wrong?
> >
> > Nothing, probably.
> >
> > What ghc version are you using, on which platform (OS, arch), which
> > packages does your executable need?
>
> I am using Ubuntu 10.10, ghc 6.12.1, on a 4-core AMD Phenom.
> The program I am compiling needs these packages:
> gtk, Text.Parsec, Time, Database.HDBC.Sqlite3, Locale,
> Data.ByteString.Lazy.
>

No obvious suspect.
Depending on how curious you are, you could split off small sub-programmes
to see what's taking a lot of space. But that's going to be tedious.

You could ask on IRC (#haskell) whether anybody knows about huge
executables with gtk, or on haskell-cafe, on the gtk2hs mailing list, or
you could open a ticket at http://hackage.haskell.org/trac/gtk2hs/ or
http://hackage.haskell.org/trac/ghc/newticket?type=bug whichever you
consider more likely to be responsible (you needn't create an account for
either bug-tracker, both have guest accounts with the guest password in
plain view - though you may need to look for it a bit). But first ask
elsewhere.

.

unread,
Oct 17, 2010, 7:10:45 AM10/17/10
to Daniel Fischer, begi...@haskell.org
Ok, after recompiling gtk again and hiding gtk-0.11.0 (which was still
monolithic), it worked. The executable is now, after stripping, a mere
2.7MB in size (which is still large for what it does, but much better).
Thanks to all who answered for your help!

Christian

Daniel Fischer

unread,
Oct 17, 2010, 7:59:29 AM10/17/10
to ., begi...@haskell.org
On Sunday 17 October 2010 13:10:32, . wrote:
> Ok, after recompiling gtk again and hiding gtk-0.11.0 (which was still
> monolithic), it worked.

Good.

> The executable is now, after stripping, a mere
> 2.7MB in size (which is still large for what it does, but much better).

Well, the executable contains the runtime, so it's naturally much larger
than a small C programme. Compare with

$ cat helloWorld.hs
module Main (main) where

main :: IO ()
main = putStrLn "Hello, World!"
$ ghc --make helloWorld.hs
[1 of 1] Compiling Main ( helloWorld.hs, helloWorld.o )
Linking helloWorld ...
$ ls -l helloWorld
-rwxr-xr-x 1 dafis users 618581 17. Okt 13:53 helloWorld

I suppose on a 64-bit system, you get about twice the numbers, so 2.7MB for
a programme using gtk isn't exorbitant.

> Thanks to all who answered for your help!
>
> Christian

Cheers,
Daniel

Christian Gosch

unread,
Oct 17, 2010, 8:57:54 AM10/17/10
to begi...@haskell.org
Hi,
I was playing around with ghc again, and I was wondering what makes
the executables so large and how I could make them smaller (strip works,
but is there anything more I can do?)
More specifically, I am compiling a program that uses the GTK+ bindings,
HDBC, and some things from Prelude.
The program simply displays a window, and reads and writes values
from/into a data base file. Not much, really.
Anyway, the program size is 20MB without stripping, and 10MB after
stripping ...

Any hints?

Thanks and good night for now ..
Christian

_______________________________________________

Gaius Hammond

unread,
Oct 17, 2010, 3:14:47 PM10/17/10
to Christian Gosch, begi...@haskell.org

On 15 Oct 2010, at 21:37, Christian Gosch wrote:

> Hi,
> I was playing around with ghc again, and I was wondering what makes
> the executables so large and how I could make them smaller (strip
> works,
> but is there anything more I can do?)
> More specifically, I am compiling a program that uses the GTK+
> bindings,
> HDBC, and some things from Prelude.
> The program simply displays a window, and reads and writes values
> from/into a data base file. Not much, really.
> Anyway, the program size is 20MB without stripping, and 10MB after
> stripping ...

A Haskell program needs to schlep around a runtime with it. So when
you compare it to say a Java program, you need to compare it to the
size of a JAR + the size of the VM, and in that case the size isn't
so extraordinary. So that is the "why", as to the "what can you do"
probably not much that you've not already tried, other than supplying
your program as a script for #!/usr/bin/runghc

http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/runghc.html

That would mean your program wasn't so self-contained, as the target
machine would also need to have all the "stuff" that you have compiled
in, e.g. the HDBC development package.

Cheers,


G

Daniel Fischer

unread,
Oct 19, 2010, 8:02:11 AM10/19/10
to begi...@haskell.org
On Tuesday 19 October 2010 13:23:40, David Virebayre wrote:
> 2010/10/17 Daniel Fischer <daniel.i...@web.de>:

> > $ cat helloWorld.hs
> > module Main (main) where
> >
> > main :: IO ()
> > main = putStrLn "Hello, World!"
> > $ ghc --make helloWorld.hs
> > [1 of 1] Compiling Main             ( helloWorld.hs, helloWorld.o )
> > Linking helloWorld ...
> > $ ls -l helloWorld
> > -rwxr-xr-x 1 dafis users 618581 17. Okt 13:53 helloWorld
>
> jhc makes remarkably small executables, this example takes 11268
> bytes, 5756 when stripped.

Yes, JHC doesn't put a large runtime into the executables (and, as a whole
programme compiler can remove more dead code than GHC can).

Stripping GHC's helloWorld:
-rwxr-xr-x 1 dafis users 377240 19. Okt 13:55 helloWorld

>
> Sad that many libraries, and gtk2hs don't work with it.

What makes in unusable for me is that it doesn't yet have arbitrary
precision integers (if that changes, the library problem might step in).
I hope it gets proper Integers and better library support soon, it's quite
an exciting project.

>
> David.

0 new messages