Any hints?
Thanks and good night for now ..
Christian
_______________________________________________
Beginners mailing list
Begi...@haskell.org
http://www.haskell.org/mailman/listinfo/beginners
> I was playing around with ghc again, and I was wondering what makes
> the executables so large and how I could make them smaller (strip works,
> but is there anything more I can do?)
> More specifically, I am compiling a program that uses the GTK+ bindings,
> HDBC, and some things from Prelude.
> The program simply displays a window, and reads and writes values
> from/into a data base file. Not much, really.
> Anyway, the program size is 20MB without stripping, and 10MB after
> stripping ...
GHC links statically. There is some effort to bring dynamic linking to GHC:
http://hackage.haskell.org/trac/ghc/wiki/SharedLibraries/PlatformSupport?redirectedfrom=DynamicLinking
You'll have to ask more knowledgeable people about its status and if it's
recommended for current projects. Since the GTK libraries are so vast,
even including one call (which forces the libs to be linked into the
executable) will dramatically increase the statically linked executable's
size.
Regards,
Aleks
Two things spring to mind (in addition to the static linking mentioned by
Aleksandar).
1) If you didn't compile the packages with -split-objs, when you use one
function from a module, the entire object file for the module is linked in.
For packages with many modules or many dependencies, that adds up pretty
fast.
If you set
split-objs: True
in your ~/.cabal/config, packages installed via cabal-install (the cabal
executable) will be built with -split-objs and only the needed functions
will be linked in (at least if you compile your programmes with
optimisations, I don't know whether -O0 uses split object files or the
monolithic ones).
(Downside: building the packages takes longer, duh; and you need more disk
space for monolithic+split object files, duh again).
2) If it's not (only) that,
it's probably the same effect as discussed in
http://hackage.haskell.org/trac/ghc/ticket/4387
Simon (PJ) says:
"Every module has a module-initialisation routine. Apart from initialising
the module, it calls the module-initialisation routine for each imported
module. So if M imports module SpecConstr from package ghc, then the
module-initialisatin routine for M will call the initialisation routine for
SpecConstr. Even though nothing from SpecConstr is ultimately used."
So if you import a module (you don't even need to use anything from it)
which transitively imports a lot of modules, you get a ton of module-
initialisation routines.
People are thinking about how to handle this best (since it affects the
vector package, on which a lot of other packages depend, it's not
unimportant).
However, the compile yielded a single .o file again, and recompiling and
re-linking my little program does not change its size ...
Any other idea what I might be doing wrong?
Also, I saw in the ghc documentatio about -split-objs that one should
only use it "if you know exactly what you're doing".
Do you know what the caveats are?
Thanks again,
Christian
On Sat, 2010-10-16 at 13:23 +0200, Daniel Fischer wrote:
> On Saturday 16 October 2010 12:26:00, . wrote:
> > Hi,
> > I was playing around with ghc again, and I was wondering what makes
> > the executables so large and how I could make them smaller (strip works,
> > but is there anything more I can do?)
> > More specifically, I am compiling a program that uses the GTK+ bindings,
> > HDBC, and some things from Prelude.
> > The program simply displays a window, and reads and writes values
> > from/into a data base file. Not much, really.
> > Anyway, the program size is 20MB without stripping, and 10MB after
> > stripping ...
> >
> > Any hints?
>
> Two things spring to mind (in addition to the static linking mentioned by
> Aleksandar).
>
> 1) If you didn't compile the packages with -split-objs, when you use one
> function from a module, the entire object file for the module is linked in.
> For packages with many modules or many dependencies, that adds up pretty
> fast.
>
> If you set
>
>
> Hi,
> I was playing around with ghc again, and I was wondering what makes
> the executables so large and how I could make them smaller (strip works,
> but is there anything more I can do?)
UPX [0] compresses executables and DLLs quite well (use it after "strip"
for the best results).
Regards,
Henk-Jan van Tuyl
[0] http://upx.sourceforge.net/
--
http://Van.Tuyl.eu/
http://members.chello.nl/hjgtuyl/tourdemonad.html
--
Yes, there's the single .o file for the package, the question is, what's in
the archive (libHSxxx.a). Rule of thumb, if it's more than twice as large
as the HSxxx.o, it contains the split object files (you can verify by
looking at it with nm).
> and recompiling and re-linking my little program does not change
> its size ...
Possibly the packages have been built with -spilt-objs originally, or your
GHC doesn't support split-objs, in either case, sorry for the inconvenience
:-/
gtk contains apparently hundreds of thousands of modules, so it may be the
module-initialisation functions. Quick test:
$ nm yourexecutable | grep stginit | wc -l
> Any other idea what I might be doing wrong?
Nothing, probably.
What ghc version are you using, on which platform (OS, arch), which
packages does your executable need?
> Also, I saw in the ghc documentatio about -split-objs that one should
> only use it "if you know exactly what you're doing".
> Do you know what the caveats are?
Doesn't work on all platforms (Sparc seems to have some problems iirc),
slower compile times. That's what I'm aware of.
>
> Thanks again,
Ah .. ok, stupid me ;) the archive is more than twice the size, and ar
tells me about a lot of object files inside it. So compiling with
-split-objs appears to have worked out.
I have seen, however, that I have 2 versions of gtk (0.11.2 and 0.11.0)
on the system. I will try to find out which one was actually used when I
called "ghc --make guiMain"; how would I go about that?
> > and recompiling and re-linking my little program does not change
> > its size ...
>
> Possibly the packages have been built with -spilt-objs originally, or your
> GHC doesn't support split-objs, in either case, sorry for the inconvenience
> :-/
>
> gtk contains apparently hundreds of thousands of modules, so it may be the
> module-initialisation functions. Quick test:
>
> $ nm yourexecutable | grep stginit | wc -l
I tried that, there are almost 900 matches.
> > Any other idea what I might be doing wrong?
>
> Nothing, probably.
>
> What ghc version are you using, on which platform (OS, arch), which
> packages does your executable need?
I am using Ubuntu 10.10, ghc 6.12.1, on a 4-core AMD Phenom.
The program I am compiling needs these packages:
gtk, Text.Parsec, Time, Database.HDBC.Sqlite3, Locale,
Data.ByteString.Lazy.
--Christian
Normally, ghc picks the later version. When compiling, you can see which
version is used by giving the appropriate verbosity flag (-v2 or greater).
After the fact, the only way I know is
$ nm executable | grep stginit_gtk | more (or less)
and then you'll see the z-encoded package version, lines like
080ca5dc T __stginit_gtkzm0zi11zi2_Stuff
for gtk-0.11.2.
> > > and recompiling and re-linking my little program does not change
> > > its size ...
> >
> > Possibly the packages have been built with -spilt-objs originally, or
> > your GHC doesn't support split-objs, in either case, sorry for the
> > inconvenience
> >
> > :-/
> >
> > gtk contains apparently hundreds of thousands of modules, so it may be
> > the module-initialisation functions. Quick test:
> >
> > $ nm yourexecutable | grep stginit | wc -l
>
> I tried that, there are almost 900 matches.
>
Hm, shouldn't nearly take you to 20M.
> > > Any other idea what I might be doing wrong?
> >
> > Nothing, probably.
> >
> > What ghc version are you using, on which platform (OS, arch), which
> > packages does your executable need?
>
> I am using Ubuntu 10.10, ghc 6.12.1, on a 4-core AMD Phenom.
> The program I am compiling needs these packages:
> gtk, Text.Parsec, Time, Database.HDBC.Sqlite3, Locale,
> Data.ByteString.Lazy.
>
No obvious suspect.
Depending on how curious you are, you could split off small sub-programmes
to see what's taking a lot of space. But that's going to be tedious.
You could ask on IRC (#haskell) whether anybody knows about huge
executables with gtk, or on haskell-cafe, on the gtk2hs mailing list, or
you could open a ticket at http://hackage.haskell.org/trac/gtk2hs/ or
http://hackage.haskell.org/trac/ghc/newticket?type=bug whichever you
consider more likely to be responsible (you needn't create an account for
either bug-tracker, both have guest accounts with the guest password in
plain view - though you may need to look for it a bit). But first ask
elsewhere.
Christian
Good.
> The executable is now, after stripping, a mere
> 2.7MB in size (which is still large for what it does, but much better).
Well, the executable contains the runtime, so it's naturally much larger
than a small C programme. Compare with
$ cat helloWorld.hs
module Main (main) where
main :: IO ()
main = putStrLn "Hello, World!"
$ ghc --make helloWorld.hs
[1 of 1] Compiling Main ( helloWorld.hs, helloWorld.o )
Linking helloWorld ...
$ ls -l helloWorld
-rwxr-xr-x 1 dafis users 618581 17. Okt 13:53 helloWorld
I suppose on a 64-bit system, you get about twice the numbers, so 2.7MB for
a programme using gtk isn't exorbitant.
> Thanks to all who answered for your help!
>
> Christian
Cheers,
Daniel
Any hints?
Thanks and good night for now ..
Christian
_______________________________________________
> Hi,
> I was playing around with ghc again, and I was wondering what makes
> the executables so large and how I could make them smaller (strip
> works,
> but is there anything more I can do?)
> More specifically, I am compiling a program that uses the GTK+
> bindings,
> HDBC, and some things from Prelude.
> The program simply displays a window, and reads and writes values
> from/into a data base file. Not much, really.
> Anyway, the program size is 20MB without stripping, and 10MB after
> stripping ...
A Haskell program needs to schlep around a runtime with it. So when
you compare it to say a Java program, you need to compare it to the
size of a JAR + the size of the VM, and in that case the size isn't
so extraordinary. So that is the "why", as to the "what can you do"
probably not much that you've not already tried, other than supplying
your program as a script for #!/usr/bin/runghc
http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/runghc.html
That would mean your program wasn't so self-contained, as the target
machine would also need to have all the "stuff" that you have compiled
in, e.g. the HDBC development package.
Cheers,
G
Yes, JHC doesn't put a large runtime into the executables (and, as a whole
programme compiler can remove more dead code than GHC can).
Stripping GHC's helloWorld:
-rwxr-xr-x 1 dafis users 377240 19. Okt 13:55 helloWorld
>
> Sad that many libraries, and gtk2hs don't work with it.
What makes in unusable for me is that it doesn't yet have arbitrary
precision integers (if that changes, the library problem might step in).
I hope it gets proper Integers and better library support soon, it's quite
an exciting project.
>
> David.